Let's start with a complete example right in the beginning. We could write a Haskell code like:
$(generateJS "code" [d|
square :: Int -> Int
square n = n * n
poly :: Int -> Int -> Int
poly x y = square x + 10 * y + 100
res = poly 42 7
|])
If we load the module into GHCi, we can try that we have the functions defined:
λ> square 5
25
λ> res
1934
But additionally there's a code
value with JavaScript source code:
λ> :t code
code :: [Char] -- i.e. String
λ> putStr code
// square :: Int -> Int
function square (n) {
return n * n;
}
// poly :: Int -> Int -> Int
function poly (x, y) {
return square(x) + 10 * y + 100;
}
var res = poly(42, 7)
How does this work? Let's walk through in three steps.
The [d| ... |]
is a syntax of Template Haskell declaration quote. For example, [d| answer = 42 |]
would produce a value of type Q Dec
, which is an abstract syntax representation of it.
Relatedly, the $(...)
is splice syntax. As it appears on the top level, it splices declarations (of the type Q [Dec]
, or expressions of the type Q Exp
depending on the context) into the module. This functionality is often used to define some boilerplate, e.g. deriveGeneric
from generics-sop
package to define instances, or makeLenses
from lens
to define Lens
values. (Note that the $(...)
wrapper is not strictly required, but being explicit helps understand what happens).
We can write a first version of generateJS
, which would pass through the declarations given to it:
generateJS1 :: String -> Q [Dec] -> Q [Dec]
generateJS1 _ decs = decs
We simply return the declarations given as argument. The result of splicing $(generateJS1 "somename" [d| some code ... |])
would be the same as not wrapping that some codeE inside splice & quote. Not very interesting yet, but it is a start.
The Dec
type is not opaque. We can look into the values, alter them, or do something in addition.
template-haskell
library also offers a pretty-printing functionality in Language.Haskell.TH.Ppr
module, most importantly pprint
function which pretty prints various template-haskell
types as Haskell code. Let's play with it:
generateJS2a :: String -> Q [Dec] -> Q [Dec]
generateJS2a _ decsQ = do
decs <- decsQ
liftIO $ putStr $ unlines $ map pprint decs
return decs
In this snippet we pprint
the declarations and putStr
them. When we (re)load the module in GHCi, something like
square_0 :: GHC.Types.Int -> GHC.Types.Int
square_0 n_1 = n_1 GHC.Num.* n_1
will be printed in between other messages. There are few things we want to improve:
GHC.Num.*
)_1
suffix in n_1
)We can solve two first ones using syb
- scrap your boilerplate. All types in template-haskell
have instances of Data
type class, so using syb
is very easy way to do modifications:
simplifyNames :: Data a => a -> a
simplifyNames = SYB.everywhere (SYB.mkT simplifyName)
simplifyName :: Name -> Name
simplifyName (Name occ _) = Name occ NameS
We modify all Name
values inside any Data
, using simplifyName
. simplifyName
then makes all names NameS
- unqualified; dynamically bound; these names are pretty-printed without prefixes or suffixes. Exactly what we want.
To solve the third issue, let's generate the additional definition containing the literal code. For that we'll need to look into template-haskell
library documentation to find the names for combinators to write the code:
stringDec
:: String -- ^ value name
-> String -- ^ value's value
-> Q Dec
stringDec name' contents = do
name <- newName name'
valD (varP name) (normalB (stringE contents)) []
When looking at the template-haskell
data structures, one can learn few things about the Haskell surface syntax. One clear thing: it's huge. Also that
foo | bar = 1
| otherwise = 2
is a valid syntax, though not often used (at least on the top-level). This is we use normalB
, and not the expression directly: the body of our value is not guarded, but it could be.
Using simplifyNmaes
and stringDec
we can define an improved version of of generateJSa
:
generateJS2b :: String -> Q [Dec] -> Q [Dec]
generateJS2b name decsQ = do
decs <- decsQ
let decs' = map simplifyNames decs
dec <- stringDec name $ unlines $ map pprint decs'
return (dec : decs)
Now, when we reload in GHCi, the compiler is silent; yet there's new value defined, code
:
λ> putStr code
square :: Int -> Int
square n = n * n
poly :: Int -> Int -> Int
poly x y = (square x + (10 * y)) + 100
res = poly 42 7
Some extra parentheses are fine.
You probably guessed what we will do next. Instead of pretty-printing Haskell as Haskell, we will pretty-print Haskell as JavaScript. The generateJS3
is essentially the same as previously:
generateJS3 :: String -> Q [Dec] -> Q [Dec]
generateJS3 name decsQ = do
decs <- decsQ
let decs1 = map simplifyNames decs
let decs2 = map generateDec decs1
dec <- stringDec name $ unlines $ map show decs2
return (dec : decs)
but instead of pprint
we use generateDec
. generateDec
doesn't handle all of Haskell syntax, only the bits we need for our example:
generateDec :: Dec -> Doc
generateDec dec@SigD {} = PP.text $ "// " ++ pprint dec
generateDec (FunD name clauses) = generateFun name clauses
generateDec (ValD (VarP name) body []) = generateVal name body
generateDec dec =
PP.text $ "// unhandled Dec: " ++ show dec
Type signatures (SigD
) are put into comments, function definitions (FunD
: fun a b = ...
) and value definitions (ValD
: val = ...
) are handled by corresponding generatefun
and generateVal
functions. Their definitions are in the appendix at the end of the blog post. The last, catch all case, show
s the Haskell value, so we can see which cases are unhandled (if any).
As the result, if we once again reload the module in GHCi, the code
will have the value as in the introduction:
λ> putStr code
// square :: Int -> Int
function square (n) {
return n * n;
}
// poly :: Int -> Int -> Int
function poly (x, y) {
return square(x) + 10 * y + 100;
}
var res = poly(42, 7)
Is this silly? Well... kind of. Compiling Haskell to JavaScript in this way won't work for all of Haskell, for example type-classes are not resolved, not to mention lazy vs. strict semantics.
Yet this is an example demonstrating actual techniques used in the real world:
For example IOHK Plutus uses the very same approach, see example in their tutorial: https://prod.playground.plutus.iohkdev.io/tutorial/#_writing_basic_plutustx_programs. They use Typed Template Haskell, but I think the idea is similar: Examine template-haskell
values, generate corresponding for "your language". There might be some extra steps, as there often are in the compilers.
I use the "capturing the Haskell code" part when working with my template library zinza
. zinza
can compile its templates to a Haskell module, which will render the template. Using the pprint
trick, I can write the input data definitions once, so they can be inserted into generated module, making it self-contained. (I have yet to write about this workflow in zinza
documentation).
In a similar spirit, one can capture the Q [Dec]
output of e.g. makeLenses
, pretty-print it into a file, and by using a CPP switch rather load a once generated Template Haskell splice from a file. For example on a systems where Template Haskell is rather non-available or just better avoidable in general.
Obsidian Systems' reflex-platform
used -ddump-slices
and regular expression magic to achieve similar effect. That feels a lot more fragile. AFAIK currently they use a custom GHC which can dump and load Template Haskell splices. That way the loading is cleaner, then using CPP, but at the cost of custom patch. Yet, with a Source Plugin (yet if you don't have Template Haskell, you most likely don't have source plugins either) one could
generateFun :: Name -> [Clause] -> Doc
generateFun name [Clause pats (NormalB body) decs]
| vars <- mapMaybe isVar pats =
PP.text "function"
<+> ppName name
<+> ppParensCommaSep (map (PP.text . pprint) vars)
<+> PP.char '{'
$$
PP.nest 2 (PP.text "return" <+> generateExp body <> PP.char ';')
$$
PP.char '}'
where
isVar (VarP var) = Just var
isVar _ = Nothing
generateFun name _ = error $ "difficult function: " ++ pprint name
generateVal :: Name -> Body -> Doc
generateVal name (NormalB body) =
PP.text "var" <+> ppName name <+> PP.char '=' <+> generateExp body
ppParensCommaSep :: [Doc] -> Doc
ppParensCommaSep ds = PP.parens $ PP.hsep $ PP.punctuate PP.comma ds
nameStr :: Name -> String
nameStr (Name (OccName n) _) = n
ppName :: Name -> Doc
ppName = PP.text . nameStr
generateExp :: Exp -> Doc
generateExp (InfixE (Just l) (VarE op) (Just r))
| nameStr op `elem` ["*", "+"]
= generateExp l <+> ppName op <+> generateExp r
generateExp (VarE n) = ppName n
generateExp (AppE f x) =
let (g, ys) = extractArguments f [x]
in generateExp g <> ppParensCommaSep (map generateExp ys)
generateExp (LitE (IntegerL i)) = PP.integer i
generateExp exp
= error $ "generateExp: difficult expression -- " ++ show exp
extractArguments :: Exp -> [Exp] -> (Exp, [Exp])
extractArguments (AppE f x) ys = extractArguments f (x : ys)
extractArguments f ys = (f, ys)