[Code snippet] How to compile a `String` to an optimised `CoreExpr` using the GHC API

Hi,

since it took me a while to figure out what would be the best way to compile just an expression instead of a full module using the GHC API, here’s a small end-to-end example to do that, in the hope that it encourages a few people to play around with it:

import GHC
import GHC.Core
import GHC.Driver.Session
import GHC.Types.Name
import GHC.Utils.Outputable
import qualified GHC.LanguageExtensions as LangExt

import Control.Monad
import Control.Monad.IO.Class
import Data.List
import System.Directory
import System.Environment
import System.Exit
import System.FilePath

indent :: Int -> String -> String
indent n = unlines . map (\s -> replicate n ' ' ++ s) . lines

pprPrint :: Outputable a => a -> IO ()
pprPrint = putStrLn . showSDocUnsafe . ppr

compileToCore :: String -> [String] -> String -> IO CoreExpr
compileToCore libdir args expression = do
  tmp <- getTemporaryDirectory
  let file = tmp </> "_interactive_.hs"
  writeFile file ("module Interactive where import GHC.Exts; it = " ++ indent 2 expression)
  -- Initialize GHC session
  defaultErrorHandler defaultFatalMessager defaultFlushOut $ do
    runGhc (Just libdir) $ do
      -- Set up GHC session
      dflags <- getSessionDynFlags
      logger <- getLogger
      (dflags, rest_args, err_messages) <- parseDynamicFlags logger dflags (map (L noSrcSpan) args)
      when (not (null rest_args)) $ liftIO $ putStrLn ("Unhandled args: " ++ show rest_args) >> exitFailure
      when (not (null err_messages)) $ liftIO $ pprPrint err_messages >> exitFailure

      setSessionDynFlags $
        -- flip gopt_unset Opt_FullLaziness $
        -- flip gopt_unset Opt_WorkerWrapper $
        -- updOptLevel 1 $ -- if you want to compile with -O1 opts, make sure to unset -ffull-laziness and -fworker-wrapper above in addition to -flocal-float-out-top-level
        flip gopt_unset Opt_LocalFloatOutTopLevel $
        flip gopt_unset Opt_IgnoreInterfacePragmas $ -- This enables cross-module inlining
        flip xopt_set LangExt.MagicHash $
        dflags
      mod_guts <- compileToCoreSimplified file
      let binds = cm_binds mod_guts
      let Just (NonRec _ e) = find (\b -> case b of NonRec x e -> getOccString x == "it"; _ -> False) binds
      return e

main = do
  libdir:args <- getArgs
  getLine >>= compileToCore libdir args >>= pprPrint

It currently unsets Opt_LocalFloatOutTopLevel , a relatively recent addition. (GHC 9.6 I think. Very useful here, thanks @vmchale) Without it, the CoreExpr might not be self-contained/closed wrt. local bindings.

I first tried coming up with my own fake module instead of writing to a temporary file, but I ultimately gave up when I had to fake a ModSummary which is what all processing functions such as parseModule, typecheckModule, etc. need.

Compile it like this:

$ ghc-9.6 -package ghc -package ghc-boot -package directory -package filepath test.hs

Then use it like this

$ echo "3 + 5 :: Int" | ./test $(ghc-9.6 --print-libdir)
I# 8#
$ echo "let f x = x * 43 :: Int; {-# NOINLINE f #-} in f 3" | ./test $(ghc-9.6 --print-libdir)
join {
  f_sH6 [InlPrag=NOINLINE] :: Int -> Int
  [LclId[JoinId(1)(Nothing)], Arity=1, Str=<L>, Unf=OtherCon []]
  f_sH6 (eta_B1 [OS=OneShot] :: Int)
    = case eta_B1 of { I# x_aCA -> I# (*# x_aCA 43#) } } in
jump f_sH6 (I# 3#)

You can even set individual GHC flags such as -fspecialise or -fstrictness:

echo "let f x = x * 43 :: Int; {-# NOINLINE f #-} in f 3" | ./test $(ghc-9.6 --print-libdir) -fstrictness
join {
  f_alA [InlPrag=NOINLINE, Dmd=1C(1,!P(L))] :: Int -> Int
  [LclId[JoinId(1)(Just [!])], Arity=1, Str=<1L>, Unf=OtherCon []]
  f_alA (eta_B1 [OS=OneShot] :: Int)
    = case eta_B1 of { I# x_iCC -> I# (*# x_iCC 43#) } } in
jump f_alA (I# 3#)

(note the changed strictness signature <1L>.)

17 Likes

Very cool! I wonder if there’s any way to do this like

ghc -e '1 + 2 :: Int' -ddump-simpl
4 Likes

Cool idea, I updated the snippet to include parsing DynFlags

Edit: Only much later I realised that is not what the previous comment was aiming at. Anyway, now there’s DynFlags parsing