Deprecating Safe Haskell, or heavily investing in it?

Deprecate it. Just make sure old stable code doesn’t need a re release because of the language extension no longer being valid. There’s a lot of stable mature libraries out there which would break otherwise.

You do raise a good point: lambda bot and similar tools like Chris smiths code world do make use of the white listing of code

Instead of relying on Safe Haskell it could have limited users to a white list of available imports. Same applies to lambdabot etc.

But that’s Linux only? He’s t do we do outside of Linux? On windows, macOS, BSD, …?

I’m not sure this is completely true. For proper isolation CI services like GitHub Actions do isolation at the operations system isolation level (they spin up a new virtualized OS instance). GitHub (shared) runners, gitlab shared runners, buildkite runners, … are all pretty terrible at isolation (outside of Linux).

On topic though; I think the idea of Safe is good ergonomics are not; nor is adoption. This should be rethought.

I wish we had some way to force levels of purity. Something that allowed in true type signature the guarantee that no IO was used (e.g. via unsafePerformIO, no ffi calls, …) and in a separate step proof that the function did not use any system dependent values either (e.g. anything unsized/implicitly sized).

2 Likes

Burn it.


OT:

And: no, I don’t trust the type system when it comes to IO at all. A bug in the RTS can launch missiles too.

There’s solutions for that, e.g. haskell playground uses one of them: bubblewrap.

The other solution is syscall filters (e.g. via seccomp). Some linux package managers do that to execute untrusted build systems: sydbox.

So Safe haskell is not only useless, but the entire class of approaches are.

2 Likes

challenge :: ()
challenge = unsafeLocalState (putStr "Try to predict this ") `par`
            unsafeLocalState (putStr "definition's output. ")


…possibly along with a bug in the UEFI - let’s burn that one too!

With that fire burning nicely, let’s also burn off all the problematic features or extensions listed in the paper:

  • unsafeLocalState, unsafePerformIO and unsafeCoerce, {#- RULES ... #-};
  • GeneralisedNewtypeDeriving (if the bug/s still haven’t been removed), TemplateHaskell

The paper mentions certain modules, they can probably be burned too:

  • System.IO.Unsafe
  • Unsafe.Coerce

If these modules can’t be retrofitted away from using unsafe features or extensions, they can also “go up in smoke”:

  • Data.ByteString
  • Data.Debug
  • Control.Monad.ST

This doesn’t look good - GHC itself may have to burn, along with every other program in existence. It’s back to ye ol’ abacus!


Alright, that should be enough stupidity for everyone:

  • to those who want to burn Safe Haskell - implement practical alternative solutions to all the problems mentioned in the paper;

  • to those who want to keep Safe Haskell - address its bugs and limitations (e.g. that bug with GeneralisedNewtypeDeriving).

Perhaps, through both efforts, a more suitable alternative can be found which keeps everyone happy the least disappointed…

I think you somewhat misunderstood my stance.

The only case where I would find Safe Haskell interesting is to execute untrusted code. However, GHC Haskell is not CakeML and code is so far disconnected from what actually gets executed that it’s unrealistic to think Safe Haskell covers any type of sandboxing or “run untrusted code” features.

There may be other use cases to it, but they’re useless to me.

Useless features that cause pain to the ecosystem should indeed burn. And Safe Haskell does cause pain.

(And I’m confident there won’t be any successor that will be interesting to me)

1 Like

Well, one “can o’ worms” at a time: currently, @Ericson2314 is up to his eyebrows trying to make the initial separation of GHC and base. Giving that still-active project the resources it needs to be successful seems the better option right now - if the general consensus here is correct (very few people actively using Safe Haskell), then those few should be happy to wait for a project that everyone can benefit from.

Does CodeWorld really use SafeHaskell? I’d be surprised, because CodeWorld doesn’t actually run user code, it just compiles it with GHCJS and lets the user’s browser run it. AFAIK, it disallows TemplateHaskell specifically to avoid running user code on the server.

1 Like

As far as I can tell, the discussion here is mostly around whether it’s useful to compile/interpret untrusted code as part of a trusted codebase. In that discussion, my 2 cents is that, even though I have never needed to do that myself, I’d say it’s potentially a very useful thing, particularly since Haskell is a compiled language. I can imagine very useful scenarios where Haskell is used to build platform-like environments, and users build functionality on top by submitting Haskell code using the platform API in the form of a Haskell module. This could enable use cases that could be impractical with the alternative approach of separately compiling the user code and running it behind process/VM/etc. isolation. Aside from the practicality, just giving your users a Haskell environment along with the haddocs for the API available for them is probably 2-3 orders of magnitude easier than setting up strict isolation and building the API for crossing the isolation boundary.

That said, I’m not really sure how much Safe really contributes to any of that. Contrary to what’s been discussed here, I don’t think this even has anything to do with purity or IO either. It should be perfectly fine to run untrusted IO code. After all, untrusted Lua and JavaScript code is run in trusted environments all the time and both of these languages are impure (all code is in IO in a sense) and they don’t even have static types.

In the end, what allows us to trust untrusted code is the platform functionality it’s given access to. If I compile an untrusted module in an environment where it doesn’t have access to any Haskell packages, including base (and Prelude), then I should be able to run their IO code safely (unless I’m missing something). I should even be able to let them use TemplateHaskell as long as I don’t allow them to use the template-haskell package.

If I were building such a platform today, I’d accept users’ Haskell modules, and I’d compile them in an environment where they only have access to a package that I design specifically for that purpose and the modules from that package would only expose functionality that I have curated to be safe for the platform.

I believe the main goal of Safe is to formalize this approach and make it easier to set up that environment without having to curate every single bit in it. That’s why I think this thread should have more discussion of how much Safe really helps with that.

2 Likes

I spent a couple of years assisting in a functional programming course, which runs a lot of “untrusted code”. Safe Haskell seemed like a fitting use-case, but coerce and unsafePerformIO never were a problem, we just queried for any mention of it (I suppose you still could have pwned us with CPP) and warned if someone tried to use it before running the code.
The real issue with student assignments are highly performant, non-allocating infinite recursions, where timeouts can’t be used because the thread doesn’t call yield ever, since it is non-allocating.
Something where “Total Haskell” might have helped, but that would have had a huge impact on the course itself if it only taught a total subset.

Moreover, the threat model of SafeHaskell does not apply at all to our situation. If someone actually wants to pwn the course… have fun, go ahead! If we find out, you will be severely punished, and you have almost nothing to gain. We are that confident in this risk assessment, that the code is even run as admin (not my idea, the sysadmin said, “it’s fine”).

tldr;
I don’t see any benefit in Safe Haskell so far.
For running untrusted code, it only takes care of a single thread model, which is better accounted for by other mechanisms.
For trusting third-party packages, I doubt it can actually take the job of carefully reviewing the code you use. And even if it did, it would be far better if it didn’t live in GHC but outside of it, a network of trust doesn’t need to be part of GHC or even Hackage directly.

4 Likes

You can use the -fno-omit-yields flag to have GHC automatically insert yields.

A.k.a. Liquid Haskell.

2 Likes

That’s precisely what we use and doesn’t work in all situations :woman_shrugging:. No real idea why, I used to have a reproducible example, but can’t find it now.

2 Likes

I agree with this completely, and from the same angle.

I work in a domain in which we run untrusted user (read: student) code. Safe Haskell never even crossed my mind as something to use to prevent undesirable classes of behaviour. As cool of an idea as it was in principle, it just doesn’t fully solve the problem it sets out to solve; and in this case, if the problem is not fully solved, it is not solved at all. A miss is as good as a mile.

Effective sandboxing is a battle-tested, language-agnostic solution that does not require any special treatment from the compiler/build system, and we should embrace it.

8 Likes

In the non-threaded run-time, that will print “definition’s output”. In the threaded run-time, the same result is likely, but both could output (possibly interleaved, depending on buffering mode). I still don’t see how this is a special reason to need SafeHaskell.

From page 2 of 12 in the paper:

…maybe this article can also help to clarify matters.

But if you still don’t see how any of that is a reason (special or otherwise) to have SafeHaskell…you might be happier using Standard ML instead of Haskell - you can sneak in all the observable effects you like, and you rarely have to change any type signatures as a result. Furthermore, and as I dimly recall, there are still some SML implementations which are only single-threaded (being able to choose your preferred language implementation - what a novelty!).

Kaplan, Okasaki, and Tarjan demonstrate a way to simplify a catenable list implementation by using careful mutation in a way slightly more general than pure lazy evaluation allows. Packages like lvish and monad-par use unsafe operations under the hood to offer functionally that just wouldn’t be available otherwise. The art of using impure operations to implement pure ones isn’t and shouldn’t be restricted to compiler writers. Unsafe coercions (often buried in rewrite rules) are essential for optimizing Haskell code that uses very heavy type-level computations.

1 Like

It isn’t - that’s why types like ST exist.

…and before long, there’ll no doubt be a need for coercions in kind and sort-level programming because optimising those will be essential, too. It’s like normalising relational DBs - there’ll always be another normal form to make them work just that bit better.

Oh, and of course dependent types will make this all so much easier…

ST supports one way to use mutable state to implement pure computations. Lazy evaluation is another. Do you think those should be the end of it? That nothing else of that sort could possibly be worth the trouble?

1 Like

No, they shouldn’t be the end of it. But it begs a similar question: beyond parametric encapsulation and lazy evaluation, has anything else been discovered that has have a similar impact to either of them?

As Bodigrim and yourself have both noted, more and more unsafe code is appearing (with various levels of “trust-washing”), none of which seems to suggest a new commonality or abstraction which can be used to manage the mayhem the way parametric encapsulation and lazy evaluation helps us to do. Instead, it’s just a ever-growing pile of “unsafe abominations” which makes the claim of Haskell being a purely functional programming language an ever-growing joke…

…so let’s solve both problems at once:

  • Haskell is relabelled the “nondeterministic functional programming language” - the unsafe prefix can then be replaced with nondet.

  • As I suggested here, unrecoverable errors in Haskell no longer stop programs, but sends them into an infinite regression: debuggers can then be attached to the still-running programs to examine their call stacks or other runtime meta-data as needed to discover what went wrong.

The research and development that went into HasFuse, an early Haskell implementation based on GHC, could be useful for this task.