Opening text files fails on M1 darwin (GHC 9.2.1)

With GHC 9.2.1 + bundled text library, opening a text file like so:

ghci> import qualified Data.Text.IO as TIO
ghci> TIO.readFile "/tmp/example.txt"
*** Exception: /tmp/example.txt: hGetContents: invalid argument (invalid byte sequence)
ghci>

Basically anything non-latin fails:

vanessa@Vanessas-MacBook-Air dickinson % cat /tmp/example.txt
«
1 Like

This might be related to UTF-8 decoding doesn't work correctly on GHC 9.2.1+AArch64 NCG (#20640) · Issues · Glasgow Haskell Compiler / GHC · GitLab.

If you need a workaround, try using the LLVM backend.

1 Like

The new subword logic in ghc, was not very well covered by the test-suite. This is being rectified with a new quickcheck based test tool: Ben Gamari / test-primops · GitLab

Please try Draft: Various codegen fixes (!6934) · Merge requests · Glasgow Haskell Compiler / GHC · GitLab, and see if that elevates the issue.

3 Likes

@angerman I think it would be good to have a PSA regarding these bugs.

People are encountering these with various packages and reporting them on the respective issue trackers. See e.g. Unexpected conversion to surrogate pairs · Issue #381 · haskell/text · GitHub.

1 Like

@sjakobi good idea! Where would you expect such a PSA to be, here on discourse? ghc-dev mailing list?

The last such PSA I remember was made on Haskell Cafe: [Haskell-cafe] Serious bug with Natural in GHC 9.0.1

It should probably be crossposted to the other places where GHC releases are announced, e.g. r/haskell and this Discourse.

1 Like

We do see a pattern here right? The release candidates aren’t getting the necessary testing they need. The first release that’s cut gets some testing because people aren’t hesitant to test a “release candidate” anymore, and the .1 ends up being effectively the -rc1.

I do agree on the PSA though. But then again I don’t even read the Cafe O_O.

4 Likes

Yeah, I guess I’m guilty of not playing enough with the release candidates even though head.hackage does make it pretty easy.

If I had tried this earlier, maybe Regression in superclass checking of instances (#20666) · Issues · Glasgow Haskell Compiler / GHC · GitLab (which I just encountered in generic-random) wouldn’t have made it into 9.2.1.

I suspect that a lot of people don’t know how head.hackage can be used to work around dependency issues with early stage GHC releases. Maybe, if this was publicized a bit better, more people would give it a try.

2 Likes

Hi, i’ve had heard about hackage.head but i had no take a look, it is great and agree it should be more publicized.

We do see a pattern here right? The release candidates aren’t getting the necessary testing they need.

Today i had to ping for the third time to a maintainer and email for the second time another one to try to make 2 hls upstream packages have support for — ghc-9.0.1!!
So not sure if test ghc release candidates would be easy, at least for project living in the leaves of the dependency graph.

1 Like

We do see a pattern here right? The release candidates aren’t getting the necessary testing they need.

Indeed. Running tests already present in GHC source tree could be a good first step, I think. E. g., test suite of text submodule clearly demonstrates the issue.

1 Like

Yes, and the test-suite even runs tests of libraries. But libraries have to adhere to ghcs odd test-driver. E.g. there are tests for array tests · master · Glasgow Haskell Compiler / Packages / array · GitLab, and base tests · master · Glasgow Haskell Compiler / Packages / base · GitLab, and unix tests · master · Glasgow Haskell Compiler / Packages / unix · GitLab, …

text sadly doesn’t have the same arcane test-infra that ghc mandates for testing :frowning:

IMO, ghc shouldn’t impose the how of test-suites onto packages; we have cabal files to describe test-suites. We should just use those. This is an uphill battle though.

1 Like

We do see a pattern here right?

Yeah, Apple is being annoying with its custom/NIH stuff again :crazy_face:

That’s quite far fetched. If it was just for the calling convention we wouldn’t see these issues. It’s due to the proliferation of subword primops. Those are arguably enabled by the machinery we had to add to properly support foreign calls on AArch64-darwin.

A more accurate spin would be that apple once again forced our hand to be less sloppy (not everything is a WORD!—in a typed language after all) and that this nudge allowed us to finally have enough machinery in place to make subword primops feasible.

We simply did not have good tests for subword primops. This would affect AArch64-Linux as well I believe. That this shows up on AArch64-darwin first should tell us something about platform popularity.

4 Likes

IMO, ghc shouldn’t impose the how of test-suites onto packages; we have cabal files to describe test-suites. We should just use those. This is an uphill battle though.

It’s probably non-trivial to integrate test suites of boot packages into native GHC tests, but it should be reasonably simple just run them as a separate build step, once ghc-stage2 and cabal are available, isn’t it? I mean, what’s the point to maintain head.hackage if not to run tests?

Now I’m a bit afraid to answer as an official voice, I’m not. So please let me prefix with: this is my view on things.

head.hackage is primarily a way to fix packages ahead of upstream to test compilation (and maybe testsuites) against them. It’s mostly an aid in ghc development. We do run head.hackage tests occasionally, but they are not part of regular CI runs, and I’m not sure how much attention anyone pays to them really :frowning:

I have started working on full blast tests for compilers. If this was further along, we might have been able to test earlier compilers faster. It’s however just a side project Hamish and I work on occasionally. I wish we’d had something like this in better shape. Quite frankly the resource use is intense, and we’ll need to find a better solution on reporting for regressions, …

We’ll likely run this against 9.2.1 soon’ish.

To go back to your earlier point, yes it would possible to set up some smaller tests for shipped packages. It just needs someone to actually write that CI job. Especially once we have building and testing separated into two distinct CI stages. GHC dev is seriously short on manpower. I’d be happy to provide any form guidance, mentoring and assistance!

I was waiting for the Aarch64 NCG, the first release candidate I didn’t have the appropriate M1 hardware!

I will be paying attention to GHC 9.2.2 etc.