Encoding issues in github classroom autograding Haskell assignments

When preparing an assignment in github classroom for my students, I am facing an issue related to character encoding. We use Portuguese in the input and output texts in the program.

As an example here (see this PR) is a simple assignment with a haskell program to say hello world (Olá, mundo! in Portuguese).

module Main (main) where

main :: IO ()
main = putStrLn "Olá, mundo!"

The tests used for automatic correction looks for the world Olá in the output of the program.

{
  "tests": [
    {
      "name": "Say hello world (with runhaskell)",
      "setup": "",
      "run": "runhaskell hello.hs",
      "input": "",
      "output": "Olá",
      "comparison": "included",
      "timeout": 10,
      "points": 1
    },
    {
      "name": "Say hello world (with ghc)",
      "setup": "ghc hello.hs",
      "run": "./hello",
      "input": "",
      "output": "Olá",
      "comparison": "included",
      "timeout": 10,
      "points": 1
    }
  ]
}

Here is the correction.

The test with runhaskell fails because the text Olá is not found in the output of the program. Clearly there is a diference in encoding:

::error::The output for test Say hello world (with runhaskell) did not match%0AExpected:%0AOlá%0AActual:%0AOl?, mundo!

The test with ghc fails with a runtime error. It seems that the generated code is not able to use the utf-8 encoding:

hello: <stdout>: commitBuffer: invalid argument (cannot encode character '\225')

Any clues on how to deal with this situation is very wellcome.

3 Likes

Hmm, perhaps you need to set UTF-8 somewhere. Perhaps export LANG=C.UTF-8 somewhere in your workflow script will help?

3 Likes

If you can wrap the main provided by the students, would the with-utf8 package help here?

1 Like

As I recently learned, setLocaleEncoding utf8 (which is what with-utf8 seems to use) works as long as you don’t spawn child processes. If your Haskell program wants to spawn child processes, LANG=C.UTF-8 runhaskell hello.hs seems to be the only solution that has worked for me so far. It doesn’t work on windows, though, but that probably isn’t a problem for CI.

4 Likes

LANG=C.UTF-8 runhaskell hello.hs works for me. Thanks.

I have also tried setting the locale with sudo localectl set-locale pt_BR.UTF-8, but it didn’t help.

1 Like