HF Coordination for test framework IDE integration

I’m bringing this one up as a pure conversation starter. I have very little idea what is involved and the value-effort balance of such a task.

All major languages have test integration in their IDE, where you can run single tests via the IDE by pressing a play button.

Here’s an example for C#.

It would be pretty invaluable to coordinate the efforts of major test frameworks (and minor also). It seems to me that a great place to start is the tasty framework, which already unifies all other major testing frameworks, and so provides a great entry point for such a task. tasty-discover is relevant here

1 Like

Having the little play button next to tests is really nice when working with Kotlin in IntelliJ. Here’s a potential minimal design for a solution in the spirit of Haskell that doesn’t rely on special language support for outside libraries.

Libraries like JUnit rely on Java annotations to unambiguously mark tests, but we don’t have anything like Java annotations in Haskell. We tend to use newtypes for this kind of situation.

What if there were something like this in a well-known location:

data TestResult = Passed | Failed Text | Error Text

class IsTest a where
  runTest :: a -> IO TestResult

Any module-level definition whose type has a IsTest instance would be presented with a play button, and the play button would invoke runTest on it. Then, we could get something like the JUnit behavior with a micro-library:

newtype Test = Test (IO ())
newtype TestFail = TestFail Text deriving Show via Text
instance Exception TestFail where
instance IsTest Test where
  runTest (Test impl) = (impl >> pure Passed) `catch` handleFailure `catch` handleOther
    where
      handleFailure (TestFail msg) = pure (Failed msg)
      handleOther e = pure (Error (show e))

 ▶️ myTest = Test $ do ...

Existing test frameworks could integrate very easily with this by providing IsTest instances.

3 Likes

This seems plausible, and might not be vastly difficult to do as a HLS plugin? (Although how nice the UI could be would depend on what support LSP provides for this kind of thing, or on adding custom IDE support.) I’m reminded of hls-eval-plugin: Eval plugin for Haskell Language Server and Proof of concept: Diagrams plugin by edsko · Pull Request #3250 · haskell/haskell-language-server · GitHub. In particular, the latter looks for bindings of type Diagram and renders them on hover; one could imagine instead a plugin that looks for bindings whose types instantiate IsTest.

I use the eval plugin all the time, just as I used its predecessor in Dante. It’s really quite nice.

I suspect that an implementation along the lines that you’re suggesting would not be astoundingly difficult.

Come to think of it, the eval plugin basically already can run unit tests with a clunky UI, so you’re definitely on to something here.

A problem is fitting this into the LSP interaction model. I know rust-analyzer supports this, so we could look at what they do.

Libraries like JUnit rely on Java annotations to unambiguously mark tests, but we don’t have anything like Java annotations in Haskell.

Haskell has these too: the ANN pragma. It’s not much used, and seems to be undocumented, but it is there!

Perhaps this is mostly a philosophical question, but if it’s not used and it’s undocumented, is it really there?

4 Likes

It is documented:

https://downloads.haskell.org/~ghc/9.4.2/docs/users_guide/extending_ghc.html#source-annotations

1 Like

And here I was reading the page in the manual about pragmas. Thanks!

1 Like

It looks like something like data Test = Test and {-# ANN myTest Test #-} could work as well, though I do like how the type class mechanism would allow lightweight adaptors from the various testing libraries to be written.

1 Like

I’ve been thinking about your solution for a little bit…

Something interesting about your solution is that it treats test declarations as primitives. Initially, I was thinking about individual tests as CLI calls with some kind of filtering. In tasty, this might look like: cabal test --test-options='-p "/myTest/"'.

Your solution has a some advantages: it’s very simple; it avoids the need for any naming conventions or annotations.

However it also has some disadvantages: the main one I can think of is that many test frameworks have flags that affect the runtime; sometimes I want to filter down to a test helper that is called many times by different top-level tests.

I think your suggestion is better though, because it’s an easy solution to a complex problem.

However, we can recover some of the power of a CLI with:

data TestOpts where
  TestOpts :: forall opts. Data opts => opts -> Test

simpleTest = Test ()

runFailedTests = Test ["--only-failed"]

Each test framework could add its own helpers, and one canonical implementation of IsTest, where runTest now has the type runTest :: TestOpts -> a -> IO TestResult.

An example:

{-# ANN myTest simpleTest #-}
myTest = _

-- The following is during development when I want to isolate a specific test
{-# ANN myTest (filterTest "mySpecificTestCase") #-}
myTest = _

Reading more, I think my small addition makes the ANN solution completely compatible with tasty, for example, even with all its CLI options and support for “ingredients”.

https://hackage.haskell.org/package/tasty-1.4.2.3/docs/Test-Tasty.html#v:defaultMain
https://hackage.haskell.org/package/tasty-1.4.2.3/docs/Test-Tasty-Ingredients.html#t:Ingredient

A codebase could declare its own TestOpts helpers with helpful ingredients and options for tests during developer debugging…

I’ll very strongly suggest not to use ANN if you can help it. It will trigger the TemplateHaskell pipeline, and thus increase compilation time, and may cause increased recompilation.

2 Likes

Thanks for the pointer!

I guess that finding a way to pass test options to the tests is going to be a little difficult then… I guess we can just say “this is for the 80% of use cases where it’s helpful and go from there”.

On the other hand, I worry about choosing a typeclass design that isn’t extensible.

I think the next step is to define what are the most valuable features in the design space and decide on whether a type class-driven approach can hit most of the right notes! I’ll get on this over thanksgiving I think…

Using a type class to indicate “click to run this test with default options” doesn’t preclude having a more detailed means of running a test suite in “expert mode”, does it?

It might be worth looking at the detailed-0.9 test type. Cabal has support for that, but as far as I know, very few people actually use it.

Definitely not, but I think it’s worth considering the possible design space for an expert mode, just so we know the design space before implementation :slight_smile:

Fwiw, I am convincing myself that your type class design is already very good. My biggest worry was not having control over options such as “accept golden tests”, but I think these I can get these back, so I’m going to write something up for the type class design.

So I’ve thought on it a little more, and I think I have a better intuition on what makes Haskell’s testing situation a little unique from other languages.

Most languages use a OO method-orientated design for test layout and discovery. For example: Java uses the @Test annotation to indicate a test; Python’s unittest framework takes any class extending unittest.TestCase and assumes that all its functions are test cases.

Haskell differs in a few ways:

  1. the organization of test code is handled by an eDSL where you create a value representing a test tree, typically with combinators such as testGroup or testCase, and assigning them names
  2. we do not have access to location data for a test case, whereas Java and the like have more immediate access to this data. This is because combinators such as testGroup do not typically have access to the location they were called at.
  3. this one is more conceptual: Haskell test structures are less focused around top level declarations in comparison to other languages. In other languages, it is not as easy to have nested test cases within one declaration. In Haskell a single top-level declaration can quite easily contain many different tests.

In other words, in Haskell, the unit of test is actually a testGroup or testCase call, and not top level declarations. The test tree is normally discovered (e.g. tasty and HUnit) with a flag such as --list-tests.

I am wondering whether it is possible to have test frameworks provide not just the test tree, but also the location of said tests. For example:

-- Test.hs
tests = testGroup "all" [testGroup "unit" unitTests, testGroup "property" propertyTests]
--Test.Unit.hs
unitTests = [testCase "my unit" ..., testCase "unit2" ...]

-- Current output of --list-tests
all.unit.unit1
all.unit.unit2
all.property.prop1

-- Suggested output of --list-tests-json
{"name" : "all.unit.my unit", "location" : "path/to/Test/Unit.hs"...}

I think this is a preferable option for a few reasons:

  • We can run compiled code for tests, not interpreted, which would be the case with a GHCi solution
  • It is what other test experiences appear to do
  • This solution provides a very easy way to get the global test tree (see screenshot below from a sample python project)
  • This is, afaict, what IntelliJ and other IDEs do for their test integration. For example, running test1 in the image below incurs the following command line call: /.../python3.9 ~/.vscode/extensions/.../visualstudio_py_testlauncher.py --us=. --up=*test.py --uvInt=2 --result-port=53890 -ttest.MyTest.test2 --testFile=./test.py

I think the type class oriented design addresses a use case that I also think is convincing: what if I have a function that could be run, such as an IO () or a function whose arguments have Arbitrary instances. In some ways I think the two use-cases are very related, but I think the experience provided by most IDEs would not work with a type class solution for the reasons above…

Just want to mention that at least for neovim there is some integration to run tests from the editor: GitHub - MrcJkb/neotest-haskell: Neotest adapter for Haskell (cabal-v2 or stack) with Hspec
It supports hspec and uses tree-sitter to parse the source and find test definitions and then it calls the test runner executable.
For other languages it’s sometimes the case that the language server is used and provides off-spec extensions to locate the tests. Or they use code lenses. The client then also triggers the test runner executable

For vim something similar exists too (GitHub - vim-test/vim-test: Run your tests at the speed of thought). Afaik it uses regex to discover tests.

What could help development of such things is a common output format for the test results, to make the parsing easier.

2 Likes

I’ve written an initial proposal for an integration that doesn’t use HLS directly, but instead provides a unified API that testing frameworks can support via the command line.

It isn’t fully fledged yet, but I’d people’s initial thoughts as to whether it makes sense, before I go ahead and try to implement this, which would require support in tasty and VSC’s Haskell extension.