I am trying to use tree-sitter with NeoVim. Initially it was appealing, since I had the regex based highlighting break often, and I thought queries on a parse tree would be better.
But it seems that Haskell is hard with tree-sitter, since it can’t know if something is a function or not without doing type checking.
For example if I have
int :: Int -> Int
int = id
and I do :Inspect on the latter int, I get @function.haskell.
But if I change to:
type Int2Int = Int -> Int
int :: Int2Int
int = id
the color is changed and it now looks like any other term.
This makes me wonder, does tree-sitter even make sense with Haskell? Are we going to highlight using LSP at some point? Then I suppose Haskellers could skip tree-sitter entirely.
I suppose you could make the point that highlighting doesn’t need to be perfect. But in that case, you could make the argument that regexes are good enough. They seem to have less moving parts.
Or should I, as the title suggests, just color functions the same as other terms, and enjoy highlighting on other elements?
Many LSPs support semantic highlighting, but, depending on your editor, the used theme/color scheme must support (color) these tokens too, which means that the highlighting of some “smaller” languages only works with special themes.
Well all or nothing seems a bit extreme, no? tree-sitter covers an awful lot correctly at the moment, but yes, it does miss a bit and thus isn’t perfect. Despite this I get a lot of value from it still - it drives highlighting in my editor (Helix), but also helps code navigation. In Helix I can press Alt-o and Helix will grow my selection to the parent node in the AST. I use this all the time to select sub-expressions to extract them into separate functions, for example. This would be a lot more complicated with regular expressions!
I’m surprised to hear that there are people who want to color identifiers of function type differently from other identifiers! What benefit do you get from doing so?
The problem is that Treesitter does the same things that LSPs can do (your selection example is covered by the Selection Range Request),but not all LSPs support the same features. And Treesitter is too limited to be used instead of LSP (yes, I know that there are LSPs implemented using Treesitter and they all are “better than nothing”). So, the sooner Treesitter is dropped for LSPs, the better (Emacs just beginning to adopt Treesitter doesn’t help), as it “forces” the LSP to implement the missing features and all editors to be able to use the same features.
If I’m allowed to hazard a guess, I’d say, knowing whether an identifier is a function or not
As always: as soon as you are used to a feature (in other LSPs), you miss it if an (LSP) implementation lacks that feature.
I think knowing if something is a free or bound variable in an expression could be useful sometimes - certainly in the case of shadowing (e.g., is this id the function (free), or some id I bound earlier, such as binding it in do notation (bound))
Ha! Because four would be coloured “as a function”? The logical conclusion of this idea is a variety of colours for a variety of different type classifications, and that would indeed imply full type checking …
Exactly. You can define your own ones (they are just indices of an array Integer Encoding for Tokens), but they have to be somehow supported by the client.