In short, I’ve been doing a really terrible job keeping the lights on and I don’t foresee a lot of additional bandwidth in the near term. I haven’t used it actively since the LSP options became usable, but I know there are people who do. Please reach out if you have any interest.
Also, a huge thank you to @andreasabel for doing all of the hard work of keeping Hasktags building with newer versions of GHC.
I see you are the maintainer of ghc-tags, Andrzej. Are you sure ghc-tags and hasktags perfectly overlap in their goal and scope?
I am a user of hasktags Jack. Thanks for your work and thanks for deciding to seek a new maintainer here; a positive way forward that shows consideration for the ecosystem and allows previous maintainer a well-deserved closure to manage with other commitments.
I’m not sure about ghc-tags, but I was a user of hasktags until something annoyed me and I moved to fast-tags and that library has been working great for me for years.
I think so. ghc-tags properly generates tags for all top-level definitions since it uses GHC API, unlike hasktags, fast-tags etc. which use ad-hoc parsers. I use it for tags-based navigation in all of the Haskell projects I interact with without any issues. When I was using hasktags, I was constantly encountering definitions I couldn’t jump to because of bugs in the parser. This was annoying enough for me that I ended up creating ghc-tags
I use hasktags because it just works. Not perfectly but it hasn’t changed for more than a decade. The source to hasktags is trivial, I could keep it running for as long as I like.
I don’t have that kind of confidence in anything that depends on GHC’s API. As tempting as they are, their maintainers are more at risk of burning out.
Understandable. FWIW ghc-tags uses ghc-lib so I don’t drown in CPP and maintenance for the last 2 years was generally "every ~6 months spend an hour moving to the newest ghc-lib".
I’m more than happy with ghc-tags. Its adoption in GHC is only the consequence of its quality and good user experience, which I can vouch for in both personal and professional contexts.
Out of interest I looked at fast-tags vs hasktags vs ghc-tags on the same non-trivial work codebase (842 Haskell modules). Based on the output sizes, it’s clear that hasktags produces the least information, fast-tags produces more (likely due to a slightly better lexer), and ghc-tags the most information (understandably, it’s a full parser). In terms of speed, the difference between them is in the noise if you let them walk the directory structure themselves. I’ve found if you pipe fd into them, it turns seconds into something on the order of 200ms. I’ve updated my Emacs binding to do this, regardless of which tags implementation I’m using.
Since the above results are not reproducible (private project), I can share how ghc-tags works locally for me with the ghc repo and the following config file:
unknown@electronics ghc $ rm TAGS*
unknown@electronics ghc $ time ghc-tags -e
libraries/ghc-heap/GHC/Exts/Stack/Decode.hs:12:14: warning: [GHC-53692] [-Wdeprecated-flags]
-XTypeInType is deprecated: use -XDataKinds and -XPolyKinds instead
|
12 | {-# LANGUAGE TypeInType #-}
| ^^^^^^^^^^
could not execute: hspec-discover
could not execute: hspec-discover
real 0m4,011s
user 0m24,264s
sys 0m3,897s
unknown@electronics ghc $ time ghc-tags -e
could not execute: hspec-discover
could not execute: hspec-discover
real 0m0,334s
user 0m0,861s
sys 0m0,237s
It runs on a little over 1500 modules. The cold run takes time, but after that it tracks modification times of all modules and on subsequent runs reruns parsing only on modules that changed. I don’t observe the slowness letting it do the directory traversal.
If anyone wants to try this locally, you need to compile GHC first for the build system to generate header files needed for parsing some modules.
My workflow is to download some common dependencies and index them together with my project so that names will be resolved in the dependencies as well. The dependencies are static and typically are indexed only once, but when they’re not cached the delay matters for me so I use fast-tags.
The raw speed numbers that don’t include file collection are:
$ hasktags --ctags -o tags.hasktags STDIN +RTS -s <files.txt
...
Total time 8.831s ( 8.865s elapsed)
$ fast-tags -o tags.fasttags - +RTS -s -N <files.txt
...
Total time 10.805s ( 1.268s elapsed)
$ ghc-tags -c -o tags.ghctags +RTS -s <files.txt
...
Total time 30.259s ( 6.278s elapsed)
Overall fast-tags seems to still be a fast one. NB it also indexes hsc, Alex and Happy files.
Regarding precision there were comparisons stating that hasktags collects the least information. The report was assumming that all tag generators generate the same output but that’s not the case: hasktags generates Emacs-style tags by default, fast-tags generates vim-style tags and ghc-tags has no default and forces you to choose. Comparing size of Emacs-style tags against vim-style ones is not right. Different formats also take different time to generate.
I compared vim tags generation, of the three thet fast-tags gives least amount of entities:
However hasktags contains the least info. For intance it doesn’t output tag type by default (e.g. whether it’s constructor, function, type, etc), while the others do. Maybe I missed the option to enable it. Example output:
$ grep -F "Key'F14" tags.*
tags.fasttags:26086:Key'F14 all-packages/GLFW-b-3.3.9.0/Graphics/UI/GLFW/Types.hs 358;" C
tags.ghctags:41912:Key'F14 all-packages/GLFW-b-3.3.9.0/Graphics/UI/GLFW/Types.hs 358;" c term:Key'F14
tags.hasktags:47312:Key'F14 ./all-packages/GLFW-b-3.3.9.0/Graphics/UI/GLFW/Types.hs 358