Hello everyone!
Many programming language ecosystems today have a system for tracking known security vulnerabilities in packages. These vulnerability databases are used to provide a variety of services, including:
- Build tools that can warn if dependencies contain known vulnerabilities
- Source code hosting sites can notify maintainers if vulnerabilities become known for their packages (cf. GitHub’s Dependabot)
- Warn people who are browsing a package index if there are known vulnerabilities
These features are directly useful for developers. They’re also useful for organizations who need to live up to various standards like ISO 27001, and having these features can make the difference between being allowed to use Haskell on a project or not.
I think it would be good to have these things for Haskell - thus, I plan to submit an HFTP.
We’ve been having some discussions with GitHub about what it would take to get Dependabot working with Haskell. I’d like to focus on the part of the problem that enables their work first, while making sure that we have a path forward to the above points 1 and 3 as well. This seems like a good way to make our work useful as soon as possible.
To enable Github to set up Dependabot for Haskell, we basically just need to provide them with a data source for vulnerabilities. The most common way to do this is to have a repository full of files in some format that the language community decides on, with vulnerability reports added through a standard Git workflow. The language community supplies the data and a description of the format, and Github will take care of getting it imported. Github also has their own vulnerability database and process, and we’d need to write an importer that talks to their GraphQL API if we want their reports in our repository. We also need to tell them how to find dependencies - they can at least look in a .cabal
file, but should we also point them at checked-in freeze files?
I would like to propose that we basically do what Rust does here, except when there are specific reasons not to. Here’s some details about their approach:
- Their vulnerability report format consists of a TOML header followed by Markdown text. The header specifies metadata such as the package name and the affected versions, while the free text at the bottom is for a description of the vulnerability itself.
- There is a specific committee in charge of merging advisory PRs to the database.
- The contents of the advisory database are CC0 (public domain in jurisdictions that support it).
I think that we should make the following changes to what Rust does:
- Swap out Cargo’s versioning scheme with Hackage’s (specifically, I’d say that affected version ranges use the same syntax and semantics as
.cabal
files) - Rust’s format describes which versions are not affected. This strikes me as more difficult to combine with everything else than describing those that are, so I’d think we should do that.
- We should review the categories of vulnerabilities - I think that we may need to swap some out. That will be part of the detailed proposal, but I would appreciate feedback on this here and now.
- The Rust format may optionally contain pointers to the affected functions. I think we should additionally allow datatypes there, for cases of things like hash collision attacks.
Note that the vulnerability database is intended to be used after the vulnerability has been fixed. Anyone may file a vulnerability alert, but we should encourage notifying maintainers first, of course.
Governance is also an important issue here. I’d hope that the database can be administered by representatives from trusted community infrastructure providers, namely Hackage trustees, Stackage, a Core Libraries Committee delegate, GHC developers, etc.
I’ve posted on the Rust Zulip instance and asked for their reflections about their setup, and I’ll summarize their replies either here or on the HFTP, depending on timing.
Some things that this could enable, in the long run, that I do not think should be part of the first iteration:
- Auditing of transitive dependencies by
stack
andcabal
for known vulnerabilities - Warnings about known vulnerabilities on the Web interfaces to Hackage and Stackage
What do you all think?