Things are moving pretty fast in the world of LLMs and we’re all trying to navigate our way through it. In Open Source we wonder how projects should deal with LLM-based code contributions, or maybe even have strict policies about it. In industrial code-bases we don’t have the random drive-by LLM contribution problem, but we do have issues around trust/correctness.[1]
The Haskell Foundation is considering running an online 1-day workshop on how companies are managing/navigating their use of LLMs and how open source maintainers view LLM contributions.
Ideally this would be a quick turnaround, hosted sometime mid-June (likely June 26th), as there’s lots of hunger for folks to learn from each other.
If you would like to give a talk at this online event (and therefore the speakers can be anywhere), please send a short 1-2 paragraph talk proposal to jmct@haskell.foundation, by end of day May 19th (anywhere in the world).
Potential topics include but are not limited to:
best practices around testing LLM generated code
Creative use of types in enforcing LLM conformance
Prompting that helps with getting ‘good Haskell’
Observed/experienced pitfalls with LLM generated Haskell code.
Talks can be ‘short’ , 30 min. including Q+A, or ‘long’, 1hr including Q+A. Please state whether you’re proposing a short or long talk when you email me.
The idea here is that we learn from each other and figure out, as a community, how to best leverage these tools, whether we’re forced to by company policy, or whether we’ve found the tools useful. We know that the use of these tools is controversial, a talk proposal does not have to be ‘pro’ LLM, but we do want it to be useful and actionable, the idea is to learn from each other.
This problem isn’t unique to industrial code-bases, but what used to be a “we trust our team” has now become “do we trust our LLM?”. ↩︎
It seems like an invitation for perspectives. Going by the bullet list it seems like there’s an opportunity to show cases where LLM use has been less than stellar. I’m willing to see where it leads and if there are compelling cases where they have successfully demonstrated value then I’m willing to invest my time watching a recording of any interesting topics.
The Haskell Foundation does not currently have an official position on this topic. There’s been no board vote on LLMs and/or their use.
Here’s what I do know:
Many of our sponsors have found the use of LLMs in their companies to be useful and worthwhile, and are hoping to learn from each other.
Many of the open source projects in the Haskell Ecosystem are currently either setting policies around the use of LLMs for contributions or considering doing so.
Part of the motivation for a quick turnaround is that it seems like a useful time to hear various perspective and learn from each other. Yes, this means that it’s likely many talks will be about how to use LLMs for Haskell more effectively. But if you’re a maintainer and are currently frustrated with LLM generated contributions, please also submit a talk proposal! In the same way we wouldn’t want a talk that’s just “LLMs are the future” with no explanation, we wouldn’t want a talk that’s just a rant with no examples. The points of workshops is to learn from each other and prompt discussion. The Haskell Foundation wants to be a body that helps facilitate these discussions.
As @Jappie states, the topics written by @jaror would be in scope.
No shorter than 30 min, and up to an hour (both of these would include Q+A time), I’ll edit to post to say that any proposal should include it’s proposed length.
I’ve been thinking to create a presentation about why LLMs are harmful for the open source ecosystem. But I feel this requires a lot of research on experience reports of projects and going through papers. I’m not sure it fits the format of this event and it’s a bit short notice as well.
I think it’s the same as it would be for any other conference focussed on a tool or methodology. Some variant of a “Why Formal Methods Are Bad” talk could absolutely be welcome at a formal methods conference, provided it was thoughtful and constructive. But you probably wouldn’t want too much of that, since the event probably selects for people who are more interested in how to apply the tool.
This highlights my problem with this call for proposals. The foundation (not the “event”) is selecting for a particular audience: people who want to use AI (i.e. people who are pro-AI). If the foundation is not pro-AI then why is it only hosting a pro-AI event and not (also) an event aimed at trying to investigate and reduce the harms of AI? Or just hosting one, more neutral, event?
I also should make it clear that I don’t think the Haskell foundation has some kind of hidden agenda to push AI on the community. Instead, I mainly want to point out tacit, perhaps unintended, implications of this announcement.
Hmm, I’m not sure I accept the framing, but I see what you’re saying.
Ultimately, the reason why we are planning this event is that people asked for an event like this. This includes sponsors and maintainers who are trying to decide how they should proceed.
I can’t promise that everyone proposal will be accepted, I can’t even promise that we’ll get enough proposals for the event to happen, but I can promise that every proposal will be considered.
People only need to follow my twitter account to understand my personal biases in the matter
Based on feedback I’ve received about the short turnaround and the likely date (a US holiday), I’ve pushed back the proposal deadline and the planned date by a week. They are May 19th and June 26th, respectively.
For various reasons I have quite a lot of anecdotal data about LLM-generated Haskell. Are experience reports a good fit for this event? Or do you expect more rigorous content?
Sounds like you’re “starting with a conclusion” here? If you haven’t done the research yet you should probably start with the question such as “what are the effects of LLMs for the open source ecosystem?”, and then try to keep your thumb off the scale, as it were.
Should it be a company-oriented talk, or a production grade/IC-oriented talk?
To be more specific: in my company, we have incentives to use LLMs, but each of us have a personal way to use them, which may not fit management’s idea of LLMs usage.