[DataHaskell] first release of beam-duckdb

The beam maintainers are happy to announce the release of beam-duckdb, a beam backend for, well, DuckDB.

The idea of beam-duckdb is to help power data science workflows, under the wing of dataHaskell. As such, please read the release announcement over on the dataHaskell blog.

DuckDB has a lot of features, only a few of which are modeled in beam-duckdb right now. Do not hesitate to raise issues if there’s some functionality you’d like!

14 Likes

I should mention that this project was started at Amerihac!

1 Like

I’m just popping back in here to mention that the latest release, 0.3.1.0, adds support for the COPY statement, among other things.

This means that you can write a Haskell program to:

  • import data from a Parquet file;
  • process data using beam as always;
  • write data data back to a Parquet file.

The beam documentation has been updated to show how to use this.

3 Likes