[Initial feedback request] DataFrame library

Exploring the design space and wanted to try out creating a dataframe library that’s meant for more exploratory data analysis. That is where you don’t know the shape of the data before hand and want to load it up quickly and answer pretty basic question.

Please let me know what you think of this direction and maybe clue me in on some existing tools in case I’m duplicating work.

7 Likes

The example is very good to introduce what this library is all about, Haddocks less so.

I don’t think exploratory data analysis can be done without dynamic typing, so I welcome this!

2 Likes

Thanks. I’m going to ask non-Haskell people to try it out to see if fulfills the goal of being easy to use.

1 Like

If targets non-Haskell people , would you consider building wrapper for like Python ?

Thank you for sharing!

I do have some pointers for you on similar work:

  • frames : row-oriented, the first real Haskell dataframes library, with compile-time affordances for inferring column types.
  • colonnade : column-oriented, with adapters to some standard data formats and a nice way to talk about table headers.
  • heidi : row-oriented, using Generics (specifically, generics-sop) to extract the frame structure from regular Haskell algebraic types.
4 Likes

Like polars I’d like to eventually get parallelism gains from using Haskell. It might be worth wrapping python for chatting thoufh but I noticed Haskell-chart already does this.

I’ve seen and worked with frames. I’m trying really hard to steer away from the template Haskell approach and see how ergonomic I can make the API.

Colonnade seems great I’ll look more into it.

RE Heidi, aren’t row-based representations more awkward to query?

Yeah, I started on Heidi knowing very little about DB internals :smiley: It’s mostly a proof of concept for the user-facing API.