The Dataframe project is great, I could have used it many times.
It’s also great that Dataframe puts a spotlight on the gulf between routine quick exploratory data analysis (EDA) and typical rigorous haskell engineering, where strong static typing is vitally important. This raises interesting questions about the product concept and roadmap.
Given that Dataframe types are dynamic and column names are strings, it’s philosophically much closer to python-pandas-polars and R-tidyverse than traditional haskell. With that structural foundation, the dataframe package has two obvious paths going forward:
Path 1) Continue in the dynamic direction, expanding functionality to compete directly with python and R.
Path 2) Provide a bridge between a lightweight dynamic style and traditional haskell strong static typing.
Path 1 is a big project, simply because the depth and breadth of other EDA platforms define the baseline, and therefore define community expectations. For example, matplotlib and seaborn are flexible and define the cultural norm, and building analogous haskell tools will need substantial engineering resources.
Next, suppose that a full-featured Dataframe is wildly successful. That’s essentially a big EDSL which doesn’t share haskell’s strengths, value propositions, trajectory, and culture. That’s not a bad thing but it’s not haskell.
Path 2 is another thing entirely. It would be straightforward to write adaptors between the dataframe library and traditional haskell packages like aeson, but there are bigger questions around the UX and understanding just what problems the bridge is solving.
Not easy questions but fun to consider!