Hitting performance issues with text maniplulation

Hi all, I’m very new to Haskell and I started writing a little command line program to test the waters.

It takes in a text file and rotates each character in the alphabet according to the Fibonacci sequence. It started unbearably slow as I calculated the sequence recursively for each Char, I got significant improvements by passing the last two values forward and then another small improvement switching from String to Text.

But I’ve hit a bit of an impasse I’m not sure where to start looking to make improvements. I arbitrarily expected to be able to convert my 1000 line / 80 col test file in under a second but I’m stuck around 5.5s

The majority of the code is in this file:

and I created a profile here:

I did try writing in JS out of curiosity and noticed I started hitting Infinity in my fib sequence, could it be an issue with adding very large numbers together? Any pointers would be greatly appreciated!

3 Likes

The docs for Data.Text.snoc say that it is O(n). Could that be (part of) the problem?

3 Likes

A good alternative to snoccing text is to use a text builder, for example @Bodigrim’s excellent linear-builder.

For the best performance you should use the linear Buffer API directly: Data.Text.Builder.Linear.Buffer. That way you can directly write into a mutable buffer avoiding most memory allocation (except for growing the buffer itself).

Edit: Although, I’m not sure if you can use it linearly inside the Text.foldl function.

2 Likes

That’s a powerful general solution. For this small example, which is all about accumulating single characters, I think I’d suggest consing onto a reversed list, and then unreversing at the end.

2 Likes

Thank you for your speedy and helpful replies, I had missed the bottleneck of snoc so reverting back to a String using : and then reversing at the end was the way, Instantly down to 0.1 seconds!

I’ll investigate linear-builder next as this is a project for learning new things.

8 Likes

Great! Hope you enjoy your Haskell journey.

1 Like

If one used unsafeCoerce to get around the linearity mismatch (evil, I know) would that affect compiler optimizations?