Hitting performance issues with text maniplulation

Hi all, I’m very new to Haskell and I started writing a little command line program to test the waters.

It takes in a text file and rotates each character in the alphabet according to the Fibonacci sequence. It started unbearably slow as I calculated the sequence recursively for each Char, I got significant improvements by passing the last two values forward and then another small improvement switching from String to Text.

But I’ve hit a bit of an impasse I’m not sure where to start looking to make improvements. I arbitrarily expected to be able to convert my 1000 line / 80 col test file in under a second but I’m stuck around 5.5s

The majority of the code is in this file:

and I created a profile here:

I did try writing in JS out of curiosity and noticed I started hitting Infinity in my fib sequence, could it be an issue with adding very large numbers together? Any pointers would be greatly appreciated!

The docs for Data.Text.snoc say that it is O(n). Could that be (part of) the problem?

A good alternative to snoccing text is to use a text builder, for example @Bodigrim’s excellent linear-builder.

For the best performance you should use the linear Buffer API directly: Data.Text.Builder.Linear.Buffer. That way you can directly write into a mutable buffer avoiding most memory allocation (except for growing the buffer itself).

Edit: Although, I’m not sure if you can use it linearly inside the Text.foldl function.

That’s a powerful general solution. For this small example, which is all about accumulating single characters, I think I’d suggest consing onto a reversed list, and then unreversing at the end.

Thank you for your speedy and helpful replies, I had missed the bottleneck of snoc so reverting back to a String using : and then reversing at the end was the way, Instantly down to 0.1 seconds!

I’ll investigate linear-builder next as this is a project for learning new things.

Great! Hope you enjoy your Haskell journey.

If one used unsafeCoerce to get around the linearity mismatch (evil, I know) would that affect compiler optimizations?