How to do text replacements with with regex-tdfa

regex-tdfa does not provide regex replacement out of the box. What is an easy way to do it?

I found

 "haystack" =~ "regex" :: (String, String, String)

and suppose I could use that, but I’m worried how well it performs when always spliting the text for each match into these three strings.

Is there another, more natural way to do replacements with regex-tdfa?

What happens if you have overlapping matches? That’s possible with regular expressions, no?

One can enumerate a MatchArray in one go and then use indices to slice the text up and put the haystack in between (assuming there is no overlap).

1 Like

Good point! I don’t think I have overlaps in my current use case though.

Good idea. Thanks heaps.

1 Like

The documentation for regex-tdfa says that it doesn’t support find-and-replace, so there is no official way to do replacement. So maybe there isn’t any method other than the ways mentioned so far.

But is there really a reason to be worried about the performance cost of splitting strings? I don’t think splitting a string in Haskell requires copying. At least, it shouldn’t require copying the tail end of the string, which is what would be expensive if you’re iterating over a long text to find multiple non-overlapping matches. (Although people say that the only way to really know about performance characteristics is to run the code and measure them.)

Also, if you’re concerned about performance, have you considered using Text instead of String? Text is often said to be faster and more memory-efficient, and internally it uses indexing to make substrings, so splitting Text should be efficient too.

I’m using Text, so good to know that splitting shouldn’t be an issue.

However, maybe I can avoid this whole workaround: I am right now thinking of switching to a regex lib which directly provides replacement. I might try pcre-heavy

There is the lesser-known regex library, which does provide replacements on top of regex-tdfa.

3 Likes

Also there’s a regex-tdfa-powered replace here.

Wow, looks like there are quite a few options out there…

Thanks for the feedback!

Here is a way to do text replacement with parsers instead of regex: Replace.Megaparsec.streamEdit.

I find that Megaparsec parsers are nicer to work with than regex for text replacement.

1 Like

You should be aware that Text.Regex.PCRE.Heavy.gsub works well for a single replacement, for multiple replacement it’s very slow.

Benchmarks

Incredible benchmarks! Pretty sad for all Regexes in Haskell…

2 Likes