Re routing: A path-element based trie is a pretty obvious optimization, and shouldn’t be difficult to implement. You would just need a “helper” middleware that takes a dictionary of “subdirectories” to sub-handlers, inspect the request to pop off the first path element, look up the matching sub-handler (and serve a 404 if it doesn’t exist), and delegate to that sub-handler. And then you can nest those dispatchers for whatever tree depth your routes need. I’ve done this against plain WAI before, and it works fine. The trickiest part is deciding how you do the “popping” - you can either actually modify the request, so that for each nested sub-handler, the first element in the path it processes is the one that it cares about; the advantage is that the sub-handler is blissfully unaware of the part of the route that it doesn’t need to care about, but the downside is that if you’re serving something that needs to link to other resources, then absolute links you generate based on the “current path” as per the HTTP request will be wrong. Alternatively, you can add the handler’s local path to its configuration, and make it skip that prefix when processing a request - downside of that is that it makes the setup a bit awkward.
Re caching: you can just make a caching middleware and run your WAI application behind it for full-response caching, and this should give you the same kind of performance that you’d get from any other web framework’s full-response caching. IIRC, there are ready-to-use packages on hackage for that, but don’t quote me on that. If you want to cache fragments, then you will need a solution specific to whatever HTML rendering system you use anyway, but whatever you are using, integrating it with WAI (or Twain, or Scotty, or whatever you end up using) should be fairly straightforward.