I’ve wanted to do this particular kata for a long time and finally got to it.
There are some post hoc notes I’ve written down if you’re interested in reading stream of grumpy remarks (with the pictures of the intermediate steps!).
I’ve wanted to do this particular kata for a long time and finally got to it.
There are some post hoc notes I’ve written down if you’re interested in reading stream of grumpy remarks (with the pictures of the intermediate steps!).
Thank you for your contribution. I enjoyed it completely. I love that you used massiv I think that it is a awesome library that demonstrates Haskell’s laziness allowing fusion and control over evaluation for parallel and concurrent programming. I’m not gonna lie geomancy is really giving me some SIMD envy. This is a good example of Haskell’s numerical strengths. I hope to contribute in this way more in the future. I also hope to see more contributions from you.
Found out there’s a gotcha then adding multiple
-with-rtsoptsargs in a package. Had to collect them into one pile of- '"-with-rtsopts=-s -N -A64m"'to avoid the repeated overwriting of it. Gotta check the rest of my packages for this…
This is #18117: It's not possible to specify multiple `-with-rtsopts` flags · Issues · Glasgow Haskell Compiler / GHC · GitLab, it might be helpful to chime in there.
The
intervalclass is just a pair of numbers, with extra functions. I briefly dived into Hackage archives, but the solutions are way to complex for the task.
There is indeed a zoo of libraries for intervals on Hackage. As a maintainer of data-interval: Interval datatype, interval arithmetic and interval-based containers, I’m curious if you can identify any possible improvements.
I ended up removing the Interval entirely. It was only needed to find the closest positive hit (if any).
The next book has BVH building chapter and I’m going to take data-interval for a walk.
What’s the next book?
Next two books, actually…
“The Next Week” is finished too and the notes are up.
I’ll have some rest now and think if I want to do the final book in the series, delve into PBR book instead, or… do something else entirely (like generating that SPIR-V disassembler from the specs).
Use SIMD, do ssssttttuuuuppppiiiidddd tttthhhhiiiinnnnggggssss faster, with more energy efficiency
A long-delayed post-Zurihac update. The only primops used so far are the basics that the NCG got in 9.12. No special layouts and fancy primops as they will only appear in 9.16 (and then I’ll need some more). Just do the regular thing four lanes at once. It’s a good start… oh wai~
It got better since then. The models took a repo with half-staged commits and made it running without segfaults. They also told me that I was absolutely right, but my BVH code is stupid and could be much simpler and faster. And also cooked benchmark harness which I was procrastinated since the very beginning. I love getting faster, but hate writing benchmarks ![]()
But one scene was quite resistant to the improvements, the finalSceneHigh from the 2nd book.
I wasn’t sure why and had to guess until I implemented a tiled scheduler. The abysmally slow tile was not where I expected it to be, but it persisted through most of the run at just 1/40 of the mean pixel rate. And also broke the tiled scheduler by leaving the last job working full steam, while the remaining 15 cores were idle.
I took a low-SPP run to measure the tiles and put the slowest first. Then subdivide the slowest half. Then subdivide the slowest quarter again. This mostly solved the idling cores, but the solution wasn’t satisfactory.
And the code was a mess. While refactoring it around I thought that I can skip the tile/subtile distinction and the grid itself and just work with arbitrary rectangles. That also served one of the long-standing project’s goals - binding the renderer to UI where I can select regions and get them rendered for debugging.
I still have no UI (or a job server to distribute the load over all of the household appliances around my house and my parents’ too), and no debugging. But after a few iterations the round limit and most of the manual size tweaking were gone as the system was optimizing itself by measuring and subdividing until the tiles are fast enough or too small.
The slow bunch is still slow, but the system now crushes it first and fills the scheduling gap at the end with the easier tiles.
I even got back and recorded the historical renders to make a timeline mini-site: https://dpwiz.gitlab.io/rtow/
This is far from over, but I’m not dead-inside wrt this codebase anymore.
Give that package a prize or something lucid-svg: DSL for SVG using lucid for HTML
Seriously underrated for debugging stuff for cheap.