Sunday, July 8, 2012

Speed vs interactivity

Wherever rendering is going on, there is a lot of focus on speed. Where development of a render engine is going on, even more so. While total render times are a big part of what makes or breaks a production, I’d like to address something I personally find even more important - interactivity.

People are more expensive than computers. It’s been said a thousand times in the context of rendering performance, but it’s a horse worth beating. Most render time is spent creating non final images. The ratio at most places I’ve worked is actually substantial. Since rendering is expensive, it means that effort spent lowering the number of times we render bad looking frames is as well spent as that lowering the time spent computing each of those frames in the first place. One of the first things one should look at to improve this is the turnaround lookdev and lighting artists have on their work, which often times is frustratingly slow - from minutes upwards of hours from the time you make a technical change until you can see the results of that change in the form of an image. This landscape is slowly changing with the introduction of production quality ray tracers in high end CG pipelines, but it’s still not all fun and games.

Going back to writing a render engine, it can be hard to implement fast turnarounds in practice, especially with the amount of pre pixel production setup we do to decrease algorithmic complexity and increase sampling efficiency later on. I’m playing around with a few things for speeding up the startup overhead of my interactive renders in Aurora. One of beauties of unbiased rendering, is that you can combine any number of results from various unbiased techniques and still be left with an unbiased result. Which means that as soon as you’ve parsed the scene description, you could send one thread off to do all the heavy lifting that’ll make the total render time shorter - building acceleration structures, storing caches for textures and sampling etc - while another couple of threads are suboptimally starting to render with the bare number of bells and whistles needed to start getting pixels in front of the user. Unbiased sampling efficiency knobs like roulette thresholds works the same way, and can start off low and be brought up later, once the user is happy with the initial result and hasn’t cancelled the render. Combined with an adaptive pixel sampling strategy even the slowest of engines (*points at own source code*) can produce reasonable approximations of the final image in very short time compared to a full render, all without introducing bias. Depending on how far you go it could add to the total time it'll take to converge to a final image, which is why this only makes sense for interactive renders where the user is very likely to stop the render and make adjustments before it has fully converged.

Some results, rendering at 512x512 resolution, 20 light bounces, about 70k triangles and 512 samples per pixel:



From looking at the absolute numbers I still have a long way to go before I'm hitting any kind of production quality speed, but the relative times for user feedback are getting reasonable considering I'm running these on an old Macbook Air.

I also finished refactoring the engine this weekend, and was pleasantly surprised to see render times lowered by a factor of three(!). Which says more about my initial attempt than the current one, I’m afraid, being that it was all restructuring and no magic features were added. A few hours of work later I have an initial implementation of multi threading, which on my dual core processor brought the speed up by an additional 80-90%. There’s still a ton of work that needs to be done on the performance, but I’m itching to get back on the feature side so they will have to wait. Next up is a couple of more brdfs so I can get more visually interesting materials than matte surfaces, and some new infinite area light features.

-Espen

No comments:

Post a Comment