Robust unit cube clipping for shadow mapping

crossposted to zeuxcg.blogspot.com

Shadow mapping is my primary area of interest in computer graphics, so expect more posts on this topic. Today I’d like to tell about robust unit cube clipping regarding different projection matrix building techniques.

The task at hand is relatively simple – given a set of points representing shadow receivers and a set of points representing shadow casters, build a matrix that, while used for shadow rendering, maximizes shadow map utilization (both in terms of shadow map area and depth precision) and at the same time eliminates all shadow clipping artifacts. I have not implemented anything except directional light support properly, so expect another post sooner or later.

Note that all points are assumed to be in world space, and for a lot of algorithms it’s preferred to take vertices of convex hull of receivers, clipped by view frustum, instead of actual receivers’ vertices – it’s not always required, but for the sake of simplicity we will assume that all receivers’ points are inside view frustum. Of course casters’ points are arbitrary.

Uniform Shadow Mapping

Uniform shadow mapping is shadow mapping with a simple orthographic projection, without any perspective reparametrization. As unit cube clipping is an operation that’s usually done *after* constructing some approximate matrix, let’s suppose we already have a view and projection matrix for our light configuration – note that in case of uniform light the only thing we care about is that view matrix has the same direction as the light, and the projection matrix represents some orthographic projection – everything else is irrelevant, so we can take arbitrary view position, more or less arbitrary up vector, construct a view matrix, and assume that projection matrix is actually identity matrix.

Now let’s construct two axis-aligned bounding boxes, one for receiver points, transformed to our light post-perspective space, another one for caster points, again in light PPS. Note that projection is orthographic, therefore there is no post-perspective division employed, and there are no singularities during AABB construction.

Now we have to build a new matrix that transforms our light viewprojection matrix to minimize shadow map wastage and Z precision loss, while preserving shadows. The actual choice of values depends on some states that are set while applying shadow map to scene.

At first, let’s deal with XY extents. Many papers propose choosing receivers’ XY extents, because we don’t care about points of casters that are outside receivers’ extents – i.e. can’t cast shadows on receivers. This provides correct results, but we can do slightly better – we can select intersection of casters’ and receivers’ XY extents:

float min_x = max(receivers.min_x, casters.min_x);
float min_y = max(receivers.min_y, casters.min_y);
float max_x = min(receivers.max_x, casters.max_x);
float max_y = min(receivers.max_y, casters.max_y);

What we get here is that if casters’ extents are “wider” than receivers’, we get the same extents as before. However, if you have a big receiver and relatively small casters on it, this will select smaller extents. Note that extents are always correct – they enclose all points that BOTH some caster and some receiver could project to – i.e. all points there you potentially could need shadow. Everywhere else there is no shadow.

Whether this is beneficial depends on scene type – you’ll usually get no benefit from this approach in real-life scenes if there is a single shadow map for the whole view – but if you have some kind of frustum splitting approach, depending on your scene, this could lead to quality improvement.

There is still a problem left – if you have CLAMP filtering set for your shadow map, this approach will cause visual bugs, because now some of receivers’ points are actually out of shadow map – so if there is a caster that fills border pixels with Z value that produces shadow, the shadow will stretch outwards. Solution? Either use BORDER filtering or when rendering to NxN shadowmap set a viewport with X = 1, Y = 1, width = N-2, height = N-2 – this will make sure that border pixels are not affected with casters. This is a small price to pay for potentially tighter frustum.

Now we have tight XY extents, and we’ll have to solve problems with Z.

Let’s suppose we’ve chosen ZN and ZF as our Z extents. This would mean that:

  1. All casters are clipped by ZN and ZF
  2. Shadow map contain depth values in range [0..1] in light PPS
  3. All receivers points that have Z
  4. All receivers points that have Z > ZF will have Z > 1 in light PPS

At first, we can’t let our casters be clipped by near plane – this would produce artifacts – so the resulting ZN value has to be less or equal to casters’ minimal Z value. If there are no receivers with Z ZF, the situation is opposite – pixel_depth is above 1, and the test always returns true, thus making all receivers points with Z > ZF shadowed. This means that there should be no receivers’ points with Z > ZF, so ZF is greater or equal than receivers.max_z.

Is there any reason to push ZF beyond that? No. No, because we don’t care about casters points that have Z > receivers.max_z – they are not going to cast shadows on any receiver anyway. Thus ZF = receivers.max_z.

Now that we have our XY and Z extents, we construct a scaling/biasing matrix that maps [min_x..max_x] x [min_y..max_y] x [ZN .. ZF] to [-1..1] x [-1..1] x [0..1] (or to [-1..1] x [-1..1] x [-1..1] in case you’re using OpenGL), and multiply your light viewprojection matrix by this matrix. By the way, the scaling/biasing matrix is of course equal to the corresponding orthographic projection matrix, so you can use existing functions like D3DXMatrixOrthoOffCenterLH to compute it.

Now that we’ve solved the problem for uniform shadow mapping, let’s move on to various perspective reparametrization algorithms.

Trapezoidal Shadow Mapping

The brief outline of TSM algorithm is as follows – construct light viewprojection matrix, transform receivers’ points to light PPS, approximate them by a trapezoid, construct a matrix that maps trapezoid to unit cube, multiply light viewprojection matrix by the trapezoid mapping matrix.

Thus applying unit cube clipping to TSM is very simple – you first construct a tight frustum for uniform shadow mapping (see above), and then use it for TSM, with no further corrections. This produces correct extents in the resulting matrix.

As we selected XY extents as intersection of casters’ and receivers’ extents, the slightly more correct approach would be to use not the receivers’ points for trapezoidal approximation, but rather points of the receivers’ volume, clipped by planes that correspond to the chosen XY extents. However, my experiments resulted in no significant quality improvement – texel distribution was good enough without this step.

Also note, that as TSM produces a frustum with relatively high FOV, the distortion of post-perspective W coordinate can affect Z coordinate badly. Some solutions for these problems are already presented in the original TSM paper (though expect another post about various methods for fixing Z errors).

Light-space Perspective Shadow Mapping

The brief outline of LiSPSM algorithm is as follows – construct light space with certain restrictions, transform all “interesting” points in the light space, build a perspective frustum that encloses all points, transform it back from light space to world space.

The problem is that if you treat only receivers’ points as “interesting”, you get shadow clipping due to Z extents; if you treat both receivers’ and casters’ points as “interesting”, the shadows are correct, but you get worse texel distribution. Also you can’t really fix Z extents AFTER you’ve computed the frustum – because perspective projection has singularities for points on plane with Z == 0, and occasionally some caster point gets near this plane and you get very high post-perspective Z extents, which screw the Z precision.

In short, I don’t know a good solution. If the light faces the viewer, we can use the approach for normal positional light for PSM (read below). Otherwise, it does not seem we can do anything. Currently I focus my frustum on both casters’ and receivers’ points. If you know a solution, I’d be more than happy to hear it.

Perspective Shadow Mapping

Note that I am not going to describe unit cube clipping for the PSM from the original paper by Stamminger and Drettakis, because there is a much better PSM algorithm, described in GPU Gems 1 by Simon Kozlov (Chapter 14, “Perspective Shadow Maps: Care and Feeding”).

The brief outline of the algorithm is as follows – construct virtual camera which is essentially a real camera slid back a bit to improve quality (to improve zf/zn ratio), transform light to virtual camera PPS. If it’s a directional light in PPS (which means that light’s direction was orthogonal to view direction), construct uniform shadow mapping matrix for the light direction in PPS (of course you should transform all casters and receivers to PPS). The resulting matrix should first transform incoming points to virtual camera PPS, and then transform them by uniform shadow mapping matrix.

If light is a positional one in PPS (which, of course, happens most often), then at first compute a bounding cone with center in light PPS around all receivers’ points (transformed to PPS again, of course). Then we have two cases – either light in PPS becomes an inverted one – that happens if light is shining from behind the viewer, i.e. Z coordinate of light direction in view space is positive – or light is a normal one. For further reference I suggest you read Stamminger and Drettakis paper (STAMMINGER M., DRETTAKIS G.: Perspective shadow maps. In Proceedings of SIGGRAPH(2002), pp. 557.562.).

If light is a normal one, then all casters that can possibly cast shadows on visible receivers’ regions are in front of virtual camera’s near plane – so we can construct a normal shadow mapping matrix from bounding cone parameters. If light is an inverted one, then usual shadow mapping matrix will not contain all casters; therefore Kozlov proposes to use an “inverse projection matrix” which is constructed by specifying –z_near as near plane value and z_near as far plane value. Again, for further reference I suggest you read his paper.

Now, we want to perform unit cube clipping, and there are actually three cases we have to resolve (directional light in PPS, positional light in PPS, inverted positional light in PPS). Why do we have problems? Well, at first, while all receivers points are inside original view frustum (and thus they are inside virtual view frustum), because we clipped receivers’ volumes, caster points are arbitrary, and so there are singularities for caster points with view space Z close to 0.

In case of normal positional light in PPS (directional light that shines in our face), we don’t care about casters that are beyond near plane; so we can clip our casters by near plane, and all caster points will be well-defined in post-projective space, which means that after we’ve computed the projection in PPS from bounding cone, we can find receivers’ and casters’ extents and use the same algorithm we had for uniform shadow mapping – just multiply the light matrix in PPS by unit cube clipping matrix (this will extend light frustum in PPS to hold all needed caster points).

Theoretically, you’ll need precise geometrical clipping of caster volumes by near plane. However, in practice, for my test scenes the simple approach of computing extents only for points in front of near plane worked well. Your mileage may vary.

In case of inverted positional light in PPS, we can no longer clip by near plane, because we’ll clip away potential casters. What we’d like to do instead is to take Z extents as in normal unit cube clipping for all caster points, and modify the shadow mapping matrix as usual. Why can’t we do that (and, as a matter of fact, why could not we do that for normal positional light in PPS, why do we need camera clipping)?

Well, the problem is, that if there are caster points with view-space Z near 0, the PPS Z extents of casters become very huge. As we use casters’ minimal Z value as our Z near value, this can lead to huge extents of unit cube clipping matrix, which makes Z precision very bad. For positional light in PPS, we avoid this by clipping casters by near plane, so that we no longer have casters with very big coordinates in PPS.

Luckily, with inverse projection matrix we can easily solve that – just clip casters’ minimal Z value with some small negative value, i.e. -10. (Note: here and below, “casters’ minimal Z value” refers to minimal Z value of casters transformed first to virtual camera PPS, and then by shadow mapping matrix).

Why does it work? Well, with normal perspective matrix, if we decrease Z near value’s magnitude (i.e. decrease the actual Z near value, as it’s positive for normal matrix), we enlarge our viewing frustum. It turns out that for inverse projection matrix, decreasing Z near value’s magnitude again enlarges viewing frustum – and clipping Z near by some negative value decreases its magnitude, while increasing the actual value.

Finally, if the light is a directional one in PPS, we can clip casters by camera’s near plane too, as we did in case of normal positional light.

Recap:

  1. If light is a directional one in PPS, construct uniform shadow mapping matrix as in first part of this post, only using casters clipped by camera near plane (either by geometrical clipping or just by throwing away caster points which are beyond near plane).
  2. If light is a positional one in PPS, construct bounding cone with center in light’s position that holds all receiver points. Then, for normal (non-inverted) light, construct shadow mapping matrix (view matrix: simple view matrix with eye position = light’s position, view direction = bounding cone direction, and arbitrary up vector; projection matrix: matrix with FOV wide enough to hold the whole bounding cone, and Z extents that encompass all receiver points), and perform unit cube clipping, only using casters clipped by camera near plane.
    For inverted light, construct an inverted projection matrix instead of a normal one, and instead of clipping casters clip casters’ minimal Z value.

Thanks to Simon Kozlov for suggesting solution for shadow clipping problems, basically all the text above consists of his ideas.

  • Sergei_am

    Немного обсуждения с ff по теме:

    [23:07] ну, камеру назад то все ведь двигают
    [23:07] вот и 2Д
    [23:07] правда у меня не факт что гладкое
    [23:10] у меня хуевъй код про тест качественная ли матрица или нет.. :)
    [23:10] как ни странно
    [23:10] у меня просто смотрится на отношение основ трапеции
    [23:10] оно должно бъть ближе к отношению расстояний между точками краев екрана, в world space
    [23:11] где верхний край екрана, ето самая верхнаяя видимая точка ландшафта справа и слева екрана
    [23:11] понятно, что когда справа горка, а слева равнина, оно дает немного худший результат… :)
    [23:12] “качественная ли матрица” – метрика что ли?
    [23:12] разве недостаточно тупо
    [23:12] да
    [23:12] брать область вблизи камеры
    [23:12] фиксированную
    [23:12] и смотреть на площадь?
    [23:12] и смотреть delta ?
    [23:13] ну да, только что подумав догадался об етом :)
    [23:13] на площадь в pps
    [23:13] а раньше – нет
    [23:13] да, именно так
    [23:13] надо только взять точку, ‘гарантированно’ из шедоумепа
    [23:13] т.е. кружочек, а не точку
    [23:13] угу
    [23:14] у меня более сериознъй проблем сейчас именно обхват сценъ
    [23:14] кастеръ втопку, из-за ландшафта – он весь кастер :)
    [23:15] а вот рисивер ландшафт… ето пока что не 100% решил
    [23:15] есть баги
    [23:15] ммм
    [23:15] какие баги?
    [23:15] * Zeux не понял проблемы
    [23:15] ну, тъ как обхватъваеш ландшафт?
    [23:16] как делаеш из него облако точек?
    [23:17] у меня нет ландшафта :( но я бы делал наверное N ббоксов
    [23:17] т.е. грид на ландшафте
    [23:17] и min/max height
    [23:17] сечеш фрустумом, точки convex-а кидаеш в шедоумеп
    [23:17] угу
    [23:17] ну, когда камера вблизи, патч на котором мъ – портит качество
    [23:17] скажем – может в 2 раза испортить :)
    [23:18] из-за z
    [23:18] из-за горки за спиной?
    [23:18] да
    [23:18] угу
    [23:18] из-за горки вокруг нас
    [23:18] из-за етого, я сделал по другому
    [23:19] я нахожу ландшафт – рейтрейсингом :)
    [23:20] вертикальными лучами, распределенными по фрустуму?
    [23:21] да
    [23:21] плотно с обоих сторон
    [23:21] реже посередине
    [23:21] примерно 60 лучей
    [23:21] я даже думал оптимизировать, ибо сейчас все лучи начинаются из камеръ
    [23:22] а ето бессмъслено, когда бегаем вертикально по ландшафту
    [23:22] много кеш мисов на 360
    [23:22] и всетаки
    [23:22] сейчас я бъ сделал дроблением близкого патча/ей на более мелкие боксъ
    [23:22] и опять все боксами
    [23:22] угу, я только хотел сказать :)
    [23:23] дробление на боксъ линейно, можно вообще on demand делать
    [23:24] можно даже пофиксить то что весь ландшафт кастер
    [23:24] тупо конусами – в патче по конусу наклона всех его нормалей
    [23:24] но – обхват И кастеров по xy даст скачки имхо
    [23:26] кстати, Zeux, тъ пробовал строить матрицу преобразования трапеции в квадрат? :)
    [23:26] по идее, если кастеры выходить будут за view, то рект будет по ресиверам все равно
    [23:26] ммм
    [23:26] трапеции?
    [23:26] ок, четърехугольника
    [23:26] произвольного
    [23:26] угу
    [23:26] много кода?
    [23:26] у меня один екран
    [23:27] хм
    [23:27] у меня один солвер матричный на 2 экрана
    [23:27] и то лузерский наверняка :)
    [23:28] у меня с этим подходом были проблемы, вот какие:
    [23:28] одна, минорная – там получается уравнение с 1 степенью свободы, 1 переменную можно произвольно выбирать – ее можно выбирать, чтобы не очень плохо было изменение W, пока не сделал
    [23:28] вторая, не минорная
    [23:29] какой солвер?!
    [23:29] я сейчас тот четырехугольник
    [23:29] http://www.everfall.com/paste/id.php?cqduim95vdkr
    [23:29] ну, гм
    [23:29] вот мое
    [23:29] 4-ехугольник в квадрат
    [23:29] ых
    [23:30] http://www.everfall.com/paste/id.php?gdbmkgfsxety
    [23:30] :)
    [23:32] а, ну…
    [23:32] я мое руками считал
    [23:33] трансляция одной точки в (0,0), ротация для стабильности, skew одной точки до (1,0), skew другой до (0,1), преспективное искажение третей до (1,1)
    [23:33] умножение нескольких таких простеньких матриц
    [23:33] ясно
    [23:33] результат на екране – развернутое Mathematic-ой
    [23:33] :)
    [23:33] так вот
    [23:33] я не знаю
    [23:34] откуда брать тот четырехугольник
    [23:34] :)
    [23:34] ну, у меня все симплистично
    [23:34] облако точек – ето само собой
    [23:34] симплистика в другом – в прямой по которой отодвигаем камеру
    [23:35] т.е. проецируем камеру – ето само собой
    [23:35] а вот отодвигать куда… масса направлений :)
    [23:35] линия взгляда – не катит
    [23:35] мммм
    [23:35] погоди
    [23:35] я писал там кучу евристик
    [23:35] что значит отодвигаем?
    [23:36] ну, я отодвигаю камеру – и из етой точки обхватъваю конусок облако
    [23:36] и нахожу четърехугольник
    [23:36] а.
    [23:36] *конусом
    [23:36] типа, две стороны конуса – две стороны четырехугольника
    [23:36] 100+ таких опътов – и берем лучший
    [23:36] ?
    [23:36] да
    [23:37] а min и max по направлению – другие 2
    [23:37] вооот, у меня есть такая же проблема, да :)
    [23:37] куда отодвигать
    [23:37] ну – в 3Д можно
    [23:37] по взгляду
    [23:38] однако если dueling frusta – ppc
    [23:38] по взгляду к сожалению есть corner cases
    [23:38] в tsm фактически по взгляду отодвигается
    [23:38] говно значит
    [23:38] ну, часто ок
    [23:38] но есть места, где ужас
    [23:38] я потом пробовал по линии что между средней точкой внизу и наверху (на горизонте)
    [23:38] лучше
    [23:39] ибо ландшафт диктует форму сценъ
    [23:39] в тсм точнее два варианта, либо по вью, либо по направлению собственно того конуса
    [23:39] и совпадание солнца и взгляда – пох
    [23:39] с центром в глазе
    [23:39] какого конуса?
    [23:40] его ведь надо собрать из чего-то
    [23:40] ну, строишь вокруг ресиверов из глаза конус
    [23:40] и двигаешь по направлению его, а не по вью
    [23:40] но тоже отстой.
    [23:40] ну ето фрустум
    [23:40] в общем случае
    [23:40] при сильно забитом фрустуме та же фигня
    [23:40] угу
    [23:41] О(N^2) не хотелось в O(N^3) делать
    [23:41] по точкам
    [23:41] N^2?!?
    [23:41] где? :)
    [23:41] точки * отодвигания
    [23:42] а, типа отодвигать на каждый вектор между точками?
    [23:42] если еще и начнем крутить направление…
    [23:42] не, отодвигать по направлению
    [23:42] я отодвигаю назад на расстояние всей сценъ – раз 100
    [23:42] т.е. да, 100*O(N)
    [23:43] угу
    [23:43] так вот, направление, между центрами нижней и верхней точкой – намного лучше чем взгляд
    [23:43] если ландшафтоподобнъе сценъ
    [23:43] у меня есть пока нечто, что работало у меня на тестовых сценах, но в общем случае будет лажать неимоверно, увы
    [23:43] камера – без значения
    [23:43] я напишу впрочем скоро в блог :)
    [23:44] кстати
    [23:44] можно построить hull
    [23:44] дада
    [23:44] и потом пробовать его оптимайзить
    [23:44] так я и делаю
    [23:44] дадада.
    [23:44] функцией распределения
    [23:44] :)
    [23:44] :)
    [23:44] нет
    [23:44] :)
    [23:44] что значит оптимайзить функцией распределения? :)
    [23:45] ну, когда въбираеш точку, которую убрать – смотриш на результантную матрицу
    [23:45] считаеш ошибку
    [23:45] а, угу
    [23:45] очевидное, да
    [23:45] да, у меня была идея такая, но пока заломало
    [23:45] заметь – ето один способ
    [23:45] я тупо его упрощаю
    [23:45] никто не мешает, делать как в Сталкере
    [23:45] запустить несколько евристик
    [23:45] считать много всего и выбирать лучшее
    [23:45] и взять лучшею
    [23:46] в идеале так и надо в общем, одна эвристика – ломается всяко где-нибудь
    [23:46] у меня пока плохо с обьектами :)
    [23:46] там не бокс, а фейс :)
    [23:46] top фейс бокса, сечение только с near,left,right
    [23:47] надо убить хак и написать честно, для обьектов влизи
    [23:48] а то въсокие ломают качество

  • Sergei_am

    Блин, исчезли ники участников – я и Zeux.