Doug Binks - 31 Jan 2015
As usual, whilst working on one aspect of Avoyd I hit a hurdle and decided to take a break by tweaking some visuals - specifically looking at the normals for my surfaces. I added a step to generate face normals in the pixel shader using the derivatives of world space position [see: Normals without normals by Angelo Pesce and the Volumes of Fun wiki on Computing Normals], and immediately noticed precision issues when close to the surface. I'll demonstrate the issue and my quick fix which uses eye relative position instead of world space, before explaining what's happening in full.
Figure 1: The image on the left shows the face normals calculated in the pixel shader using the world space position, and on the right we take the eye relative world space position.
Anyone who's spent a fair amount of time with floating point numbers will be familiar with precision issues, so I realised that since I was taking deltas of two values which were close together I would best be served by having those values be near the origin. Rather than use the world space position I should be using an eye relative position, i.e. I emit eye_pos - world_pos from the vertex shader and use that to generate my normals.
I did this, and made an image for a tweet, getting some replies asking for further details, so I'm writing this up rather than working on my entity collision detection.
My original shaders (pseudo glsl) were:
// Vertex shader, much removed layout(std140) uniform uboWanKenobi { mat4 matModelToWorldToViewToProj; mat4 matModelToWorld; }; in vec3 in_pos; out vec3 world_pos; void main(void) { vec4 pos = vec4(in_pos, 1.0); gl_Position = matModelToWorldToViewToProj * pos; world_pos = (matModelToWorld * pos ).xyz; } // fragment shader, much removed in vec3 world_pos; out vec4 fragCol; void main(void) { vec3 dFdxPos = dFdx( world_pos ); vec3 dFdyPos = dFdy( world_pos ); vec3 facenormal = normalize( cross(dFdxPos,dFdyPos )); fragCol = vec4(facenormal*0.5 + 0.5,1.0); }
This outputs the world space position to the pixel shader, which then calculates the screen space derivatives of that to get the face normal.
The 24 bits of mantissa in a 32 bit float has approximately 7 places of precision. It turns out that this isn't enough to represent dFdx( world_pos ) or dFdy( world_pos ) to a precision where the error won't show on an 8-bit monitor (~3 decimal places of precision) for the circumstances I took the screenshot in. For more on floating points do check out Bruce Dawson's blog posts about floating point issues and Tom Forsyth's post on precision.
In the screenshot I'm using a first person camera (as Avoyd is a First Person Editor), and am fairly close to the surface so the edge of the nearby two triangles is showing a distance in world space coordinates of about 0.2f across on screen. The camera is at a position in world space of around ( 300.0f, 300.0f, 300.0f ). With the image being around 1000 pixels tall, the distance between each pixel is about 0.0002f. Note that 32 bit floats can represent this number to high accuracy, but they can't represent the difference between 300.00000f and 300.00002f to very high accuracy since this difference is at the 7th decimal place when written in floating point format as 3.0000002 * 10^2 (this should be done in binary, which we do below).
In other words, when you're taking the gradient of a value emitted by the vertex shader you're not looking at a gradient derived by taking ( P1 - P0 )/ num_pixels with P0 and P1 being the world space positions at vertices 0 and 1. You're instead taking the difference between the interpolated position at neighbouring pixels in the quad being rasterized. Naturally I'm talking GPUs here so there's some divergence in how this is done with some implementations only calculating one gradient for the whole quad and others doing the per-pixel calculation. Recent GLSL additions allow you to select these coarse and fine derivatives.
The solution is to move the absolute value of the quantity closer to the origin. If I'm comparing 0.20000f and 0.20002f then I have 3 more places of precision. You can do this by calculating world_pos - eye_pos. This works because things which are close take up more space than things which are far away, so you get the accuracy where you need it - close to your viewpoint. If you want even more accuracy, then calculate world_pos - eye_pos - eye_forwards*near_dist so that you get the full possible precision.
The corrected shaders become:
// Vertex shader, much removed layout(std140) uniform uboWanKenobi { mat4 matModelToWorldToViewToProj; mat4 matModelToWorld; vec3 eye_pos; // world space eye position }; in vec3 in_pos; out vec3 eye_relative_pos; void main(void) { vec4 pos = vec4(in_pos, 1.0); gl_Position = matModelToWorldToViewToProj * pos; eye_relative_pos = (matModelToWorld * pos ).xyz - eye_pos; } // fragment shader, much removed in vec3 eye_relative_pos; out vec4 fragCol; void main(void) { vec3 dFdxPos = dFdx( eye_relative_pos ); vec3 dFdyPos = dFdy( eye_relative_pos ); vec3 facenormal = normalize( cross(dFdxPos,dFdyPos )); fragCol = vec4(facenormal*0.5 + 0.5,1.0); }
Armed with some knowledge, runtime reloading of shaders, Runtime Compiled C++ and Mikko Mononen's excellent NanoVG I bring you this image demonstrating the problem of floating point accuracy and per-pixel gradients:
Figure 2: Drawing a graph of calculating dFdx(Pos) across a Pos width of 0.2f along 1024 pixels (only graphing a few hundred), along with displaying the normal problem and its solution.
Here I'm displaying a similar view to that from before, with a graph displayed with NanoVG using the values calculated using:
// precision test... float P = 0.2f; // set to 300.0f for float P0 = 000.0f + P; float P1 = 000.2f + P; const int N = 1024; float dFdxPos[N]; float dPA = (P1-P0)/(float)N; for(int i = 0; i < N; ++i) { float ti = (float)i/1024.0f; float tip1 = (float)(i+1)/1024.0f; float dt = tip1-ti; float Pi = P0+(P1-P0)*ti; float Pip1 = P0+(P1-P0)*tip1; dFdxPos[i] = Pip1 - Pi; } //then render graph of values with height 2.0f * dPA
Here I'm calculating dFdx( Pos ) by taking the finite difference between two values of Pos interpolated between P0 and P1 along 1024 points, and displaying them on a graph with height twice the calculated gradient from the positions at P0 and P1.
You can see that for the case where Pos is 300.0f, the graphed value of dFdx( Pos ) calculated per pixel jumps around the actual value.
Fixed point would help to some extent with these issues (much of the actual computation in the non programmable parts of the GPU is done with fixed point). With 32 bits of precision we get ~9 decimal places, so to get 3 places of accuracy for an 8-bit monitor over 1000 pixels spanning a world space of 0.2 we could have a distance of 10,000.2000 - i.e. about 10km in my mapping of 1.0 unit to a meter. This is less than the distances I need, so the fixed exponent would have to be varied for different draw calls to fit in the entire scene. For large scale scenes this type of solution is required anyway.
Use eye relative space where possible if you need floating point position values.
32 bits don't float your boat when the ocean is large.
An Interpolation shader stage before the pixel shader would be useful (this is currently possible in a geometry shader but there are issues).
You might ask me why I'm bothering to calculate face normals in the pixel shader. Well, I wanted to have non-smooth terrain since it's more readable given the particular voxel polygon generation of the scenery. I could do this by making face normals in my geometry, but this would lead to more vertices since I can't share them. I could generate them in a geometry shader, but the performance of this solution (mainly on Apple OS X) isn't suitable.