“Arithmetic! Algebra! Geometry! Grandiose trinity! Luminous triangle! Whoever has not known you is without sense!” –Comte de Lautréamont

Today, we’re going to make a big leap. We’re going beyond the purely spherical structures and the infinite plane we have been tracing so far, and introduce triangles – the essence of modern computer graphics, the element that entire virtual words are comprised of. If you want to pick up where we left off last time, use the code of part 2. The finished code of what we’re going to do today can be found here. Let’s go!

The definition of a *triangle *is simple: It’s merely a list of three connected *vertices*, each of which stores its position and, later on, normal. The winding order of the vertices from your point of view determines whether you’re looking at the front or the back. Traditionally, counter-clockwise winding order is considered ‘front’.

First, we need to be able to tell whether a ray hits a triangle, and where. A very popular (but certainly not the only) algorithm for determining ray-triangle intersections has been introduced by Gentlemen Tomas Akenine-Möller and Ben Trumbore in 1997. You can read all the details in the paper ‘Fast, Minimum Storage Ray-Triangle Intersection’ here.

The code from the paper can easily be ported to HLSL shader code:

static const float EPSILON = 1e-8; bool IntersectTriangle_MT97(Ray ray, float3 vert0, float3 vert1, float3 vert2, inout float t, inout float u, inout float v) { // find vectors for two edges sharing vert0 float3 edge1 = vert1 - vert0; float3 edge2 = vert2 - vert0; // begin calculating determinant - also used to calculate U parameter float3 pvec = cross(ray.direction, edge2); // if determinant is near zero, ray lies in plane of triangle float det = dot(edge1, pvec); // use backface culling if (det < EPSILON) return false; float inv_det = 1.0f / det; // calculate distance from vert0 to ray origin float3 tvec = ray.origin - vert0; // calculate U parameter and test bounds u = dot(tvec, pvec) * inv_det; if (u < 0.0 || u > 1.0f) return false; // prepare to test V parameter float3 qvec = cross(tvec, edge1); // calculate V parameter and test bounds v = dot(ray.direction, qvec) * inv_det; if (v < 0.0 || u + v > 1.0f) return false; // calculate t, ray intersects triangle t = dot(edge2, qvec) * inv_det; return true; }

To use this function, you need a ray and the three vertices of a triangle. The return value tells you whether the triangle is hit or not. In case of a hit, three additional values are calculated: `t`

describes the distance along the ray to the hit point, and `u`

/ `v`

are two of the three barycentric coordinates that specify the location of the hit point on the triangle (where the last one can be calculated as `w = 1 - u - v`

). If you don’t know about barycentric coordinates yet, read an excellent explanation on Scratchapixel.

Without further ado, let us trace a single, hard-coded triangle! Find your shader’s `Trace`

function and add the following snippet:

// Trace single triangle float3 v0 = float3(-150, 0, -150); float3 v1 = float3(150, 0, -150); float3 v2 = float3(0, 150 * sqrt(2), -150); float t, u, v; if (IntersectTriangle_MT97(ray, v0, v1, v2, t, u, v)) { if (t > 0 && t < bestHit.distance) { bestHit.distance = t; bestHit.position = ray.origin + t * ray.direction; bestHit.normal = normalize(cross(v1 - v0, v2 - v0)); bestHit.albedo = 0.00f; bestHit.specular = 0.65f * float3(1, 0.4f, 0.2f); bestHit.smoothness = 0.9f; bestHit.emission = 0.0f; } }

As mentioned, `t`

stores the distance along the ray, and we can directly use it to calculate the hit point. The normal, which is important for correct reflection, can be obtained using the cross product of any two triangle edges. Enter play mode and enjoy your first self-traced triangle:

**Exercise:** Try to calculate the position using barycentric coordinates instead of distance. If you’re doing it right, the glossy triangle looks exactly the same as before.

We’ve overcome the first hurdle, but tracing full-blown triangle meshes is another story. We need to learn some basic things about meshes first. If you are familiar with this, feel free to skim over this next paragraph.

In computer graphics, a mesh is defined by a number of buffers, the most important ones being the *vertex *and *index *buffer. The *vertex buffer *is a list of 3D vectors, describing each vertex’s position in *object space* (meaning that those values won’t need to be changed when you translate, rotate or scale the object – they are transformed from *object space* to *world space* on the fly using matrix multiplication instead). The *index buffer *is a list of integers that are *indices *pointing into the vertex buffer. Every three indices make up a triangle. For example, if the index buffer is [0, 1, 2, 0, 2, 3], there are two triangles: The first triangle consists of the first, second and third vertex in the vertex buffer, while second triangle consists of the first, third and fourth vertex. The index buffer thus also defines the aforementioned winding order. In addition to the vertex and index buffer, additional buffers can add information to each vertex. The most common additional buffers store *normals*, *texture coordinates* (knows as *texcoords *or simply *UVs*) and *vertex colors*.

The first thing we need to do is to actually find out about the GameObjects that should be part of the ray tracing process. The naive solution would be to just do a `FindObjectOfType<MeshRenderer>()`

, but we’ll go for something a little more flexible and fast. Let’s add a new component `RayTracingObject`

:

using UnityEngine; [RequireComponent(typeof(MeshRenderer))] [RequireComponent(typeof(MeshFilter))] public class RayTracingObject : MonoBehaviour { private void OnEnable() { RayTracingMaster.RegisterObject(this); } private void OnDisable() { RayTracingMaster.UnregisterObject(this); } }

This component is added to each object that we want to use in our ray tracing and takes care of registering them with the `RayTracingMaster`

. Add these functions in the master:

private static bool _meshObjectsNeedRebuilding = false; private static List<RayTracingObject> _rayTracingObjects = new List<RayTracingObject>(); public static void RegisterObject(RayTracingObject obj) { _rayTracingObjects.Add(obj); _meshObjectsNeedRebuilding = true; } public static void UnregisterObject(RayTracingObject obj) { _rayTracingObjects.Remove(obj); _meshObjectsNeedRebuilding = true; }

So far, so good – we know which objects to trace. Now comes the gritty part: We are about to gather all the data from Unity’s Meshes (matrix, vertex and index buffer, remember?), put them into our own data structures and upload them to the GPU so the shader can use them. Let’s start with our data structures and buffer definitions on the C# side, in the master:

struct MeshObject { public Matrix4x4 localToWorldMatrix; public int indices_offset; public int indices_count; } private static List<MeshObject> _meshObjects = new List<MeshObject>(); private static List<Vector3> _vertices = new List<Vector3>(); private static List<int> _indices = new List<int>(); private ComputeBuffer _meshObjectBuffer; private ComputeBuffer _vertexBuffer; private ComputeBuffer _indexBuffer;

…and let’s do the same thing in the shader. You’re used to this by now, aren’t you?

struct MeshObject { float4x4 localToWorldMatrix; int indices_offset; int indices_count; }; StructuredBuffer<MeshObject> _MeshObjects; StructuredBuffer<float3> _Vertices; StructuredBuffer<int> _Indices;

Our data structures are in place, so we can fill them with actual data now. We’re gathering all vertices from all meshes into one big `List<Vector3>`

, and all indices into a big `List<int>`

. While this is no problem for the vertices, we need to adjust the indices so that they still point to the proper vertex in our big buffer. As an example, imagine we have already added objects worth 1000 vertices so far, and now we’re adding a simple cube mesh. The first triangle might consist of the indices [0, 1, 2], but since we already have 1000 vertices in our buffer before we even start adding the cube’s vertices, we need to shift the indices, thus becoming [1000, 1001, 1002]. Here’s how it looks in code:

private void RebuildMeshObjectBuffers() { if (!_meshObjectsNeedRebuilding) { return; } _meshObjectsNeedRebuilding = false; _currentSample = 0; // Clear all lists _meshObjects.Clear(); _vertices.Clear(); _indices.Clear(); // Loop over all objects and gather their data foreach (RayTracingObject obj in _rayTracingObjects) { Mesh mesh = obj.GetComponent<MeshFilter>().sharedMesh; // Add vertex data int firstVertex = _vertices.Count; _vertices.AddRange(mesh.vertices); // Add index data - if the vertex buffer wasn't empty before, the // indices need to be offset int firstIndex = _indices.Count; var indices = mesh.GetIndices(0); _indices.AddRange(indices.Select(index => index + firstVertex)); // Add the object itself _meshObjects.Add(new MeshObject() { localToWorldMatrix = obj.transform.localToWorldMatrix, indices_offset = firstIndex, indices_count = indices.Length }); } CreateComputeBuffer(ref _meshObjectBuffer, _meshObjects, 72); CreateComputeBuffer(ref _vertexBuffer, _vertices, 12); CreateComputeBuffer(ref _indexBuffer, _indices, 4); }

Call `RebuildMeshObjectBuffers`

in the `OnRenderImage`

function, and don’t forget to release the new buffers in `OnDisable`

. Here are the two helper functions I used in the above code to make buffer handling a little easier:

private static void CreateComputeBuffer<T>(ref ComputeBuffer buffer, List<T> data, int stride) where T : struct { // Do we already have a compute buffer? if (buffer != null) { // If no data or buffer doesn't match the given criteria, release it if (data.Count == 0 || buffer.count != data.Count || buffer.stride != stride) { buffer.Release(); buffer = null; } } if (data.Count != 0) { // If the buffer has been released or wasn't there to // begin with, create it if (buffer == null) { buffer = new ComputeBuffer(data.Count, stride); } // Set data on the buffer buffer.SetData(data); } } private void SetComputeBuffer(string name, ComputeBuffer buffer) { if (buffer != null) { RayTracingShader.SetBuffer(0, name, buffer); } }

Great, we have our buffers, and they are filled with the required data! Now we just need to tell the shader about it. In `SetShaderParameters`

, add the following code (and, thanks to our new helper function, you can shorten the code for the sphere buffer too while you’re at it):

SetComputeBuffer("_Spheres", _sphereBuffer); SetComputeBuffer("_MeshObjects", _meshObjectBuffer); SetComputeBuffer("_Vertices", _vertexBuffer); SetComputeBuffer("_Indices", _indexBuffer);

Phew. That was tedious, but take a look at what we just did: We gathered all the internal data of our meshes (matrix, vertices and indices), put them in a nice and simple structure and sent it to the GPU, that is now eagerly waiting to make use of it.

Let’s not keep the GPU waiting. We already have code to trace a single triangle in the shader, and a mesh really is just a bunch of them. The only new thing here is that we use our matrix to transform the vertices from object to world space using the intrinsic function `mul`

(for multiply). The matrix contains translation, rotation and scale of the object. It is 4×4, so we need a 4d vector for the multiplication. The first three components (x, y, z) are taken from our vertex buffer. We set the fourth component (w) to 1, because we’re dealing with a point. If it was a direction, we’d put a 0 there to ignore any translation and scale in the matrix. Confused? Read this tutorial at least eight times. Here’s the shader code:

void IntersectMeshObject(Ray ray, inout RayHit bestHit, MeshObject meshObject) { uint offset = meshObject.indices_offset; uint count = offset + meshObject.indices_count; for (uint i = offset; i < count; i += 3) { float3 v0 = (mul(meshObject.localToWorldMatrix, float4(_Vertices[_Indices[i]], 1))).xyz; float3 v1 = (mul(meshObject.localToWorldMatrix, float4(_Vertices[_Indices[i + 1]], 1))).xyz; float3 v2 = (mul(meshObject.localToWorldMatrix, float4(_Vertices[_Indices[i + 2]], 1))).xyz; float t, u, v; if (IntersectTriangle_MT97(ray, v0, v1, v2, t, u, v)) { if (t > 0 && t < bestHit.distance) { bestHit.distance = t; bestHit.position = ray.origin + t * ray.direction; bestHit.normal = normalize(cross(v1 - v0, v2 - v0)); bestHit.albedo = 0.0f; bestHit.specular = 0.65f; bestHit.smoothness = 0.99f; bestHit.emission = 0.0f; } } } }

We’re just one step away from actually seeing all this in action. Let’s restructure our `Trace`

function a little and add the tracing of mesh objects:

RayHit Trace(Ray ray) { RayHit bestHit = CreateRayHit(); uint count, stride, i; // Trace ground plane IntersectGroundPlane(ray, bestHit); // Trace spheres _Spheres.GetDimensions(count, stride); for (i = 0; i < count; i++) { IntersectSphere(ray, bestHit, _Spheres[i]); } // Trace mesh objects _MeshObjects.GetDimensions(count, stride); for (i = 0; i < count; i++) { IntersectMeshObject(ray, bestHit, _MeshObjects[i]); } return bestHit; }

That’s it! Let’s add some simple meshes (Unity’s primitves work just fine), give them a `RayTracingObject`

component and watch the magic happen. **Don’t** use any detailed meshes (more than a few hundred triangles) yet! Our shader is missing the proper optimization and it might take seconds or even minutes to trace one sample per pixel if you go overboard. The result is that your GPU driver will be killed by the system, Unity potentially crashes and your machine needs rebooting.

Notice that our meshes are not smooth, but flat-shaded. Since we didn’t upload the vertices’ normals to a buffer yet, we need to use the cross product to get each triangle’s normal individually and can’t interpolate across the triangle area. We will take care of this issue in the next part of this tutorial series.

For the fun of it, I’ve downloaded the Stanford Bunny from Morgan McGuire’s archive and reduced it to 431 triangles using Blender‘s decimate modifier. You can play around with the light settings and the hard-coded material in the shader’s `IntersectMeshObject`

function. Here’s a dielectric bunny with nice soft shadows and subtle diffuse GI in the Grafitti Shelter:

…and a metallic bunny under the strongly directional light of Cape Hill that casts some disco-like spots on the floor plane:

…and two little bunnies hiding under a big rocky Suzanne under the blue sky of Kiara 9 Dusk (I hard-coded an alternative material for the first object by checking whether the index offset is 0):

It’s pretty cool to see a real mesh in your own ray tracer for the first time, isn’t it? We have juggled quite some data today, learned about the Möller-Trumbore intersection and integrated everything so that Unity’s GameObjects can be used right away. We have also seen one of the beautiful sides of ray tracing: As soon as you integrate a new intersection, all the cool effects (soft shadows, specular and diffuse GI etc.) just work.

Rendering the glossy bunny took forever, and I still had to use some slight filtering on the result to remove the most jarring noise. To overcome this, the scene is usually structured into a spatial structure like a grid, kD tree or a bounding volume hierarchy that considerably speeds up the rendering of larger scenes.

But first things first: What we should do next is to fix the normals, so our meshes (even if low-poly) look smoother than they do now. Automatic update of the matrices when objects are moved and an actual coupling to Unity’s materials instead of just a single hard-coded one also sound like a good idea. We will take care of this in the next part of this series.

You’ve made it this far. Thank you, and see you in part 4!

]]>*“There is nothing worse than sharp image of a fuzzy concept.” – Ansel Adams*

In the first part of this series, we created a Whitted ray tracer capable of tracing perfect reflections and hard shadows. What’s missing are the fuzzy effects: diffuse interreflection, glossy reflections and soft shadows.

Building upon the code we already have, we will iteratively solve the rendering equation formulated my James Kajiya in 1986 and transform our renderer into a *path tracer* able to capture the mentioned effects. Again, we will be using C# for scripts and HLSL for shaders. Code is hosted on Bitbucket.

This article is substantially more mathematical than the previous one, but don’t be afraid. I will try to explain every formula as best as I can. The formulas are there to help you see what’s going on and *why* our renderer works, so I advise you to try to understand them and ask away in the comment section should anything be unclear.

The following image is rendered using HDRI Haven’s Graffiti Shelter. The other images in this article are rendered using Kiara 9 Dusk.

Formally, the task of a photorealistic renderer is solving the rendering equation, which can be written as follows:

$$L(x, \, \vec \omega_{o}) = L_e(x, \, \vec \omega_{o}) + \int_{\Omega}{f_r(x, \, \vec \omega_{i}, \, \vec \omega_{o}) \, (\vec \omega_{i} \cdot \vec n) \, L(x, \, \vec \omega_{i}) \, d\vec \omega_{i}}$$

Let’s break it down. Ultimately we want to determine the brightness of a pixel on the screen. The rendering equation gives us the amount of light \(L(x, \, \vec \omega_{o})\) going from point \(x\) (a ray’s hit point) in direction \(\vec \omega_{o}\) (the direction the ray is coming from). The surface might be a light source itself, emitting light \(L_e(x, \, \vec \omega_{o})\) in our direction. Most surfaces don’t, so they only reflect light coming from outside. This is what the integral is for. Intuitively, it accumulates the light coming from every possible direction in the hemisphere \(\Omega\) around the normal (so for now we’re only considering light reaching the surface from *above*, not from *below* which would be required for translucent materials).

The first part \(f_r\) is called the *bidirectional reflectance distribution function* (BRDF). This function visually describes the kind of material we are dealing with – metal or dielectric, bright or dark, glossy or dull. The BRDF defines which proportion of light coming from \(\vec \omega_{i}\) is reflected in direction \(\vec \omega_{o}\). In practice, this is handled using a 3-component vector for the amount of red, green and blue light, each in range \([0,1]\).

The second part \((\vec \omega_{i} \cdot \vec n)\) is equivalent^{1} to \(cos \theta\) where \(\theta\) is the angle between incoming light and surface normal \(\vec n\). Imagine a beam of parallel light hitting a surface head-on. Now imagine the same beam hitting the surface at a flat angle. The light will spread over a larger area, but that also means that each point in this area appears darker than before. The cosine accounts for this.

Finally, the actual light coming from \(\vec \omega_{i}\) is determined recursively using the same equation. So the light at point \(x\) depends on incoming light from all possible directions in the upper hemisphere. In each of those directions from point \(x\) lies another point \(x\prime\), for which the brightness again depends on the incoming light from all possible directions in that point’s upper hemisphere. Rinse and repeat.

So here it is. An infinitely recursive integral equation over infinitely many hemispherical integration domains. We won’t be able to solve this equation directly, but there is a fairly simple solution.

^{1}Remember this! We will be talking about the cosine quite a lot, and when we do, we mean the dot product. Since \(\vec{a}\cdot\vec{b}=\|\vec{a}\|\ \|\vec{b}\|\cos(\theta)\) and we are dealing with *directions* (unit-length vectors), the dot product *is* the cosine for most purposes in Computer Graphics.

Monte Carlo integration is a numerical integration technique that allows us to estimate any integral using a finite number of random samples. Moreover, Monte Carlo guarantees convergence towards the correct solution – the more samples you take, the better. Here is the general form:

$$F_N \approx \frac{1}{N} \sum_{n=0}^{N}{\frac{f(x_n)}{p(x_n)}}$$

The integral of a function \(f(x_n)\) can thus be estimated by averaging random samples over the integration domain. Each sample is divided by the probability \(p(x_n)\) of it being chosen. This way, a sample that is frequently selected will be weighted less than a sample that is chosen only rarely.

With uniform sampling over the hemisphere (each direction has the same probability of being selected), the probability of samples is constant: \(p(\omega) = \frac{1}{2 \pi}\) (because \(2 \pi\) is the surface area of the unit hemisphere). If you bring it all together, that’s what you get:

$$L(x, \, \vec \omega_{o}) \approx L_e(x, \, \vec \omega_{o}) + \frac{1}{N} \sum_{n=0}^{N}{\color{Green}{2 \pi \, f_r(x, \, \vec \omega_{i}, \, \vec \omega_{o}) \, (\vec \omega_{i} \cdot \vec n)} \, L(x, \, \vec \omega_{i})}$$

The emission \(L_e(x, \, \vec \omega_{o})\) is simply the return value of our `Shade`

function. The \(\frac{1}{N}\) is what happens already in our `AddShader`

. The multiplication with \(L(x, \, \vec \omega_{i})\) happens when we reflect the ray and trace it further. Our mission is to fill the green part of the equation with some life.

Before we can embark on our adventures, let’s take care of some things: sample accumulation, deterministic scenes and shader randomness.

For some reason or another, Unity wouldn’t give me an HDR texture as `destination`

in `OnRenderImage`

. The format for me was `R8G8B8A8_Typeless`

, so the precision would quickly be too low for adding up more than a few samples. To overcome this, let’s add `private RenderTexture _converged`

to our C# script. This will be our buffer to accumulate the results with high precision, before displaying it on screen. Initialize / release this texture exactly the same as `_target`

in the `InitRenderTexture`

function. In the `Render`

function, double the blit:

Graphics.Blit(_target, _converged, _addMaterial); Graphics.Blit(_converged, destination);

When you make a change to your rendering, it helps to compare it to previous results to judge the effect. Currently, we will be presented with a new random scene every time we restart play mode or recompile scripts. To overcome this, add a `public int SphereSeed`

to your C# script and add the following line at the beginning of `SetUpScene`

:

Random.InitState(SphereSeed);

You can now manually set the seed of the scene. Enter any number and disable / reenable the `RayTracingMaster`

component until you have found a scene that you like.

The settings used for the example images are: Sphere Seed 1223832719, Sphere Radius [5, 30], Spheres Max 10000, Sphere Placement Radius 100.

Before we can start any stochastic sampling, we need randomness in our shader. I’m using the canonical one-liner I found on the web somewhere, modified for more convenience:

float2 _Pixel; float _Seed; float rand() { float result = frac(sin(_Seed / 100.0f * dot(_Pixel, float2(12.9898f, 78.233f))) * 43758.5453f); _Seed += 1.0f; return result; }

Initialize `_Pixel`

directly in `CSMain`

as `_Pixel = id.xy`

, so each pixel will use different random values. `_Seed`

is initialized from C# in the `SetShaderParameters`

function.

RayTracingShader.SetFloat("_Seed", Random.value);

The quality of the random numbers we generate here is uncertain. It would be worth investigating and testing this function, analyzing the effect of the parameters and comparing it to other approaches. For the time being, we’ll just use it and hope for the best.

First things first: We’ll need random directions uniformly distributed on the hemisphere. For the full sphere, this non-trivial challenge is described in detail in this article by Cory Simon. It is easily adapted to the hemisphere. Here’s the shader code:

float3 SampleHemisphere(float3 normal) { // Uniformly sample hemisphere direction float cosTheta = rand(); float sinTheta = sqrt(max(0.0f, 1.0f - cosTheta * cosTheta)); float phi = 2 * PI * rand(); float3 tangentSpaceDir = float3(cos(phi) * sinTheta, sin(phi) * sinTheta, cosTheta); // Transform direction to world space return mul(tangentSpaceDir, GetTangentSpace(normal)); }

The directions are generated for a hemisphere centered around positive Z, so we need to transform it to be centered around whatever normal we need. To this end, we generate a tangent and binormal (two vectors orthogonal to the normal and orthogonal to each other). We first choose a helper vector to generate the tangent. We take positive X for this, and only fall back to positive Z if the normal is (nearly) coaligned with the X axis. Then we can use the cross product to generate a tangent and subsequently a binormal.

float3x3 GetTangentSpace(float3 normal) { // Choose a helper vector for the cross product float3 helper = float3(1, 0, 0); if (abs(normal.x) > 0.99f) helper = float3(0, 0, 1); // Generate vectors float3 tangent = normalize(cross(normal, helper)); float3 binormal = normalize(cross(normal, tangent)); return float3x3(tangent, binormal, normal); }

Now that we have our uniform random directions, we can start implementing the first BRDF. The Lambert BRDF is the most commonly used one for diffuse reflection, and it’s strikingly simple: \(f_r(x, \, \vec \omega_{i}, \, \vec \omega_{o}) = \frac{k_d}{\pi}\), where \(k_d\) is the albedo of the surface. Let’s insert it into our Monte Carlo rendering equation (I’ll drop the emission term for now) and see what happens:

$$L(x, \, \vec \omega_{o}) \approx \frac{1}{N} \sum_{n=0}^{N}{\color{BlueViolet}{2 k_d} \, (\vec \omega_{i} \cdot \vec n) \, L(x, \, \vec \omega_{i})}$$

Let’s put this in our shader rightaway. In the `Shade`

function, replace the code inside the `if (hit.distance < 1.#INF)`

clause with the following lines:

// Diffuse shading ray.origin = hit.position + hit.normal * 0.001f; ray.direction = SampleHemisphere(hit.normal); ray.energy *= 2 * hit.albedo * sdot(hit.normal, ray.direction); return 0.0f;

The new direction of the reflected ray is determined using our uniform hemisphere sampling function. The ray’s energy is multiplied with the relevant part of the equation above. Since the surface does not emit any light (but only reflects the light it receives directly or indirectly from the sky), we return 0 here. Remember that our `AddShader`

averages the samples for us, so we don’t need to care about \(\frac{1}{N} \sum\). The `CSMain`

function already contains the multiplication with \(L(x, \, \vec \omega_{i})\) (the next reflected ray), so there’s not much for us to do.

The `sdot`

function is a simple utility that I have defined for myself. It simply returns the result of the dot product, with an optional factor and then clamped to \([0,1]\):

float sdot(float3 x, float3 y, float f = 1.0f) { return saturate(dot(x, y) * f); }

Let’s recap what the code does so far. `CSMain`

generates our primary camera rays and calls `Shade`

. If a surface is hit, this function will in turn generate a new ray (uniform random in the hemisphere around the normal) and factor the material’s BRDF and the cosine into the ray’s energy. If the sky is hit, we’ll sample the HDRI – our only light source – and return the light, which is multiplied with the ray’s energy (i.e. the product of all prior hits starting from the camera). This is a single sample that is blended with the converged result. In the end, each sample has a contribution of \(\frac{1}{N}\).

Time to try it out. Since metals don’t have any diffuse reflection, let’s disable them for now in our C# script’s `SetUpScene`

function (still calling `Random.value`

here to preserve scene determinism):

bool metal = Random.value < 0.0f;

Enter play mode and see how the initially noisy image clears up and converges to a nice rendering like this:

Not so bad for a few lines of code (and some careful math – I see you’re slowly becoming friends). Let’s spice things up with specular reflections by adding a Phong BRDF. The original Phong formulation had its share of problems (not reciprocal, not energy-conserving), but thankfully, other people took care of that. The modified Phong BRDF looks like this, where \(\vec \omega_{r}\) is the perfectly reflected light direction and \(\alpha\) is the Phong exponent controlling the roughness:

$$f_r(x, \, \vec \omega_{i}, \, \vec \omega_{o}) = k_s \, \frac{\alpha + 2}{2 \pi} \, (\vec \omega_{r} \cdot \vec \omega_{o})^\alpha$$

Here is a little 2D graph displaying what the Phong BRDF for \(\alpha=15\) looks like for an incident ray at 45° angle. Click in the bottom right corner to change the \(\alpha\) value yourself.

Plug it into our Monte Carlo rendering equation:

$$L(x, \, \vec \omega_{o}) \approx \frac{1}{N} \sum_{n=0}^{N}{\color{brown}{k_s \, (\alpha + 2) \, (\vec \omega_{r} \cdot \vec \omega_{o})^\alpha} \, (\vec \omega_{i} \cdot \vec n) \, L(x, \, \vec \omega_{i})}$$

And finally add this to the Lambert BRDF we already have:

$$L(x, \, \vec \omega_{o}) \approx \frac{1}{N} \sum_{n=0}^{N}{[\color{BlueViolet}{2 k_d} + \color{brown}{k_s \, (\alpha + 2) \, (\vec \omega_{r} \cdot \vec \omega_{o})^\alpha}] \, (\vec \omega_{i} \cdot \vec n) \, L(x, \, \vec \omega_{i})}$$

And here it is in code together with the Lambert diffuse:

// Phong shading ray.origin = hit.position + hit.normal * 0.001f; float3 reflected = reflect(ray.direction, hit.normal); ray.direction = SampleHemisphere(hit.normal); float3 diffuse = 2 * min(1.0f - hit.specular, hit.albedo); float alpha = 15.0f; float3 specular = hit.specular * (alpha + 2) * pow(sdot(ray.direction, reflected), alpha); ray.energy *= (diffuse + specular) * sdot(hit.normal, ray.direction); return 0.0f;

Note that we substituted the dot product with a slightly different, but equivalent one (reflected \(\omega_o\) instead of \(\omega_i\)). Now reenable metallic materials in the `SetUpScene`

function and give it a shot.

Playing around with different \(\alpha\) values, you will notice a problem: Lower exponents already take a long time to converge, and for higher exponents the noise is particularly stubborn. Even after several minutes of waiting, the result is far from pretty, which is unacceptable for such a simple scene. \(\alpha = 15\) and \(\alpha = 300\) with 8192 samples look like this:

“Why is that? We had such nice perfect reflections (\(\alpha = \infty\)) before!”, you might ask. The problem is that we are generating *uniform* samples, and weighting them according to the BRDF. For high Phong exponents, the value of the BRDF is tiny for all but those directions that are very close to the perfect reflection, and it is very unlikely that we will randomly select them using our *uniform* samples. On the other hand, *if* we actually hit one of those directions, the BRDF is huge to compensate for all the other tiny samples. A very high variance is the result. Paths with multiple specular reflections are even worse, resulting in the noise you see in the images above.

To make our path tracer practical, we need a change of paradigm. Instead of wasting precious samples on areas where they won’t matter in the end (because they will get a very low BRDF and / or cosine factor), let’s *generate samples that matter*.

As a first step, we’ll get our perfect reflections back, and then see how we can generalize this idea. To do this, we will split our shading logic up in diffuse and specular. For each sample, we will randomly choose one or the other (depending on the ratio of \(k_d\) and \(k_s\)). For diffuse, we’ll stick to the uniform sampling, but for specular, we will explicitly reflect the ray in the single direction that matters. Since less samples will now be spent on each reflection type, we need to increase the contribution of the samples accordingly to end up with the same net amount, like so:

// Calculate chances of diffuse and specular reflection hit.albedo = min(1.0f - hit.specular, hit.albedo); float specChance = energy(hit.specular); float diffChance = energy(hit.albedo); float sum = specChance + diffChance; specChance /= sum; diffChance /= sum; // Roulette-select the ray's path float roulette = rand(); if (roulette < specChance) { // Specular reflection ray.origin = hit.position + hit.normal * 0.001f; ray.direction = reflect(ray.direction, hit.normal); ray.energy *= (1.0f / specChance) * hit.specular * sdot(hit.normal, ray.direction); } else { // Diffuse reflection ray.origin = hit.position + hit.normal * 0.001f; ray.direction = SampleHemisphere(hit.normal); ray.energy *= (1.0f / diffChance) * 2 * hit.albedo * sdot(hit.normal, ray.direction); } return 0.0f;

The `energy`

function is a little helper that averages the color channels:

float energy(float3 color) { return dot(color, 1.0f / 3.0f); }

Here it is, the prettified variant of the Whitted ray tracer we built last time, but now with real diffuse shading (read ‘soft shadows, ambient occlusion, diffuse global illumination’):

Let’s look at the basic Monte Carlo formula again:

$$F_N \approx \frac{1}{N} \sum_{n=0}^{N}{\frac{f(x_n)}{p(x_n)}}$$

As you can see, we divide each sample’s contribution by the probability that this particular sample is chosen. So far, we used uniform sampling over the hemisphere and therefore had a constant \(p(\omega)=\frac{1}{2 \pi}\). As we saw earlier, this is far from optimal e.g. in case of the Phong BRDF which is large in a very narrow set of directions.

Imagine we could find a probability distribution that exactly matches the integrand: \(p(x) = f(x)\). This is what would happen:

$$F_N \approx \frac{1}{N} \sum_{n=0}^{N}{1}$$

Now there are no samples that get very little contribution. Instead, those samples will inherently be selected with a lower probability. This will drastically reduce the variance of the result and make rendering converge faster.

In practice, it is unrealistic to find such a perfect distribution, because some factors of the integrand (in our case BRDF × cosine × incoming light) are not known (most prominently the incoming light), but already distributing samples according to BRDF × cosine or even only BRDF will do us a lot of good. This is known as Importance Sampling.

For the following steps, we need to replace our uniform sample distribution by a cosine (power) distribution. Remember, instead of multiplying uniform samples with a cosine, *lowering* their contribution, we want to generate proportionally *fewer* samples.

This article by Thomas Poulet describes how this is done. We’ll add an `alpha`

parameter to our `SampleHemisphere`

function that determines the power of the cosine sampling: 0 for uniform, 1 for cosine, or above for higher Phong exponents. In code:

float3 SampleHemisphere(float3 normal, float alpha) { // Sample the hemisphere, where alpha determines the kind of the sampling float cosTheta = pow(rand(), 1.0f / (alpha + 1.0f)); float sinTheta = sqrt(1.0f - cosTheta * cosTheta); float phi = 2 * PI * rand(); float3 tangentSpaceDir = float3(cos(phi) * sinTheta, sin(phi) * sinTheta, cosTheta); // Transform direction to world space return mul(tangentSpaceDir, GetTangentSpace(normal)); }

Now the probability of each sample is \(p(\omega) = \frac{\alpha + 1}{2 \pi} \, (\vec \omega \cdot \vec n)^\alpha\). The beauty of this might not be immediately obvious, but it will unfold in a minute.

First, we’ll improve our diffuse rendering. Our uniform distribution already fits the constant Lambert BRDF quite well, but we can do better by including the cosine factor. The probability distribution of the cosine sampling (where \(\alpha = 1\)) is \(\frac{(\vec \omega_{i} \cdot \vec n)}{\pi}\), which simplifies our diffuse Monte Carlo formula to:

$$L(x, \, \vec \omega_{o}) \approx \frac{1}{N} \sum_{n=0}^{N}{\color{BlueViolet}{k_d} \, L(x, \, \vec \omega_{i})}$$

// Diffuse reflection ray.origin = hit.position + hit.normal * 0.001f; ray.direction = SampleHemisphere(hit.normal, 1.0f); ray.energy *= (1.0f / diffChance) * hit.albedo;

This will give our diffuse shading a slight speedup. Let’s take care of the real culprit now.

For the Phong BRDF, the procedure is similar. This time, we have a product of two cosines: the regular cosine from the rendering equation (like in the diffuse case) times the BRDF’s own powered cosine. We will only take care of the latter.

Let’s insert the probability distribution from above into our Phong equation. A detailed derivation can be found in Lafortune and Willems: Using the Modified Phong Reflectance Model for Physically Based Rendering (1994):

$$L(x, \, \vec \omega_{o}) \approx \frac{1}{N} \sum_{n=0}^{N}{\color{brown}{k_s \, \frac{\alpha + 2}{\alpha + 1}} \, (\vec \omega_{i} \cdot \vec n) \, L(x, \, \vec \omega_{i})}$$

// Specular reflection float alpha = 15.0f; ray.origin = hit.position + hit.normal * 0.001f; ray.direction = SampleHemisphere(reflect(ray.direction, hit.normal), alpha); float f = (alpha + 2) / (alpha + 1); ray.energy *= (1.0f / specChance) * hit.specular * sdot(hit.normal, ray.direction, f);

These changes are enough to fix any problems with high Phong exponents and make our rendering converge in a way more reasonable time.

Finally, let’s extend our scene generation so we get varying values for smoothness and emission of our spheres! In C#, add a `public float smoothness`

and `public Vector3 emission`

to `struct Sphere`

. Since we changed the size of the struct, we need to adjust the stride when creating the Compute Buffer (4 × number of floats, remember?). Make the `SetUpScene`

function put in some values for smoothness and emission.

Back in the shader, add both variables to `struct Sphere`

and `struct RayHit`

and initialize them in `CreateRayHit`

. Last but not least, set both values in `IntersectGroundPlane`

(hard-coded, put in anything you want) and `IntersectSphere`

(taking the values from the `Sphere`

).

I’d like to use smoothness values the way I’m used to from the Unity Standard shader, which is different than the kinda arbitrary Phong exponent. Here’s a conversion that works reasonably well, to be used in the `Shade`

function:

float SmoothnessToPhongAlpha(float s) { return pow(1000.0f, s * s); }

float alpha = SmoothnessToPhongAlpha(hit.smoothness);

Using emission is simply done by returning the value in `Shade`

:

return hit.emission;

Take a deep breath, relax, and wait for your image to clear into a soothing sight like this:

**Congratulations!** You have made it through this math-ridden forest. You implemented a path tracer capable of diffuse and specular shading, and you learned about Importance Sampling, applying the concept right away to make rendering converge in a matter of minutes rather than hours or days.

This article has been quite a leap from the last one in terms of complexity, but also in terms of quality of the results. Working your way through the math takes time, but is worth it, since it will drastically deepen your understanding of what’s going on and will allow you to extend the algorithm without breaking physical plausibility.

Thank you for following along! In the third part, we will leave the thicket of sampling and shading behind us (for now…), and go back to civilization for a rendezvous with Gentlemen Möller and Trumbore. They have a word or two to say about triangles.

]]>In this article, we’re going to write a very simple ray tracer from scratch using compute shaders in Unity. The languages we will use are C# for the scripts and HLSL for the shaders. All code is also hosted on Bitbucket.

Follow along and you will end up with a rendering like this:

I would like to start by quickly reviewing the basic ray tracing theory. If you are familiar, please feel free to skip ahead.

Let’s think about how photographs emerge in the real world – highly simplified, but for the purpose of rendering this should be fine. It all starts with a light source emitting photons. A photon flies in a straight line until it hits a surface, at which point it is reflected or refracted and continues its journey minus some energy that has been absorbed by the surface. Eventually, some photons will hit the camera’s image sensor which in turn produces the resulting image. Ray tracing basically simulates these steps to create photorealistic images.

In practice, only a tiny fraction of the photons emitted by a light source will ever hit the camera. Therefore, applying the principle of Helmholtz reciprocity, calculations are commonly reversed: Instead of shooting photons from light sources, rays are shot from the camera into the scene, reflected or refracted and eventually hit a light source.

The ray tracer we are going to build is based on a 1980 paper by Turner Whitted. We will be able to simulate hard shadows and perfect reflections. It will also serve as a basis for more advanced effects like refraction, diffuse global illumination, glossy reflections and soft shadows.

Let’s start by creating a new Unity project. Create a C# script `RayTracingMaster.cs`

and a compute shader `RayTracingShader.compute`

. Fill the C# script with some basic code:

using UnityEngine; public class RayTracingMaster : MonoBehaviour { public ComputeShader RayTracingShader; private RenderTexture _target; private void OnRenderImage(RenderTexture source, RenderTexture destination) { Render(destination); } private void Render(RenderTexture destination) { // Make sure we have a current render target InitRenderTexture(); // Set the target and dispatch the compute shader RayTracingShader.SetTexture(0, "Result", _target); int threadGroupsX = Mathf.CeilToInt(Screen.width / 8.0f); int threadGroupsY = Mathf.CeilToInt(Screen.height / 8.0f); RayTracingShader.Dispatch(0, threadGroupsX, threadGroupsY, 1); // Blit the result texture to the screen Graphics.Blit(_target, destination); } private void InitRenderTexture() { if (_target == null || _target.width != Screen.width || _target.height != Screen.height) { // Release render texture if we already have one if (_target != null) _target.Release(); // Get a render target for Ray Tracing _target = new RenderTexture(Screen.width, Screen.height, 0, RenderTextureFormat.ARGBFloat, RenderTextureReadWrite.Linear); _target.enableRandomWrite = true; _target.Create(); } } }

The `OnRenderImage`

function is automatically called by Unity whenever the camera has finished rendering. To render, we first create a render target of appropriate dimensions and tell the compute shader about it. The 0 is the index of the compute shader’s kernel function – we have only one.

Next, we *dispatch* the shader. This means that we are telling the GPU to get busy with a number of thread groups executing our shader code. Each thread group consists of a number of threads which is set in the shader itself. The size and number of thread groups can be specified in up to three dimensions, which makes it easy to apply compute shaders to problems of either dimensionality. In our case, we want to spawn one thread per pixel of the render target. The default thread group size as defined in the Unity compute shader template is `[numthreads(8,8,1)]`

, so we’ll stick to that and spawn one thread group per 8×8 pixels. Finally, we write our result to the screen using `Graphics.Blit`

.

Let’s give it a try. Add the `RayTracingMaster`

component to the scene’s camera (this is important for `OnRenderImage`

to be called), assign your compute shader and enter play mode. You should see the output of Unity’s compute shader template in the form of a beautiful triangle fractal.

Now that we can display things on screen, let’s generate some camera rays. Since Unity gives us a fully working camera, we will just use the calculated matrices to do this. Start by setting the matrices on the shader. Add the following lines to the script `RayTracingMaster.cs`

:

private Camera _camera; private void Awake() { _camera = GetComponent<Camera>(); } private void SetShaderParameters() { RayTracingShader.SetMatrix("_CameraToWorld", _camera.cameraToWorldMatrix); RayTracingShader.SetMatrix("_CameraInverseProjection", _camera.projectionMatrix.inverse); }

Call `SetShaderParameters`

from `OnRenderImage`

before rendering.

In the shader, we define the matrices, a `Ray`

structure and a function for construction. Please note that in HLSL, unlike in C#, a function or variable declaration needs to appear *before* it is used. For each screen pixel’s center, we calculate the origin and direction of the ray, and output the latter as color. Here is the full shader:

#pragma kernel CSMain RWTexture2D<float4> Result; float4x4 _CameraToWorld; float4x4 _CameraInverseProjection; struct Ray { float3 origin; float3 direction; }; Ray CreateRay(float3 origin, float3 direction) { Ray ray; ray.origin = origin; ray.direction = direction; return ray; } Ray CreateCameraRay(float2 uv) { // Transform the camera origin to world space float3 origin = mul(_CameraToWorld, float4(0.0f, 0.0f, 0.0f, 1.0f)).xyz; // Invert the perspective projection of the view-space position float3 direction = mul(_CameraInverseProjection, float4(uv, 0.0f, 1.0f)).xyz; // Transform the direction from camera to world space and normalize direction = mul(_CameraToWorld, float4(direction, 0.0f)).xyz; direction = normalize(direction); return CreateRay(origin, direction); } [numthreads(8,8,1)] void CSMain (uint3 id : SV_DispatchThreadID) { // Get the dimensions of the RenderTexture uint width, height; Result.GetDimensions(width, height); // Transform pixel to [-1,1] range float2 uv = float2((id.xy + float2(0.5f, 0.5f)) / float2(width, height) * 2.0f - 1.0f); // Get a ray for the UVs Ray ray = CreateCameraRay(uv); // Write some colors Result[id.xy] = float4(ray.direction * 0.5f + 0.5f, 1.0f); }

Try rotating the camera in the inspector. You should see that the ‘colorful sky’ behaves accordingly.

Now let’s replace the colors with an actual skybox. I am using HDRI Haven’s Cape Hill in my examples, but you can of course use any one that you like. Download and drop it into Unity. In the import settings, remember to increase the maximum resolution if you downloaded a higher resolution than 2048. Now add a `public Texture SkyboxTexture`

to the script, assign your texture in the inspector and set it on the shader by adding this line to the `SetShaderParameters`

function:

RayTracingShader.SetTexture(0, "_SkyboxTexture", SkyboxTexture);

In the shader, define the texture and a corresponding sampler, and a π constant that we’ll use in a minute:

Texture2D<float4> _SkyboxTexture; SamplerState sampler_SkyboxTexture; static const float PI = 3.14159265f;

Now instead of writing the direction as color, we’ll sample the skybox. To do this, we transform our cartesian direction vector to spherical coordinates and map this to texture coordinates. Replace the last bit of the `CSMain`

by this:

// Sample the skybox and write it float theta = acos(ray.direction.y) / -PI; float phi = atan2(ray.direction.x, -ray.direction.z) / -PI * 0.5f; Result[id.xy] = _SkyboxTexture.SampleLevel(sampler_SkyboxTexture, float2(phi, theta), 0);

So far so good. Now we’re getting to the actual tracing of our rays. Mathematically, we will calculate the intersection between our ray and our scene geometry, and store the hit parameters (position, normal and distance along the ray). If our ray hits multiple objects, we will pick the closest one. Let’s define the struct `RayHit`

in the shader:

struct RayHit { float3 position; float distance; float3 normal; }; RayHit CreateRayHit() { RayHit hit; hit.position = float3(0.0f, 0.0f, 0.0f); hit.distance = 1.#INF; hit.normal = float3(0.0f, 0.0f, 0.0f); return hit; }

Commonly, scenes are comprised of many triangles, but we will start simple: intersecting an infinite ground plane and a handful of spheres!

Intersecting a line with an infinite plane at \(y=0\) is pretty simple. We only accept hits in positive ray direction though, and reject any hit that is not closer than a potential previous hit.

By default, parameters in HLSL are passed by value and not by reference, so we would only be able to work on a copy and not propagate changes to the calling function. We pass `RayHit bestHit`

with the `inout`

qualifier to be able to modify the original struct. Here’s the shader code:

void IntersectGroundPlane(Ray ray, inout RayHit bestHit) { // Calculate distance along the ray where the ground plane is intersected float t = -ray.origin.y / ray.direction.y; if (t > 0 && t < bestHit.distance) { bestHit.distance = t; bestHit.position = ray.origin + t * ray.direction; bestHit.normal = float3(0.0f, 1.0f, 0.0f); } }

To use it, let’s add a framework `Trace`

function (we will extend it in a minute):

RayHit Trace(Ray ray) { RayHit bestHit = CreateRayHit(); IntersectGroundPlane(ray, bestHit); return bestHit; }

Furthermore, we need a basic shading function. Again, we pass the `Ray`

with `inout`

– we will modify it later on when we talk about reflection. For debug purposes, we return the normal if geometry was hit, and fall back to our skybox sampling code otherwise:

float3 Shade(inout Ray ray, RayHit hit) { if (hit.distance < 1.#INF) { // Return the normal return hit.normal * 0.5f + 0.5f; } else { // Sample the skybox and write it float theta = acos(ray.direction.y) / -PI; float phi = atan2(ray.direction.x, -ray.direction.z) / -PI * 0.5f; return _SkyboxTexture.SampleLevel(sampler_SkyboxTexture, float2(phi, theta), 0).xyz; } }

We will use both functions down in `CSMain`

. Remove the skybox sampling code if you haven’t already, and add the following lines to trace the ray and shade the hit:

// Trace and shade RayHit hit = Trace(ray); float3 result = Shade(ray, hit); Result[id.xy] = float4(result, 1);

A plane is not the most exciting thing in the world, so let’s add a sphere rightaway. The math for a line-sphere intersection can be found on Wikipedia. This time there can be two ray hit candidates: the entry point `p1 - p2`

, and the exit point `p1 + p2`

. We will check the entry point first, and only use the exit point if the other one is not valid. A sphere in our case is defined as a `float4`

comprised of position (xyz) and radius (w). Here’s the code:

void IntersectSphere(Ray ray, inout RayHit bestHit, float4 sphere) { // Calculate distance along the ray where the sphere is intersected float3 d = ray.origin - sphere.xyz; float p1 = -dot(ray.direction, d); float p2sqr = p1 * p1 - dot(d, d) + sphere.w * sphere.w; if (p2sqr < 0) return; float p2 = sqrt(p2sqr); float t = p1 - p2 > 0 ? p1 - p2 : p1 + p2; if (t > 0 && t < bestHit.distance) { bestHit.distance = t; bestHit.position = ray.origin + t * ray.direction; bestHit.normal = normalize(bestHit.position - sphere.xyz); } }

To add a sphere, just call this function from `Trace`

, for example:

// Add a floating unit sphere IntersectSphere(ray, bestHit, float4(0, 3.0f, 0, 1.0f));

There is one problem with the current approach: We’re only testing the center of each pixel, so you can see nasty aliasing effects (the dreaded ‘jaggies’) in the result. To circumvent this, we are going to trace not one but multiple rays per pixel. Each ray gets a random offset inside the pixel’s region. To keep an acceptable frame rate, we’re doing progressive sampling, meaning that we will trace one ray per pixel each frame and average the result over time if the camera didn’t move. Every time the camera moves (or any other parameter like field of view, scene geometry or scene lighting is changed), we need to start all over.

Let’s create a very simple image effect shader that we will use for adding up several results. Name your shader `AddShader`

, make sure the first line reads `Shader "Hidden/AddShader"`

. After `Cull Off ZWrite Off ZTest Always`

add `Blend SrcAlpha OneMinusSrcAlpha`

to enable alpha blending. Next, replace the default `frag`

function with the following lines:

float _Sample; float4 frag (v2f i) : SV_Target { return float4(tex2D(_MainTex, i.uv).rgb, 1.0f / (_Sample + 1.0f)); }

This shader will now just draw the first sample with an opacity of \(1\), the next one with \(\frac{1}{2}\), then \(\frac{1}{3}\) and so on, averaging all samples with equal contribution.

In the script, we still need to count the samples and make use of the newly created image effect shader:

private uint _currentSample = 0; private Material _addMaterial;

You should also reset `_currentSamples = 0`

when the render target is rebuilt in `InitRenderTexture`

, and add an `Update`

function that detects camera transform changes:

private void Update() { if (transform.hasChanged) { _currentSample = 0; transform.hasChanged = false; } }

To use our custom shader, we need to initialize a material, tell it about the current sample and use it for blitting to the screen in the `Render`

function:

// Blit the result texture to the screen if (_addMaterial == null) _addMaterial = new Material(Shader.Find("Hidden/AddShader")); _addMaterial.SetFloat("_Sample", _currentSample); Graphics.Blit(_target, destination, _addMaterial); _currentSample++;

So we’re doing progressive sampling, but we’re still always using the pixel center. In the compute shader, define a `float2 _PixelOffset`

and use that in `CSMain`

instead of the hard `float2(0.5f, 0.5f)`

offset. Back in the script, create a random offset by adding this line to `SetShaderParameters`

:

RayTracingShader.SetVector("_PixelOffset", new Vector2(Random.value, Random.value));

If you move the camera, you should see that the image still shows aliasing, but it will quickly vanish if you stand still for a couple of frames. Here is a side by side comparison of the good we’ve done:

The groundwork for our ray tracer is now done, so we can start dealing with the fancy things that actually set ray tracing apart from other rendering techniques. Perfect reflections are the first item on our list. The idea is simple: Whenever we hit the surface, we reflect the ray according to the law of reflection that you will probably remember from school (incident angle = angle of reflection), reduce its energy, and repeat until we either hit the sky, run out of energy or after a fixed amount of maximum bounces.

In the shader, add a `float3 energy`

to the ray and initialize it in the `CreateRay`

function as `ray.energy = float3(1.0f, 1.0f, 1.0f)`

. The ray starts with full throughput on all color channels, and will diminish with each reflection.

Now we’re going to execute a maximum number of 8 traces (the original ray plus 7 bounces), and add up the results of the `Shade`

function calls, but multiplied with the ray’s energy. As an example, imagine a ray that has been reflected once and lost \(\frac{3}{4}\) of its energy. Now it travels on and hits the sky, so we only transfer \(\frac{1}{4}\) of the energy of the sky hit to the pixel. Adjust your `CSMain`

like this, replacing the previous `Trace`

and `Shade`

calls:

// Trace and shade float3 result = float3(0, 0, 0); for (int i = 0; i < 8; i++) { RayHit hit = Trace(ray); result += ray.energy * Shade(ray, hit); if (!any(ray.energy)) break; }

Our `Shade`

function is now also responsible for updating the energy and generating the reflected ray, so here’s where the `inout`

becomes important. To update the energy, we perform an element-wise multiplication with the specular color of the surface. For example, gold has a specular reflectivity of roughly `float3(1.0f, 0.78f, 0.34f)`

, so it will reflect 100% of red light, 78% of green light, but only 34% of blue light, giving the reflection its distinct golden tint. Be careful not to go over 1 with any of those values, since you would create energy out of nowhere. Also, the reflectivity is often lower than you would think. See e.g. slide 64 in Physics and Math of Shading by Naty Hoffman for some values.

HLSL has an inbuilt function to reflect a ray using a given normal, which is great. Due to floating point inaccuracy, it can happen that a reflected ray is blocked by the surface it is reflected on. To prevent this self-occlusion we will offset the position just a bit along the normal direction. Here’s the new `Shade`

function:

float3 Shade(inout Ray ray, RayHit hit) { if (hit.distance < 1.#INF) { float3 specular = float3(0.6f, 0.6f, 0.6f); // Reflect the ray and multiply energy with specular reflection ray.origin = hit.position + hit.normal * 0.001f; ray.direction = reflect(ray.direction, hit.normal); ray.energy *= specular; // Return nothing return float3(0.0f, 0.0f, 0.0f); } else { // Erase the ray's energy - the sky doesn't reflect anything ray.energy = 0.0f; // Sample the skybox and write it float theta = acos(ray.direction.y) / -PI; float phi = atan2(ray.direction.x, -ray.direction.z) / -PI * 0.5f; return _SkyboxTexture.SampleLevel(sampler_SkyboxTexture, float2(phi, theta), 0).xyz; } }

You might want to increase the intensity of the skybox a little by multiplying it with a factor greater than one. Now play around with your `Trace`

function. Put some spheres in a loop and you will end up with a result like this:

So we can trace mirror-like reflections, which allows us to render smooth metallic surfaces, but for non-metals we need one more thing: diffuse reflection. In brief, metals will only reflect incoming light tinted with their specular color, while non-metals allow light to refract into the surface, scatter and leave it in a random direction tinted with their albedo color. In case of an ideal Lambertian surface which is commonly assumed, the probability is proportional to the cosine of the angle between said direction and the surface normal. A more in-depth discussion of the topic can be found here.

To get started with diffuse lighting, let’s add a `public Light DirectionalLight`

to our `RayTracingMaster`

and assign the scene’s directional light. You might also want to detect the light’s transform changes in the `Update`

function, just like we already do it for the camera’s transform. Now add the following lines to your `SetShaderParameters`

function:

Vector3 l = DirectionalLight.transform.forward; RayTracingShader.SetVector("_DirectionalLight", new Vector4(l.x, l.y, l.z, DirectionalLight.intensity));

Back in the shader, define `float4 _DirectionalLight`

. In the `Shade`

function, define the albedo color right below the specular color:

float3 albedo = float3(0.8f, 0.8f, 0.8f);

Replace the previously black return with a simple diffuse shading:

// Return a diffuse-shaded color return saturate(dot(hit.normal, _DirectionalLight.xyz) * -1) * _DirectionalLight.w * albedo;

Remember that the dot product is defined as \(a \cdot b = ||a||\ ||b|| \cos \theta\). Since both our vectors (the normal and the light direction) are of unit length, the dot product is exactly what we are looking for: the cosine of the angle. The ray and the light are pointing in opposite directions, so for head-on lighting the dot product returns -1 instead of 1. We need to flip the sign to make up for this. Finally, we saturate this value (i.e. clamp it to \([0,1]\) range) to prevent negative energy.

For the directional light to cast shadows, we will trace a shadow ray. It starts at the surface position in question (again with a very small displacement to avoid self-shadowing), and points in the direction the light comes from. If anything blocks the way to infinity, we won’t use any diffuse light. Add these lines above the diffuse return statement:

// Shadow test ray bool shadow = false; Ray shadowRay = CreateRay(hit.position + hit.normal * 0.001f, -1 * _DirectionalLight.xyz); RayHit shadowHit = Trace(shadowRay); if (shadowHit.distance != 1.#INF) { return float3(0.0f, 0.0f, 0.0f); }

Now we can trace some glossy plastic spheres with hard shadows! Setting 0.04 for specular and 0.8 for albedo yields the following image:

As today’s crescendo, let’s create some more complex and colorful scenes! Instead of hard-coding everything in the shader, we will define the scene in C# for more flexibility.

First we are going to extend the `RayHit`

structure in the shader. Instead of globally defining the material properties in the `Shade`

function, we will define them per object and store them in the `RayHit`

. Add `float3 albedo`

and `float3 specular`

to the struct, and initialize them to `float3(0.0f, 0.0f, 0.0f)`

in `CreateRayHit`

. Also adjust the `Shade`

function to use these values from `hit`

instead of the hard-coded ones.

To establish a common understanding of what a sphere is on the CPU and the GPU, define a struct `Sphere`

both in your shader and in the C# script. On the shader side, it looks like this:

struct Sphere { float3 position; float radius; float3 albedo; float3 specular; };

Mirror this structure in your C# script.

In the shader, we need to make the `IntersectSphere`

function work with our custom struct instead of the `float4`

. This is simple to do:

void IntersectSphere(Ray ray, inout RayHit bestHit, Sphere sphere) { // Calculate distance along the ray where the sphere is intersected float3 d = ray.origin - sphere.position; float p1 = -dot(ray.direction, d); float p2sqr = p1 * p1 - dot(d, d) + sphere.radius * sphere.radius; if (p2sqr < 0) return; float p2 = sqrt(p2sqr); float t = p1 - p2 > 0 ? p1 - p2 : p1 + p2; if (t > 0 && t < bestHit.distance) { bestHit.distance = t; bestHit.position = ray.origin + t * ray.direction; bestHit.normal = normalize(bestHit.position - sphere.position); bestHit.albedo = sphere.albedo; bestHit.specular = sphere.specular; } }

Also set `bestHit.albedo`

and `bestHit.specular`

in the `IntersectGroundPlane`

function to adjust its material.

Next, define `StructuredBuffer<Sphere> _Spheres`

. This is the place where the CPU will store all spheres that comprise the scene. Remove all hardcoded spheres from your `Trace`

function and add the following lines:

// Trace spheres uint numSpheres, stride; _Spheres.GetDimensions(numSpheres, stride); for (uint i = 0; i < numSpheres; i++) IntersectSphere(ray, bestHit, _Spheres[i]);

Now we will fill the scene with some life. Back in C#, let’s add some public parameters to control sphere placement and the actual compute buffer:

public Vector2 SphereRadius = new Vector2(3.0f, 8.0f); public uint SpheresMax = 100; public float SpherePlacementRadius = 100.0f; private ComputeBuffer _sphereBuffer;

Set up the scene in `OnEnable`

, and release the buffer in `OnDisable`

. This way, a random scene will be generated every time you enable the component. The `SetUpScene`

function will try to position spheres in a certain radius, and reject those that would intersect spheres already in existence. Half of the spheres are metallic (black albedo, colored specular), the other half is non-metallic (colored albedo, 4% specular):

private void OnEnable() { _currentSample = 0; SetUpScene(); } private void OnDisable() { if (_sphereBuffer != null) _sphereBuffer.Release(); } private void SetUpScene() { List<Sphere> spheres = new List<Sphere>(); // Add a number of random spheres for (int i = 0; i < SpheresMax; i++) { Sphere sphere = new Sphere(); // Radius and radius sphere.radius = SphereRadius.x + Random.value * (SphereRadius.y - SphereRadius.x); Vector2 randomPos = Random.insideUnitCircle * SpherePlacementRadius; sphere.position = new Vector3(randomPos.x, sphere.radius, randomPos.y); // Reject spheres that are intersecting others foreach (Sphere other in spheres) { float minDist = sphere.radius + other.radius; if (Vector3.SqrMagnitude(sphere.position - other.position) < minDist * minDist) goto SkipSphere; } // Albedo and specular color Color color = Random.ColorHSV(); bool metal = Random.value < 0.5f; sphere.albedo = metal ? Vector3.zero : new Vector3(color.r, color.g, color.b); sphere.specular = metal ? new Vector3(color.r, color.g, color.b) : Vector3.one * 0.04f; // Add the sphere to the list spheres.Add(sphere); SkipSphere: continue; } // Assign to compute buffer _sphereBuffer = new ComputeBuffer(spheres.Count, 40); _sphereBuffer.SetData(spheres); }

The magic number 40 in `new ComputeBuffer(spheres.Count, 40)`

is the stride of our buffer, i.e. the byte size of one sphere in memory. To calculate it, count the number of floats in the `Sphere`

struct and multiply it by float’s byte size (4 bytes). Finally, set the buffer on the shader in the `SetShaderParameters`

function:

RayTracingShader.SetBuffer(0, "_Spheres", _sphereBuffer);

Congratulations, you made it! You now have a working GPU-powered Whitted ray tracer, able to render a plane and lots of spheres with mirror-like reflections, simple diffuse lighting and hard shadows. The full source code can be found on Bitbucket. Play around with the sphere placement parameters and enjoy the beautiful view:

We achieved quite something today, but there’s still a lot of ground to cover: diffuse global illumination, glossy reflections, soft shadows, non-opaque materials with refraction, and obviously using triangle meshes instead of spheres. In the next article, we will extend our Whitted ray tracer into a path tracer to conquer a number of the mentioned phenomena.

Thank you for taking the time to work through this article! Stay tuned, the follow-up is in the works.

]]>