DirectX Raytracing: real-time ray tracing. Back trace method

Landscaping and planning 24.09.2019
Landscaping and planning

Which should finally bring ray tracing to games. Ray tracing (ray tracing) is not a new method. In relation to games, they talked about it 20 years ago, and the term itself regarding computer graphics arose in 1982, and since then the method has not appeared in games. But what is it anyway?

Ray tracing is a 3D modeling technique that uses a principle similar to real physical processes. That is, to build an object, the system tracks the trajectory of the virtual beam from the screen to this very object.

In reality, we do not see objects by themselves, but the light reflected from them. Exceptions are objects that themselves serve as light sources. The ray tracing method uses roughly the same principles as applied to a virtual environment.

The problem is that this method is extremely expensive in terms of hardware resource requirements. If, when using the usual rendering methods, the same color or transparency of the object's material is set initially, and reflections and shadows are emulated, including through shaders and other tricks, then in the case of ray tracing, these characteristics are determined precisely in the process of interaction of those same virtual rays with the object, as and in reality. This requires simply enormous expenses on the part of the GPU, even in the case of some individual objects, not to mention the use of ray tracing in games as the main method of constructing objects.

For example, to build an image with a resolution of 1024 x 768 pixels, it is necessary to form 768,432 beams using ray tracing. In this case, each beam can both be reflected and refracted, which ultimately leads to an increase in the number of traced rays by several times. And if in the case of conventional rendering methods, the necessary polygons just need to be drawn in time, then in the case of ray tracing, each ray requires constant mathematical calculations, starting from the moment it is emitted.

It is for this reason that ray tracing has long been used where the method does not need to work in real time. In games, we just need the latter. Nvidia says there has never been a graphics card that has the performance to do this. And now there seems to be Volta, which should accelerate ray tracing in hardware. True, there are no details about this. It is still completely unclear how actively game developers will be able to use ray tracing, and what we, as players, will get as a result. But it is possible to understand it approximately already now. For example, in the video below.

This is the much anticipated Metro Exodus. It will be the first AAA game to use RTX technology. But do you see anything unusual in the video? If you don’t know where to look, most simply won’t understand that this video shows us the technology we have been waiting for the last couple of decades. In fact, Metro ray tracing will be used for some global illumination effects. More specifically, for the ambient occlusion shading model and indirect lighting for indirect lighting. At the same time, classical rasterization will not go anywhere.

Thus, we can conclude that, despite the fact that the introduction of ray tracing into games can be called a revolution, so far it is a very small and inconspicuous revolution. We will see effects based on ray tracing very locally. This will deal with either lighting or reflection, including mirrors. Probably, something more global can be seen in five years, or even more.

I think it's also worth mentioning the point regarding Nvidia RTX technology and Volta cards separately. As you know, there are only two models on sale that use the GV100 GPU - this is the Tesla V100 accelerator and the closer to the people, but insanely expensive Titan V adapter. Undoubtedly, mass video cards with the Volta GPU will enter the market in the near future, but they are quite long time will be in the minority. Especially considering that the current Pascal adapters are more than enough for most. And the mining fever, which can ruin the release of new products, should not be discounted. So is it possible that only owners of new Nvidia cards will be able to enjoy ray tracing in games in the next couple of years? Not at all. Yes, Nvidia is talking about some hardware-accelerated ray tracing capabilities of the Volta GPU, although there are no details yet. But output to consumer market technology that is not available to the masses - stupidity. Therefore, we have Microsoft DirectX Raytracing (DXR) - a set of new tools and methods for the DirectX 12 API. It will be responsible for both hardware and purely software implementation of ray tracing in games. The last option is just interesting for all those who do not have Volta video cards.

However, this raises the question of optimization and performance. I'll give you an example. Many people know the Nvidia HairWorks and AMD TressFX Hair technologies, which make it possible to render hair and fur more realistically. And these technologies are optimized for GeForce and Radeon adapters, respectively. At the same time, they work fine on “foreign” video cards, they only lower fps more. A similar situation will be with ray tracing in games. The only question is how much fps will drop on non-Volta adapters when ray tracing effects are activated. This, by the way, is another argument in favor of the fact that in the coming years ray tracing in games will be in its infancy.

As for AMD, the company also recently introduced its own ray tracing engine - Radeon Rays 2.0. Only it is already based on the Vulkan API. At the same time, no restrictions were indicated in the GPU generation.

Summing up, we can say that for the majority, ray tracing in games in the coming years will either be unavailable due to insufficiently productive video cards, or will pass by simply because the user will not notice and will not understand where exactly it is used in one or another scene of a particular game. Although, who knows, maybe everything will be completely different.

If you find an error, please highlight a piece of text and click Ctrl+Enter.

I know it's a little disappointing. Where reflections, shadows and beautiful appearance? We will get it all, because we have only just begun. But this a good start- the spheres look like circles, which is better than if they looked like cats. They don't look like spheres because we missed important component, which allows a person to determine the shape of an object - how it interacts with light.

Lighting

The first step to add "realism" to our scene rendering is lighting simulation. Lighting is an insanely complex topic, so I'll present a very simplified model that's good enough for our purposes. Some parts of this model are not even close to physical models, they are just fast and look good.

We'll start with some simplifying assumptions that will make life easier for us.

First, we will declare that all lighting has White color. This will allow us to characterize any light source by a single real number i called brightness lighting. Simulating colored lighting isn't that hard (you only need three brightness values, one per channel, and calculating all the colors and lighting per channel), but to make our job easier, I won't do it.

Secondly, we will get rid of the atmosphere. This means that the lights do not become less bright, regardless of their range. The attenuation of the brightness of light depending on the distance is also not too difficult to implement, but for the sake of clarity we will skip it for now.

Light sources

Light must from somewhere act. In this section, we will define three various types lighting sources.

Point sources

point source emits light from a fixed point in space called its position. Light is emitted uniformly in all directions; that is why it is also called omnidirectional lighting. Therefore, a point source is fully characterized by its position and brightness.

Incandescent lamp - good example from real world the approximation of which is a point source of illumination. Although an incandescent lamp does not emit light from a single point and is not perfectly omnidirectional, the approximation is quite good.

Let's define a vector as the direction from point P in the scene to light source Q. This vector, called light vector, is simply equal to . Note that since Q is fixed and P can be any point in the scene, it will generally be different for each point in the scene.

Directional sources

If a point source is a good approximation of an incandescent lamp, then what is a good approximation of the Sun?

This is a tricky question and the answer depends on what you want to render.

In scale solar system The sun can be roughly considered a point source. After all, it emits light from a point (albeit a rather large one) and emits it in all directions, so it fits both requirements.

However, if your scene takes place on Earth, then this is not a very good approximation. The sun is so far away that each ray of light will actually have the same direction (Note: this approximation holds true at the scale of a city, but not at longer distances - in fact, the ancient Greeks were able to calculate the radius of the Earth with amazing accuracy based on different directions sunlight in various places). While it is possible to approximate this with a point source very far from the scene, this distance and the distance between objects in the scene are so different in magnitude that accuracy errors can occur.

For such cases, we will set directional light sources. Like point sources, a directional source has brightness, but unlike point sources, it does not have a position. Instead, he has direction. You can perceive it as an infinitely distant point source, shining in a certain direction.

In the case of point sources, we need to calculate a new light vector for each point P of the scene, but in this case it is given. In the scene with the Sun and the Earth will be equal to .

ambient lighting

Can any real-world lighting be modeled as a point or directional light? Almost always yes (Note: but this won't necessarily be easy; zone lighting (imagine a source behind a diffuser) can be approximated by many point sources on its surface, but this is difficult, more computationally expensive, and the results are not ideal.). Are these two types of sources sufficient for our purposes? Unfortunately no.

Imagine what is happening on the moon. The only significant source of illumination nearby is the sun. That is, the "front half" of the Moon relative to the Sun receives all the illumination, and the "rear half" is in complete darkness. We see this from different angles on Earth, and this effect creates what we call the "phases" of the moon.

However, the situation on Earth is slightly different. Even points that are not illuminated directly from a light source are not completely dark (just look at the floor under the table). How do light rays reach these points if the “view” to the light sources is blocked by something?

As I mentioned in the section Color models when light hits an object, some of it is absorbed, but the rest is scattered into the scene. This means that light can come not only from light sources, but also from other objects that receive it from light sources and scatter it back. But why stop there? Diffused lighting in turn falls on some other object, part of it is absorbed, and part is scattered again in the scene. With each reflection, the light loses some of its brightness, but in theory you can continue ad infinitum(Note: not really, because light is quantum in nature, but close enough to it.).

This means that you need to consider the source of illumination every object. As you can imagine, this greatly increases the complexity of our model, so we will not go this route (Note: but you can at least google Global Illumination and look at the beautiful images.).

But we still don't want every object to be either directly lit or completely dark (unless we're rendering a solar system model). To overcome this barrier, we will define a third type of light sources, called ambient lighting, which is characterized only by brightness. It is believed that it is the unconditional contribution of lighting to every point of the scene. This is a very strong simplification of the extremely complex interaction between lights and scene surfaces, but it works.

Illumination of one point

In general, there will be one ambient light in a scene (because ambient light only has a brightness value, and any number of them will trivially combine into a single ambient light) and an arbitrary number of point and directional lights.

To calculate the illuminance of a point, we simply need to calculate the amount of light contributed by each source and add them up to get a single number representing the total amount of light received by the point. We can then multiply the color of the surface at that point by that number to get the correct lit color.

So what happens when a beam of light with a direction from a directional or point source hits the P point of some object in our scene?

Intuitively, we can break down objects into two general classes, depending on how they behave with light: "matte" and "shiny". Since most of the objects around us can be considered “matte”, we will start with them.

diffuse scattering

When a beam of light falls on a matte object, due to the roughness of its surface at the microscopic level, it reflects the beam into the scene evenly in all directions, that is, a “scattered” (“diffuse”) reflection is obtained.

To see this, carefully look at some matte object, for example, a wall: if you move along the wall, its color does not change. That is, the light you see reflected from an object is the same no matter where you look at the object.

On the other hand, the amount of reflected light depends on the angle between the light beam and the surface. This is intuitively clear - the energy carried by the beam, depending on the angle, should be distributed over a smaller or larger surface, that is, the energy per unit area reflected into the scene will be higher or lower, respectively:

To express this mathematically, let's characterize the orientation of a surface by its normal vector. The normal vector, or simply "normal" is a vector that is perpendicular to the surface at some point. It is also a unit vector, that is, its length is 1. We will call this vector .

Diffuse Reflection Modeling

So, a ray of light with a direction and brightness falls on a surface with a normal. What part is reflected back to the scene as a function of , and ?

For a geometric analogy, let's think of the brightness of light as the "width" of the beam. Its energy is distributed over a surface of size . When and have the same direction, that is, the beam is perpendicular to the surface, which means that the energy reflected per unit area is equal to the incident energy per unit area;< . С другой стороны, когда угол между и приближается к , приближается к , то есть энергия на единицу площади приближается к 0; . Но что происходит в промежутках?

The situation is shown in the diagram below. We know , and ; I added angles and , as well as dots , and , to make the notation related to this diagram easier.

Since, technically, a beam of light has no width, so we will assume that everything happens on an infinitely small flat area of ​​​​the surface. Even if it is the surface of a sphere, the area in question is so infinitesimally small that it is nearly flat relative to the size of the sphere, just as the Earth looks flat at small scales.

A beam of light with a width is incident on a surface at a point at an angle . The normal at the point is , and the energy carried by the beam is distributed over . We need to calculate .

One of the angles is , and the other is . Then the third angle is . But it should be noted that and also form a right angle, that is, they should also be. Hence, :

Let's look at a triangle. Its angles are , and . The side is , and the side is .

And now... trigonometry to the rescue! By definition ; replace with , and with , and we get


which is converted to
We're almost done. is the angle between and , that is, it can be expressed as
And finally
So we've got a very simple equation relating the reflected part of the light to the angle between the normal to the surface and the direction of the light.

Note that at the corners more value becomes negative. If we use this value without hesitation, then we will get light sources as a result, subtractive light. It doesn't make any physical sense; an angle greater simply means that the light is actually reaching back surface, and does not contribute to the illumination of the illuminated point. That is, if it becomes negative, then we consider it equal to .

Diffuse reflection equation

We can now formulate an equation to calculate the total amount of light received by a point with a normal in a scene with ambient luminance and point or directional lights with luminance and light vectors either known (for directional sources) or computed for P (for point sources):
It is worth repeating again that the terms in which should not be added to the illumination of the point.

Sphere Normals

There is only one little thing missing here: where do the normals come from?

This question is much trickier than it looks, as we will see in the second part of the article. Fortunately, for the case we are considering, there is a very simple solution: the normal vector of any point on the sphere lies on a straight line passing through the center of the sphere. That is, if the center of the sphere is , then the direction of the normal to the points is :

Why did I write "normal direction" and not "normal"? In addition to being perpendicular to the surface, the normal must be a unit vector; this would be true if the radius of the sphere were , which is not always true. To calculate the normal itself, we need to divide the vector by its length, thus obtaining the length:


This is mostly of theoretical interest because the lighting equation written above involves dividing by , but creating "true" normals is a good approach; This will make our work easier in the future.

Diffuse Reflection Rendering

Let's translate all this into pseudocode. First, let's add a couple of lights to the scene:

Light ( type = ambient intensity = 0.2 ) light ( type = point intensity = 0.6 position = (2, 1, 0) ) light ( type = directional intensity = 0.2 direction = (1, 4, 4) )
Note that the luminosity conveniently sums to , because it follows from the lighting equation that no point can have a light intensity greater than one. This means that we will not get areas with "too long exposure".

The lighting equation is fairly easy to translate into pseudocode:

ComputeLighting(P, N) ( i = 0.0 for light in scene.Lights ( if light.type == ambient ( i += light.intensity ) else ( if light.type == point L = light.position - P else L = light.direction n_dot_l = dot(N, L) if n_dot_l > 0 i += light.intensity*n_dot_l/(length(N)*length(L)) ) ) return i )
And the only thing left is to use ComputeLighting in TraceRay . We will replace the string that returns the color of the sphere

Return closest_sphere.color
to this snippet:

P = O + closest_t*D # compute intersection N = P - closest_sphere.center # compute sphere normal at intersection N = N / length(N) return closest_sphere.color*ComputeLighting(P, N)
Just for fun, let's add a big yellow sphere:

Sphere ( color = (255, 255, 0) # Yellow center = (0, -5001, 0) radius = 5000 )
We fire up the renderer, and behold, the spheres are finally starting to look like spheres!

But wait, how did the big yellow sphere turn into a flat yellow floor?

It wasn't, it's just that she's so big relative to the other three and the camera is so close to her that she looks flat. Just like our planet looks flat when we stand on its surface.

Reflection from a smooth surface

Now we turn our attention to the "shiny" objects. Unlike "matte" objects, "shiny" objects change their appearance when you look at them from different angles.

Take a billiard ball or a freshly washed car. Such objects exhibit a particular light propagation pattern, usually with bright areas that appear to move as you walk around them. Unlike matte objects, how you perceive the surface of these objects actually depends on the point of view.

Note that the red billiard balls remain red if you step back a couple of steps, but the bright White spot, giving them a "shiny" look, appears to be moving. This means that the new effect does not replace diffuse reflection, but complements it.

Why is this happening? We can start with why not occurs on matte objects. As we saw in the previous section, when a ray of light hits the surface of a matte object, it scatters evenly back into the scene in all directions. Intuitively, this is due to the roughness of the surface of the object, that is, at the microscopic level, it looks like many small surfaces pointing in random directions:

But what happens if the surface is not so uneven? Let's take the other extreme - a perfectly polished mirror. When a ray of light hits a mirror, it is reflected in a single direction that is symmetrical to the angle of incidence with respect to the normal of the mirror. If we name the direction of the reflected light and agree that indicates on the light source, we get the following situation:

Depending on the degree of "polished" surface, it is more or less like a mirror; that is, we get a “mirror” reflection (specular reflection, from the Latin “speculum”, that is, “mirror”).

For a perfectly polished mirror, an incident beam of light is reflected in a single direction. This is what allows us to clearly see objects in the mirror: for each incident ray there is a single reflected ray. But not every object is polished perfectly; although most of light is reflected in the direction , part of it is reflected in directions close to ; the closer to , the more light is reflected in that direction. The "sparkle" of an object determines how quickly the reflected light decreases as it moves away from:

What we're interested in is how to figure out how much light is reflected back in the direction of our viewpoint (because that's the light we use to determine the color of each point). If is the "view vector" pointing out to the camera, and is the angle between and , then this is what we have:

When all the light is reflected. When the light is not reflected. As with diffuse reflection, we need a mathematical expression to determine what happens at intermediate values ​​of .

Modeling "mirror" reflection

Remember how I mentioned earlier that not all models are based on physical models? Well, here's one example of that. The model shown below is arbitrary, but it is used because it is easy to calculate and looks good.

Let's take . He has good properties: , , and the values ​​gradually decrease from to along a very beautiful curve:

Meets all the requirements for a "mirror" function, so why not use it?

But we are missing one more detail. In this formulation, all objects shine equally. How to change the equation to get different degrees of gloss?

Keep in mind that this gloss is a measure of how quickly the reflection function decreases as . A very simple way to get different light curves is to calculate the degree of some positive exponent . Since , it is obvious that ; that is, it behaves exactly the same as , only "already". Here for different values :

The larger the value of , the "narrower" the function around becomes, and the more shiny the object looks.

Commonly called reflection index, and it is a property of the surface. Since the model is not based on physical reality, the values ​​can only be determined by trial and error, i.e. by adjusting the values ​​until they start to look “natural” (Note: for using a physics based model, see the two-beam reflectance function (DPRF). )).

Let's put it all together. The ray hits the surface at the point where the normal is , and the reflection index is . How much light will be reflected in the direction of view?

We have already decided that this value is , where is the angle between and , which in turn is reflected relative to . That is, the first step is to calculate from and .

We can decompose into two vectors and , such that , where is parallel to , and is perpendicular to :

This is a projection onto ; by the properties of the scalar product and based on the fact that , the length of this projection is equal to . We have determined what will be parallel, so .

Since , we can immediately get .

Now let's look at ; since it is symmetric about , its parallel component is the same as y , and the perpendicular component is opposite to ; that is :

Substituting the previously obtained expressions, we get


and simplifying a bit, we get

The meaning of "mirror" reflection

Now we are ready to write down the "mirror" reflection equation:

As with diffuse lighting, it can be negative, and again we must ignore this. Also, not every object needs to be shiny; for such objects (which we will represent through ) the value of "specularity" will not be calculated at all.

Rendering with "specular" reflections

Let's add the "mirror" reflections we've been working on to the scene. First, let's make some changes to the scene itself:

Sphere ( center = (0, -1, 3) radius = 1 color = (255, 0, 0) # Red specular = 500 # Shiny ) sphere ( center = (-2, 1, 3) radius = 1 color = ( 0, 0, 255) # Blue specular = 500 # Shiny ) sphere ( center = (2, 1, 3) radius = 1 color = (0, 255, 0) # Green specular = 10 # Slightly shiny ) sphere ( color = (255, 255, 0) # Yellow center = (0, -5001, 0) radius = 5000 specular = 1000 # Very shiny)
In the code, we need to change ComputeLighting so that it calculates the value of "specularity" if necessary and adds it to general lighting. Note that it now requires both:

ComputeLighting(P, N, V, s) ( i = 0.0 for light in scene.Lights ( if light.type == ambient ( i += light.intensity ) else ( if light.type == point L = light.position - P else L = light.direction # Diffuse n_dot_l = dot(N, L) if n_dot_l > 0 i += light.intensity*n_dot_l/(length(N)*length(L)) # Specular if s != -1 ( R = 2*N*dot(N, L) - L r_dot_v = dot(R, V) if r_dot_v >
Finally, we need to change TraceRay to pass the new ComputeLighting parameters. obvious; it is taken from the sphere data. But what about? is a vector pointing from the object to the camera. Fortunately, in TraceRay we already have a vector pointing from the camera to the object - this is the direction of the traced ray! That is, it's simple.

Here is the new TraceRay code with "mirror" reflection:

TraceRay(O, D, t_min, t_max) ( closest_t = inf closest_sphere = NULL for sphere in scene.Spheres ( t1, t2 = IntersectRaySphere(O, D, sphere) if t1 in and t1< closest_t closest_t = t1 closest_sphere = sphere if t2 in and t2 < closest_t closest_t = t2 closest_sphere = sphere } if closest_sphere == NULL return BACKGROUND_COLOR P = O + closest_t*D # Вычисление пересечения N = P - closest_sphere.center # Вычисление нормали сферы в точке пересечения N = N / length(N) return closest_sphere.color*ComputeLighting(P, N, -D, sphere.specular) }
And here is our reward for all this juggling with vectors:

Shadows

Where there is light and objects, there must be shadows. So where are our shadows?

Let's start with a more fundamental question. Why must be a shadow? Shadows appear where there is light, but its rays cannot reach the object because there is another object in their path.

You'll notice that in the previous section we were interested in angles and vectors, but we only considered the light source and the point we need to color, and completely ignored everything else that happens in the scene - for example, an object that gets in the way.

Instead, we need to add some logic saying " if there is an object between the point and the source, then there is no need to add lighting coming from this source".

We want to highlight the following two cases:

It looks like we have all the tools we need to do this.

Let's start with a directional source. We know ; this is the point we are interested in. We know ; this is part of the definition of the light source. Having and , we can define a ray, namely , which passes from a point to an infinitely distant light source. Does this ray intersect another object? If not, then there is nothing between the point and the source, that is, we can calculate the illumination from this source and add it to the total illumination. If it crosses, then we ignore this source.

We already know how to calculate the nearest intersection between a ray and a sphere; we use it for ray tracing from the camera. We can again use it to calculate the closest intersection between the light beam and the rest of the scene.

However, the parameters are slightly different. Instead of starting from the camera, the rays are emitted from the . The direction is not , but . And we are interested in intersections with everything after for an infinite distance; this means that and .

We can handle point sources in a very similar way, with two exceptions. First, is not given, but it is very easy to calculate from the position of the source and . Secondly, we are interested in any intersections starting with but only up to (otherwise, objects per a light source could create shadows!); that is, in this case, and .

There is one edge case that we need to consider. Let's take a beam. If we look for intersections starting with , then we will most likely find itself at , because it really is on the sphere, and ; in other words, each object will cast shadows on itself (Note: more specifically, we want to avoid a situation in which a point, and not the whole object, casts a shadow on itself; an object with a more complex shape than a sphere (namely, any concave object) can cast true shadows on itself!

The easiest way to deal with this is to use a low value instead of a lower bound. Geometrically, we want the ray to start slightly away from the surface, i.e. near but not exactly at . That is, for directional sources, the interval will be , and for point sources - .

Rendering with shadows

Let's turn this into pseudocode.

In the previous version, TraceRay calculated the nearest ray-sphere intersection and then calculated the illumination at the intersection. We need to extract the closest intersection code since we want to use it again to compute the shadows:

ClosestIntersection(O, D, t_min, t_max) ( closest_t = inf closest_sphere = NULL for sphere in scene.Spheres ( t1, t2 = IntersectRaySphere(O, D, sphere) if t1 in and t1< closest_t closest_t = t1 closest_sphere = sphere if t2 in and t2 < closest_t closest_t = t2 closest_sphere = sphere } return closest_sphere, closest_t }
As a result, TraceRay is much simpler:

TraceRay(O, D, t_min, t_max) ( closest_sphere, closest_t = ClosestIntersection(O, D, t_min, t_max) if closest_sphere == NULL return BACKGROUND_COLOR P = O + closest_t*D # Compute intersection N = P - closest_sphere.center # Compute sphere normal at intersection N = N / length(N) return closest_sphere.color*ComputeLighting(P, N, -D, sphere.specular) )
Now we need to add a shadow check to ComputeLighting:

ComputeLighting(P, N, V, s) ( i = 0.0 for light in scene.Lights ( if light.type == ambient ( i += light.intensity ) else ( if light.type == point ( L = light. position - P t_max = 1 ) else ( L = light.direction t_max = inf ) # Check for shadow shadow_sphere, shadow_t = ClosestIntersection(P, L, 0.001, t_max) if shadow_sphere != NULL continue # Diffuse n_dot_l = dot(N, L ) if n_dot_l > 0 i += light.intensity*n_dot_l/(length(N)*length(L)) # Specular if s != -1 ( R = 2*N*dot(N, L) - L r_dot_v = dot(R, V) if r_dot_v > 0 i += light.intensity*pow(r_dot_v/(length(R)*length(V)), s) ) ) ) return i )
Here's what our newly rendered scene will look like:


Source code and working demo >>

Now we already have something.

Reflection

We have shiny objects. But is it possible to create objects that actually behave like mirrors? Of course, and in fact, their implementation in a ray tracer is very simple, but at first it may seem confusing.

Let's see how mirrors work. When we look in a mirror, we see rays of light reflecting off the mirror. Rays of light are reflected symmetrically about the surface normal:

Let's say we trace a ray and the closest intersection is a mirror. What color is the beam of light? Obviously, this is not the color of the mirror, but any color that the reflected beam has. All we need to do is calculate the direction of the reflected beam and find out what was the color of the light coming from that direction. If only we had a function that returns, for a given ray, the color of the light falling from that direction...

Oh wait, we have it: it's called TraceRay .

So we start with the main TraceRay loop to see what the ray emitted from the camera "sees". If TraceRay determines that the ray sees a reflective object, then it just has to calculate the direction of the reflected ray and call... itself.

At this point, I suggest that you reread the last three paragraphs until you understand them. If this is your first time reading about recursive ray tracing, then you may need to reread it a couple of times and think a bit before you really understand.

Don't rush, I'll wait.

Now that euphoria from this beautiful moment Eureka! slept a little, let's formalize it a little.

The most important thing in all recursive algorithms is to prevent an infinite loop. There is an obvious exit condition in this algorithm: when the ray either hits a non-reflective object, or when it hits nothing. But there is a simple case where we can end up in an infinite loop: the effect endless corridor. It manifests itself when you put a mirror in front of another mirror and see in them endless copies of yourself!

There are many ways to prevent this problem. We will introduce recursion limit algorithm; he will control the "depth" he can go to. Let's call it . When , then we see objects, but without reflections. When we see some objects and reflections of some objects. When we see some objects, reflections of some objects and reflections of some reflections of some objects. Etc. In general, there is not much point in going deeper than 2-3 levels, because at this stage the difference is already barely noticeable.

We will create another distinction. "Reflectivity" doesn't have to be set to yes or no - objects can be partially reflective and partially colored. We will assign each surface a number from to , which determines its reflectivity. After that, we will mix the locally lit color and the reflected color in proportion to this number.

Finally, you need to decide what parameters the recursive call to TraceRay should take? The ray starts from the surface of the object, point . Beam direction is the direction of light reflected from ; in TraceRay we have , that is, the direction from the camera to , opposite to the movement of light, that is, the direction of the reflected beam will be , reflected relative to . Similar to what happens with shadows, we don't want objects to reflect themselves, so . We want to see objects reflected no matter how far away they are, so . And lastly, the recursion limit is one less than the recursion limit we are currently in.

Rendering with reflection

Let's add reflection to the ray tracer code.

As before, we first change the scene:

sphere ( center = (0, -1, 3) radius = 1 color = (255, 0, 0) # Red specular = 500 # Shiny reflective = 0.2 # Slightly reflective ) sphere ( center = (-2, 1, 3) radius = 1 color = (0, 0, 255) # Blue specular = 500 # Shiny reflective = 0.3 # Slightly more reflective) sphere ( center = (2, 1, 3) radius = 1 color = (0, 255, 0) # Green specular = 10 # Slightly shiny reflective = 0.4 # Even more reflective) sphere ( color = (255, 255, 0) # Yellow center = (0, -5001, 0) radius = 5000 specular = 1000 # Very shiny reflective = 0.5# Half reflective )
We use the "reflective ray" formula in a couple of places, so we can get rid of it. It takes a ray and a normal , returning , reflected relative to :

ReflectRay(R, N) ( return 2*N*dot(N, R) - R; )
The only change in ComputeLighting is to replace the reflection equation with a call to this new ReflectRay .

A small change has been made to the main method - we need to pass the top-level TraceRay a recursion limit:

Color = TraceRay(O, D, 1, inf, recursion_depth)
The recursion_depth constant can be set to a reasonable value, such as 3 or 5.

The only important changes occur near the end of the TraceRay , where we recursively calculate the reflections:

TraceRay(O, D, t_min, t_max, depth) ( closest_sphere, closest_t = ClosestIntersection(O, D, t_min, t_max) if closest_sphere == NULL return BACKGROUND_COLOR # Calculate local color P = O + closest_t*D # Calculate intersection point N = P - closest_sphere.center # Compute the normal to the sphere at the point of intersection N = N / length(N) local_color = closest_sphere.color*ComputeLighting(P, N, -D, sphere.specular) # If we hit the recursion limit or the object doesn't reflective, then we are done with r = closest_sphere.reflective if depth<= 0 or r <= 0: return local_color # Вычисление отражённого цвета R = ReflectRay(-D, N) reflected_color = TraceRay(P, R, 0.001, inf, depth - 1) return local_color*(1 - r) + reflected_color*r }
Let the results speak for themselves:

To better understand the recursion depth limit, let's take a closer look at rendering with :

And here's the same zoomed-in view of the same scene, this time rendered with :

As you can see, the difference is whether we see reflections of reflections of reflections of objects, or only reflections of objects.

Custom Camera

At the very beginning of the discussion of ray tracing, we made two important assumptions: the camera is fixed at and pointed at , and the "up" direction is . In this section, we will get rid of these restrictions so that we can position the camera anywhere in the scene and point it in any direction.

Let's start with position. You may have noticed that it is used only once throughout the pseudocode: as the starting point of the rays coming from the camera in the top-level method. If we want to change the position of the camera. then the only thing what needs to be done is to use a different value for .

Does the change affect provisions on the direction rays? In no way. The direction of the rays is a vector passing from the camera to the projection plane. When we move the camera, the projection plane moves with the camera, meaning their relative positions don't change.

Let's now turn our attention to direction. Let's say we have a rotation matrix that rotates in the desired viewing direction, and - in the desired direction "up" (and since this is a rotation matrix, by definition it should do what is required for). Position the camera doesn't change if you just rotate the camera around. But the direction changes, it just undergoes the same rotation as the whole camera. That is, if we have a direction and a rotation matrix, then rotated is simple.

Only the top-level function changes:

For x in [-Cw/2, Cw/2] ( for y in [-Ch/2, Ch/2] ( D = camera.rotation * CanvasToViewport(x, y) color = TraceRay(camera.position, D, 1, inf) canvas.PutPixel(x, y, color) ) )
Here is what our scene looks like when viewed from a different position and with a different orientation:

Where to go next

We will end the first part of the work with a brief overview of some interesting topics that we have not explored.

Optimization

As stated in the introduction, we have considered the most understandable way to explain and implement the various possibilities. Therefore, the ray tracer is fully functional, but not particularly fast. Here are some ideas that you can learn on your own to speed up the tracer. Just for fun, try to measure the execution time before and after their implementation. You will be very surprised!

Parallelization

The most obvious way to speed up a ray tracer is to trace multiple rays at the same time. Because each ray coming out of the camera is independent of all the others, and most of the structures are read-only, we can trace one ray to each CPU core without too much trouble or complications due to timing issues.

In fact, ray tracers belong to a class of algorithms called extremely parallelizable precisely because their very nature makes it very easy to parallelize them.

Value caching

Consider the values ​​computed by IntersectRaySphere , which is where the ray tracer typically spends most of its time:

K1 = dot(D, D) k2 = 2*dot(OC, D) k3 = dot(OC, OC) - r*r
Some of these values ​​are constant throughout the scene - once you know where the spheres are, r*r and dot(OC, OC) don't change anymore. You can calculate them once during scene loading and store them in the spheres themselves; you just need to recalculate them if the spheres are to move in the next frame. dot(D, D) is a constant for the given ray, so you can evaluate it in ClosestIntersection and pass it to IntersectRaySphere .

Shadow optimizations

If an object's point is in shadow relative to the light source because another object is detected along the way, then there is a high probability that the point adjacent to it is also in shadow relative to the light source due to the same object (this is called shadow consistency):

That is, when we are looking for objects between a point and a light source, we can first check if the last object that cast a shadow on the previous point relative to the same light source casts a shadow on the current point. If so, then we can finish; if not, then we just continue to check the rest of the objects in the usual way.

Likewise, when calculating the intersection between a ray of light and objects in the scene, we don't really need the closest intersection - it's enough to know that there is at least one intersection. We can use a special version of ClosestIntersection that returns the result as soon as it finds the first intersection (and for this we need to calculate and return not closest_t , but just a boolean value).

Spatial structures

Computing the intersection of a ray with each sphere is quite a waste of resources. There are many data structures that allow you to discard entire groups of objects in one fell swoop without having to calculate individual intersections.

A detailed discussion of such structures is not the subject of our article, but the general idea is this: suppose we have several spheres close to each other. You can calculate the center and radius of the smallest sphere containing all these spheres. If the ray does not intersect this boundary sphere, then one can be sure that it does not intersect any of the spheres it contains, and this can be done in a single intersection test. Of course, if it intersects a sphere, then we still need to check whether it intersects any of the spheres it contains.

You can learn more about this by reading about hierarchies of bounding volumes.

Downsampling

Here's an easy way to make the ray tracer times faster: compute times fewer pixels!

Suppose we are ray tracing for pixels and , and they fall on the same object. We can logically assume that the ray for a pixel will also fall on the same object, skip the initial search for intersections with the entire scene and go directly to color calculation at this point.

If you do this in the horizontal and vertical directions, you can perform up to 75% less primary calculations of ray-scene intersections.

Of course, this can easily miss a very thin object: unlike the ones discussed earlier, this is a “wrong” optimization, because the results of its use are not identical what would we get without it; in a sense, we are "cheating" on this economy. The trick is how to guess how to save correctly, providing satisfactory results.

Other primitives

In the previous sections, we used spheres as primitives because they are easy to manipulate from a mathematical point of view. But having achieved this, you can simply add other primitives.

Note that from TraceRay's point of view, any object can do as long as it only needs to calculate two values: the value for the closest intersection between the ray and the object, and the normal at the intersection point. Everything else in the ray tracer is independent of the object type.

Triangles are a good choice. First you need to calculate the intersection between the ray and the plane containing the triangle, and if there is an intersection, then determine whether the point is inside the triangle.

Structural block geometry

There is a very interesting type of object that is relatively easy to implement: a boolean operation between other objects. For example, crossing two spheres can create something that looks like a lens, and subtracting a small sphere from a larger sphere can create something that looks like a Death Star.

How it works? For each object, you can calculate the places where the ray enters and exits the object; for example, in the case of a sphere, the ray enters and exits at . Suppose we need to calculate the intersection of two spheres; the ray is inside the intersection when it is inside both spheres, and outside otherwise. In the case of subtraction, the ray is inside when it is inside the first object but not inside the second.

More generally, if we want to compute the intersection between a ray and (where is any Boolean operator), then we first need to separately compute the intersection of ray- and ray- , which gives us the "inner" interval of each object and . Then we calculate , which is in the "inner" interval . We just need to find the first value that is in both the "inner" interval and the interval that we are interested in:

The normal at the intersection point is either the normal of the object that creates the intersection, or its opposite, depending on whether we are looking "outside" or "inside" the original object.

Of course, they don't have to be primitives either; they can themselves be the results of Boolean operations! If implemented cleanly, we don't even need to know how they are as long as we can get intersections and normals from them. Thus, we can take three spheres and calculate, for example, .

Transparency

Not all objects need to be opaque, some may be partially transparent.

The implementation of transparency is very similar to the implementation of reflection. When the ray hits a partially transparent surface, we calculate the local and reflected color, as before, but also calculate the complementary color - the color of the light passing through through an object obtained by another call to TraceRay . Then you need to mix this color with the local and reflected colors, taking into account the transparency of the object, and that's it.

Refraction

In real life, when a ray of light passes through a transparent object, it changes direction (so when a straw is immersed in a glass of water, it looks "broken"). Change of direction depends on refractive index each material according to the following equation:
Where and are the angles between the ray and the normal before and after crossing the surface, and and are the refractive indices of the material outside and inside the objects.

For example, approximately equal to , and approximately equal to . That is, for a beam entering the water at an angle, we obtain




Stop for a moment and realize: if you implement constructive block geometry and transparency, you can model a magnifying glass (the intersection of two spheres) that will behave like a physically correct magnifying glass!

Supersampling

Supersampling is the approximate opposite of subsampling, where we aim for precision instead of speed. Suppose that the rays corresponding to two neighboring pixels fall on two different objects. We need to colorize each pixel with the corresponding color.

However, don't forget the analogy we started with: each ray should define the "defining" color of each square"grid" through which we look. By using one ray per pixel, we conditionally decide that the color of the light beam passing through the middle of the square determines the entire square, but this may not be the case.

This problem can be solved by tracing several rays per pixel - 4, 9, 16, and so on, and then averaging them to get the color of the pixel.

Of course, this makes the ray tracer 4x, 9x, or 16x slower, for the same reason that downsampling makes it 1x faster. Fortunately, there is a compromise. We can assume that the properties of an object change smoothly along its surface, that is, emitting 4 rays per pixel that fall on the same object at slightly different points will not improve the scene too much. Therefore, we can start with one ray per pixel and compare neighboring rays: if they hit other objects or their color differs by more than a redistributed threshold, then we apply pixel subdivision to both.

Ray tracer pseudocode

Below is the full version of the pseudocode we created in the ray tracing chapters:

CanvasToViewport(x, y) ( return (x*Vw/Cw, y*Vh/Ch, d) ) ReflectRay(R, N) ( return 2*N*dot(N, R) - R; ) ComputeLighting(P, N, V, s) ( i = 0.0 for light in scene.Lights ( if light.type == ambient ( i += light.intensity ) else ( if light.type == point ( L = light.position - P t_max = 1 ) else ( L = light.direction t_max = inf ) # Check for shadows shadow_sphere, shadow_t = ClosestIntersection(P, L, 0.001, t_max) if shadow_sphere != NULL continue # Diffuse n_dot_l = dot(N, L) if n_dot_l > 0 i += light.intensity*n_dot_l/(length(N)*length(L)) # Shine if s != -1 ( R = ReflectRay(L, N) r_dot_v = dot(R, V) if r_dot_v > 0 i += light.intensity*pow(r_dot_v/(length(R)*length(V)), s) ) ) ) return i ) ClosestIntersection(O, D, t_min, t_max) ( closest_t = inf closest_sphere = NULL for sphere in scene.Spheres ( t1, t2 = IntersectRaySphere(O, D, sphere) if t1 in and t1< closest_t closest_t = t1 closest_sphere = sphere if t2 in and t2 < closest_t closest_t = t2 closest_sphere = sphere } return closest_sphere, closest_t } TraceRay(O, D, t_min, t_max, depth) { closest_sphere, closest_t = ClosestIntersection(O, D, t_min, t_max) if closest_sphere == NULL return BACKGROUND_COLOR # Вычисление локального цвета P = O + closest_t*D # Вычисление точки пересечения N = P - closest_sphere.center # Вычисление нормали сферы в точке пересечения N = N / length(N) local_color = closest_sphere.color*ComputeLighting(P, N, -D, sphere.specular) # Если мы достигли предела рекурсии или объект не отражающий, то мы закончили r = closest_sphere.reflective if depth <= 0 or r <= 0: return local_color # Вычисление отражённого цвета R = ReflectRay(-D, N) reflected_color = TraceRay(P, R, 0.001, inf, depth - 1) return local_color*(1 - r) + reflected_color*r } for x in [-Cw/2, Cw/2] { for y in [-Ch/2, Ch/2] { D = camera.rotation * CanvasToViewport(x, y) color = TraceRay(camera.position, D, 1, inf) canvas.PutPixel(x, y, color) } }
And here is the scene used to render the examples:

viewport_size = 1 x 1 projection_plane_d = 1 sphere ( center = (0, -1, 3) radius = 1 color = (255, 0, 0) # Red specular = 500 # Shiny reflective = 0.2 # Slightly reflective ) sphere ( center = (-2, 1, 3) radius = 1 color = (0, 0, 255) # Blue specular = 500 # Shiny reflective = 0.3 # Slightly more reflective) sphere ( center = (2, 1, 3) radius = 1 color = (0, 255, 0) # Green specular = 10 # Slightly shiny reflective = 0.4 # Even more reflective) sphere ( color = (255, 255, 0) # Yellow center = (0, -5001, 0) radius = 5000 specular = 1000 # Very shiny reflective = 0.5 # Half reflective ) light ( type = ambient intensity = 0.2 ) light ( type = point intensity = 0.6 position = (2, 1, 0) ) light ( type = directional intensity = 0.2 direction = (1, 4, 4) )

Tags: Add tags

Foreword

Moore's law, which refers to the exponential growth of computing power over time, suggests that sooner or later, ray-tracing methods used to create highly realistic images in three-dimensional editors can be used in real-time in computer games.

But in fact, the laws passed by the deputies, the tastes of the voters, that is, the users, and scientific and technological advances in faraway areas will affect the prospects for ray tracing to a much greater extent.

Introduction

Let us briefly highlight the essence of the method of (inverse) ray tracing. In the rasterization method used in modern real-time graphics, to draw an object, a projection onto the screen plane of the triangles that make up the object is found. And they are drawn pixel by pixel, filling the depth buffer, that is, the distance to the screen plane. The depth buffer is required so that the triangles closest to the observer draw the far ones, and not vice versa. And all other effects are done on the basis of rasterization.

In the reverse ray tracing method, on the contrary, the image is built from the pixels of the screen, and not from the objects. An imaginary ray is drawn through each point of the screen in the direction from the observer. It simulates a beam of light that came to the observer from a given direction. And for each ray, it looks at which object it intersects first. And the color of the object area corresponding to the intersection point will set the color of the given pixel. But then the most interesting begins. After crossing with the object, the ray begins its journey through the scene. Rays are drawn in the direction of light sources to check whether a given point of this object is shaded, a reflected beam can be drawn if the object has mirror properties, a refracted beam can be drawn if the object is translucent.

In this case, the point of the object is directly illuminated by only one light source, the second is obscured by another object.

Thus, there is some simulation of the propagation of light. The method has many complex modifications, but they are based on "ray tracing", that is, finding the intersection of a ray (light) with scene objects.

Problem

Although the tracing method allows you to draw a scene with lighting effects, transparency and reflections, it is computationally extremely expensive. The operation of finding the intersection of an arbitrary ray with the objects of a complex scene is highly non-trivial. And it cannot be as easily accelerated by special (quite simple) "accelerators" as the mathematically simple operation of triangle rasterization. Therefore, in game graphics, a rasterization method is used, which allows you to quickly draw geometry, that is, object shapes and textures with all kinds of shaders. And the lighting of almost the entire scene is static. Only for individual moving models, private methods for drawing shadows are used. They are also based on rasterization: shadows, in fact, are simply drawn.

The simplest example: a silhouette of an object is drawn into a separate buffer, from the point of view of the light source, and then the contents of this buffer, like a texture, are superimposed on the surface under the object. It turns out such dynamic running shadows. They can be seen in many computer games for a long time. The method allows improvements, it is possible to project this silhouette onto walls, onto curved surfaces. The texture of this silhouette can be blurred, thus obtaining a grayscale pattern, and not just a black and white sharp silhouette. And then, when applied, you get a soft transition from darkness to light, the so-called soft shadow. It will not be a completely correct physically shadow, but it looks similar.

A ray-traced soft shadow will be more realistic, but much more computationally expensive to draw. And the very first question is, will a gamer in the excitement of, for example, a computer shooter, notice the difference between a highly approximated shadow and a more physically correct one? Here we come to the subjective perception of people, that is, gamers, graphics. After all, the picture on the monitor screen only roughly approximates reality. And if you use different criteria, the measure of this approximation will change.

It turned out, and it is obvious, that for the majority the decisive criterion for approximation is geometric detailing. Next, with some margin, high-quality texturing. In terms of texturing, the ray tracing method has an approximate parity with the rasterization method, we will not particularly consider the issues of texturing and material shaders.

But it is not profitable to draw the geometry of the scene with ray tracing, although it depends on the scene. Scenes of a certain plan are more efficient to draw with tracing, but scenes from modern games are far from this class.

Further in the article, we will take a closer look at various projects in the field of tracing, but, for example, at one time Intel demonstrated rendering levels from Quake III using tracing. Low poly levels in low resolution were slowly drawn on a very expensive and advanced system, far from the consumer market. The trick was that you can draw dynamic shadows and complex reflections.

But human vision and perception are so arranged that it is very adaptive to lighting. Actually, all sorts of shadows only interfere with the human eye to select the objects it needs. For example, when hunting, a typical occupation of our ancestors, prey could hide in the shadows of trees. It is necessary to remove the shadows from the virtual image that is being formed in the brain.

Another point is that the actual lighting of one scene can be incredibly diverse, depending on the reflective properties of surfaces and the properties of the environment, air in particular, as well as the properties of the light source. They mean not mirror reflections, but the scattering of light by objects. As we can see, it is darker in the very corner than closer to the window, because the darkest corner ends up with a smaller number of photons of light traveling around the room. The air itself can also scatter light in various ways. And for an approximate simplified lighting model used in many games, it is possible to select realistic parameters of the reflective properties of surfaces, the properties of air, and a light source in order to roughly reproduce the lighting of game scenes in reality.

Games often also use pre-rendered lighting for the scene, which is pre-calculated by the same ray tracing method and written into the object material textures. Okay, most of the time we see static light in real life. The sun slowly moves across the sky, but when we enter the room, we turn on the light if it is not already on there. Then we take the machine gun and shoot the light bulbs, the light turns off. All this can be pre-calculated and placed in special textures called lightmaps (to save space, they are of lower resolution than material textures, since the lighting changes smoothly and can be interpolated qualitatively for each point using small-sized textures). Or calculate the lighting for each vertex of the triangles of a highly detailed scene, and draw shadows from moving models approximately using one of the private methods.


Over the past few years, ray tracing seems to have become the "number one dream" of the real-time 3D graphics world. Interest in this rendering technology peaked when young researcher Daniel Pohl announced his project on the technology back in 2004.

The reason for the interest of the general public in the work was, in large part, that Pohl focused on the famous id Software games Quake III, Quake IV and the shooter franchise Quake Wars 3D. The researcher attracted a lot of attention from the press, and gamers began to dream of a brighter future when their favorite games will be calculated using the ray tracing method and get rid of rasterization.

Intel quickly took notice of the project, and it seemed like the perfect way for the company to justify increasing the number of cores in processors. The company quickly launched its own research program, and today Intel never misses an opportunity to highlight that ray tracing is the future of real-time 3D gaming. But is it really so? What technological realities lie behind the marketing hype? What are the real benefits of ray tracing? Can we expect ray tracing to replace rasterization? We will try to answer these questions.


Click on the picture to enlarge.

Basic principles

The basic idea behind the ray tracing method is very simple: for every pixel on the display, the rendering engine draws a direct ray from the viewer's eye (camera) to an element in the rendered scene. The first intersection is used to determine the color of the pixel as a function of the element's intersected surface.

But this alone is not enough to display a realistic scene. It is necessary to define the lighting of the pixel, which requires secondary rays (as opposed to primary rays, which determine the visibility of different objects that make up the scene). To calculate scene lighting effects, secondary rays are drawn from intersection points to different light sources. If these rays are blocked by an object, then that point is in the shadow cast by the light source in question. Otherwise, the light source affects the illumination. The sum of all the secondary rays that reach the light source determines the quality of the light that hits our scene element.

But that's not all. To get the most realistic rendering, the reflection and refraction characteristics of the material must be taken into account. In other words, you need to know how much light is reflected at the point of intersection of the primary beam, as well as the amount of light that passes through the material at that point. Again, to calculate the final color of the pixel, it is necessary to draw reflection and refraction rays.

As a result, we get several types of rays. Primary rays are used to determine the visibility of an object and are similar to the sort of Z-buffer used in rasterization. And the secondary rays are divided into the following:

  • rays of shadow / lighting;
  • rays of reflection;
  • refraction rays.

Classic ray tracing algorithm. Click on the picture to enlarge.

This ray tracing algorithm is the result of the work of Turner Whitted, the researcher who invented the algorithm 30 years ago. Until that time, the ray tracing algorithm only worked with primary rays. And the improvements made by Whitted turned out to be a giant step towards the realism of scene rendering.

If you are familiar with physics, then you probably noticed that the ray tracing algorithm works "in the opposite direction" from the phenomena occurring in the real world. Contrary to popular belief in the Middle Ages, our eyes do not emit rays of light; on the contrary, they receive rays of light from light sources that are reflected on various objects around us. In principle, this is how the very first ray tracing algorithms worked.

But the main drawback of the first algorithms was that they imposed a huge computational load. For each light source, you need to draw thousands of rays, many of which will not affect the rendered scene at all (because they do not cross the imaging plane). Modern ray tracing algorithms are optimizations of the basic algorithms, while they use the so-called reverse ray tracing, since the rays are drawn in the opposite direction compared to reality.


The original ray tracing algorithm led to a lot of unnecessary calculations. Click on the picture to enlarge.

Benefits of Ray Tracing

As you have seen, the main advantage of the ray tracing method is its simplicity and elegance. The algorithm uses only one primitive object to display effects, which often require a non-trivial approach and complex stimulation technologies when using a standard rasterization method.


The environment map gives a good approximation of simulating the reflections of the environment, but the ray tracing method can even simulate the reflections of the eyes of Luigi's car on the hood. Click on the picture to enlarge.

Reflections are one area where ray tracing excels. Today, modern game 3D engines calculate reflections using environment maps. This technology gives a good approximation to the reflections of objects located "at infinity" or in the environment (as you can see from the name), but for closely spaced objects, the approach shows its limitations.

The developers of racing games, in particular, have created their own tricks for simulating the reflections of close objects using the so-called dynamic cube maps (dynamic cube maps). The camera is placed at the level of the gamer's machine, after which rendering is carried out in the main directions. The rendered results are then stored in cubemaps, which are used to render the reflections.


Dynamic cubemaps can simulate reflections of nearby objects, such as an airplane on a teapot. But they can't cope with reflections of parts of the object on each other, for example, the spout of a teapot on its body. Click on the picture to enlarge.

Of course, dynamic cubemaps also have their disadvantages. It is rather expensive in terms of processing power to calculate several rendering results, and so that performance does not drop too much, cubemaps are not recalculated as many times as the main picture. This may result in a slight delay in reflections. To reduce the load on the fill rate (fill rate), rendering is performed at a lower resolution, which can lead to pixelation in reflections. Finally, this technology is often limited to the gamer's machine, and all other objects use simpler (spherical) environment maps.

With the ray tracing method, reflections are displayed perfectly, and without complex algorithms, since everything is calculated by the main rendering algorithm. Another advantage can be considered the output of reflections of parts of the object on each other, for example, the reflection of a side-view mirror on the body of a car, which is very difficult to obtain using the rasterization method - and here this reflection is obtained in the same way as others.


The ray tracing method allows you to simulate the reflections of parts of objects on each other, for example, the reflection of a side view mirror on the body of a car. Click on the picture to enlarge.

Another indisputable advantage of the ray tracing method is the high-quality processing of transparency effects. Using the rasterization algorithm, it is extremely difficult to display transparency effects, since the calculation of transparency directly depends on the rendering order. To get good results, you need to sort the transparent polygons in the order of furthest from the camera to closest, and then render.

But in practice, this task is too heavy in terms of calculations, and transparency errors are also possible, since polygons are sorted, not pixels. There are several technologies that allow you to bypass scene polygon sorting (such as depth peeling and A-buffers), but at the moment none of them can be called life-saving. However, the ray tracing algorithm allows you to elegantly handle transparency effects.


Proper handling of transparency effects with a rasterizer requires the use of complex algorithms, such as A-buffers. Click on the picture to enlarge.

Another important advantage is the calculation of shadows. In the world of rasterization, shadow mapping technology has become the standard. But it has several problems, such as "ladders" on the contours and the amount of memory used. The ray tracing algorithm solves the shadow problem in a very elegant way, without resorting to complicated algorithms, using the same primitive object and without requiring additional memory.

Finally, another strong advantage of the ray tracing method is its native ability to work with curved surfaces. For several years now, modern GPUs have had support for curved surfaces (it appears and disappears as new drivers and new architectures are released). But if rasterizers have to do an initial tessellation pass to create triangles (which is the only primitive a rasterization engine can work with), then a ray tracing engine can just work with ray intersections, without a precise mathematical definition of the surface.

Myths about ray tracing

But the ray tracing method still shouldn't be idealized, so it's time to break some of the myths surrounding this algorithm.

To begin with, many gamers consider the ray tracing algorithm to be fundamentally better than rasterization, since it is used in films. This is not true. Most synthetic/hand drawn movies (like all Pixar movies) use an algorithm called REYES which is based on rasterization. Pixar added ray tracing to its RenderMan rendering engine only later, during the production of Cars. But even for this film, ray tracing was used selectively so as not to overwhelm existing computing power. Prior to this project, Pixar used a plug-in for limited use of the ray tracing method, such as ambient occlusion (AO) shading effects.


Click on the picture to enlarge.

The second common myth among ray tracing advocates concerns the complexity of scenes that can be rendered with ray tracing and rasterization. To understand, we need to take a closer look at each algorithm.

The following shows how the rasterization algorithm works with each triangle in the scene.

  • The set of pixels that covers each triangle is determined;
  • for each affected pixel, its depth is compared with the depth of its neighboring pixel.

The main limitation of the rasterization method concerns the number of triangles. The algorithm has complexity O(n), where n is the number of triangles. The algorithm in this case has a linear complexity depending on the number of triangles, since for each frame a list of triangles to be processed is compiled, one by one.

In contrast, the ray tracing algorithm works as follows.

For each frame pixel:

  • a ray is drawn that determines which triangle is the closest;
  • for each triangle, the distance from the triangle to the image output plane is calculated.

As you can see, the processing sequence has become reversed. In the first case, we took each polygon and looked at which pixels it covered. And in the second case, we took each pixel and looked at which polygon corresponds to it. Therefore, you might think that the ray tracing method is less dependent on the number of polygons than the rasterization method, since the number of polygons does not affect the main loop. But in practice this is not the case. In fact, to determine which triangle will intersect with the ray, we need to process all the triangles in the scene. Here, of course, ray tracing advocates will say that you don't need to process all the triangles in the scene with each ray. Using the appropriate type of data structure, it is very easy to organize the triangles in such a way that only a small percentage of them are tested with each ray, i.e. we get that the ray tracing method has O(log n) complexity, where n is the number of polygons.

Yes, the arguments are valid. But the advocates of ray tracing are a bit disingenuous that the same is true for rasterization. Game engines have been using binary space partitioning (BSP) trees and other methods for years to limit the number of polygons that need to be calculated per frame. Another point of contention is that such structures are most effective for static data. All we need to do is calculate the data once and then just access it, and it gives very good results. But what to do with dynamic data? In this case, the data will have to be recalculated for each image, and there are no miracle formulas for this. You still have to study each polygon.

Simple algorithm?

The final myth concerns the natural simplicity and elegance of the ray tracing algorithm. Of course, a ray tracing algorithm can be written in a few lines of code (some algorithms fit on one side of a business card), but a high-performance ray tracing algorithm is a completely different matter.

David Luebke, an nVidia engineer, made the following comment, which perfectly reflects reality: "Rasterization is fast, but you need to carefully consider how to perform complex visual effects. Ray tracing supports complex visual effects, but you need to carefully consider how make it fast."


Code for a minimal ray tracing algorithm written by Paul Heckbert to fit on a business card. Click on the picture to enlarge.

All you have to do is read a few articles about optimizations that need to be made to the ray tracing algorithm in order to appreciate Lübcke's words. For example, the most powerful ray tracing algorithms do not process rays independently, they use so-called ray sets to optimize performance with rays that have the same origin and the same direction. This optimization is great for single instruction many data (SIMD) functional blocks inside the CPU and GPU, and is also very effective for main beams with some degree of coherence (co-directionality) or for shadow beams. But, on the other hand, optimization is no longer suitable for refraction or reflection rays.

Moreover, as Daniel Pohl points out in his article about Quake Wars RT, using ray sets can become problematic with transparent textures (famous alpha textures used for trees), because if all the rays in a set don't behave the same way (some touch the surface, others pass through), the extra overhead involved can become much more than the benefits of the optimizations that the use of ray sets provides.


Visualization of the "cost" of rendering each pixel, where red pixels are the most "expensive". As you can see, tree rendering is very expensive in the ray-traced version of Quake Wars. Click on the picture to enlarge.

Finally, as we already mentioned, the ray tracing method requires a suitable data structure to store the different elements of the scene, and it is this structure that will play a decisive role in the final performance. But choosing and then working with such a data structure is not as simple as it seems at first glance. Some structures perform better with static data, while others can be updated faster with dynamic data or take up less memory. As usual, it all comes down to finding an acceptable compromise. Miracles don't happen.

Therefore, as we can see, the ray tracing algorithm cannot always be called the ideal of simplicity and elegance, as some believe. To get good performance from the ray tracing algorithm, you need to find no less non-trivial programming solutions than in the case of obtaining complex visual effects in the rasterization method.

Now that we have dispelled some of the myths associated with ray tracing, let's turn to the real problems that are associated with this technology.

And we will start with the main problem associated with this rendering algorithm: its slowness. Of course, some enthusiasts will say that this is no longer a problem, since the ray tracing algorithm is well parallelized, and the number of processor cores increases every year, so we should see a linear increase in ray tracing performance. Also, research on optimizations that can be applied to ray tracing is still in its infancy. If you look at the first 3D accelerators and compare them with what is available today, then there are indeed reasons for optimism.

However, this view misses an important point: the most interesting thing about ray tracing is the secondary rays. In practice, calculating an image with only primary rays will not give much improvement in image quality compared to the classical algorithm with a Z-buffer. But the problem with secondary beams is that they have absolutely no coherence (co-directionality). When moving from one pixel to another, completely different data must be calculated, which negates all the usual caching techniques that are very important for good performance. This means that the calculation of secondary rays is very dependent on the memory subsystem, in particular, on its delays. This is the worst case scenario, because of all the memory characteristics, it is the delays that have improved the least in recent years, and there is no reason to believe that the situation will improve in the foreseeable future. It's pretty easy to increase memory bandwidth by using multiple chips in parallel, but the latencies will still stay the same.


For graphics cards, memory latency decreases much more slowly than bandwidth increases. If the latter improves by a factor of 10, then the delays only improve by a factor of two.

The reason for the popularity of the GPU lies in the fact that the creation of "iron", specializing in rasterization, turned out to be a very effective solution. With rasterization, memory access is performed coherently (in parallel), regardless of whether we are working with pixels, texels, or vertices. Therefore, small caches paired with massive memory bandwidth are the ideal solution for great performance. Of course, increasing throughput is very expensive, but such a solution is quite suitable provided that it pays off. On the contrary, today there are no solutions to speed up memory access when calculating multiple beams. It is for this reason that ray tracing will never be as efficient as rasterization.

Another characteristic problem of the ray tracing method concerns anti-aliasing (AA). Rays are drawn as a simple mathematical abstraction, and they do not take into account the real size. The triangle intersection test is a simple boolean function that gives a "yes" or "no" answer, but doesn't give details such as "the ray intersects the triangle 40%". A direct consequence of this effect will be the appearance of "ladders".

To solve this problem, several technologies have been proposed, such as beam tracing and cone tracing, which take into account the thickness of the rays, but their complexity does not allow for an efficient implementation. And the only technology that can give good results is the calculation of more rays than there are pixels, that is, supersampling (rendering at a higher resolution). It is hardly worth mentioning once again that this technology is much more expensive in terms of computing power than the multisampling used in modern GPUs.

Hybrid rendering engine?

If you have read the entire article up to this point, then you are probably already thinking that the ray tracing method cannot replace rasterization yet, but perhaps it is worth mixing the two technologies together? And at first glance it seems that the two technologies complement each other. It's easy to imagine rasterizing triangles to define the visible image, taking advantage of the technology's excellent performance, and then applying ray tracing to only some surfaces, adding realism where needed, such as adding shadows or getting good reflections and transparency. Actually, Pixar used this approach for the cartoon "Cars / Cars". Geometric models are created with REYES, and ray tracing is used "on demand" where certain effects need to be simulated.


For Cars, Pixar used a hybrid rendering engine combining REYES for rendering and on-demand ray tracing for reflections and ambient occlusion. Click on the picture to enlarge.

Unfortunately, despite the fact that such a method sounds promising, hybrid solutions are not so easy to implement. As we have already seen, one of the important drawbacks of the ray tracing method is the need to organize the data structure in such a way as to limit the number of checks for the intersection of rays and objects. And using a hybrid model instead of pure ray tracing doesn't change that. You need to organize the data structure along with all the disadvantages that come with it. For example, suppose that the ray tracing method is based on static data, and the rendering of dynamic data is done through rasterization. But in this case, we lose all the advantages of ray tracing. Since there is no dynamic data for the ray tracing method, it will not be possible to make objects cast shadows or see reflections.

Moreover, in terms of performance, the biggest problem is the memory access associated with secondary rays, and these rays are exactly what we need in our hybrid rendering engine. So the performance gain will not be as big as one might initially expect. Since most of the rendering time will be spent on secondary rays, the gain from not calculating the main rays will be negligible.

In other words, when trying to combine the advantages of both methods, we inevitably combine their disadvantages, losing the elegance of the ray tracing method and the high performance of rasterization.

As we have repeatedly mentioned in our article, there are many problems to be solved before the ray tracing method becomes a worthy alternative to rasterization in the field of real-time rendering. And if you think about it, will this method be a panacea for all ills? The benefits of ray tracing are not revolutionary enough to justify the significant performance hit. The strong points of the algorithm are associated with reflections and transparency, since these two effects are the most difficult to derive on existing rasterization algorithms. But, again, is this such a serious drawback? The world around us does not consist entirely of very transparent or luminous objects, so our vision may well be satisfied with a rough approximation.

If you look at the latest car simulators, for example, Gran Turismo and Forza, then there is quite clearly quite satisfactory rendering quality, even if the reflections on the body are completely false. And the exact reflection of the rear-view mirror on the paint can hardly be considered sufficient to recognize another step towards photorealism.


In fact, there are no reflections. For example, the side view mirror on the car body is not reflected. But do you need an "honest" rendering of the Audi R8 with ray tracing? Click on the picture to enlarge.

Most enthusiasts believe that ray tracing is inherently better than rasterization - but they often base their opinion on an image produced by an offline, non-real-time engine. However, the results of such engines are much better than the capabilities of modern games. In addition, there is some confusion around ray tracing. Enthusiasts often compare photorealistic images to rasterization, which are obtained by a combination of several techniques, such as ray tracing for direct reflections, radiosity for diffuse reflections, photon mapping for caustics, etc. All these technologies are combined to provide the most photorealistic quality possible.


To get photorealistic rendering, you need to combine several technologies. Ray tracing alone is not sufficient to simulate complex interactions between different types of materials and light. Click on the picture to enlarge.

In its basic version, the ray tracing method, if we consider the existing attempts at real-time implementation, is only suitable for ideal reflections and hard (sharp) shadows. Doom 3 proved a few years ago that it was possible to create a solid 3D engine that was ideal for dynamic shadows and rasterization, but in retrospect, the game also showed that hard shadows are not realistic.


Click on the picture to enlarge.

To create soft shadows or diffuse reflections (like what you see on textured metal, for example), more advanced ray tracing techniques are required, such as path tracing or distributed ray tracing. But such techniques require significantly more rays, so they are still poorly suited for real time.

Some users believe that sooner or later so much computing power will be available that the performance advantage of rasterization will no longer be a decisive factor. Under the law of diminishing returns, the performance gains from rasterization will be quickly forgotten in favor of the elegance of ray tracing. About the same as before, the performance advantage of assembly language coding was forgotten, which turned out to be insufficient to outweigh the advantages of high-level languages.

However, this is unlikely to convince us. In any case, we are still far from the time when we can sacrifice performance for elegance and simplicity. Just look at what has happened in the last 10 years in the world of offline rendering. If the rendering of one frame of the cartoon Toy Story / Toy Story was completed, on average, in two hours, then the frame of the cartoon "Ratatouille / Ratatouille" was already six and a half hours, despite the processing power that increased between the two pictures more than 400 times. In other words, the more computing power and resources you provide to computer artists, the faster they absorb them.

If even a company like Pixar, which can afford to dedicate several hours of computation to a single frame, decides to use ray tracing only occasionally due to negative performance impact, it means that the times when we get enough processing power in 3D -real-time games to perform all ray tracing rendering are very, very far away. And in the future, enthusiasts will certainly have where to spend such computing power.

At Gamescom 2018, Nvidia announced a series of Nvidia GeForce RTX graphics cards that will support Nvidia RTX real-time ray tracing technology. Our editors figured out how this technology will work and why it is needed.

What is Nvidia RTX?

Nvidia RTX is a platform that contains a number of useful tools for developers that open access to a new level of computer graphics. Nvidia RTX is only available for the new generation of Nvidia GeForce RTX graphics cards based on the Turing architecture. The main feature of the platform is the possibility real time ray tracing(also called ray tracing).

What is ray tracing?

Ray tracing is a feature that allows you to simulate the behavior of light, creating believable lighting. Now in games, the rays do not move in real time, which is why the picture, although it often looks beautiful, is still not realistic enough - the technologies currently used would require a huge amount of resources for ray tracing.

This is corrected by the new Nvidia GeForce RTX graphics card series, which has enough power to calculate the path of the rays.

How it works?

RTX projects rays of light from the player's (camera's) point of view onto the surrounding space and calculates in this way where what color the pixel should appear. When the beams hit something, they can:

  • Reflect - this will provoke the appearance of a reflection on the surface;
  • Stop - this will create a shadow on the side of the object that the light did not hit
  • Refract - this will change the direction of the beam or affect the color.
The presence of these functions allows you to create more believable lighting and realistic graphics. This process is very resource-intensive and has long been used in the creation of film effects. The only difference is that when rendering a frame of a movie, the authors have access to a large amount of resources and, it can be considered, an unlimited period of time. In games, the device has fractions of a second to form a picture, and the video card is used, most often, one, and not several, as in the processing of motion pictures.

This has prompted Nvidia to introduce additional cores in GeForce RTX graphics cards that will take on most of the workload, improving performance. They are also equipped with artificial intelligence, whose task is to calculate possible errors during the tracing process, which will help to avoid them in advance. This, as the developers say, will also increase the speed of work.

And how does ray tracing affect quality?

During the presentation of video cards, Nvidia showed a number of examples of how ray tracing works: in particular, it became known that some upcoming games, including Shadow of the Tomb Raider and Battlefield 5, will work on the RTX platform. This function, however, will be optional in the game, since one of the new video cards is needed for tracing. Trailers shown by the company during the presentation can be viewed below:

Shadow of the Tomb Raider, which will be released on September 14 this year:

Battlefield 5, which will be released on October 19:

Metro Exodus, which is scheduled for release on February 19, 2019:

Control , which has no release date yet:

Along with that, Nvidia, what other games will get ray tracing.

How to enable RTX?

Due to the technical features of this technology, only video cards with Turing architecture will support ray tracing - the devices currently available cannot cope with the amount of work that tracing requires. At the moment, the only video cards with this architecture are the Nvidia GeForce RTX series, models of which are available for pre-order from 48,000 to 96,000 rubles.

Are there analogues for AMD?

AMD has its own variant of real-time ray tracing technology, which is present in their Radeon ProRender engine. The company announced its development back at GDC 2018, which took place in March. The main difference between the AMD method and Nvidia is that AMD gives access not only to tracing, but also to rasterization, a technology that is now used in all games. This allows you to both use tracing, getting better lighting, and save resources in places where tracing will be an unnecessary load on the video card.

The technology that will run on the Vulkan API is still in development.

As Nvidia stated during its presentation, the development of RTX technology will significantly improve the graphics component of games, expanding the set of tools available to developers. Nevertheless, it's too early to talk about a general graphics revolution - not all games will support this technology, and the cost of video cards with its support is quite high. The presentation of new video cards means that there is progress in graphical details, and over time it will grow and grow.

We recommend reading

Top