# 如何在這種光投射算法中防止出現波紋現象？

``````#define LOCAL_WG_SIZE 128u
const float PI = 3.1415926535897932384626433832795;
const float RAY_LENGTH = 256.0f;

// get the number of the current thread (0...1608)
uint renderNodeNum = local_coords;
// get the endpoint of the ray, it will be on a circle
endPoint.x = int(RAY_LENGTH * sin(float(renderNodeNum) * PI / 1024.0f)) + lightPos.x;
endPoint.y = int(RAY_LENGTH * cos(float(renderNodeNum) * PI / 1024.0f)) + lightPos.y;

// vector approximation. Works, but has moire artifacts.
// I've also tried Bresenham's line algorithm, but it leaves a cross shape as the light fades which looks ugly.
vec2 dt = normalize(vec2(endPoint - lightPos));
vec2 t = vec2(lightPos);
for (int k = 0; k < RAY_LENGTH; k++) {
coords.x = int(t.x);
coords.y = int(t.y);

// calculate transparency
currentAlpha = (transpPixel.b + transpPixel.g * 10.0f + transpPixel.r * 100.0f) / 111.0f;
// calculate color
lightRay.rgb = min(colorPixel.rgb, lightRay.rgb) - (1.0f - currentAlpha) - transmit;
currentOutPixel.rgb = max(currentOutPixel.rgb, lightRay.rgb);
currentOutPixel.a = lightRay.a;
// write color
imageStore(img_output, coords, currentOutPixel);

t += dt;
}
``````

In principle you avoid using scatter (casting) behavior with GPU. They have offered random output coordinate write out since only shader model 5 as a need for extreme situations. But you should as general rule write your GPU code in a "gather" fashion.

The difference: the hardware threads are logically soft-locked to one output position in the render target. The scheduler decides to what rectangle (or cube) in the target buffers, the kicked thread group will output results.

So you should work around the designated destination, and figure out the start; instead of working from some start and computing a dynamic destination.

This way not only will you please the hardware by avoiding contention, and race conditions completely; but also you'll avoid holes.