Hello Triangle, Meet Swift! (And Wide Color)

Two triangles rendered with Metal

The colors at left were gamma encoded after interpolation; those on the right were not.

For an iOS developer wanting to get their feet wet with Metal, a natural place to start is Apple’s Hello Triangle demo.

It is truly the “Hello World” of Metal. All it does is render a two-dimensional triangle, whose corners are red, green and blue, into an MTKView. The vertex and fragment shaders are about as simple as you can get. Even so, it’s a great way to start figuring out how the pieces of the pipeline fit together.

The only thing is—it’s written in Objective C.

As a Swift developer, I found myself wishing I could see a version of Hello Triangle in that language. So I decided to convert it to Swift. (The conversion itself was pretty straightforward: You can see the code in this repo.)

To spice things up a little, I also updated the demo to support wide color, which in Apple’s ecosystem means using the Display P3 color space. (Wide color refers to the ability to display colors outside of the traditional gamut, known as sRGB; it’s something I explored in this earlier post.)

Supporting wide color in Hello Triangle is conceptually simple: Instead of setting the vertices to pure red, green and blue as defined in sRGB, set them to the pure red, green and blue as defined in Display P3. On devices that support it, the corners of the triangle will appear brighter and more vivid.

But as a Metal novice, I found it a bit tricky. In MacOS, the MTKView class has a settable colorspace property, which presumably makes things fairly simple—but in iOS, that property isn’t available.

For that reason, it wasn’t immediately clear to me where in the Metal pipeline to make adjustments for wide color support.

I found an answer in this excellent Stack Overflow reply and related blog post. The author explains how to convert Display P3 color values (which range from 0.0 to 1.0, but actually refer to a wider-than-normal color space) to extended sRGB values (which is comparable to normal sRGB except the values can be negative or greater than 1.0) with the help of a matrix transform. The exact math depends on the colorPixelFormat of the MTKView, which determines where the gamma gets applied.

OK, so about gamma: the gist of gamma correction is that color intensities are often passed through a non-linear function before saving an image. Because most images have only 256 luminance levels, and the human eye is very sensitive to changes in dark colors, the gamma function helps store more darks, sacrificing bright intensities. The values are then passed through an inverse function when presenting on a display.

Because gamma encoding is not linear, values that are evenly spaced before encoding (also known as “compression”) won’t be evenly spaced after the encoding. (This blog post has a superb explanation of gamma correction for those who aren’t familiar.)

There’s a lot of implicit gamma encoding and decoding that can happen in the Metal pipeline, and if you manipulate values without knowing which state you’re in, things can get screwed up fast.

As I learned from those earlier blog posts, there are a couple of options for handling gamma when rendering in wide color to a MTKView:

  1. convert your Display P3 color values to their linear (non-encoded) counterpart in sRGB, and allow the MTKView to apply the gamma encoding for you (by choosing pixel format .bgra10_xr_srgb), or
  2. convert the P3 values to linear sRGB and then pre-apply the gamma encoding yourself mathematically, choosing the pixel format .bgra10_xr.

In this demo, this is the difference between converting the left corner’s “extended” color to 1.2249, -0.04203, -0.0196 (which is P3’s reddest red, converted to linear sRGB) and converting it to 1.0930, -0.2267, -0.1501 (P3’s reddest red as sRGB with gamma encoding applied; these are the numbers you would get if you used Apple’s ColorSync utility to convert to sRGB).

While these conversions are probably best done in a shader, I only had three vertices to handle, so I did it in Swift code using matrix math (see below).

After trying options 1 and 2 above, I noticed an interesting difference in the visual results: when I let the MTKView apply gamma compression to my vertex colors (option 1, pictured above at left), the interior of the triangle was much lighter than when I used the technique in option 2 (right).

The issue was this: In option 1, not only were my triangle’s corners being assigned gamma-compressed values, but so were all of the pixels in between.

The way GPUs work is that values in between the defined vertices are computed automatically using a linear interpolation (or, strictly speaking, a barycentric interpolation) before being passed to the fragment shader.

After the interpolation (which occured in linear space), the gamma encoding moved all of the pixels toward lighter intensities (higher numbers, closer to 1.0).

But when I applied gamma encoding to the converted vertex colors “by hand” (option 2) and set the MTKView to the colorPixelFormat of .bgra10_xr, only the corners were gamma encoded, and the interpolation was effectively done in gamma space. The result was a triangle whose corners were the same color as in option 1, but whose interior values were biased toward the dark end, because of the nature of the gamma function described above.

While neither option is necessarily wrong, you might argue that option 1 (interpolating in linear space) seems more natural, because light is additive in linear space.

Some specifics below:

Using this matrix and conversion functions from endavid

private static let linearP3ToLinearSRGBMatrix: matrix_float3x3 = {
    let col1 = float3([1.2249,  -0.2247,  0])
    let col2 = float3([-0.0420,   1.0419,  0])
    let col3 = float3([-0.0197,  -0.0786,  1.0979])
    return matrix_float3x3([col1, col2, col3])
}()

extension float3 {
    var gammaDecoded: float3 {
        let f = {(c: Float) -> Float in
            if abs(c) <= 0.04045 { return c / 12.92 } return sign(c) * powf((abs(c) + 0.055) / 1.055, 2.4) } return float3(f(x), f(y), f(z)) } var gammaEncoded: float3 { let f = {(c: Float) -> Float in
            if abs(c) <= 0.0031308 {
                return c * 12.92
            }
            return sign(c) * (powf(abs(c), 1/2.4) * 1.055 - 0.055)
        }
        return float3 (f(x), f(y), f(z))
     }
}

…and a conversion function like this…

func toSRGB(_ p3: float3) -> float4 {
    // Note: gamma decoding not strictly necessary in this demo
    // because 0 and 1 always decode to 0 and 1
    let linearSrgb = p3.gammaDecoded * linearP3ToLinearSRGBMatrix
    let srgb = linearSrgb.gammaEncoded
    return float4(x: srbg.x, y: srbg.y, z: srbg.z, w: 1.0)
}

…the color adjustment went like this:

let p3red = float3([1.0, 0.0, 0.0])
let p3green = float3([0.0, 1.0, 0.0])
let p3blue = float3([0.0, 0.0, 1.0])

let vertex1 = Vertex(position: leftCorner, color: toSRGB(p3red))
let vertex2 = Vertex(position: top, color: toSRGB(p3green))
let vertex3 = Vertex(position: rightCorner, color: toSRGB(p3blue))

let myWideColorVertices = [vertex1, vertex2, vertex3]

I hope this port helps someone out there. And huge thanks to David Gavilan for his informative blog posts and for his incredible helpful feedback on this post.

Hello Triangle Swift