GPU Texture Compression Everywhere

When I joined NVIDIA in 2005, one of my main goals was to work on texture and mesh processing tools. The NVIDIA Texture Tools were widely used, and Cem Cebenoyan, Sim Dietrich and Clint Brewer had been doing interesting work on mesh processing and optimization (nvtristrip, nvmeshmender). That was exactly the kind of work I wanted to be involved in.

However, the priorities of the tools team were different, and I ended up working on FX Composer instead. I wasn’t particularly excited about that, so in 2006, I switched to the Developer Technology group.

At the time, NVIDIA and ATI were competing for dominance in the GPU market. While we had a solid market share, our real goal was to grow the overall market. Expanding the “pie” rather than just our slice of it. If you imagine a gamer with a fixed budget, we wanted him to allocate more of that budget to the GPU rather than the CPU. One way to achieve this was by encouraging developers to shift workloads from the CPU to the GPU.

This push was part of the broader GPGPU movement. CUDA had just been released, but it had no integration with graphics APIs, and compute shaders didn’t exist yet. One of the workloads that caught our attention was GPU texture compression. Under the pretext of harnessing the GPU, I found my way back to working on texture compression.

Another idea gaining traction at the time was runtime texture compression.

In April 2004, Farbrausch released .kkrieger, a first-person shooter that packed all its content into just 96 KB by using procedural generation for levels, models, and textures, but it wasn’t until late into the development of fr-041: debris in 2007 that they started using runtime DXT compression to reduce GPU memory usage and improve performance.

Around the same time, Allegorithmic was developing ProFX, the predecessor to Substance Designer, a middleware for real-time procedural texturing. ProFX also included a fast DXT encoder, allowing procedural textures to be converted into GPU-friendly formats at load time.

Simon Brown was working on PlayStation Home, Sony’s 3D social virtual world where players could create and customize their avatars. To support this, he wrote a fast DXT encoder optimized for the PS3’s SPUs, demonstrating the potential of offloading texture compression to parallel processors.

John Carmack had been talking about the megatexture technology for a while, but in 2006, when Jan Paul van Waveren published the details of their Real-Time DXT Compression implementation on the Intel Software Network, we at NVIDIA saw a potential problem: if Rage ended up CPU-limited, it could push gamers toward CPU upgrades rather than GPUs. That made real-time texture compression in the GPU a strategic priority for us.

Continue reading →

Tools for GPU Codec Development

I would like to share some details about the tools that I’ve built to develop the Spark codecs.

When I started working on Spark I had no idea how much time and effort I would be investing on this project. I started with very few tools and helpers, with the goal of obtaining results quickly. While I was able to create a proof of concept in a few days, over time it became clear that I would need better tools in order to maintain that fast development pace.

The first tool I built was Spark Report, a command-line application that automates codec testing. It runs the codecs on diverse image sets, calculates error metrics, generates detailed reports, compares results with other codecs, and tracks performance over time. With Spark Report, I could confidently iterate on the codecs, knowing that any changes I made wouldn’t introduce regressions and would improve results across a wide range of images.

In addition to detecting regressions and doing comparative analysis against other codecs, I also needed a tool to dig deeper, understand the behavior of the codecs and the consequences of the changes. Something that I learned early is that you cannot trust error metrics too much, and that there’s no alternative to visual inspection, so I also built Spark View, a tool to view the output of the codecs.

There’s always been some tension between codec development, tools, and quality of life improvements, but in retrospect I think that every effort on better tools has paid off handsomely, and if anything I’d say I’ve delayed that work too much. In my defense, it was hard to justify the work without knowing the full scope of the project. Would I be working on Spark for a few months? Or a few years? Initially I did not know if it would be worth building a commercial product around it, but as things progressed, the scope of the project grew, adding more formats, more codecs, more platforms, and the depth of the codecs and the complexity of the optimizations increased.

Looking back, I wish I had prioritized tool development earlier, but hindsight always feels clearer in retrospect.

Continue reading →

Crossing the Ludicon

I’m excited to announce that I’m starting my own business to research and develop graphics and game technologies, with a focus on the texture and mesh processing pipelines.

My first product is a real-time ASTC encoder that is orders of magnitude faster than existing offline compressors. It targets a small subset of the available encoding space, but achieves competitive quality through carefully crafted algorithms and creative optimizations.

In addition to that I’m exploring middleware products and applications that advance the state of the art in the areas of RDO texture compression, mesh processing algorithms such as simplification and parameterization, and alternative representations for rendering and physical simulation.

For inquiries, contact me at: castano@ludicon.com

BC1 Compression Revisited

The NVIDIA Texture Tools (NVTT) had the highest quality BC1 encoder that was openly available, but the rest of the code was mediocre: The remaining encoders had not received much attention, the CUDA code paths were not maintained, and the accompanying image processing and serialization code was conventional. There was too much code that was not particularly interesting, it was a hassle to build, and required too much maintenance.

This compelled me to package the BC1 compressor independently as a single header library:

https://github.com/castano/icbc

While doing that I also took the opportunity to revisit the encoder and change the way it was vectorized. I wanted to write about that, but before getting into the details let’s overview how a BC1 compressor works.

Continue reading →

My DMV Experience

A few years ago I renewed my driver’s license and when the new one arrived I was sad to find it had an error. The first half of my last name had been removed and treated as my middle name.

This is a common mistake that Americans make, but I was unhappy about it and hoped it could be fixed easily, so I made an appointment at the DMV to correct it. I brought my previous license as proof of identification. It had not expired yet, so it seemed to me that it would be a valid proof of identification. However, I was informed that in order to correct the error I would also have to bring my birth certificate.

I then made another appointment and went with my birth certificate. I was born in Spain, so it was a Spanish birth certificate with a certified english translation. Turns out however, that being a foreigner, I would have to bring my passport instead.

Well, one more appointment, and as you may be guessing already, my passport was not enough. I am a permanent resident, so the DMV worker actually needed my green card.

For my last appointment I brought all the requested documents and more, and finally things went smoothly. There was no wait! Employees were polite and helpful! They scanned my documents, and I was able to request a new license with my last name corrected. I was given a temporary license that didn’t have the error, and was told my new license would be mailed in a few weeks. I was impressed!

A few days ago my new license arrived and my last name still had the same error.

“Climbing a big mountain is hard”

That was my son’s conclusion after today’s climb. At 9000 feet his head was aching with that uncomfortable combination of altitude, sun exposure, fatigue, and dehydration that mountaineers are so familiar with. Pyramid would have to wait for us another time. Less than a 1000 feet to reach the summit. So close, but impossibly far at the same time.

This Winter I bought Nacho his first pair of mountaineering boots and crampons. He has been practicing self belay and self arrest over the last couple of years, but it was time to take his climbing to the next level. One of our goals is to climb Shasta. This does not only requires competency using crampons and ice axe, but also fitness and endurance to ascend 7000 feet in a couple of days.

Continue reading →

Ladybugs

The ladybug, ladybird beetle, or simply lady beetle, has a special place in our culture: A children’s favorite, loved by gardeners, enemy of aphids, bringer of good fortune. What’s more exciting than finding a ladybug? Finding thousands of them!

Maia holds a lone ladybug by the South Yuba in April 2015.

Ladybugs are migratory insects. Here in California, during the winter months, they travel from the valley to the foothills and clump together at specific spots, typically sunny areas near water, covering rocks and vegetation in a living red carpet.

Continue reading →

Castle Peak 2015 – 2019

Nacho and I climbed Castle Peak together for the first time nearly 4 years ago. He had been asking me to take him camping in the snow and had been saying he wanted to climb a mountain (Mt Shasta! no less). To get him started I decided to take him to Castle Peak and climb it in two days.

I have to admit that back then I had very little experience camping in the winter, but had done it enough times to feel confident taking him along. That said, I didn’t know what to expect, we didn’t have any specialized winter equipment, just our 3 season tent and regular camping gear, we even had to rent our snowshoes! Thankfully winters in California are fairly mild and weather forecasts are pretty accurate, so it was easy to pick a day with good weather and warm temperatures.

Continue reading →

2018 Recap

One of my goals for this year is to write more about our trips and adventures. I’ve been thinking I could write a guide book with all the material that I have, but in order to do that I would have to get better at documenting and organizing it. To get started I’m going to do a quick overview of the trips we did last year.

I thought 2018 was a slow year compared to the previous ones, but now that I sit down and look at everything we have done, I think it’s probably about average. I went on a total of 36 trips, totaling 63 days and 16 nights outdoors. The kids joined me on 24 of those trips (for 41 days and 10 nights). Initially I was thinking I could write a summary about each one, but that would be a very long post! Instead I’m just going to highlight the ones that I enjoyed the most.

None of the trips were particularly challenging. The year was punctuated by several injuries and I’ve been feeling out of shape and more tired than usual. I don’t know if this is a sign of me getting old, or just that I need to slow down and give myself more time to recover. The kids are also growing up, it’s getting easier to take them along, and they both enjoy hiking and climbing, so I’ve been exploring with them more instead of going on personal trips.

Continue reading →

Lightmap optimizations for iOS

One of the main challenges of porting The Witness to iOS was reducing the app memory footprint. The lightmaps that we used in the PC version simply did not fit in the memory budget that we had for iOS.

As described in my previous article, on PC we compress our lightmaps using DXT5-RGBM. The DXT5 texture compression format is not available in iOS, so the first problem was to find a suitable alternative.

Continue reading →