OpenUSD is not a File Format

OpenUSD seems overcomplicated for a lot of simple tasks. Why? And why would we want that?

OpenUSD is not a File Format

When people first encounter USD they ask a lot of questions. How much should go into one file? When should I use layers? Do I look like I know what a payload is?

When I first came across OpenUSD it wasn't called USD yet, but the ideas were all there or being developed. I asked a lot of questions like... why? Couldn't this be simpler? It took me a while, but once I saw how it was used something clicked for me. I started to think of USD not as a file format, but more like source code or a database. USD is not the product, it's the data that's used to build the product.

If you take this view I think the complex parts of USD start to make sense.

If you come to USD expecting a better FBX or an alternative to glTF I think you're going to be disappointed. USD can represent whatever you want, but you'll only need a subset of its features. You'll be trying to just load a model and find yourself reading about components, assemblies, opinions, layers, payloads, and on and on. You might ask, "Why is this so over-engineered for displaying a model?" I'd answer, "This isn't over-engineering, these are features that solve a different set of problems."

The Source Code Analogy

If we consider that USD files are like source code, what does that mean? In a software project many programmers can contribute by editing source files. All the source files are then used to create an application, server, phone app, cli tool, whatever. The source files are converted into the application by some automated process that makes the output artifact. Then that artifact gets distributed to users.

USD enables that same style of collaboration for 3D content. Each artist can work on whatever source USD files they need to change, and then a process can convert those source USD files into whatever artifact you're creating. Maybe that's frames of a movie, like at Pixar. Or maybe a simulated factory, like in NVIDIA Omniverse. Or it could be a usdz asset for Apple augmented reality. In all those cases artists and technical users can collaborate on the content and the automated process that makes the artifact can do smart things to make good output.

You could view most rendering systems (Tractor, OpenCue, Deadline) as being like build systems (make, cargo, webpack) - they orchestrate the conversion process. Software renderers (Renderman, Arnold, V-Ray) are like compilers (gcc, rustc, javac) - they transform the source into the final output.

Real time renderers are analogous to interpreters (python, v8, deno) but maybe now I'm stretching the analogy too far 🤓

While these tools are not USD only, USD's open format makes it much easier to inspect and modify data in 3D content. In that way your 3D files are more like the source code files programmers normally deal with. You can inspect your USD content without using the original tool that generated it. It's also easier to write new tools that analyze or edit the USD source files. The point I'm trying to make is that USD offers us a path to applying source code concepts and tooling to our 3D content. I think this is an area that's ripe for innovation.

There are currently efforts underway to write USD linters which, much like source code linters, can help you find anti-patterns in your data. See the Pixar one and the NVIDIA one. Because USD is open source it's fairly easy for a developer to make your own linter as well, or any other tool.

In my career I've written one-off python scripts that open USD files looking for specific issues, a known kind of flipped normal for instance, and correct the normals in the data. These scripts didn't need to load the tool that created the USD file to inspect it, and didn't need to perform a render. They can mathematically inspect the data and make the correction in place, and by doing that we could run them across terabytes of 3D content in a few minutes. Well... maybe 30 minutes ;)

Given that kind of freedom, imagine how easy it would be to write a 3D model generator or even an interactive 3D modeler, a tool for generating a specific kind of shading, or anything else you might need for your project.

The Database Analogy

USD is like a database in that it is designed to support the kinds of relationships that are needed for 3D content, and it is optimized for efficiently accessing the data you need.

All the composition arcs exist to support relationships between different entities. Sublayers are for overriding values non-destructively, relationships and payloads are used to import content from another entity, variants are for making several versions based on a single base entity, and inherits and specializes are for making classes of entities that all share some content.

USD also provides some database-like optimizations - intelligent caching, memory efficiency, good locality of reference on disk, and fast lookups even for massive files.

In terms of the type of database, I think it's a bit of a hybrid. From a high level it is relational (even if the relationships aren't the traditional ones). Entities use a namespace path as a key, for instance /World/countertop/hotdog. This is to ensure uniqueness while keeping key values human readable. However, USD does not offer a SQL interface, ACID guarantees, or other properties of a relational database.

In terms of storage it has more in common with a non-relational database. Content is stored in documents containing named entities whose data structures resemble strongly typed JSON objects. The actual on-disk format can be either human-readable text or optimized binary representations designed for fast, database-style lookups.

Nice Analogies, What Does That Mean?

These are some of the reasons for all that complexity. All the new concepts and complicated edge cases come from the desire to have a body of fast to access, easy to inspect, easy to modify source data that supports your ability to build your product. That's different from most 3D file formats, which I often refer to as publishing formats.

If you choose to use USD in your projects, there are still other kinds of complexity to deal with. You'll likely want some kind of version control, which USD doesn't provide an answer for. You'll need to choose some material representation that works for you, and there isn't an obvious path for this, it's very project dependent. You need some kind of good viewer, you need some kind of automation to generate your artifacts, etc. This is mostly stuff you don't have to worry about if you keep everything in a single tool like Maya, Houdini, or Blender and export to a simpler publishing format like glTF.

Why would I want all this extra complexity?

Here's the payoff: you get non-destructive and collaborative workflows, you get total control of your data, you get the ability to automate time consuming exports and builds, you get the ability to work at really large scale.

To me it's this: You get total freedom. Use whatever you want to make whatever you want.

Why Other Formats Don't Solve This Problem

I think USD is uniquely positioned in this way. Other formats don't really provide the same features. In my mind most 3D content formats are more like publishing formats. They are the artifact, not the source data. If we use photoshop as an analogy, USD is the layers in your image, and other formats are the jpeg you export.

Let's look at a few alternatives.

First, glTF. Let's be clear, I think glTF is great. If you're delivering meshes with materials to be rendered by threejs or another real time rendering framework, I think glTF is the way to go today. It's got great compression, it's pretty flexible in what it can represent, it's widely supported, it's optimized for loading to the GPU. It will out-perform USD at loading files to render, and it's very likely faster to render in a real-time rendering pipeline. If glTf solves your problems you should use it. You'd only choose USD over glTF if you want USD's source code and database like features.

FBX is still a commonly used format and has wide support. I mostly see it used for importing skeletal animation. Think like a model with some bones, and values that move the bones around to move the model. FBX has a lot of warts; it isn't an open format, and exports aren't always imported correctly in another tool. Parsing the files yourself is complex and challenging. However, it is a very widely supported format, and I do find myself using it when I need to work with Unreal Engine. For the purposes of this article, it is another publishing format, but I don't see it as a good way to store your source data.

Alembic is an interesting case. It was created to solve a lot of the same problems as USD, but USD has grown in adoption faster. In my opinion Alembic took a simpler approach. It has less of the complexity of USD's composition arcs. It doesn't have as much of a focus on compression and optimization for the database concept, because it originally was used for storing baked output for renders. USD was used for this too, along with its other features. I'd choose USD over Alembic for most tasks unless I had an existing codebase that used Alembic. There's just more momentum and more modern work around USD.

These other formats have a place, but none of them provide the kind of collaboration and freedom I believe USD can enable.

Let's Do a Reality Check

I like this vision of USD enabling us all to use any tool, make anything we want, do so with open source tools, share our work and build on each other's successes. It would be great. Today though, it is not the reality.

I think there are a few difficult things when choosing to work with USD.

  • Distribution: There's no clear path from, "I have some USD files" to "Here's how I can share them with a collaborator" or "Here's something I can send to my users"
  • Versioning: I've seen engineering teams get lost in this for months or years. There's no good default choice.
  • Materials: You can use any material system you want, but which should you use?
  • Viewers: There's no easy, free USD viewer anyone can use. usdview requires C++ compilation knowledge, and proprietary editors often only offer text views of USD layers for debugging.
  • Automation: There are few open source automation tools for USD, like a source code build system or webpack. Your proprietary editors might provide this, but if it doesn't suit your needs you can't really use USD without a lot of work.

As long as we're being pessimists, let me also point out that I'm presenting my viewpoint here. There are many well informed people, include those building USD and on the AOUSD Foundation, that disagree with some of my assertions here. In particular, people push back against the idea that the barrier to entry is as high as I state, and that USD is not a good publishing/delivery format. As always when reading what some guy on the internet says, test all this against your own experience and knowledge.

A Vision of What Could Be

Allow me to dream for a second.

Imagine creating a 3D project like we create software today. You initialize a project folder, use any tools you want - Blender, Maya, Houdini, an AI agent, custom Python scripts - and they all contribute to a unified USD project. Version control just works. Materials have sensible defaults. Collaborators can pull just what they need and contribute back.

Need to render images? Automation tools handle it. Need more power? Cloud workflows are built in. Want to export to a game engine? One command. Want to publish to the web? You got it.

In this world, the barrier to creating custom tools drops dramatically. We'd see ecosystems like npm or three.js emerge for 3D content creation (not just display). Studios could focus on creativity instead of constantly reinventing the technological wheel. Small creators could access powerful workflows. The cost of 3D content creation would plummet.

I'd love to see that. It would enable more rich 3D experiences, better medical/architectural/scientific visualizations, product configurators with good 3D views, much cheaper vfx and animation for film studios and creators, much more high quality content across the board.

I believe the problems I've presented above are solvable. If we can build enough common tooling and settle on some useful defaults a lot of the barriers to USD adoption will fall away. The foundation is sound.

Some Takeaways

If you're fighting USD in your work, it may be growing pains, but it also might be that its the wrong tool for the job. Do you need a 3D rendering format? Or, do you need a lot of fast, complex, collaborative source data for generating your output? In the first case, I'd say use glTF. In the second, USD is the only game in town.

If you read this nodding your head at all, or have some ideas for USD tooling, defaults, etc. I'd love to hear about it!