Handling USD Errors in Python

Exploring the ways OpenUSD reports errors to Python clients. Let's look at how we can process and surface them!

Handling USD Errors in Python
National Archives photo no. 245-MS-2621L

This may be a bit of a niche topic, but it has come up several times in my career, and every time I lose a few hours digging around in docs and source code to get back up to speed. Here's my cheat sheet.

When writing anything user-facing with USD's python bindings you'll inevitably want to present errors in a user friendly way. Errors in USD are reported in a very verbose and technical way, often including stack traces and symbols that look something like

Open(pxrInternal_v0_24__pxrReserved__::TfWeakPtr<
pxrInternal_v0_24__pxrReserved__::SdfLayer> rootLayer, 
pxrInternal_v0_24__pxrReserved__::TfWeakPtr<
pxrInternal_v0_24__pxrReserved__::SdfLayer> sessionLayer,
pxrInternal_v0_24__pxrReserved__::ArResolverContext pathResolverContext,
pxrInternal_v0_24__pxrReserved__::UsdStage::InitialLoadSet 
load=Usd.Stage.LoadAll)

That's Usd.Stage.Open for what its worth.

Let's explore errors in OpenUSD, how they're represented in python, and what options we have for processing them.

Errors in C++

Errors mostly originate in C++, and it will be helpful to tour how those are represented before we get into the python side. USD has pretty good documentation for these types of errors, check it out here.

The short version is: C++ libraries can report non-fatal errors during function execution. Error functionality is provided by the Tf library. These TfErrors don't raise exceptions. Callers can optionally watch for errors by using an object called TfErrorMark, or they can allow errors to be handled at a higher level, where they are typically written out to stderr.

When python code is executing errors are handled differently, if they have not been handled in object code before returning to python they are converted into exceptions. This is because exceptions are seen as more pythonic. All C++ TfError objects will be converted into a Tf.ErrorException that's raised in python.

Back on the Python Side

Let's see one of these in action. Here's some output from an interactive python shell, I'll create a temporary stage and attempt to define a mesh with an invalid name.

>>> stage = Usd.Stage.CreateInMemory()
>>> mesh = UsdGeom.Mesh.Define(stage, "~~~invalid name~~~")
Warning: in SdfPath at line 144 of /src/USD/pxr/usd/sdf/path.cpp -- 
Ill-formed SdfPath <~~~invalid name~~~>: :1:0(0): parse error matching
pxrInternal_v0_24__pxrReserved__::Sdf_PathParser::Path
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
pxr.Tf.ErrorException:
        Error in 
'pxrInternal_v0_24__pxrReserved__::UsdStage::_IsValidPathForCreatingPrim' 
at line 3634 in file /src/USD/pxr/usd/usd/stage.cpp : 'Path must be an 
absolute path: <>'

It's honestly not suuuper terrible for an error reported to a programmer. There are two distinct issues reported here. First, we see a Warning printed about parsing the SdfPath, because the path we've provided is not valid. Second, we see the Traceback and error message for the Tf.ErrorException that was raised. That exception has a pretty useful error, it says what was wrong and reports the function name and line number where it occurred.

In the equivalent C++ version this TfError would have been written to stderr, and mesh would be set to an invalid pointer. In python we get an exception instead. So, let's take a look at our options for handling this exception, and making it a bit more presentable for users.

Handling Tf.ErrorException

First thing, let's catch this exception type and do something with it. Depending on your needs, that might be enough! To keep this short I'm going to switch to using the function Tf.RaiseRuntimeError for the next few examples. This creates a TfError and raises a Tf.ErrorException the same way that C++ code does.

>>> Tf.RaiseRuntimeError("detected sinusoidal depleneration")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/USD/lib/python/pxr/Tf/__init__.py", line 218, in RaiseRuntimeError
    _RaiseRuntimeError(msg, codeInfo[0], codeInfo[1], codeInfo[2], codeInfo[3])
pxr.Tf.ErrorException:
        Error in '__main__.<module>' at line 1 in file <stdin> : 
'Python runtime error: detected sinusoidal depleneration'

With a try block we can clean this up.

>>> try:
...     Tf.RaiseRuntimeError("detected sinusoidal depleneration")
... except Exception as e:
...     # log or otherwise report
...     print(e)
...

        Error in '__main__.<module>' at line 2 in file <stdin> : 
'Python runtime error: detected sinusoidal depleneration'

This is cleaner, it doesn't show a Traceback, and it allows us to keep working even though we ran into an error. Still, this message is targeted more at programmers than end users. Our earlier example had text like "pxrInternal_v0_24__pxrReserved__::UsdStage::_IsValidPathForCreatingPrim'
at line 3634 in file /src/USD/pxr/usd/usd/stage.cpp : 'Path must be an
absolute path: <>'"

The end part of that is kind of helpful "Path must be an absolute path: <>". Unfortunately Tf.ErrorException doesn't give us any accessor functions to split up that string. You could extract it by string parsing, but that doesn't seem great.

Processing TfError Instead

The original TfError did have some separate fields that might be helpful. Tf.ErrorException doesn't present those for us to access, but there is a way to get them back.

The secret is to use the TfErrorMark that C++ code uses to detect errors. In python it doesn't work quite the way it does in C++ though. If we just create one and then cause some errors, we still get errors converted into Tf.ErrorException before they get added to the error mark. So, for instance, this raises an exception instead of adding an error to the mark named m.

>>> m = Tf.Error.Mark()
>>> mesh = UsdGeom.Mesh.Define(stage, "/~~~invalid name~~~")
Warning: in SdfPath at line 144 of /src/USD/pxr/usd/sdf/path.cpp -- Ill-formed SdfPath </~~~invalid name~~~>: :1:1(1): parse error matching tao::PXR_INTERNAL_NS_pegtl::ascii::eolf
Traceback ...etc...
>>> m.IsClean()
True

The Tf.Error.Mark is also not a context manager, so you can't use it in a with Tf.Error.Mark block. However, we can get errors into a Tf.Error.Mark using the function Tf.RepostErrors while handling a Tf.ErrorException. Like this.

>>> try:
...     Tf.RaiseRuntimeError("detected sinusoidal depleneration")
... except Tf.ErrorException as e:
...     m = Tf.Error.Mark()
...     Tf.RepostErrors(e)
...     if not m.IsClean():
...         print("\n".join([f"{err.errorCodeString} - {err.commentary}" for err in m.GetErrors()]))
...
True
TF_DIAGNOSTIC_RUNTIME_ERROR_TYPE - Python runtime error: detected sinusoidal depleneration

The True here is the return value from Tf.RepostErrors. After that, we see TF_DIAGNOSTIC_RUNTIME_ERROR_TYPE which is the error code string for this exception, and then we see the "commentary" which is the same string ending we got from the Tf.ErrorException, but without function name, filename and line number. Better than trying to parse the error string!

💡
There is a new function being added to USD called Tf.CatchAndRepostErrors that does exactly this, but in a context decorator that you can use to make this automatic. This change adds it. I would expect it to be in the next version of USD, so any version greater than 25.05.

So What Have We Got?

Armed with this knowledge we can now catch USD's main error type, and if we need to we can get back to Tf.Error and use its fields to try to get a more usable error message. Nice.

But! Are these messages user friendly enough? In the original "invalid prim name" example we'd get this string "Path must be an absolute path: <>". That's actually not super helpful to users. They did provide the path "~~~invalid name~~~" but this message says the path is empty.

That's because of that earlier Warning from SdfPath. It reported that the string was ill-formed and returned an empty path, which is represented by <> in the error message. Warnings are not converted to Tf.ErrorException, so we didn't catch the warning in the try block. The warning is a kind of Tf diagnostic which is essentially just a log message. Unfortunately the error message doesn't make much sense without this warning for context.

It is possible to intercept warnings and other log messages and process them yourself by using a TfDiagnosticMgr::Delegate. I'm not going to dive into that here, but if you're interested check out the C++ ones in UsdUtils here or here. You may also be interested in this one from NVIDIA. From python you can create a UsdUtils.CoalescingDiagnosticDelegate if that serves your needs for logging.

Let's just conclude that in general USD errors messages are only useful to programmers, and even then only to programmers who can see the whole log. Process the errors in whatever way is useful, but I imagine to end users it is often best to just say, "encountered a runtime error while creating a Mesh".

Another Exciting Wrinkle

I find that in spite of my best efforts, my code is sometimes busted. I know that probably never happens to you, dear reader. But, in my substandard work I find that I pass a USD function an argument of the wrong type, and in return I get a big ugly exception like this one. Imagine I just wanted to open one stage, and made this bone headed function call asking for 1 stage.

>>> stage = Usd.Stage.Open(1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
Boost.Python.ArgumentError: Python argument types in
    Stage.Open(int)
did not match C++ signature:
    Open(pxrInternal_v0_24__pxrReserved__::TfWeakPtr<pxrInternal_v0_24__pxrReserved__::SdfLayer> rootLayer, pxrInternal_v0_24__pxrReserved__::TfWeakPtr<pxrInternal_v0_24__pxrReserved__::SdfLayer> sessionLayer, pxrInternal_v0_24__pxrReserved__::ArResolverContext pathResolverContext, pxrInternal_v0_24__pxrReserved__::UsdStage::InitialLoadSet load=Usd.Stage.LoadAll)
    Open(pxrInternal_v0_24__pxrReserved__::TfWeakPtr<pxrInternal_v0_24__pxrReserved__::SdfLayer> rootLayer, pxrInternal_v0_24__pxrReserved__::ArResolverContext pathResolverContext, pxrInternal_v0_24__pxrReserved__::UsdStage::InitialLoadSet load=Usd.Stage.LoadAll)
    Open(pxrInternal_v0_24__pxrReserved__::TfWeakPtr<pxrInternal_v0_24__pxrReserved__::SdfLayer> rootLayer, pxrInternal_v0_24__pxrReserved__::TfWeakPtr<pxrInternal_v0_24__pxrReserved__::SdfLayer> sessionLayer, pxrInternal_v0_24__pxrReserved__::UsdStage::InitialLoadSet load=Usd.Stage.LoadAll)
    Open(pxrInternal_v0_24__pxrReserved__::TfWeakPtr<pxrInternal_v0_24__pxrReserved__::SdfLayer> rootLayer, pxrInternal_v0_24__pxrReserved__::UsdStage::InitialLoadSet load=Usd.Stage.LoadAll)
    Open(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > filePath, pxrInternal_v0_24__pxrReserved__::ArResolverContext pathResolverContext, pxrInternal_v0_24__pxrReserved__::UsdStage::InitialLoadSet load=Usd.Stage.LoadAll)
    Open(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > filePath, pxrInternal_v0_24__pxrReserved__::UsdStage::InitialLoadSet load=Usd.Stage.LoadAll)

If I'm trying to separate USD errors from general errors, I'd want to catch this and process it as a USD error. However, I can't find any* way to catch a Boost.Python.ArgumentError. It is not an importable type, so I wind up having to process that exception in a general except Exception block. This is lame, but unless (until?) USD manages to move off of boost for python bindings, I think we're stuck with this wart in exception handling.

🤠
I put an asterisk by any* above... I did see one suggestion online to deliberately cause this exception during startup, save the type, then process future exceptions based on that type. You can also search the string looking for Boost.Python.ArgumentError in it. Let's say no good way to catch one, but some pragmatic options.

tl;dr

Here's my approach in a project I'm working on. I want to have my own exception type that I can catch and use to return helpful messages in an api. Let's call it WickedAndPerniciousError. I'd like to process USD exceptions enough that I can at least get the error code when it is useful, but I don't plan to expose USD error strings to users. They are verbose, and can expose implementation details. I will log the full messages. I'd also like to have the error handling in a decorator I can use to wrap my python functions.

Here's a short version of what I'm going with.

import logging
from functools import wraps

from pxr import Tf, Usd, UsdGeom

logger = logging.getLogger(__name__)

class WickedAndPerniciousError(Exception):
  def __init__(self, message: str):
    super().__init__(message)
    self.message = message


def handle_usd_errors(op_desc: str = "USD api operation"):
  def decorator(func):
    @wraps(func)
    def wrapper(*args, **kwargs):
      try:
        return func(*args, **kwargs)
      except Tf.ErrorException as e:
        logger.exception(f"{op_desc} failed with Tf.ErrorException {str(e)}")
        mark = Tf.Error.Mark()
        Tf.RepostErrors(e)
        error_codes = " ".join([tfe.errorCodeString for tfe in mark.GetErrors()])
        raise WickedAndPerniciousError(f"{op_desc} failed. Error Code: {error_codes}")
      except WickedAndPerniciousError:
        # Re-raise our own exception unchanged
        raise
      except Exception as e:
        # These may be Boost.Python.ArgumentError
        logger.exception(f"{op_desc} failed with unexpected error: {str(e)}")
        raise WickedAndPerniciousError(f"{op_desc} failed with an unexpected error")
    return wrapper
  return decorator


@handle_usd_errors("Create Mesh")
def create_mesh(stage: Usd.Stage, path: str) -> UsdGeom.Mesh:
  return UsdGeom.Mesh.Define(stage, path)


# Now run it and see the error messages
stage = Usd.Stage.CreateInMemory()

try:
    mesh = create_mesh(stage, "/~~~invalid name~~~")
except WickedAndPerniciousError as e:
    print(f"Caught a wicked and pernicious error: {e.message}")

If you run this there's a lot of output in the terminal. That's because there's no TfDiagnosticMgr::Delegate and USD is writing to stderr. You also get my print statement from the end of the listing above.

Caught a wicked and pernicious error: Create Mesh failed. Error Code: TF_DIAGNOSTIC_CODING_ERROR_TYPE

We're reporting the exception from USD, but now using an exception type that we control and can present however we need. In my real application it will be handled in the public api and converted to error strings returned in a structured way.

There is also all the logging done to stderr. In my application I'm using a custom delegate so I can redirect those messages into the standard python logging library. I'll write more about that in a future article.

I hope this provides some ideas for how to deal with USD's error reporting in your python code. Best of luck charming those snakes! 🐍🐍🐍