Demystifying VMT and RTTI in Unreal Engine C++

(Cover Photo: ©1984 – ITV Granada
Jeremy Brett, “Sherlock Holmes” TV Series)

It is no secret that, due to nature of modern OOP guidelines, Unreal Engine C++ source code is full of “virtual functions”. It is also no secret that, calling a virtual function is inherently slower than calling a non-virtual function… So, what’s the catch?

A non-virtual call is simply a jump (JMP) to a compiled-in pointer. On the other hand, a virtual call requires at least an extra indexed dereference, and sometimes a fixup addition, which is stored in a lookup table known as Virtual Method Table (VMT). This is simply why a virtual call is always slower than a non-virtual call.

To avoid this overhead, compilers usually steer clear of VMT whenever the call can be resolved at compile time. However, due to complex nature of Inheritence based class structures used in modern game engines, using VMT is unavoidable in most cases.

Virtual Method Table (VMT)

An object’s VMT (also known as vtable) contains the addresses of the object’s dynamically bound methods. Method calls are performed by fetching the method’s address from the object’s virtual method table.

C++ compiler creates a separate VMT for each class. When an object is created, a virtual pointer (VPTR) to this table is added as a hidden member of this object. The compiler also generates hidden code in the constructors of each class to initialize new object’s VPTR to the address of its class’ virtual method table.

The virtual method table is same for all objects belonging to the same class, and is therefore typically shared between them. Objects belonging to type-compatible classes in an inheritance hierarchy will have virtual method tables with the same layout.

Tip: The C++ standards do not mandate exactly how dynamic dispatch must be implemented, but compilers generally use minor variations on the same basic model. The VMT is generally a good performance trade-off to achieve dynamic dispatch, but there are alternatives, such as Binary Tree Dispatch (BTD), with higher performance but different costs.

Speaking of hidden codes and VMTs in the constructors of each class, each C++ object should also contain additional information about its type. An object’s data type is a crucial information in terms of casting.

Run-Time Type Information  (RTTI)

RTTI is a feature of the C++ programming language that exposes information about an object’s data type at runtime. It can apply to simple data types, such as integers and characters, or to generic types.

Run-Time Type Information is available only for classes that are polymorphic, which means they have at least one virtual method. In practice, this is not a limitation because base classes must have a virtual destructor to allow objects of derived classes to perform proper cleanup if they are deleted from a base pointer.

RTTI is used in three main C++ language elements:

  The dynamic_cast operator: Used for conversion of polymorphic types.

  The typeid operator: Used for identifying the exact type of an object.

  The type_info class: Used to hold the type information returned by the typeid operator.

In order to perform cast-related operations listed above, RTTI heavily uses VMT. For example, given an object of a polymorphic class, a type_info object can be obtained through the use of the typeid operator.  In principle, this is a simple operation which involves finding the VMT, through that finding the most-derived class object of which the object is part, and then extracting a pointer to the type_info object from that object’s virtual function table (or equivalent).

In terms of performance, using the dynamic_cast operator is more expensive than type_info. Given a pointer to an object of a polymorphic class, a cast to a pointer to another base subobject of the same derived class object can be done using a dynamic_cast.  In principle, this operation involves finding the VMT, through that finding the most-derived class object of which the object is part, and then using type information associated with that object to determine if the conversion (cast) is allowed, and finally performing any required adjustments of the pointer.  In principle, this checking involves the traversal of a data structure describing the base classes of the most derived class.  Thus, the run-time cost of a dynamic_cast may depend on the relative positions in the class hierarchy of the two classes involved.

Tip: In the original C++ design, Bjarne Stroustrup did not include RTTI, because he thought this mechanism was often misused.

Hacking VMT and RTTI Information using UE C++

In order to gather “header” information (that contains VMT and RTTI data) in an Unreal Engine C++ class/object, I have written the following LogClassHeader() C++ function using Visual Studio 2019 version 16.3.8 for Unreal Engine version 4.23.1.

void UDebugCore::LogClassHeader(void* const pThis, size_t nSize)
{
  FString fsClassName = "NULL";
  FString fsObjectName = "NULL";
  UObject* const pCastedToUObject = (UObject*)(pThis);
  void** const pCastedToHeader = reinterpret_cast<void**>(pThis);

  if (pCastedToUObject)
  {
    fsClassName = pCastedToUObject->GetClass()->GetName();
    fsObjectName = pCastedToUObject->GetName();
  }

  if (pCastedToHeader)
  {
    for (size_t i = 0; i < nSize / sizeof(void*); i++)
    {
      MACRO_PRINTF("Pointer[%04zu] = 0x%p", i, pCastedToHeader[i])
    }
  }
}

This function has 2 input parameters:

  pThis:  Object to extract Class Header Info from
  nSize:  Size of Header

Calling the function is very easy. You simply insert your call into the constructor of the class that you would like to hack. For example, the following code gathers C++ header information of <APlayerControllerThirdPerson> class.

APlayerControllerThirdPerson::APlayerControllerThirdPerson()
{
  UDebugCore::LogClassHeader(this, sizeof(*this));
}

When you run the code, all pointers that are available (and used) in the header of <APlayerControllerThirdPerson> will be listed. And, in case you need, instance name and class type is stored in fsObjectName and fsClassName variables, as a bonus.

Pointer[0000] = 0x00007FFC8A3ECB00
Pointer[0001] = 0x0000BC5B00000231
Pointer[0002] = 0x000001648346AC00
Pointer[0003] = 0x000AAD5B000AAD5B
Pointer[0004] = 0x0000000000000000
Pointer[0005] = 0x00000164834864E0
Pointer[0006] = 0x00007FFCAAB2EF98
Pointer[0007] = 0x00007FFCA8708208
Pointer[0008] = 0x0000000F00000000
            .....
             ...
              .

So, what do these numbers mean? Well, all these pointers are addresses of the virtual functions, followed by some of the member variables. In order to understand which is which, we need to decipher the structure of the data set.

Here comes the tricky part! With each and every update to Visual Studio C++ compiler, the structure tends to get updated as well. In other words, the header structure of a C++ class changes with each major compiler update. Try to think of it as a “living organism”. As I’m typing this sentence now, a new update with a new C++ header structure may be on its way. So, it is up to you (!) to analyze what’s going on under the hood.

Good news is, we can gather <template> information about C++ class header structure from the official Microsoft patent documents! Although they are not up-to-date, I think it is a good practice to start the investigation using the source of information itself.

Here is some of the Microsoft patents which describe various parts of C++ implementation used in Visual Studio:

  US Patent #5410705: “Method for generating an object data structure layout for a class in a compiler for an object-oriented programming language”

  US Patent #5617569: “Method and system for implementing pointers to members in a compiler for an object-oriented programming language”

  US Patent #5754862: “Method and system for accessing virtual base classes”

  US Patent #5297284: “Method and system for implementing virtual functions and virtual base classes and setting a this pointer for an object-oriented programming language”

  US Patent #5371891: “Method for object construction in a compiler for an object-oriented programming language”

  US Patent #5603030: “Method and system for destruction of objects using multiple destructor functions in an object-oriented computer system”

So, what I’m offering is, good old-fashioned reverse engineering”.

– “Is such a challenge worth it?”

– “Um, yes!… If it doesn’t break you, it can make you.”

References:

  Bjarne Stroustrup, “A History of C++: 1979—1991”, p. 50 – (March 1993)

  Keith Cooper & Linda Torczon, “Engineering A Compiler”, Morgan Kaufmann, 2nd Edition – (2011)

  Microsoft Visual Studio Documentation, “C++ Run-Time Type Information” – (November 2016)

  “Technical Report on C++ Performance”, OpenRCE.org – (September 2006)

  “Reversing Microsoft Visual C++: Classes, Methods and RTTI”, OpenRCE.org – (September 2006)

  “Intel® 64 and IA-32 Architectures Optimization Reference Manual” – (April 2018)