C++ Typecasts – An Assembly POV

This post may be considered a continuation of the post Addressing The Addresses or a separate read. We are basically interested in understanding how type-casts are done at assembly level, corresponding to C++ language, which may give a clue about the address arithmetic being performed leading to the offsets discussed in the post I mentioned in the beginning.

First let me demonstrate the typecasting of primitive types (I am taking some stuff from this wiki page). Consider the C++ code

int aVar = 65;
int* intPointer = &aVar;
    
char* charPointer = (char*) intPointer;

In the last line we are doing an explicit type conversion from int to char.

From assembly instruction perspective, there is no difference between int and char as far as the casting is concerned. Information about one pointer (of type int) is shared with the different pointer (of type char). Only the dereferencing bit differs like so

char b = *aP;         mov     rax, QWORD PTR [rbp-8]
                      movzx   eax, BYTE PTR [rax]
                      mov     BYTE PTR [rbp-9], al

while for int

int d = *cP;          mov     rax, QWORD PTR [rbp-24]
                      mov     eax, DWORD PTR [rax]
                      mov     DWORD PTR [rbp-28], eax

for char type of dereferencing, BYTE is read, while, for int type of dereferencing, DWORD is read.

Moving on, we now consider user defined data types, specially classes. Consider the following code

#include <iostream>
class UObjectBase
{
    int a;
    int b;
};

class UObject : public UObjectBase
{
public:
    virtual ~UObject() = default;
};

// Type your code here, or load an example.
int main()
{
    UObjectBase bar;
    UObjectBase* ptr = &bar;

    UObject* fp = (UObject*) ptr;

    std::cout << &bar << '\n';
    std::cout << fp << '\n';
}

The out put of above program may be

0x7ffe4dd2f998
0x7ffe4dd2f990 // 8 bytes offset

however on removing the virtual function (the destructor ~UObject()) may lead to the out put

0x7ffe4dd2f998
0x7ffe4dd2f998 // same address

Here, we are basically down casting from UObjectBase to UObject and that leads to the offset of 8 bytes in presence of virtual function (vtable). The assembly code instructions (generated vis godbolt) look like

    UObject* fp = (UObject*) ptr;    cmp     QWORD PTR [rbp-8], 0
                                     mov     eax, 0
                                     mov     rax, QWORD PTR [rbp-8]
                                     sub     rax, 8
                                     jmp     .L3

where .L3 is some complex set of instructions, while, in absence of the virtual function, instructions are like so

    UObject* fp = (UObject*) ptr;    mov     rax, QWORD PTR [rbp-8]
                                     mov     QWORD PTR [rbp-16], rax

In presence of the virtual function (or vtable) there is an instruction to subtract 8 bits from rax which creates the offset in addresses shown in the out put. This example was taken from stackoverflow post.

In the book “Effective C++, third edition”, item 27, Scott Myers points that

… a single object (e.g., an object of
type Derived) might have more than one address (e.g., its address
when pointed to by a Base* pointer and its address when pointed to by
a Derived* pointer). That can’t happen in C. It can’t happen in Java. It
can’t happen in C#. It does happen in C++.

Addressing The Addresses

In my books, computers are the best machines humans have ever built. A big wave to Alan Turing, you are the thinker I may resent if at all (well Paul Dirac and Darwin are in the same league). All Turing machines (i.e computers) have tape to read from and to write to, like so

The purpose of this blog-post is to understand the implications of “how” the tape is read. Based on this notion of “how” variety of “meanings” (software data-types) emerge if considered at low enough level of abstraction (assembly language).

We will take a live example from Karma, where we have a UObject pool allocator. The idea is to allocate a block of memory for UObject, for instance, AActor spawn (which is done in the routine SpawnActor()). Another example is UClass (which is great grand child class of UObject). Our pool allocator, GUObjectAllocator, allocates the space from the blue strip, representing Karma’s pre-allocated memory, and returns the address in red, meaning the subsequent block of memory addresses, determined by the size of UClass, have been reserved.

However, the following error message (pardon the Normal block address which is not specific to the example I have in mind, written above) started appearing on application close (or, to be precise, on freeing the blue block of memory at application close)

Thus began my search for the grand resolution of the issue. I posted the error message in CppIndia Discord Server. In the response, I was suggested to use Address Sanitizer (ASAN) to look out for the cause of such issue. And sure enough, Xcode ASAN greeted me with the following

==36231==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x00012ad047f8 at pc 0x00010084d208 bp 0x00016fdfcc20 sp 0x00016fdfc3e0
WRITE of size 64 at 0x00012ad047f8 thread T0
    #0 0x10084d204 in __asan_memset+0x104 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x41204) (BuildId: f0a7ac5c49bc3abc851181b6f92b308a32000000200000000100000000000b00)
    #1 0x102a971d4 in Karma::FGenericPlatformMemory::Memzero(void*, unsigned long) GenericPlatformMemory.h:72
    #2 Callstack .....

0x00012ad047f8 is located 8 bytes to the left of 1280000-byte region [0x00012ad04800,0x00012ae3d000)
allocated by thread T0 here:
    #0 0x10084ee68 in wrap_malloc+0x94 (libclang_rt.asan_osx_dynamic.dylib:arm64e+0x42e68) (BuildId: f0a7ac5c49bc3abc851181b6f92b308a32000000200000000100000000000b00)
    #1 0x102aa5b08 in Karma::FMemory::SystemMalloc(unsigned long) KarmaMemory.h:124
    #2 0x102aa5a24 in Karma::KarmaSmriti::StartUp() KarmaSmriti.cpp:25
    #3 0x1029ab45c in Karma::Application::PrepareMemorySoftBed() Application.cpp:65
    #4 0x1029aade8 in Karma::Application::PrepareApplicationForRun() Application.cpp:54
    #5 0x10000772c in main EntryPoint.h:79
    #6 0x195923f24  (<unknown module>)

Some report on Heap Overflow
==36231==ABORTING

I have put bold font to the text of interest. The FGenericPlatformMemory::MemZero() is doing the “illegal” writing to the block of memory at address 0x00012ad047f8 which was not the address, 0x00012ad04800, returned by GUobjectAllocator. Furthermore this fact is reinforced by the message “0x00012ad047f8 is located 8 bytes to the left of 1280000-byte region [0x00012ad04800,0x00012ae3d000)”. So who or rather how is this offset of 8 bytes is being introduced?

The typecasting done while generating the UClass object, here

ReturnClass = (UClass*)GUObjectAllocator.AllocateUObject(sizeof(UClass), alignof(UClass), true);

is the reason for the offset of 8 bytes leading to the error message because of writing in the place that is not supposed to be written by the app. I have marked the “illegal” block of memory with pink in the cartoon above. The rectification is simple enough

ReturnClass = reinterpret_cast<UClass*>(GUObjectAllocator.AllocateUObject(sizeof(UClass), alignof(UClass), true)); 

The reinterpret_cast basically type casts the data type without introducing offsets in the address. Thus the conversion from UObjectBase* to UClass* is achieved with ReturnClass having the address value of 0x00012ad04800, which is the legal block of memory reserved by Karma’s pool allocator.

This might raise a question on the comparative working of reinterpret_cast and C-style cast that we leave to the future. A thing that can be said is for that we will be needing assembly language equivalent of the code, something along the lines of this article.