What is reinterpret cast<>
?
Just like type casting (char)int
, which reinterprets the value numerically here:
int x = 65;
std::cout << (char)x << std::endl; //prints character 'A' as 65 is the ascii code for 'A'.
//OR
std::cout << static_cast<char>(x) << std::endl; //C++ style
Similarly reinterpret casts works on pointers !
Specific type of pointers can read specific type of data. Means a int
pointer will read 4 bytes
of data at a time whereas a char
pointer reads 1 byte
of data at a time.
reinterpret cast allows us to change this behavior.
int x = 54;
int* ptr = &x
std::cout << *ptr << std::endl; //reads 4 bytes of data at a time
char* c_ptr = reinterpret_cast<char*>(&x); //reinterpret it to view the address as a char pointer instead of a int pointer
std::cout << *c_ptr << std::endl; //reads only 1 byte at a time, and if printed as a character, shows the ASCII equivalent.
reinterpreting a int*
as a char*
changes it behavior, It does not changes the actual byte and data but the pointer type and how we can access it changes. A int*
can read 4 bytes
at a time that’s its natural behavior, and a char*
reads only 1 byte
so reinterpreting allows use to view the 4 bytes
as 1 byte
chunks.
Using the []
operator
We can use the []
operator to access the different bytes just as in arrays.
int x = 54;
int* ptr = &x
std::cout << *ptr << std::endl; //reads 4 bytes of data at a time
char* c_ptr = reinterpret_cast<char*>(&x); //reinterpret it to view the address as a char pointer instead of a int pointer
for(int x = 0; x < 4; x++){ //runs 4 times because a int is 4 bytes.
std::cout << c_ptr[x] << std::endl;
}
How does this work ?
A int
is 4 bytes
of memory but reinterpreting it as a char*
, we can only view 1 byte at a time (because a char
is 1 byte
and so a char*
behaves the same way - reading only a byte at a time). Therefore to read all the 4 bytes
we need to access the 4 indexes
, therefore we can use the []
operator to access the locations.
int -> 64;
broken into 4 bytes -> 0x00 0x00 0x00 0x40
int -> byte1(0x40) ~ byte2(0x00) ~ byte3(0x00) ~ byte4(0x00) #follows little endian
char* -> [0]read^
reads 1 byte at a time! Unlike int which reads 4 bytes a time
So,
c_ptr[0] = first byte(0x40), #<- LSB
c_ptr[1] = second byte(0x00),
c_ptr[2] = third byte(0x00),
c_ptr[3] = fourth byte(0x00) #<- MSB
Why does it work ?
int
, char
, float
, etc. are just raw bytes with custom read and write rules like an int
will be 4 bytes and it will only contain numbers, not characters or any other data, only numbers without decimals and same idea for other datatypes ! And their corresponding pointers behave the same way - a int*
will always read 4 bytes
and a char*
will always read 1 byte
.
reinterpret_cast<>
tells the compiler:
“Treat this block of memory as if it belongs to a different type, without changing the actual bytes.”
reinterpret_cast<>
just lets us change that behavior, change the way we are accessing raw bytes, to view a type x
memory as a type y
pointer ! Its extremely dangerous as its very much prone to reading or writing in restricted or out of bounds memory.
Now, in C/C++ arrays decay to pointers to the first element when passed into expressions or functions (sizeof(arr)
or &arr
wont decay). Means we can do this:
int arr[] = { 11, 22, 33, 44, 55 };
std::cout << *arr << std::endl;
OR
int arr[] = { 11, 22, 33, 44, 55 };
std::cout << *(arr + 1) << std::endl; //access element at index 1
OR
int arr[] = { 11, 22, 33, 44, 55 };
std::cout << arr[1] << std::endl; //access element at index 1
its just basic pointer arithmetic and dereferencing !
Its just pointer + index
which is written as pointer[index]
!
Its basic math !
a + b = b + a
=> pointer + index = index + pointer
=> pointer[index] = index[pointer]
AND THATS WHY THIS WORKS TOO:
int arr[] = { 11, 22, 33, 44, 55 };
std::cout << 2[arr] << std::endl; //same as arr[2] !
Printing values after using reinterpret_cast<>
int x = 64;
int* ptr = &x
std::cout << *ptr << std::endl; //reads 4 bytes of data at a time
char* c_ptr = reinterpret_cast<char*>(&x); //reinterpret it to view the address as a char pointer instead of a int pointer
for(int x = 0; x < 4; x++){
std::cout << c_ptr[x] << std::endl;
}
/*
OUTPUT:
@
//non printable character
//non printable character
//non printable character
*/
WHY IS THE OUTPUT LIKE THAT ?
64 is an int
,
64 in hex
-> 0x00000040
Follows little endian format where the LSB
is taken in account first.
So the bytes are:
byte 1 = 0x40
,
byte 2 = 0x00
,
byte 3 = 0x00
,
byte 4 = 0x00
0x40
represented as a char
is the character ‘@
’ ! Therefore we get the ‘@
’ and followed by 0x00
which represents NUL
character in ASCII, which is not printable therefore the blank spaces !
See how it behaves like a character just how doing (char)int
would do ! Means even casting the pointer to a different type changes it behavior on how its read and displayed !