A practical example of How to use ghidra to attack crackme
First Download crackme from the site MalwareTech , the password to the archive – too MalwareTech.

So, let’s see what is in the archive. We see the executable file vm1.exe and dump file ram.bin. The explanation on the site says that we are dealing with an eight-bit virtual machine. The dump file is nothing more than a chunk of memory, in which random data and a flag that we need to find are interspersed. Let’s leave the dump file alone for a while and take a look at vm1.exe through the DiE program.
CrackMy Analyzer Detect It EasyDiE does not show anything interesting, everything is fine with entropy. It means that there is no hinged protection, but it was still worth checking. Let’s load this file into Ghidra and see what it gives out. I will give a complete listing of the application without functions (it is quite small) – so that you understand what we are dealing with.
PUSH EBP
MOV EBP ,ESP
SUB ESP ,0x94
LEA ECX =>local_94 ,[0xffffff70 + EBP ]
CALL MD5::MD5
PUSH 0x1fb
PUSH 0x0
CALL dword ptr [->KERNEL32.DLL::GetProcessHeap ]
PUSH EAX
CALL dword ptr [->KERNEL32.DLL::HeapAlloc ]
MOV [DAT_0040423c ],EAX
PUSH 0x1fb
PUSH DAT_00404040
MOV EAX ,[DAT_0040423c ]
PUSH EAX
CALL memcpy
ADD ESP ,0xc
CALL FUN_004022e0
MOV ECX ,dword ptr [DAT_0040423c ]
PUSH ECX
LEA ECX =>local_94 ,[0xffffff70 + EBP ]
CALL MD5::digestString
MOV dword ptr [local_98 + EBP ],EAX
PUSH 0x30
PUSH s_We've_been_compromised!_0040302c
MOV EDX ,dword ptr [local_98 + EBP ]
PUSH EDX
PUSH 0x0
CALL dword ptr [->USER32.DLL::MessageBoxA ]
PUSH 0x0
CALL dword ptr [->KERNEL32.DLL::ExitProcess ]
XOR EAX ,EAX
MOV ESP ,EBP
POP EBP
RET
As you can see, the code is simple and easy to read. Let’s use the Ghidra decompiler and see what it produces.
undefined4 entry(void)
{
HANDLE hHeap;
char *lpText;
DWORD dwFlags;
SIZE_T dwBytes;
MD5 local_94 [144];
MD5(local_94);
dwBytes = 0x1fb;
dwFlags = 0;
hHeap = GetProcessHeap();
DAT_0040423c = (char *)HeapAlloc(hHeap,dwFlags,dwBytes);
memcpy(DAT_0040423c,&DAT_00404040,0x1fb);
FUN_004022e0();
lpText = digestString(local_94,DAT_0040423c);
MessageBoxA((HWND)0x0,lpText,"We\'ve been compromised!",0x30);
ExitProcess(0);
return 0;
}
I added indentation for readability – separated the variable declarations from the rest of the code. The code is very simple: first, memory is allocated in the heap ( GetProcessHeap… HeapAlloc), then 0x1fb (507) bytes from are copied into it DAT_00404040. But we have nothing interesting in 00404040! We recall that the crack manual said that ram.bin is a piece of memory. Of course, if you look at the file size, it turns out to be 507 bytes.
*
We load ram.bin in HxD or any other hex editor and watch.
File ram.bin in HxD Hex Editor
Alas, there is nothing intelligible there. But the logic of the work DAT_0040423c is cleared up a bit: – this is ram.bin (our dedicated 507 bytes on the heap). Let’s rename DAT_0040423c RAM to make it easier to navigate the code. Next, go to the function FUN_004022e0.
Graphic representation of the function FUN_004022e0
Here is the decompiled function code:
void FUN_004022e0(void)
{
byte bVar1;
uint uVar2;
byte bVar3;
byte local_5;
local_5 = 0;
do {
uVar2 = (uint)local_5;
bVar1 = local_5 + 1;
bVar3 = local_5 + 2;
local_5 = local_5 + 3;
uVar2 = FUN_00402270((byte *)(uint)*(byte *)(RAM + 0xff + uVar2),
(uint)*(byte *)(RAM + 0xff + (uint)bVar1),
*(undefined *)(RAM + 0xff + (uint)bVar3));
} while ((uVar2 & 0xff) != 0);
return;
}
Since we still know that we have a virtual machine, everything becomes more or less clear. But in order to truly understand pseudocode, one must always look into the disassembler, otherwise pseudocode can be confusing.
Ghidra pseudocode and disassembler
I outlined the instructions that perform the increment of variables by one. Remember that we have a function FUN_00402270 that is initialized with three parameters. We look at the initialization of the first parameter.
MOVZX ECX ,byte ptr [EBP + local_5 ]
MOV EDX ,dword ptr [RAM ]
MOVZX EAX ,byte ptr [0xff + EDX + ECX *0x1 ]
MOV dword ptr [EBP + local_14 ],EAX
MOV CL,byte ptr [EBP + local_5 ]
ADD CL,0x1 ; Variable increment
Obviously, a byte is taken from [RAM] and the variable is initialized. And the same code when initializing each function argument, the only difference is that the registers in which the function arguments will be changed FUN_00402270 . As a result, the function call looks like this:
MOV ECX ,dword ptr [EBP + local_c ]
PUSH ECX
MOV EDX ,dword ptr [EBP + local_10 ]
PUSH EDX
MOV EAX ,dword ptr [EBP + local_14 ]
PUSH EAX
CALL FUN_00402270
So, FUN_00402270 three parameters are transmitted – three bytes from [RAM], following each other. Go to the function FUN_00402270, here is its pseudocode:
uint FUN_00402270(byte *param_1,int param_2,undefined param_3)
{
if (param_1 == (byte *)0x1) {
*(undefined *)(RAM + param_2) = param_3;
}
else {
if (param_1 == (byte *)0x2) {
param_1 = (byte *)(RAM + param_2);
DAT_00404240 = *param_1;
}
else {
if (param_1 != (byte *)0x3) {
return (uint)param_1 & 0xffffff00;
}
param_1 = (byte *)(RAM + param_2);
*(byte *)(RAM + param_2) = *param_1 ^ DAT_00404240;
}
}
return CONCAT31((int3)((uint)param_1 >> 8),1);
Here the first byte passed to the function is checked, and if it matches with 0x1, 0x2 or 0x3, the next two arguments are processed. The parsing of the first parameter is especially clearly readable in the disassembled listing. Apparently, this is a virtual machine command interpreter that contains only three VM commands.
Graphic representation of the interpreter in Ghidra
PUSH EBP
MOV EBP ,ESP
PUSH ECX
MOV EAX ,dword ptr [EBP + param_1 ]
MOV dword ptr [EBP + local_8 ],EAX
CMP dword ptr [EBP + local_8 ],0x1
JZ LAB_0040228e
CMP dword ptr [EBP + local_8 ],0x2
JZ LAB_0040229e
CMP dword ptr [EBP + local_8 ],0x3
JZ LAB_004022b0
JMP LAB_004022d1
At this stage I will discuss a little more to summarize. So, we have an application that works with 507 bytes of memory, the dump of which we have is ram.bin. Inside this dump, the data that is interesting to us is mixed with other data that we do not need. The application vm1.exe reads byte memory in search of instructions 0x1, 0x2 and 0x3, and as soon as one of them is found, the next two bytes after them are processed.
In other words, we have mnemonic commands (p-code, pi-code) that work with their two arguments, and a memory area of 507 bytes is nothing more than a pi-code tape mixed with garbage. In fact, do not be afraid of garbage – processing commands will begin with finding the desired byte of the opcode, and the following two values will be taken, and the garbage is simply skipped.
INFO
P-code, or “pi-code“, is the implementation of mnemonics for its own command interpreter. It is also called the “hypothetical processor” code – after all, in fact, the processor for the execution of pi-code was written by someone independently.
Now let’s analyze the programmed opcodes of commands that are parsed by the code shown above. I will immediately provide a C code similar to the disassembler listing.
LAB_0040228e:
MOV ECX ,dword ptr [RAM ]
ADD ECX ,dword ptr [EBP + param_2 ]
MOV DL,byte ptr [EBP + param_3 ]
MOV byte ptr [ECX ],DL
JMP LAB_004022d5
Let’s start to restore the logic of the virtual machine. Announce char ram[507] – it will be the memory of the virtual machine. Using this function fopen → fread →, fwrite write the contents of the ram.bin file to this array. Four lines of the assembler code and the transition – everything is simple: in the array ram by value we [EBP + param_2] move the value param_3. In the code, it will look like this:
ram[val_01] = val_02;
We start analyzing the following subroutine:
LAB_0040229e:
MOV EAX ,[RAM ]
ADD EAX ,dword ptr [EBP + param_2 ]
MOV CL,byte ptr [EAX ]
MOV byte ptr [r1 ],CL ; DAT_00404240
JMP LAB_004022d5
It is very similar to the previous one, it is also an analogue of the MOV operation, but one of the two registers of the virtual machine ( DAT_00404240 in the listing) is already used here , into which the value from the VM memory is put. And from our point of view – from the array ram, which is addressed param_2 in the disassembler code, and in ours – val_01. In other words, an operation MOV reg,[mem].
int r1 = 0, r2 = 0; // We declare VM registers
r1 = ram[val_01];
The last subroutine is twice as difficult – instead of four lines of code, here are eight! We take the value from memory (remember our array ram, where did we write the contents of ram.bin?) And save it to the virtual machine register (EDX), then take the first value after the mnemonics in the pi code (ECX) and perform the XOR operation between them . The result is put back into memory.
LAB_004022b0:
MOVZX EDX ,byte ptr [r1 ] ; DAT_00404240
MOV EAX ,[RAM ]
ADD EAX ,dword ptr [EBP + param_2 ]
MOVZX ECX ,byte ptr [EAX ]
XOR ECX ,EDX
MOV EDX ,dword ptr [RAM ]
ADD EDX ,dword ptr [EBP + param_2 ]
MOV byte ptr [EDX ],CL
JMP LAB_004022d5
In C, it will look like this:
r2 = ram[val_01];
ram[val_01] = r2 ^ r1;
That’s all. The three-team virtual machine has been restored, it remains to apply the results of our work to the ram.bin file in order to get the required cracking flag. As I said, for this we read the file in char ram[507] and use the VM code decompiler. As a bonus, the cycle will display the virtual machine mnemonics in a readable form, and at the end will print the desired flag. I added clarifying comments to the code.
char ram[507]; // VM memory, ram.bin
int r1 = 0, r2 = 0; // VM registers
for (;;)
{
int command = (int)ram[x]; // We take command opcode
int val_01 = (int)ram[x + 1]; // First operand of the command
int val_02 = (int)ram[x + 2]; // Second operand command
// Decoding the code
if (command == 0x1)
{
ram[val_01] = val_02;
cout << "mov " << "[" <<(int)ram[val_01] << "]" << "," << val_02 << endl;
}
if (command == 0x2)
{
r1 = ram[val_01];
cout << "mov " << "r1" << "," << "[" << (int)ram[val_01] << "]" << endl;
}
if (command == 0x3)
{
r2 = ram[val_01];
ram[val_01] = r2 ^ r1;
cout << "xor " << "r2" << "," << "r1" << endl;
}
if (command > 3 || command < 1) break;
x += 3;
}
printf("\n%s\n", &ram); // Print the result
After executing this code, we will get the disassembled VM and flag.
Result of the restored virtual machine
Conclusion
I hope that, after reading the article, you will stop being afraid of the words “virtual machine” or “pi-code”. Of course, in real commercial protectors like VMProtect or Themida everything will be much more complicated: there can be used a lot of virtual machine commands, their mnemonic codes can change constantly, there are virtual machines, different anti-debugging and anti-dump techniques written in pi-code, and much more . But you got the first idea.
At the same time, we became more closely acquainted with the toolbox called Ghidra and performed the first hack using it, even if it was a crack!
Leave a Reply