Angr_Lab

Angr_Lab

实验准备

编译

首先是每个关卡都有一个独立的文件夹, 里面会有一个形似00_angr_find.c.jinja的文件, 这是还未编译的文件, 我们需要手动编译, 而Angr_Lab提供了编译生成脚本generate.py

1
2
python generate.py 1234 00_angr_find
其中1234是我们输入的一个随机数, 最后一个参数就是我们要输出编译的文件

运行

在编译得到可执行文件后, 我们可以先尝试运行.

在WSL上运行生成的可执行文件(为32位程序, 需要安装32位运行库, 同时需要WSL2)

可以看到是输入flag, 正确则为Good Job, 错误则为Try again

分析

使用IDA分析程序逻辑, 找到接收的执行路径地址等等…

使用Angr解题

在同文件下有一个scaffold00.py文件, 里面有Angr解题的基本框架, 同时还有很多的提示以及知识点.我们根据这些去解题.

00_angr_find

我们在编译得到了可执行文件后首先分析

main()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
int __cdecl main(int argc, const char **argv, const char **envp)
{
int i; // [esp+1Ch] [ebp-1Ch]
char s1[9]; // [esp+23h] [ebp-15h] BYREF
unsigned int v6; // [esp+2Ch] [ebp-Ch]

v6 = __readgsdword(0x14u);
printf("Enter the password: ");
__isoc99_scanf("%8s", s1);
for ( i = 0; i <= 7; ++i )
s1[i] = complex_function(s1[i], i);
if ( !strcmp(s1, "HXUITWOA") )
puts("Good Job.");
else
puts("Try again.");
return 0;
}

找到加密函数后跟进查看

complex_function(int a1, int a2)

1
2
3
4
5
6
7
8
9
int __cdecl complex_function(int a1, int a2)
{
if ( a1 <= 64 || a1 > 90 ) // 首先判断是否是小写字母以及@
{
puts("Try again.");
exit(1);
}
return (3 * a2 + a1 - 65) % 26 + 65; // 然后进行运算
}

汇编查看”Good Job”的地址

最终打印”Good Job”字符串的汇编语句在mian()函数中, 我们查看mian()反汇编

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
; int __cdecl main(int argc, const char **argv, const char **envp)
.text:08049242 public main
.text:08049242 main proc near ; DATA XREF: _start+2A↑o
.text:08049242
.text:08049242 var_2C = dword ptr -2Ch
.text:08049242 var_1C = dword ptr -1Ch
.text:08049242 s1 = byte ptr -15h
.text:08049242 var_C = dword ptr -0Ch
.text:08049242 var_4 = dword ptr -4
.text:08049242 argc = dword ptr 8
.text:08049242 argv = dword ptr 0Ch
.text:08049242 envp = dword ptr 10h
.text:08049242
.text:08049242 ; __unwind {
.text:08049242 lea ecx, [esp+4]
.text:08049246 and esp, 0FFFFFFF0h
.text:08049249 push dword ptr [ecx-4]
.text:0804924C push ebp
.text:0804924D mov ebp, esp
.text:0804924F push ecx
.text:08049250 sub esp, 34h
.text:08049253 mov eax, ecx
.text:08049255 mov eax, [eax+4]
.text:08049258 mov [ebp+var_2C], eax
.text:0804925B mov eax, large gs:14h
.text:08049261 mov [ebp+var_C], eax
.text:08049264 xor eax, eax
.text:08049266 sub esp, 0Ch
.text:08049269 push offset aEnterThePasswo ; "Enter the password: "
.text:0804926E call _printf
.text:08049273 add esp, 10h
.text:08049276 sub esp, 8
.text:08049279 lea eax, [ebp+s1]
.text:0804927C push eax
.text:0804927D push offset a8s ; "%8s"
.text:08049282 call ___isoc99_scanf
.text:08049287 add esp, 10h
.text:0804928A mov [ebp+var_1C], 0
.text:08049291 jmp short loc_80492C0
.text:08049293 ; ---------------------------------------------------------------------------
.text:08049293
.text:08049293 loc_8049293: ; CODE XREF: main+82↓j
.text:08049293 lea edx, [ebp+s1]
.text:08049296 mov eax, [ebp+var_1C]
.text:08049299 add eax, edx
.text:0804929B movzx eax, byte ptr [eax]
.text:0804929E movsx eax, al
.text:080492A1 sub esp, 8
.text:080492A4 push [ebp+var_1C]
.text:080492A7 push eax
.text:080492A8 call complex_function
.text:080492AD add esp, 10h
.text:080492B0 mov ecx, eax
.text:080492B2 lea edx, [ebp+s1]
.text:080492B5 mov eax, [ebp+var_1C]
.text:080492B8 add eax, edx
.text:080492BA mov [eax], cl
.text:080492BC add [ebp+var_1C], 1
.text:080492C0
.text:080492C0 loc_80492C0: ; CODE XREF: main+4F↑j
.text:080492C0 cmp [ebp+var_1C], 7
.text:080492C4 jle short loc_8049293
.text:080492C6 sub esp, 8
.text:080492C9 push offset s2 ; "HXUITWOA"
.text:080492CE lea eax, [ebp+s1]
.text:080492D1 push eax ; s1
.text:080492D2 call _strcmp
.text:080492D7 add esp, 10h
.text:080492DA test eax, eax
.text:080492DC jz short loc_80492F0
.text:080492DE sub esp, 0Ch
.text:080492E1 push offset s ; "Try again."
.text:080492E6 call _puts
.text:080492EB add esp, 10h
.text:080492EE jmp short loc_8049300
.text:080492F0 ; ---------------------------------------------------------------------------
.text:080492F0
.text:080492F0 loc_80492F0: ; CODE XREF: main+9A↑j
.text:080492F0 sub esp, 0Ch
.text:080492F3 push offset aGoodJob ; "Good Job."
.text:080492F8 call _puts
.text:080492FD add esp, 10h
.text:08049300
.text:08049300 loc_8049300: ; CODE XREF: main+AC↑j
.text:08049300 mov eax, 0
.text:08049305 mov ecx, [ebp+var_C]
.text:08049308 xor ecx, large gs:14h
.text:0804930F jz short loc_8049316
.text:08049311 call ___stack_chk_fail
.text:08049316 ; ---------------------------------------------------------------------------
.text:08049316
.text:08049316 loc_8049316: ; CODE XREF: main+CD↑j
.text:08049316 mov ecx, [ebp+var_4]
.text:08049319 leave
.text:0804931A lea esp, [ecx-4]
.text:0804931D retn
.text:0804931D ; } // starts at 8049242
.text:0804931D main endp
.text:0804931D
.text:0804931D ; ---------------------------------------------------------------------------

可以看到我们要找的是: .text:080492F3 push offset aGoodJob ; "Good Job."

使用Angr解题

在了解了程序的大致逻辑以后, 我们打开scaffold00.py使用Angr来解题

代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
# Before you begin, here are a few notes about these capture-the-flag
# challenges.
#
# Each binary, when run, will ask for a password, which can be entered via stdin
# (typing it into the console.) Many of the levels will accept many different
# passwords. Your goal is to find a single password that works for each binary.
#
# If you enter an incorrect password, the program will print "Try again." If you
# enter a correct password, the program will print "Good Job."
#
# Each challenge will be accompanied by a file like this one, named
# "scaffoldXX.py". It will offer guidance as well as the skeleton of a possible
# solution. You will have to edit each file. In some cases, you will have to
# edit it significantly. While use of these files is recommended, you can write
# a solution without them, if you find that they are too restrictive.
#
# Places in the scaffoldXX.py that require a simple substitution will be marked
# with three question marks (???). Places that require more code will be marked
# with an ellipsis (...). Comments will document any new concepts, but will be
# omitted for concepts that have already been covered (you will need to use
# previous scaffoldXX.py files as a reference to solve the challenges.) If a
# comment documents a part of the code that needs to be changed, it will be
# marked with an exclamation point at the end, on a separate line (!).

#在开始之前, 这里有一些关于CTF挑战的注意事项

#每个二进制文件在运行时都会要求输入密码, 不同关卡有不同的密码(flag), 你的目标就是找到flag

#如果输入flag是错误的, 打印"Try again"; 如果输入flag正确, 则打印"Good Job""

#每个挑战都会附带一个类似"scaffoldXX.py", 它将提供指导以及可能的解决方案的框架.你必须取编辑每个文件.
#在某些情况下, 你将必须去进行大量的编辑. 虽然建议使用这些文件, 但是如果你发现它们限制了你, 你用自己
#的方式去编写

#"???"表示这里需要替换为其他内容, "..."表示这里需要跟多代码, "!"表示下面的代码将需要修改
#有新概念的地方会有注释, 但是已经讲过的概念不会再重复(你需要将以前的scaffoldXX.py的内容作为参考来解决)
#
import angr
import sys

def main(argv):
# Create an Angr project.创建一个Angr项目
# If you want to be able to point to the binary from the command line, you can
# use argv[1] as the parameter. Then, you can run the script from the command
# line as follows:
#如果你想能够从命令行指向二进制文件, 可以使用argv[1]作为参数. 然后你从命令行运行脚本, 如下所示:
# python ./scaffold00.py [binary]
#意思就是你可以在打开文件的时候在后面加上一个参数, 这个参数是需要解的文件, 然后path_to_binary = argv[1]
# (!)
path_to_binary = argv[1] # :string
project = angr.Project(path_to_binary)#创建Angr项目, 参数一个文件路径

# Tell Angr where to start executing (should it start from the main()
# function or somewhere else?) For now, use the entry_state function
# to instruct Angr to start from the main() function.
#告诉Angr应该从哪里开始执行(应该从main()函数还是其他地方开始执行), 下面使用entry_state函数指示Angr从main()函数开始
initial_state = project.factory.entry_state(
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,#符号 填充 不受限制的 内存
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}#符号 填充 不受限制的 寄存器
)

# Create a simulation manager initialized with the starting state. It provides
# a number of useful tools to search and execute the binary.
# 创建一个初始化为"起始状态"的模拟管理器. 它提供了大量的工具去搜索和执行二进制文件
simulation = project.factory.simgr(initial_state)

# Explore the binary to attempt to find the address that prints "Good Job."
# You will have to find the address you want to find and insert it here.
# This function will keep executing until it either finds a solution or it
# has explored every possible path through the executable.
# 探索二进制文件以尝试找到打印"Good Job"的地址. 你必须找到你想要搜索的地址并将该地址赋给print_good_address变量
# 此函数将继续执行, 直到找到解决方案或探索了可执行文件的所有可能路径
# (!)下面有需要修改的地址
print_good_address = 0x080492F3 # :integer (probably in hexadecimal)
simulation.explore(find=print_good_address)

# Check that we have found a solution. The simulation.explore() method will
# set simulation.found to a list of the states that it could find that reach
# the instruction we asked it to search for. Remember, in Python, if a list
# is empty, it will be evaluated as false, otherwise true.
# 检查我们是否找到了解决方案, simulation.explore()方法把simulation.found设置为一个<状态列表>
# 它可以找到我们要求它搜索的指令的状态. 请记住: 在Python中, 如果列表为空, 它将被判断为false, 否则为true
if simulation.found:#类似于全局变量在simulation.explore中被设置成了我们想要的执行状态
# The explore method stops after it finds a single state that arrives at the
# target address.
# explore方法找到目标地址的单个状态后停止
solution_state = simulation.found[0]

# Print the string that Angr wrote to stdin to follow solution_state. This
# is our solution.
# 打印Angr写入标准输入的字符串以跟随solution_state, 这就是我们的解决方案
print(solution_state.posix.dumps(sys.stdin.fileno()).decode())
else:
# If Angr could not find a path that reaches print_good_address, throw an
# error. Perhaps you mistyped the print_good_address?
# 如果Angr不能找到一条到达print_good_address, 则抛出一个error.
#也许是你写错了print_good_address
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

把注释去掉以后

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
import angr
import sys

def main(argv):
path_to_binary = argv[1]
project = angr.Project(path_to_binary)

initial_state = project.factory.entry_state(
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

simulation = project.factory.simgr(initial_state)

print_good_address = 0x080492F3
simulation.explore(find=print_good_address)

if simulation.found:
solution_state = simulation.found[0]
print(solution_state.posix.dumps(sys.stdin.fileno()).decode())
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

使用的流程

  • 找到目标程序, 并在该程序的基础上创建Angr项目
  • 指定执行起点(符号应该是默认为标准输入了)
  • 创建一个simulation来提供工具(方法)去搜索和执行二进制文件
  • 通过上面的分析我们找到了”Good Job”的地址, 我们使用simulation提供的explore来设定”接受的执行路径”
  • explore后开始符号执行
  • 最后搜索到的结果放在simulation.found(有点像是全局变量)中, 如果没搜索到found则为空对于if来讲就是false, 成功则为非空对于if来讲就是true.

01_angr_avoid

编译并执行

1
2
3
4
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/01_angr_avoid$ python3 generate.py 1234 01_angr_avoid
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/01_angr_avoid$ ./01_angr_avoid
Enter the password: aaaaaaaaa
Try again.

分析

使用IDA查看

来到mian()函数, 可以看到很多的”应该避免的路径”, mian()代码非常庞大以至于IDA给出无法显示的警告

Untitled

“Good Job”所在地址

我们使用字符串查找可以找到我们需要的”接受的执行路径”

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
.text:080492CC ; int __cdecl maybe_good(char *s1, char *s2)
.text:080492CC public maybe_good
.text:080492CC maybe_good proc near ; CODE XREF: main+31E↓p
.text:080492CC ; main+336↓p ...
.text:080492CC
.text:080492CC s1 = dword ptr 8
.text:080492CC s2 = dword ptr 0Ch
.text:080492CC
.text:080492CC ; __unwind {
.text:080492CC endbr32
.text:080492D0 push ebp
.text:080492D1 mov ebp, esp
.text:080492D3 sub esp, 8
.text:080492D6 movzx eax, should_succeed
.text:080492DD test al, al
.text:080492DF jz short loc_804930A
.text:080492E1 sub esp, 4
.text:080492E4 push 8 ; n
.text:080492E6 push [ebp+s2] ; s2
.text:080492E9 push [ebp+s1] ; s1
.text:080492EC call _strncmp
.text:080492F1 add esp, 10h
.text:080492F4 test eax, eax
.text:080492F6 jnz short loc_804930A
.text:080492F8 sub esp, 0Ch
.text:080492FB push offset aGoodJob ; "Good Job."
.text:08049300 call _puts
.text:08049305 add esp, 10h
.text:08049308 jmp short loc_804931B
.text:0804930A ; ---------------------------------------------------------------------------
.text:0804930A
.text:0804930A loc_804930A: ; CODE XREF: maybe_good+13↑j
.text:0804930A ; maybe_good+2A↑j
.text:0804930A sub esp, 0Ch
.text:0804930D push offset s ; "Try again."
.text:08049312 call _puts
.text:08049317 add esp, 10h
.text:0804931A nop
.text:0804931B
.text:0804931B loc_804931B: ; CODE XREF: maybe_good+3C↑j
.text:0804931B nop
.text:0804931C leave
.text:0804931D retn
.text:0804931D ; } // starts at 80492CC
.text:0804931D maybe_good endp

可以看到我们要找的是: .text:080492FB push offset aGoodJob ; "Good Job."

mian()函数

我们稍微分析以下main函数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
; int __cdecl main(int argc, const char **argv, const char **envp)
.text:0804931E public main
.text:0804931E main proc near ; DATA XREF: _start+2A↑o
.text:0804931E
.text:0804931E var_4C = dword ptr -4Ch
.text:0804931E j = dword ptr -3Ch
.text:0804931E i = dword ptr -38h
.text:0804931E input = byte ptr -34h
.text:0804931E s2 = byte ptr -20h
.text:0804931E var_C = dword ptr -0Ch
.text:0804931E var_4 = dword ptr -4
.text:0804931E argc = dword ptr 8
.text:0804931E argv = dword ptr 0Ch
.text:0804931E envp = dword ptr 10h
.text:0804931E
.text:0804931E ; __unwind {
.text:0804931E endbr32
.text:08049322 lea ecx, [esp+4]
.text:08049326 and esp, 0FFFFFFF0h
.text:08049329 push dword ptr [ecx-4]
.text:0804932C push ebp
.text:0804932D mov ebp, esp
.text:0804932F push ecx
.text:08049330 sub esp, 54h
.text:08049333 mov eax, ecx
.text:08049335 mov eax, [eax+4]
.text:08049338 mov [ebp+var_4C], eax
.text:0804933B mov eax, large gs:14h
.text:08049341 mov [ebp+var_C], eax
.text:08049344 xor eax, eax
.text:08049346 mov [ebp+j], 0
.text:0804934D jmp short loc_804935E
.text:0804934F ; ---------------------------------------------------------------------------
.text:0804934F
.text:0804934F loc_804934F: ; CODE XREF: main+44↓j
.text:0804934F lea edx, [ebp+s2] ; s2数组初始化为0
.text:08049352 mov eax, [ebp+j]
.text:08049355 add eax, edx
.text:08049357 mov byte ptr [eax], 0
.text:0804935A add [ebp+j], 1
.text:0804935E
.text:0804935E loc_804935E: ; CODE XREF: main+2F↑j
.text:0804935E cmp [ebp+j], 13h
.text:08049362 jle short loc_804934F ; s2数组初始化为0
.text:08049364 lea eax, [ebp+s2]
.text:08049367 mov dword ptr [eax], 57514A4Dh
.text:0804936D mov dword ptr [eax+4], 454C4441h
.text:08049374 sub esp, 0Ch
.text:08049377 push offset aEnterThePasswo ; "Enter the password: "
.text:0804937C call _printf
.text:08049381 add esp, 10h
.text:08049384 sub esp, 8
.text:08049387 lea eax, [ebp+input]
.text:0804938A push eax ; input
.text:0804938B push offset a8s ; "%8s"
.text:08049390 call ___isoc99_scanf
.text:08049395 add esp, 10h
.text:08049398 mov [ebp+i], 0
.text:0804939F jmp short loc_80493CE ; i的初始值为0
.text:080493A1 ; ---------------------------------------------------------------------------
.text:080493A1
.text:080493A1 loc_80493A1: ; CODE XREF: main+B4↓j
.text:080493A1 lea edx, [ebp+input]
.text:080493A4 mov eax, [ebp+i]
.text:080493A7 add eax, edx ; i + input
.text:080493A9 movzx eax, byte ptr [eax] ; input[i]
.text:080493AC movsx eax, al ; 取单字节
.text:080493AF sub esp, 8 ; 开辟栈空间
.text:080493B2 push [ebp+i] ; 将i压入栈中
.text:080493B5 push eax ; 将push[i]压入栈中
.text:080493B6 call complex_function ; complex_function(input[i], i);这个函数跟00_angr_find是一样的
.text:080493BB add esp, 10h ; 调用者清理栈
.text:080493BE mov ecx, eax ; 返回值就是加密后的数据
.text:080493C0 lea edx, [ebp+input]
.text:080493C3 mov eax, [ebp+i]
.text:080493C6 add eax, edx ; input + i
.text:080493C8 mov [eax], cl ; input[i] = complex_function(input[i], i)
.text:080493CA add [ebp+i], 1
.text:080493CE
.text:080493CE loc_80493CE: ; CODE XREF: main+81↑j
.text:080493CE cmp [ebp+i], 7 ; i的初始值为0
.text:080493D2 jle short loc_80493A1 ; for(i = 0; i <= 7; i++)
.text:080493D4 lea eax, [ebp+input] ; 8个数据加密完成后来到这里
.text:080493D7 add eax, 1 ; input + 1
.text:080493DA movzx eax, byte ptr [eax]
.text:080493DD movzx eax, al ; input[1]
.text:080493E0 and eax, 10h ; input[1] & 0x10, 取高4
.text:080493E3 test eax, eax
.text:080493E5 setz dl ; 如果高4位为0的话dl为1
.text:080493E8 lea eax, [ebp+s2]
.text:080493EB add eax, 1
.text:080493EE movzx eax, byte ptr [eax] ; s2[1]
.text:080493F1 movzx eax, al
.text:080493F4 and eax, 10h
.text:080493F7 test eax, eax
.text:080493F9 setnz al ; 看s2[1]高位是否为0
.text:080493FC xor eax, edx ; s2[1]必为0, 所以input[1]高位不为0的话
.text:080493FE test al, al ; 则不跳转
.text:08049400 jz loc_808F34F ; 可以看到跳转的地址就是avoidme

跟进跳转

我们跟进地址为0x08049400地址处的条件跳转

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
loc_808F34F:                            ; CODE XREF: main+E2↑j
.text:0808F34F call avoid_me
.text:0808F354 lea eax, [ebp+input]
.text:0808F357 add eax, 1
.text:0808F35A movzx eax, byte ptr [eax]
.text:0808F35D movzx eax, al
.text:0808F360 and eax, 8
.text:0808F363 test eax, eax
.text:0808F365 setz dl
.text:0808F368 lea eax, [ebp+s2]
.text:0808F36B add eax, 1
.text:0808F36E movzx eax, byte ptr [eax]
.text:0808F371 movzx eax, al
.text:0808F374 and eax, 8
.text:0808F377 test eax, eax
.text:0808F379 setnz al
.text:0808F37C xor eax, edx
.text:0808F37E test al, al
.text:0808F380 jz loc_80B230F
.text:0808F386 lea eax, [ebp+input]
.text:0808F389 add eax, 1
.text:0808F38C movzx eax, byte ptr [eax]
.text:0808F38F movzx eax, al
.text:0808F392 and eax, 4
.text:0808F395 test eax, eax
.text:0808F397 setnz dl
.text:0808F39A lea eax, [ebp+s2]
.text:0808F39D add eax, 1
.text:0808F3A0 movzx eax, byte ptr [eax]
.text:0808F3A3 movzx eax, al
.text:0808F3A6 and eax, 4
.text:0808F3A9 test eax, eax
.text:0808F3AB setnz al
.text:0808F3AE xor eax, edx
.text:0808F3B0 test al, al
.text:0808F3B2 jz loc_80A0B66
.text:0808F3B8 call avoid_me
.text:0808F3BD lea eax, [ebp+input]
.text:0808F3C0 add eax, 1
.text:0808F3C3 movzx eax, byte ptr [eax]
.text:0808F3C6 movzx eax, al
.text:0808F3C9 and eax, 2
.text:0808F3CC test eax, eax
.text:0808F3CE setnz dl
.text:0808F3D1 lea eax, [ebp+s2]
.text:0808F3D4 add eax, 1
.text:0808F3D7 movzx eax, byte ptr [eax]
.text:0808F3DA movzx eax, al
.text:0808F3DD and eax, 2
.text:0808F3E0 test eax, eax
.text:0808F3E2 setnz al
.text:0808F3E5 xor eax, edx
.text:0808F3E7 test al, al
.text:0808F3E9 jz loc_8097FAD
.text:0808F3EF call avoid_me

需要避开的地址

可以看到调用了很多个avoid_me()函数, 每一个都是需要避开的路径, 一个一个避开肯定不显示, 所以我们跟进avoid_me()函数, 直接避开这个函数本身就好了

1
2
3
4
5
6
7
8
9
10
11
12
13
14
.text:080492BB ; void avoid_me()
.text:080492BB public avoid_me
.text:080492BB avoid_me proc near ; CODE XREF: main+14C↓p
.text:080492BB ; main+183↓p ...
.text:080492BB ; __unwind {
.text:080492BB endbr32
.text:080492BF push ebp
.text:080492C0 mov ebp, esp
.text:080492C2 mov should_succeed, 0
.text:080492C9 nop
.text:080492CA pop ebp
.text:080492CB retn
.text:080492CB ; } // starts at 80492BB
.text:080492CB avoid_me endp

我们选择避开: .text:080492C2 mov should_succeed, 0

使用Angr解题

与上一个关卡相比, 这里这一关添加了大量的”应避开的路径”, 如果只有”接收路径”而没有”避开路径”的话, 时间成本会大很多.

代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
import angr
import sys

def main(argv):
#获取二进制文件, 并在此基础上创建一个Angr项目
path_to_binary = argv[1]
project = angr.Project(path_to_binary)

#指定执行的起始状态, 这里表示从main函数开始
initial_state = project.factory.entry_state(
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)
#创建一个模拟管理器, 提供搜索和执行的工具(方法), 注意其参数为起始状态
simulation = project.factory.simgr(initial_state)

# Explore the binary, but this time, instead of only looking for a state that
# reaches the print_good_address, also find a state that does not reach
# will_not_succeed_address. The binary is pretty large, to save you some time,
# everything you will need to look at is near the beginning of the address
# space.
# 尝试分析二进制文件, 但是这一次, 除了要找到我们的"接收的路径"print_good_address,
# 还要找到一个"避免的路径"will_not_succeed_address. 二进制文件非常大, 为了节省你的
# 时间, 你需要找的所有内容都在地址内容的开头附近
# (!)
print_good_address = 0x080492FB
will_not_succeed_address = 0x080492C2
# 使用模拟管理器的探索: 开始符号执行, 并搜索find地址, 避开avoid地址
simulation.explore(find=print_good_address, avoid=will_not_succeed_address)

#最终的结果会存储在simulation的found位向量中
if simulation.found:#根据found是否为空来判断是否找到了我们想要的符号值
solution_state = simulation.found[0]#获取符号值
print(solution_state.posix.dumps(sys.stdin.fileno()).decode())#解码为字符串并打印
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

去掉注释后

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
import angr
import sys

def main(argv):
path_to_binary = argv[1]
project = angr.Project(path_to_binary)

initial_state = project.factory.entry_state(
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)
simulation = project.factory.simgr(initial_state)

print_good_address = 0x080492FB
will_not_succeed_address = 0x080492C2

simulation.explore(find=print_good_address, avoid=will_not_succeed_address)

if simulation.found:
solution_state = simulation.found[0]
print(solution_state.posix.dumps(sys.stdin.fileno()).decode())
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

整个的执行流程跟上一题差不多, 要学习的是

simulation.explore(find=print_good_address, avoid=will_not_succeed_address)

02_angr_find_codition

编译并运行

1
2
3
4
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/02_angr_find_condition$ python3 generate.py 1234 02_angr_find_condition
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/02_angr_find_condition$ ./02_angr_find_condition
Enter the password: aaa
Try again.

分析

使用IDA分析

分析mian函数的部分

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
; int __cdecl main(int argc, const char **argv, const char **envp)
.text:080492BB public main
.text:080492BB main proc near ; DATA XREF: _start+2A↑o
.text:080492BB
.text:080492BB var_4C = dword ptr -4Ch
.text:080492BB var_40 = dword ptr -40h
.text:080492BB var_3C = dword ptr -3Ch
.text:080492BB var_38 = dword ptr -38h
.text:080492BB s1 = byte ptr -34h
.text:080492BB s2 = byte ptr -20h
.text:080492BB var_C = dword ptr -0Ch
.text:080492BB var_4 = dword ptr -4
.text:080492BB argc = dword ptr 8
.text:080492BB argv = dword ptr 0Ch
.text:080492BB envp = dword ptr 10h
.text:080492BB
.text:080492BB ; __unwind {
.text:080492BB endbr32
.text:080492BF lea ecx, [esp+4]
.text:080492C3 and esp, 0FFFFFFF0h
.text:080492C6 push dword ptr [ecx-4]
.text:080492C9 push ebp
.text:080492CA mov ebp, esp
.text:080492CC push ecx
.text:080492CD sub esp, 54h
.text:080492D0 mov eax, ecx
.text:080492D2 mov eax, [eax+4]
.text:080492D5 mov [ebp+var_4C], eax
.text:080492D8 mov eax, large gs:14h
.text:080492DE mov [ebp+var_C], eax
.text:080492E1 xor eax, eax
.text:080492E3 mov [ebp+var_38], 0DEADBEEFh
.text:080492EA mov [ebp+var_40], 0
.text:080492F1 jmp short loc_8049302
.text:080492F3 ; ---------------------------------------------------------------------------
.text:080492F3
.text:080492F3 loc_80492F3: ; CODE XREF: main+4B↓j
.text:080492F3 lea edx, [ebp+s2]
.text:080492F6 mov eax, [ebp+var_40]
.text:080492F9 add eax, edx
.text:080492FB mov byte ptr [eax], 0
.text:080492FE add [ebp+var_40], 1
.text:08049302
.text:08049302 loc_8049302: ; CODE XREF: main+36↑j
.text:08049302 cmp [ebp+var_40], 13h
.text:08049306 jle short loc_80492F3
.text:08049308 lea eax, [ebp+s2]
.text:0804930B mov dword ptr [eax], 57514A4Dh
.text:08049311 mov dword ptr [eax+4], 454C4441h
.text:08049318 sub esp, 0Ch
.text:0804931B push offset aEnterThePasswo ; "Enter the password: "
.text:08049320 call _printf
.text:08049325 add esp, 10h
.text:08049328 sub esp, 8
.text:0804932B lea eax, [ebp+s1]
.text:0804932E push eax
.text:0804932F push offset a8s ; "%8s"
.text:08049334 call ___isoc99_scanf
.text:08049339 add esp, 10h
.text:0804933C mov [ebp+var_3C], 0
.text:08049343 jmp short loc_8049376
.text:08049345 ; ---------------------------------------------------------------------------
.text:08049345
.text:08049345 loc_8049345: ; CODE XREF: main+BF↓j
.text:08049345 mov eax, [ebp+var_3C]
.text:08049348 lea edx, [eax+8]
.text:0804934B lea ecx, [ebp+s1]
.text:0804934E mov eax, [ebp+var_3C]
.text:08049351 add eax, ecx
.text:08049353 movzx eax, byte ptr [eax]
.text:08049356 movsx eax, al
.text:08049359 sub esp, 8
.text:0804935C push edx
.text:0804935D push eax
.text:0804935E call complex_function
.text:08049363 add esp, 10h
.text:08049366 mov ecx, eax
.text:08049368 lea edx, [ebp+s1]
.text:0804936B mov eax, [ebp+var_3C]
.text:0804936E add eax, edx
.text:08049370 mov [eax], cl
.text:08049372 add [ebp+var_3C], 1
.text:08049376
.text:08049376 loc_8049376: ; CODE XREF: main+88↑j
.text:08049376 cmp [ebp+var_3C], 7
.text:0804937A jle short loc_8049345
.text:0804937C cmp [ebp+var_38], 0DEADBEEFh ; 因为var_38的值一直都没有改变, 所以一旦第一个跳转了, 就说明后面的跳转会一直跳转到错误的位置
.text:08049383 jz loc_804B97C ; 这里最终跳转到了Try again, 应该从这里跳转到的位置时设为"避开路径"
.text:08049389 cmp [ebp+var_38], 0DEADBEEFh
.text:08049390 jz loc_804A689
.text:08049396 cmp [ebp+var_38], 0DEADBEEFh
.text:0804939D jz loc_8049D16
.text:080493A3 cmp [ebp+var_38], 0DEADBEEFh
.text:080493AA jnz loc_8049863
.text:080493B0 cmp [ebp+var_38], 0DEADBEEFh
.text:080493B7 jz loc_8049610
.text:080493BD cmp [ebp+var_38], 0DEADBEEFh
.text:080493C4 jnz loc_80494ED
.text:080493CA cmp [ebp+var_38], 0DEADBEEFh
.text:080493D1 jnz loc_8049462
.text:080493D7 cmp [ebp+var_38], 0DEADBEEFh
.text:080493DE jnz short loc_8049421
.text:080493E0 sub esp, 8
.text:080493E3 lea eax, [ebp+s2]
.text:080493E6 push eax ; s2
.text:080493E7 lea eax, [ebp+s1]
.text:080493EA push eax ; s1
.text:080493EB call _strcmp
.text:080493F0 add esp, 10h
.text:080493F3 test eax, eax
.text:080493F5 jz short loc_804940C
.text:080493F7 sub esp, 0Ch
.text:080493FA push offset s ; "Try again."
.text:080493FF call _puts
.text:08049404 add esp, 10h
.text:08049407 jmp loc_804DF5E
.text:0804940C ; ---------------------------------------------------------------------------
.text:0804940C
.text:0804940C loc_804940C: ; CODE XREF: main+13A↑j
.text:0804940C sub esp, 0Ch
.text:0804940F push offset aGoodJob ; "Good Job."
.text:08049414 call _puts
.text:08049419 add esp, 10h
.text:0804941C jmp loc_804DF5E
.text:08049421 ; ---------------------------------------------------------------------------
.text:08049421
.text:08049421 loc_8049421: ; CODE XREF: main+123↑j
.text:08049421 sub esp, 8
.text:08049424 lea eax, [ebp+s2]
.text:08049427 push eax ; s2
.text:08049428 lea eax, [ebp+s1]
.text:0804942B push eax ; s1
.text:0804942C call _strcmp
.text:08049431 add esp, 10h
.text:08049434 test eax, eax
.text:08049436 jz short loc_804944D
.text:08049438 sub esp, 0Ch
.text:0804943B push offset s ; "Try again."
.text:08049440 call _puts
.text:08049445 add esp, 10h
.text:08049448 jmp loc_804DF5E
.text:0804944D ; ---------------------------------------------------------------------------
.text:0804944D
.text:0804944D loc_804944D: ; CODE XREF: main+17B↑j
.text:0804944D sub esp, 0Ch
.text:08049450 push offset aGoodJob ; "Good Job."
.text:08049455 call _puts
.text:0804945A add esp, 10h
.text:0804945D jmp loc_804DF5E
.text:08049462 ; ---------------------------------------------------------------------------
.text:08049462
.text:08049462 loc_8049462: ; CODE XREF: main+116↑j
.text:08049462 cmp [ebp+var_38], 0DEADBEEFh
.text:08049469 jz short loc_80494AC
.text:0804946B sub esp, 8
.text:0804946E lea eax, [ebp+s2]
.text:08049471 push eax ; s2
.text:08049472 lea eax, [ebp+s1]
.text:08049475 push eax ; s1
.text:08049476 call _strcmp
.text:0804947B add esp, 10h
.text:0804947E test eax, eax
.text:08049480 jz short loc_8049497
.text:08049482 sub esp, 0Ch
.text:08049485 push offset s ; "Try again."
.text:0804948A call _puts
.text:0804948F add esp, 10h
.text:08049492 jmp loc_804DF5E
.text:08049497 ; ---------------------------------------------------------------------------
.text:08049497
.text:08049497 loc_8049497: ; CODE XREF: main+1C5↑j
.text:08049497 sub esp, 0Ch
.text:0804949A push offset aGoodJob ; "Good Job."
.text:0804949F call _puts
.text:080494A4 add esp, 10h
.text:080494A7 jmp loc_804DF5E

mian函数非常的大, 而且这个关卡与前面不同的地方在于它的<接受路径>和<避免路径>都有非常多条, 这样的结果就是我们很难通过以约束地址的方式来找到真正的路径, 即便能这样的效率也非常的低. 所以Angr提供多样的约束方式, 而这也是通过本关卡所需要的.

本题要用的是识别每个<状态>的标准输出中是否含有”Good Job.”字符串来判断是否是<接受路径>, 而其载体则是<函数>.

使用Angr解题

代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
# It is very useful to be able to search for a state that reaches a certain
# instruction. However, in some cases, you may not know the address of the
# specific instruction you want to reach (or perhaps there is no single
# instruction goal.) In this challenge, you don't know which instruction
# grants you success. Instead, you just know that you want to find a state where
# the binary prints "Good Job."
# 能够搜索到一个到达特定指令的状态是非常有用的. 但是, 在某些情况下, 你可能不知道要到达的特定指令的地址
# (或者可能没有单一的指令目标). 在这个关卡中, 你不知道那条指令可以让你成功. 相反, 你只知道你想找的是打印
# "Good Job"的状态

# Angr is powerful in that it allows you to search for a states that meets an
# arbitrary condition that you specify in Python, using a predicate you define
# as a function that takes a state and returns True if you have found what you
# are looking for, and False otherwise.
# Angr的强大之处在于它允许你去搜索一个满足你已经在Python中声明了的任意条件的状态
# 使用你定义为函数的谓词, 该函数接收状态并在找到所需内容时返回True, 否则为False
import angr
import sys

def main(argv):
#在二进制文件建立Angr项目
path_to_binary = argv[1]
project = angr.Project(path_to_binary)

#指定执行的起始状态, 这里是main()函数
initial_state = project.factory.entry_state(
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

#创建模拟管理器, 提供搜索和执行的工具
simulation = project.factory.simgr(initial_state)

# Define a function that checks if you have found the state you are looking
# for.
# 定义一个函数来检查你是否找到了你正在寻找的状态
def is_successful(state):
# Dump whatever has been printed out by the binary so far into a string.
# 将目前二进制文件打印出来的任何内容转储到一个字符串中
stdout_output = state.posix.dumps(sys.stdout.fileno())

# Return whether 'Good Job.' has been printed yet.
# 返回是否已经打印出"Good Job"
# (!)
return "Good Job".encode() in stdout_output#如果"Good Job."字符串在标准输出中, 则返回真

# Same as above, but this time check if the state should abort. If you return
# False, Angr will continue to step the state. In this specific challenge, the
# only time at which you will know you should abort is when the program prints
# "Try again."
# 与上面相同, 但这次检查状态是否应该终止, 如果返回False, Angr将继续步入状态, 在
# 这个特殊的挑战中, 你应该知道终止的唯一可能是在程序打印"Try again"的时候.
def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())

return "Try again".encode() in stdout_output#如果"Tyr again."字符串在标准输出中, 则返回真

# Tell Angr to explore the binary and find any state that is_successful identfies
# as a successful state by returning True.
# 让Angr探索二进制文件, 并通过返回True找到is_successful状态并识别为成功状态
simulation.explore(find=is_successful, avoid=should_abort)

if simulation.found:
solution_state = simulation.found[0]
print(solution_state.posix.dumps(sys.stdin.fileno()).decode())
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

去掉注释后

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
import angr
import sys

def main(argv):

path_to_binary = argv[1]
project = angr.Project(path_to_binary)

initial_state = project.factory.entry_state(
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

simulation = project.factory.simgr(initial_state)

def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())

return "Good Job".encode() in stdout_output#如果"Good Job."字符串在标准输出中, 则返回真

def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())

return "Try again".encode() in stdout_output#如果"Tyr again."字符串在标准输出中, 则返回真

simulation.explore(find=is_successful, avoid=should_abort)

if simulation.found:
solution_state = simulation.found[0]
print(solution_state.posix.dumps(sys.stdin.fileno()).decode())
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

我们可以看到explore的参数类型发生了改变simulation.explore(find=is_successful, avoid=should_abort), 从具体的地址变成了回调函数, 而findavoid就是通过回调函数的返回值来判断是<接受路径>还是<避免路径>

03_angr_symbolic_registers

编译并运行

1
2
3
4
5
6
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/03_angr_symbolic_registers$ python3 generate.py 1234 03_angr_symbolic_registers
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/03_angr_symbolic_registers$ ./03_angr_symbolic_registers
Enter the password: aaaa
a
a
Try again.

分析

main函数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
 ; int __cdecl main(int argc, const char **argv, const char **envp)
.text:080495CF public main
.text:080495CF main proc near ; DATA XREF: _start+2A↑o
.text:080495CF
.text:080495CF var_14 = dword ptr -14h
.text:080495CF var_10 = dword ptr -10h
.text:080495CF var_C = dword ptr -0Ch
.text:080495CF var_4 = dword ptr -4
.text:080495CF argc = dword ptr 8
.text:080495CF argv = dword ptr 0Ch
.text:080495CF envp = dword ptr 10h
.text:080495CF
.text:080495CF ; __unwind {
.text:080495CF endbr32
.text:080495D3 lea ecx, [esp+4]
.text:080495D7 and esp, 0FFFFFFF0h
.text:080495DA push dword ptr [ecx-4]
.text:080495DD push ebp
.text:080495DE mov ebp, esp
.text:080495E0 push ecx
.text:080495E1 sub esp, 14h
.text:080495E4 sub esp, 0Ch
.text:080495E7 push offset aEnterThePasswo ; "Enter the password: "
.text:080495EC call _printf
.text:080495F1 add esp, 10h
.text:080495F4 call get_user_input
.text:080495F9 mov [ebp+var_14], eax
.text:080495FC mov [ebp+var_10], ebx
.text:080495FF mov [ebp+var_C], edx
.text:08049602 sub esp, 0Ch
.text:08049605 push [ebp+var_14]
.text:08049608 call complex_function_1
.text:0804960D add esp, 10h
.text:08049610 mov [ebp+var_14], eax
.text:08049613 sub esp, 0Ch
.text:08049616 push [ebp+var_10]
.text:08049619 call complex_function_2
.text:0804961E add esp, 10h
.text:08049621 mov [ebp+var_10], eax
.text:08049624 sub esp, 0Ch
.text:08049627 push [ebp+var_C]
.text:0804962A call complex_function_3
.text:0804962F add esp, 10h
.text:08049632 mov [ebp+var_C], eax
.text:08049635 cmp [ebp+var_14], 0
.text:08049639 jnz short loc_8049647
.text:0804963B cmp [ebp+var_10], 0
.text:0804963F jnz short loc_8049647
.text:08049641 cmp [ebp+var_C], 0
.text:08049645 jz short loc_8049659
.text:08049647
.text:08049647 loc_8049647: ; CODE XREF: main+6A↑j
.text:08049647 ; main+70↑j
.text:08049647 sub esp, 0Ch
.text:0804964A push offset s ; "Try again."
.text:0804964F call _puts
.text:08049654 add esp, 10h
.text:08049657 jmp short loc_8049669
.text:08049659 ; ---------------------------------------------------------------------------
.text:08049659
.text:08049659 loc_8049659: ; CODE XREF: main+76↑j
.text:08049659 sub esp, 0Ch
.text:0804965C push offset aGoodJob ; "Good Job."
.text:08049661 call _puts
.text:08049666 add esp, 10h
.text:08049669
.text:08049669 loc_8049669: ; CODE XREF: main+88↑j
.text:08049669 mov eax, 0
.text:0804966E mov ecx, [ebp+var_4]
.text:08049671 leave
.text:08049672 lea esp, [ecx-4]
.text:08049675 retn
.text:08049675 ; } // starts at 80495CF
.text:08049675 main endp

在读取了用户输入后, 进行了三次判断, 且只有一个<接受路径>和<避免路径>, <接受路径>和<避免路径>都只有一个, 可以用地址来表示, 但是还是用回调函数判断输出的方式会更通用一些, 我们先跟进get_user_input(), 然后跟进加密函数complex_function_1()

get_user_input()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
; __unwind {
.text:08049582 endbr32
.text:08049586 push ebp
.text:08049587 mov ebp, esp
.text:08049589 sub esp, 18h
.text:0804958C mov eax, large gs:14h
.text:08049592 mov [ebp+var_C], eax
.text:08049595 xor eax, eax
.text:08049597 lea ecx, [ebp+var_10]
.text:0804959A push ecx
.text:0804959B lea ecx, [ebp+var_14]
.text:0804959E push ecx
.text:0804959F lea ecx, [ebp+var_18]
.text:080495A2 push ecx
.text:080495A3 push offset aXXX ; 读取三个整数
.text:080495A8 call ___isoc99_scanf
.text:080495AD add esp, 10h
.text:080495B0 mov eax, [ebp+var_18]
.text:080495B3 mov edx, [ebp+var_14]
.text:080495B6 mov ebx, edx
.text:080495B8 mov edx, [ebp+var_10]
.text:080495BB nop
.text:080495BC mov ecx, [ebp+var_C]
.text:080495BF xor ecx, large gs:14h
.text:080495C6 jz short locret_80495CD
.text:080495C8 call ___stack_chk_fail

aXXX字符串常量是"%x %x %x", 是读取三个整数

complex_function_1()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
 __unwind {
.text:08049218 endbr32
.text:0804921C push ebp
.text:0804921D mov ebp, esp
.text:0804921F xor [ebp+arg_0], 0B47ECDE5h
.text:08049226 add [ebp+arg_0], 7C34CCE5h
.text:0804922D mov eax, [ebp+arg_0]
.text:08049230 sub eax, 68313899h
.text:08049235 mov [ebp+arg_0], eax
.text:08049238 add [ebp+arg_0], 25546EB0h
.text:0804923F xor [ebp+arg_0], 667C4B5Fh
.text:08049246 add [ebp+arg_0], 5218863Ah
.text:0804924D add [ebp+arg_0], 4A61A7C0h
.text:08049254 xor [ebp+arg_0], 0D81E98A1h
.text:0804925B add [ebp+arg_0], 0EA4480Ah
.text:08049262 xor [ebp+arg_0], 0FA091CC9h
.text:08049269 xor [ebp+arg_0], 0BB654A8Dh
.text:08049270 xor [ebp+arg_0], 518DA374h
.text:08049277 mov eax, [ebp+arg_0]
.text:0804927A sub eax, 573A9DA6h
.text:0804927F mov [ebp+arg_0], eax
.text:08049282 mov eax, [ebp+arg_0]
.text:08049285 sub eax, 720B986Bh
.text:0804928A mov [ebp+arg_0], eax
.text:0804928D xor [ebp+arg_0], 62F378CAh
.text:08049294 mov eax, [ebp+arg_0]
.text:08049297 sub eax, 7C6D5375h
.text:0804929C mov [ebp+arg_0], eax
.text:0804929F xor [ebp+arg_0], 5073ECCAh
.text:080492A6 xor [ebp+arg_0], 1E57541Dh
.text:080492AD mov eax, [ebp+arg_0]
.text:080492B0 sub eax, 5EDD295Ch
.text:080492B5 mov [ebp+arg_0], eax
.text:080492B8 mov eax, [ebp+arg_0]
.text:080492BB sub eax, 70D80AE9h
.text:080492C0 mov [ebp+arg_0], eax
.text:080492C3 add [ebp+arg_0], 41CE452Bh
.text:080492CA xor [ebp+arg_0], 2CB9F02h
.text:080492D1 xor [ebp+arg_0], 1DA8339Eh
.text:080492D8 add [ebp+arg_0], 1708A4A7h
.text:080492DF xor [ebp+arg_0], 0CCAC92A7h
.text:080492E6 mov eax, [ebp+arg_0]
.text:080492E9 sub eax, 3D111768h
.text:080492EE mov [ebp+arg_0], eax
.text:080492F1 mov eax, [ebp+arg_0]
.text:080492F4 sub eax, 2369D87Bh
.text:080492F9 mov [ebp+arg_0], eax
.text:080492FC xor [ebp+arg_0], 184BE27Fh
.text:08049303 add [ebp+arg_0], 281A05Eh
.text:0804930A add [ebp+arg_0], 66205B90h
.text:08049311 xor [ebp+arg_0], 3C3B90F5h
.text:08049318 mov eax, [ebp+arg_0]
.text:0804931B pop ebp
.text:0804931C retn

可以看到是不断的与常量xor, and, sub和add运算最终返回运算结果, 需要注意的是运算过程中间arg_0变量的值会传送给eax寄存器mov eax, [ebp+arg_0], 而且最后返回时也会将值传送给eax寄存器.

再看看最终到达<接受路径>的条件

.text:08049635 cmp [ebp+var_14], 0 .text:08049639 jnz short loc_8049647 .text:0804963B cmp [ebp+var_10], 0 .text:0804963F jnz short loc_8049647 .text:08049641 cmp [ebp+var_C], 0 .text:08049645 jz short loc_8049659

比较转换后的三个输入整数是否为0

使用Angr解题

首先因为Angr不支持scanf读取多个内容, 所以本题中读取三个十六进制整数的符号我们无法让Angr自动完成注入. 所以我们要在合适的执行位置, 找到注入点.

1
2
3
.text:080495F9                 mov     [ebp+var_14], eax
.text:080495FC mov [ebp+var_10], ebx
.text:080495FF mov [ebp+var_C], edx

这里就是我们所需要的注入点, 我们是以寄存器作为符号.

1
2
3
4
5
6
7
password0_size_in_bits = 32  # :integer, 这个是符号位向量的位数
password0 = claripy.BVS('password0', password0_size_in_bits)# 成功创建符号
password1 = claripy.BVS('password1', password0_size_in_bits)
password2 = claripy.BVS('password2', password0_size_in_bits)
initial_state.regs.eax = password0
initial_state.regs.ebx = password1
initial_state.regs.edx = password2

注入符号并符号执行的过程我们可以理解为: [从符号执行的此刻开始, 你就是未知数了]. 在本题中的例子就是这样

image.png

接下来是两种情况:

  • 如上图最上方的红框, 如果我们在这个位置开始符号执行, 那么我们就是以此刻的三个寄存器值来作为未知数. 最终Angr还原的符号就是此时的eax, ebx, edx, 可此时eax, ebx还有edx并不是我们所想要求得的flag, 所以不能想之前的关卡那样默认从mian()开头开始符号执行
  • 另一种情况就是从第二个红框处开始符号执行, 最终Angr还原的符号就是此时的eax, ebx, edx. 而这三个数此时就是我们的flag, 所以从这里开始执行才是正确的.

下面是代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
# Angr doesn't currently support reading multiple things with scanf (Ex: 
# scanf("%u %u).) You will have to tell the simulation engine to begin the
# program after scanf is called and manually inject the symbols into registers.
# Angr目前不支持使用scanf读取多个内容, (如: scanf("%u %u")), 你必须告诉模拟引擎在调用
# scanf后启动程序并手动将符号注入寄存器
import angr
import claripy
import sys

def main(argv):
# 在二进制文件建立Angr项目
path_to_binary = argv[1]
project = angr.Project(path_to_binary)

# Sometimes, you want to specify where the program should start. The variable
# start_address will specify where the symbolic execution engine should begin.
# Note that we are using blank_state, not entry_state.
# 有时, 你想要明确指定程序应该从哪里开始. 变量start_address将指示符号引擎从哪里开始
# 请注意, 我们使用的是blank_state()而不是entry_state()
# (!)
start_address = 0x080495F9 # :integer (probably hexadecimal)
initial_state = project.factory.blank_state(
addr=start_address,
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

# Create a symbolic bitvector (the datatype Angr uses to inject symbolic
# values into the binary.) The first parameter is just a name Angr uses
# to reference it.
# You will have to construct multiple bitvectors. Copy the two lines below
# and change the variable names. To figure out how many (and of what size)
# you need, dissassemble the binary and determine the format parameter passed
# to scanf.
# 创建一个符号位向量(Angr用来将符号值注入二进制文件的数据类型), 第一个参数只是Angr用来引用它的名称
# 你必须将多个寄存器设置为不同的位向量, 先声明一定数量的符号位向量, 然后在合适的位置, 合适的寄存器注入符号
# (!)
password0_size_in_bits = 32 # :integer, 这个是符号位向量的位数
password0 = claripy.BVS('password0', password0_size_in_bits)# 成功创建符号
password1 = claripy.BVS('password1', password0_size_in_bits)
password2 = claripy.BVS('password2', password0_size_in_bits)
...

# Set a register to a symbolic value. This is one way to inject symbols into
# the program.
# 将寄存器设置为符号值, 这是将符号注入程序的一种方法
# initial_state.regs stores a number of convenient attributes that reference
# registers by name. For example, to set eax to password0, use:
# initial_state.regs存储了许多按名称引用寄存器的遍历属性.
# 如, 将eax设置为password0, 请使用下面的语句
# initial_state.regs.eax = password0
#
# You will have to set multiple registers to distinct bitvectors. Copy and
# paste the line below and change the register. To determine which registers
# to inject which symbol, dissassemble the binary and look at the instructions
# immediately following the call to scanf.
# 你必须将多个寄存器设置为不同的位向量.
# 复制并粘贴下面的行并更改寄存器.
# 要确定哪些寄存器要注入哪些符号, 请反汇编二进制文件并查看调用scanf后立即执行的指令
# (!)
initial_state.regs.eax = password0
initial_state.regs.ebx = password1
initial_state.regs.edx = password2

# 创建一个模拟管理器
simulation = project.factory.simgr(initial_state)

def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Good Job".encode() in stdout_output

def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Try again".encode() in stdout_output

simulation.explore(find=is_successful, avoid=should_abort)

if simulation.found:
solution_state = simulation.found[0]# 如果找到了<接受路径>

# Solve for the symbolic values. If there are multiple solutions, we only
# care about one, so we can use eval, which returns any (but only one)
# solution. Pass eval the bitvector you want to solve for.
# 求解符号值. 如果有多个解决方案, 我们只关心一个, 所以我们可以使用eval()方法, 这个方法可以返回任何(但只有一个)
# 解决方案, 将需要求解的位向量传递给eval()方法
# (!)
solution0 = solution_state.solver.eval(password0)
solution1 = solution_state.solver.eval(password1)
solution2 = solution_state.solver.eval(password2)

# Aggregate and format the solutions you computed above, and then print
# the full string. Pay attention to the order of the integers, and the
# expected base (decimal, octal, hexadecimal, etc).
# 合并, 格式话你在上面得到的答案, 然后打印完整的字符串. 注意整数的顺序, 以及预期的基数
#
solution = str(hex(solution0)) + ' ' + str(hex(solution1)) + ' ' + str(hex(solution2)) # :string, 注意这里要转换成十六进制, 因为scanf用的是%x
print(solution)
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

去掉注释后

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
import angr
import claripy
import sys

def main(argv):
path_to_binary = argv[1]
project = angr.Project(path_to_binary)

start_address = 0x080495F9 # :integer (probably hexadecimal)
initial_state = project.factory.blank_state(
addr=start_address,
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

password0_size_in_bits = 32
password0 = claripy.BVS('password0', password0_size_in_bits)
password1 = claripy.BVS('password1', password0_size_in_bits)
password2 = claripy.BVS('password2', password0_size_in_bits)
initial_state.regs.eax = password0
initial_state.regs.ebx = password1
initial_state.regs.edx = password2

simulation = project.factory.simgr(initial_state)

def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Good Job".encode() in stdout_output

def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Try again".encode() in stdout_output

simulation.explore(find=is_successful, avoid=should_abort)

if simulation.found:
solution_state = simulation.found[0]

solution0 = solution_state.solver.eval(password0)
solution1 = solution_state.solver.eval(password1)
solution2 = solution_state.solver.eval(password2)

solution = str(hex(solution0)) + ' ' + str(hex(solution1)) + ' ' + str(hex(solution2)) # :string, 注意这里要转换成十六进制, 因为scanf用的是%x
print(solution)
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

04_angr_symbolic_stack

编译并运行

1
2
3
4
5
6
7
8
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/04_angr_symbolic_stack$ python3 generate.py 1
234 04_angr_symbolic_stack
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/04_angr_symbolic_stack$ cp 04_angr_symbolic_s
tack ./1
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/04_angr_symbolic_stack$ ./04_angr_symbolic_st
ack
Enter the password: aaaaaaa
Try again.

分析

使用IDA分析

mian()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
.text:08049450 ; __unwind {
.text:08049450 endbr32
.text:08049454 lea ecx, [esp+4]
.text:08049458 and esp, 0FFFFFFF0h
.text:0804945B push dword ptr [ecx-4]
.text:0804945E push ebp
.text:0804945F mov ebp, esp
.text:08049461 push ecx
.text:08049462 sub esp, 4
.text:08049465 sub esp, 0Ch
.text:08049468 push offset aEnterThePasswo ; "Enter the password: "
.text:0804946D call _printf ; 打印提示
.text:08049472 add esp, 10h
.text:08049475 call handle_user ; 主要函数
.text:0804947A mov eax, 0
.text:0804947F mov ecx, [ebp+var_4]
.text:08049482 leave
.text:08049483 lea esp, [ecx-4]
.text:08049486 retn
.text:08049486 ; } // starts at 8049450
.text:08049486 main endp

handle_user()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
.text:080493D0 ; int handle_user()
.text:080493D0 public handle_user
.text:080493D0 handle_user proc near ; CODE XREF: main+25↓p
.text:080493D0
.text:080493D0 num_2 = dword ptr -10h
.text:080493D0 num_1 = dword ptr -0Ch
.text:080493D0
.text:080493D0 ; __unwind {
.text:080493D0 endbr32
.text:080493D4 push ebp
.text:080493D5 mov ebp, esp
.text:080493D7 sub esp, 18h
.text:080493DA sub esp, 4
.text:080493DD lea eax, [ebp+num_2] ; 跟上题不同的是输入后不能用寄存器来作为符号了(一般题目都不会那样)
.text:080493E0 push eax
.text:080493E1 lea eax, [ebp+num_1] ; 同上
.text:080493E4 push eax
.text:080493E5 push offset aUU ; "%u %u"
.text:080493EA call ___isoc99_scanf
.text:080493EF add esp, 10h
.text:080493F2 mov eax, [ebp+num_1]
.text:080493F5 sub esp, 0Ch
.text:080493F8 push eax
.text:080493F9 call complex_function0 ; 给num_1变换
.text:080493FE add esp, 10h
.text:08049401 mov [ebp+num_1], eax
.text:08049404 mov eax, [ebp+num_2]
.text:08049407 sub esp, 0Ch
.text:0804940A push eax
.text:0804940B call complex_function1 ; 给num_2变换
.text:08049410 add esp, 10h
.text:08049413 mov [ebp+num_2], eax
.text:08049416 mov eax, [ebp+num_1] ; 关键比较
.text:08049419 cmp eax, 0A63C9805h
.text:0804941E jnz short loc_804942A
.text:08049420 mov eax, [ebp+num_2]
.text:08049423 cmp eax, 0B47ECDE5h
.text:08049428 jz short loc_804943C
.text:0804942A
.text:0804942A loc_804942A: ; CODE XREF: handle_user+4E↑j
.text:0804942A sub esp, 0Ch
.text:0804942D push offset s ; "Try again."
.text:08049432 call _puts
.text:08049437 add esp, 10h
.text:0804943A jmp short loc_804944D
.text:0804943C ; ---------------------------------------------------------------------------
.text:0804943C
.text:0804943C loc_804943C: ; CODE XREF: handle_user+58↑j
.text:0804943C sub esp, 0Ch
.text:0804943F push offset aGoodJob ; "Good Job."
.text:08049444 call _puts
.text:08049449 add esp, 10h
.text:0804944C nop
.text:0804944D
.text:0804944D loc_804944D: ; CODE XREF: handle_user+6A↑j
.text:0804944D nop
.text:0804944E leave
.text:0804944F retn
.text:0804944F ; } // starts at 80493D0
.text:0804944F handle_user endp

根据题目的名字来判断, 我们要用堆栈的局部变量来作为符号.

同时前面有个误区就是我认为的符号执行起点类似于断点, 断点前的语句都执行过了, 然后在断点处注入符号, 而实际上的符号执行起点则是直接从该点开始执行, 前面的语句都没有作用.

所以这里我们要考虑栈存在的问题, 因为scanf以前的语句都没有执行, 所以局部变量其实也是不存在的, 要想通过scanf的两个参数(即两个局部变量)来注入符号, 就需要自己构造一个栈.

使用Angr解题

具体的分析还是先看Angr代码后再讲会清晰一些

以下是解题代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
# This challenge will be more challenging than the previous challenges that you
# have encountered thus far. Since the goal of this CTF is to teach symbolic
# execution and not how to construct stack frames, these comments will work you
# through understanding what is on the stack.
# ! ! !
# IMPORTANT: Any addresses in this script aren't necessarily right! Dissassemble
# the binary yourself to determine the correct addresses!
# ! ! !
# 这个挑战比之前的都要难一些. 由于这个CTF的目标是教你Angr而不是堆栈, 因此这些注释将
# 帮助你了解栈上的内容
# 重要提示: 此脚本中的任何地址都不一定是正确的, 需要自己反汇编二进制代码查看
import angr
import claripy
import sys

def main(argv):
# 找到二进制文件并在此基础上建立Angr项目
path_to_binary = argv[1]
project = angr.Project(path_to_binary)

# For this challenge, we want to begin after the call to scanf. Note that this
# is in the middle of a function.
#针对这个关卡, 我们想要再调用完scanf以后开始符号执行. 记住这是在函数的中间完成的
# This challenge requires dealing with the stack, so you have to pay extra
# careful attention to where you start, otherwise you will enter a condition
# where the stack is set up incorrectly. In order to determine where after
# scanf to start, we need to look at the dissassembly of the call and the
# instruction immediately following it:
# 这个挑战需要处理堆栈, 所以你需要特别注意你从哪里开始, 否则你会进入错误的堆栈.
# 为了确定scanf之后从哪里开始, 我们需要查看调用的反汇编和紧随其后的指令
# sub $0x4,%esp
# lea -0x10(%ebp),%eax
# push %eax
# lea -0xc(%ebp),%eax
# push %eax
# push $0x80489c3
# call 8048370 <__isoc99_scanf@plt>
# add $0x10,%esp
# Now, the question is: do we start on the instruction immediately following
# scanf (add $0x10,%esp), or the instruction following that (not shown)?
# Consider what the 'add $0x10,%esp' is doing. Hint: it has to do with the
# scanf parameters that are pushed to the stack before calling the function.
# Given that we are not calling scanf in our Angr simulation, where should we
# start?
# 现在的问题是: 我们是从scanf之后(add $0x10, %esp)开始, 还是在这个指令之后的指令开始.
# 提示: 它与调用函数之前被压入堆栈的scanf参数有关.
# 鉴于我们没有在Angr模拟中调用scanf, 我们应该从哪里开始呢?
# 答案是在add后面开始, 因为该语句负责清理scanf的堆栈, 如果直接在这条语句处开始, 那么使用要调用者函数的栈数据的话, 地址都需要加上0x10
# (!)
start_address = 0x80493F2
initial_state = project.factory.blank_state(
addr=start_address,
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

# We are jumping into the middle of a function! Therefore, we need to account
# for how the function constructs the stack. The second instruction of the
# function is:
# 我们正在跳入一个函数的中间! 因此, 我们需要知道堆栈的结构, 该函数的第二条指令是
# mov %esp,%ebp这个是ATT风格的汇编语句
# At which point it allocates the part of the stack frame we plan to target:
# 在这里上, 它分配了我们计划起始的栈帧的一部分
# sub $0x18,%esp
# Note the value of esp relative to ebp. The space between them is (usually)
# the stack space. Since esp was decreased by 0x18
# 注意esp相对于ebp的值, 它们之间的空间是堆栈栈空间. 由于esp减少了0x18
#
# 高地址
# /-------- The stack --------\
# ebp -> | |
# |---------------------------|
# | |
# |---------------------------|
# . . . (total of 0x18 bytes)
# . . . Somewhere in here is
# . . . the data that stores
# . . . the result of scanf.
# esp -> | |
# \---------------------------/
# 低地址
# Since we are starting after scanf, we are skipping this stack construction
# step. To make up for this, we need to construct the stack ourselves. Let us
# start by initializing ebp in the exact same way the program does.
# 因为我们是在scanf之后开始的, 所以我们跳过了这个堆栈构建步骤.
# 为了弥补这一点, 我们需要自己构建堆栈. 然我们以与程序完全相同的方式初始化ebp
initial_state.regs.ebp = initial_state.regs.esp

# scanf("%u %u") needs to be replaced by injecting two bitvectors. The
# reason for this is that Angr does not (currently) automatically inject
# symbols if scanf has more than one input parameter. This means Angr can
# handle 'scanf("%u")', but not 'scanf("%u %u")'.
# You can either copy and paste the line below or use a Python list.
# scanf("%u %u")需要通过注入两个位向量来替换. 原因是如果scanf有多个输入参数, Angr不会自动注入符号
# 这意味着Angr可以处理scanf("%u"), 但不能处理scanf("%u %u").
# 你可以复制粘贴下面的行或使用Python列表
# (!)
password_size_bits = 32
password0 = claripy.BVS('password0', password_size_bits)
password1 = claripy.BVS('password1', password_size_bits)

# Here is the hard part. We need to figure out what the stack looks like, at
# least well enough to inject our symbols where we want them. In order to do
# that, let's figure out what the parameters of scanf are:
# 这是最难的部分, 我们需要能清楚堆栈长什么样, 至少能让我们的符号注入我们想要的地方.
# 为了做到这一点, 让我们能清楚scanf的参数是什么
# sub $0x4,%esp
# lea -0x10(%ebp),%eax
# push %eax
# lea -0xc(%ebp),%eax
# push %eax
# push $0x80489c3
# call 8048370 <__isoc99_scanf@plt>
# add $0x10,%esp
# As you can see, the call to scanf looks like this:
# 正如你所看见的, 对scanf的调用如下所示
# scanf( 0x80489c3, ebp - 0xc, ebp - 0x10 )
# format_string password0 password1
# From this, we can construct our new, more accurate stack diagram:
# 由此, 我们可以构建新的, 更准确的堆栈图
#
# /-------- The stack --------\
# ebp -> | padding |
# |---------------------------|
# ebp - 0x01 | more padding |
# |---------------------------|
# ebp - 0x02 | even more padding |
# |---------------------------|
# . . . <- How much padding? Hint: how
# |---------------------------| many bytes is password0? 填充了9个字节, password0从9开始
# ebp - 0x0b | password0, second byte |
# |---------------------------|
# ebp - 0x0c | password0, first byte |
# |---------------------------|
# ebp - 0x0d | password1, last byte |
# |---------------------------|
# . . .
# |---------------------------|
# ebp - 0x10 | password1, first byte |
# |---------------------------|
# . . .
# |---------------------------|
# esp -> | |
# \---------------------------/
#
# Figure out how much space there is and allocate the necessary padding to
# the stack by decrementing esp before you push the password bitvectors.
# 计算出有多少空间, 并在你压入password位向量之前通过递减esp来分配栈空间
padding_length_in_bytes = 0x8 # :integer
initial_state.regs.esp -= padding_length_in_bytes#相当于汇编: sub esp, 0x10

# Push the variables to the stack. Make sure to push them in the right order!
# The syntax for the following function is:
#将变量压入堆栈, 确保以正确的形式压入它们(正确的顺序), 以下函数的语法是:
# initial_state.stack_push(bitvector)
#
# This will push the bitvector on the stack, and increment esp the correct
# amount. You will need to push multiple bitvectors on the stack.
# 这会将位向量压入栈中, 并给esp加上正确的值
# 你将需要再栈中压入许多的位向量
# (!)
initial_state.stack_push(password0) # :bitvector (claripy.BVS, claripy.BVV, claripy.BV)
initial_state.stack_push(password1)

simulation = project.factory.simgr(initial_state)

def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Good Job".encode() in stdout_output

def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Try again".encode() in stdout_output

simulation.explore(find=is_successful, avoid=should_abort)

if simulation.found:
solution_state = simulation.found[0]

solution0 = solution_state.solver.eval(password0)
solution1 = solution_state.solver.eval(password1)

solution = str((solution0)) + ' ' + str((solution1))
print(solution)
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

去掉注释后

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
import angr
import claripy
import sys

def main(argv):
path_to_binary = argv[1]
project = angr.Project(path_to_binary)

start_address = 0x80493F2
initial_state = project.factory.blank_state(
addr=start_address,
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)
initial_state.regs.ebp = initial_state.regs.esp

password_size_bits = 32
password0 = claripy.BVS('password0', password_size_bits)
password1 = claripy.BVS('password1', password_size_bits)

padding_length_in_bytes = 0x8
initial_state.regs.esp -= padding_length_in_bytes

initial_state.stack_push(password0)
initial_state.stack_push(password1)

simulation = project.factory.simgr(initial_state)

def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Good Job".encode() in stdout_output

def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Try again".encode() in stdout_output

simulation.explore(find=is_successful, avoid=should_abort)

if simulation.found:
solution_state = simulation.found[0]

solution0 = solution_state.solver.eval(password0)
solution1 = solution_state.solver.eval(password1)

solution = str((solution0)) + ' ' + str((solution1))
print(solution)
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

问题一: 符号执行的起点是在add esp, 10h前面还是后面

这里我们要知道的是: 在符号执行之前, Angr就会完成符号注入

如果在add esp, 10h处注入, Angr会先获取该条指令处的state, 而当前的state的esp指针是经过了___isoc99_scanf的retn指令后的入口地址, 但是参数是在调用之前压入栈中的, 也就是说此时的esp是在传入三个参数后的位置. 此时注入了两个符号分别是esp - 8esp - 4, 在注入完成后才执行add esp, 10h, 这个指令的作用就是清理参数的栈回到调用者的栈顶, 所以前面注入的符号直接作废了. 下面的图像会直观一些

这是在add esp, 10h命令的位置符号执行, 符号注入发生在符号执行之前image.png

下面是执行完add esp, 10h命令后

image.png

所以我们需要在add esp, 10h后开始符号执行

关于栈注入符号的理解

即假定一块栈的内容为x, 然后进行条件约束并求解.

如何注入

1. 构造栈

可以看看引用password的方式ebp - 0xcebp - 0x10, 都是通过ebp加上偏移量来是实现的, 我们需要自己构造一个栈用来保存自己的符号. 同时保证后面的汇编语句能够正确引用到我们的符号

image.png

1
initial_state.regs.ebp = initial_state.regs.esp

当ebp改变时, 就是创建了一个新的栈帧, 也就是一个新的栈.

向栈中注入符号

1
2
3
4
5
padding_length_in_bytes = 0x8
initial_state.regs.esp -= padding_length_in_bytes

initial_state.stack_push(password0)
initial_state.stack_push(password1)

前两个语句是调整偏移, 为了模仿原本的栈结构, 使后面的汇编语句能够正确引用到这两个符号.

下面的两个语句就是注入符号到栈中了(源代码在前面申请了这两个32位的符号)

符号执行

1
simulation = project.factory.simgr(initial_state)

05_angr_symbolic_memory

编译并执行

1
2
3
4
5
6
7
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/05_angr_symbolic_memory$ python3 generate.py 1234 05_angr_symbolic_memory
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/05_angr_symbolic_memory$ ./05_angr_symbolic_memory
Enter the password: aaaaa
aaaaa
aaaaa
aaaaa
Try again.

分析

使用IDA查看

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
.text:080492BB ; __unwind {
.text:080492BB endbr32
.text:080492BF lea ecx, [esp+4]
.text:080492C3 and esp, 0FFFFFFF0h
.text:080492C6 push dword ptr [ecx-4]
.text:080492C9 push ebp
.text:080492CA mov ebp, esp
.text:080492CC push ecx
.text:080492CD sub esp, 14h
.text:080492D0 sub esp, 4
.text:080492D3 push 21h ; '!' ; n
.text:080492D5 push 0 ; c
.text:080492D7 push offset user_input ; s
.text:080492DC call _memset ; 初始化内存
.text:080492E1 add esp, 10h
.text:080492E4 sub esp, 0Ch
.text:080492E7 push offset aEnterThePasswo ; "Enter the password: "
.text:080492EC call _printf
.text:080492F1 add esp, 10h
.text:080492F4 sub esp, 0Ch
.text:080492F7 push offset unk_8134378
.text:080492FC push offset unk_8134370
.text:08049301 push offset unk_8134368
.text:08049306 push offset user_input ; 将输入保存在上面申请的内存中
.text:0804930B push offset a8s8s8s8s ; "%8s %8s %8s %8s"
.text:08049310 call ___isoc99_scanf
.text:08049315 add esp, 20h
.text:08049318 mov [ebp+var_C], 0
.text:0804931F jmp short loc_804934E
.text:08049321 ; ---------------------------------------------------------------------------
.text:08049321
.text:08049321 loc_8049321: ; CODE XREF: main+97↓j
.text:08049321 mov eax, [ebp+var_C]
.text:08049324 add eax, 8134360h
.text:08049329 movzx eax, byte ptr [eax]
.text:0804932C movsx eax, al
.text:0804932F sub esp, 8
.text:08049332 push [ebp+var_C]
.text:08049335 push eax ; 将内存中保存的输入进行变换
.text:08049336 call complex_function
.text:0804933B add esp, 10h
.text:0804933E mov edx, eax
.text:08049340 mov eax, [ebp+var_C]
.text:08049343 add eax, 8134360h
.text:08049348 mov [eax], dl
.text:0804934A add [ebp+var_C], 1
.text:0804934E
.text:0804934E loc_804934E: ; CODE XREF: main+64↑j
.text:0804934E cmp [ebp+var_C], 1Fh
.text:08049352 jle short loc_8049321
.text:08049354 sub esp, 4
.text:08049357 push 20h ; ' ' ; n
.text:08049359 push offset s2 ; "HXUITWOAPNESFFEGWWORQMFUTTYBKKFF"
.text:0804935E push offset user_input ; s1
.text:08049363 call _strncmp
.text:08049368 add esp, 10h
.text:0804936B test eax, eax
.text:0804936D jz short loc_8049381
.text:0804936F sub esp, 0Ch
.text:08049372 push offset s ; "Try again."
.text:08049377 call _puts
.text:0804937C add esp, 10h
.text:0804937F jmp short loc_8049391
.text:08049381 ; ---------------------------------------------------------------------------
.text:08049381
.text:08049381 loc_8049381: ; CODE XREF: main+B2↑j
.text:08049381 sub esp, 0Ch
.text:08049384 push offset aGoodJob ; "Good Job."
.text:08049389 call _puts
.text:0804938E add esp, 10h
.text:08049391
.text:08049391 loc_8049391: ; CODE XREF: main+C4↑j
.text:08049391 mov eax, 0
.text:08049396 mov ecx, [ebp+var_4]
.text:08049399 leave
.text:0804939A lea esp, [ecx-4]
.text:0804939D retn
.text:0804939D ; } // starts at 80492BB
.text:0804939D main endp
.text:0804939D
.text:0804939D ; ---------------------------------------------------------------------------

可以看到现在题目中有四个输入, 所以不能让Angr自动执行, 同时输入也没有保存在栈中, 而是在一段内存中.

查看complex_function()的调用处的汇编可以发现内存的地址是一个常量

1
2
3
4
5
6
7
8
.text:08049321                 mov     eax, [ebp+var_C]
.text:08049324 add eax, 8134360h
.text:08049329 movzx eax, byte ptr [eax]
.text:0804932C movsx eax, al
.text:0804932F sub esp, 8
.text:08049332 push [ebp+var_C]
.text:08049335 push eax ; 将内存中保存的输入进行变换
.text:08049336 call complex_function

于是我们得到保存输入的内存地址为: 0x8134360

使用Angr解题

代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
import angr
import claripy
import sys

def main(argv):
# 获取二进制文件, 并在该二进制文件上建立Angr文件
path_to_binary = argv[1]
project = angr.Project(path_to_binary)

# 符号执行的地址
start_address = 0x08049315 #必要条件: 必须在memset后面, 如果在前面的话, 该段内存还未被用户使用, Angr复原的就是无用数据
#必要条件: 要在scanf后面, 如果在前面的话, Angr还原的state就是在scanf前的处于初始化的数据了
#问题: 是否还要再管那个栈空间, 经过验证是不用管的, 前面后面都可以
initial_state = project.factory.blank_state(
addr=start_address,
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

# The binary is calling scanf("%8s %8s %8s %8s").
# (!)
password_len_bits = 64 # 8个字符, 一个字符8位, 总共64位
password0 = claripy.BVS('password0', password_len_bits)
password1 = claripy.BVS('password1', password_len_bits)
password2 = claripy.BVS('password2', password_len_bits)
password3 = claripy.BVS('password3', password_len_bits)
...

# Determine the address of the global variable to which scanf writes the user
# input. The function 'initial_state.memory.store(address, value)' will write
# 'value' (a bitvector) to 'address' (a memory location, as an integer.) The
# 'address' parameter can also be a bitvector (and can be symbolic!).
# 确定scanf写入输入的全局变量的地址.
# 函数initial_state.memory.store(address, value)的作用是
# 将参数value(位向量)写入地址为address的内存
# 参数address也可以是位向量(也可以是符号)
# (!)
password0_address = 0x8134360
initial_state.memory.store(password0_address, password0)
initial_state.memory.store(password0_address + 0x8, password1)
initial_state.memory.store(password0_address + 0x10, password2)
initial_state.memory.store(password0_address + 0x18, password3)

simulation = project.factory.simgr(initial_state)

def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Good Job".encode() in stdout_output

def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Try again".encode() in stdout_output

simulation.explore(find=is_successful, avoid=should_abort)

if simulation.found:
solution_state = simulation.found[0]

# Solve for the symbolic values. We are trying to solve for a string.
# Therefore, we will use eval, with named parameter cast_to=bytes
# which returns bytes that can be decoded to a string instead of an integer.
# 求解符号值.
# 我们正常是解出一个字符串, 因此, 我们将使用带有 命名参数cast_to=bytes的eval函数
# 加上该参数后这个函数将返回解码为字符串的数据. 而不是整数的字节
# (!)
# cast_to = bytes表示切片成单字节, 然后外部用decode解码成unicode
solution0 = solution_state.solver.eval(password0,cast_to=bytes).decode()
solution1 = solution_state.solver.eval(password1,cast_to=bytes).decode()
solution2 = solution_state.solver.eval(password2,cast_to=bytes).decode()
solution3 = solution_state.solver.eval(password3,cast_to=bytes).decode()
solution = solution0 + solution1 + solution2 + solution3

print(solution)
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

去掉注释后的代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
import angr
import claripy
import sys

def main(argv):
path_to_binary = argv[1]
project = angr.Project(path_to_binary)

start_address = 0x08049318
initial_state = project.factory.blank_state(
addr=start_address,
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

password_len_bits = 64
password0 = claripy.BVS('password0', password_len_bits)
password1 = claripy.BVS('password1', password_len_bits)
password2 = claripy.BVS('password2', password_len_bits)
password3 = claripy.BVS('password3', password_len_bits)

password0_address = 0x8134360
initial_state.memory.store(password0_address, password0)
initial_state.memory.store(password0_address + 0x8, password1)
initial_state.memory.store(password0_address + 0x10, password2)
initial_state.memory.store(password0_address + 0x18, password3)

simulation = project.factory.simgr(initial_state)

def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Good Job".encode() in stdout_output

def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Try again".encode() in stdout_output

simulation.explore(find=is_successful, avoid=should_abort)

if simulation.found:
solution_state = simulation.found[0]

solution0 = solution_state.solver.eval(password0,cast_to=bytes).decode()
solution1 = solution_state.solver.eval(password1,cast_to=bytes).decode()
solution2 = solution_state.solver.eval(password2,cast_to=bytes).decode()
solution3 = solution_state.solver.eval(password3,cast_to=bytes).decode()
solution = solution0 + solution1 + solution2 + solution3

print(solution)
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

符号执行的地址的位置

memset()

我们的用户输入是发生再memset()后面的, 所以要在memset()后面符号执行

scanf()

如果再scanf()之前注入符号的话, Angr还原的state也是scanf()之前的那段内存了, 而在scanf调用之前的内存还是00 00 00 00这样的形式, 所以必须要scanf()后面符号执行

add esp, 10h

是否要在这条指令后面开始执行, 因为本题没有涉及栈的符号注入, 所以不用在意在前还是在后. 后面经过检验确实是这样.

1
2
3
start_address = 0x08049315   #这是add     esp, 10h
start_address = 0x08049318 #这是add esp, 10h的下一个汇编语句
# 二者都能得到正确的flag

06_angr_symbolic_dynamic_memory

编译并运行

1
2
3
4
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/06_angr_symbolic_dynamic_memory$ python3 generate.py 1234 06_angr_symbolic_dynamic_memory
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/06_angr_symbolic_dynamic_memory$ ./06_angr_symbolic_dynamic_memory
Enter the password: aaaaaaaaaaaa
Try again.

分析

使用IDA分析

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
; int __cdecl main(int argc, const char **argv, const char **envp)
.text:080492FF public main
.text:080492FF main proc near ; DATA XREF: _start+2A↑o
.text:080492FF
.text:080492FF i = dword ptr -0Ch
.text:080492FF var_4 = dword ptr -4
.text:080492FF argc = dword ptr 8
.text:080492FF argv = dword ptr 0Ch
.text:080492FF envp = dword ptr 10h
.text:080492FF
.text:080492FF ; __unwind {
.text:080492FF endbr32
.text:08049303 lea ecx, [esp+4]
.text:08049307 and esp, 0FFFFFFF0h
.text:0804930A push dword ptr [ecx-4]
.text:0804930D push ebp
.text:0804930E mov ebp, esp
.text:08049310 push ecx
.text:08049311 sub esp, 14h
.text:08049314 sub esp, 0Ch
.text:08049317 push 9 ; size
.text:08049319 call _malloc ; 申请内存
.text:0804931E add esp, 10h
.text:08049321 mov ds:buffer0, eax
.text:08049326 sub esp, 0Ch
.text:08049329 push 9 ; size
.text:0804932B call _malloc ; 申请内存
.text:08049330 add esp, 10h
.text:08049333 mov ds:buffer1, eax
.text:08049338 mov eax, ds:buffer0
.text:0804933D sub esp, 4
.text:08049340 push 9 ; n
.text:08049342 push 0 ; c
.text:08049344 push eax ; s
.text:08049345 call _memset ; 初始化内存
.text:0804934A add esp, 10h
.text:0804934D mov eax, ds:buffer1
.text:08049352 sub esp, 4
.text:08049355 push 9 ; n
.text:08049357 push 0 ; c
.text:08049359 push eax ; s
.text:0804935A call _memset ; 初始化内存
.text:0804935F add esp, 10h
.text:08049362 sub esp, 0Ch
.text:08049365 push offset aEnterThePasswo ; "Enter the password: "
.text:0804936A call _printf
.text:0804936F add esp, 10h
.text:08049372 mov edx, ds:buffer1
.text:08049378 mov eax, ds:buffer0
.text:0804937D sub esp, 4
.text:08049380 push edx
.text:08049381 push eax
.text:08049382 push offset a8s8s ; "%8s %8s"
.text:08049387 call ___isoc99_scanf ; 用户输入
.text:0804938C add esp, 10h
.text:0804938F mov [ebp+i], 0
.text:08049396 jmp short loc_8049402
.text:08049398 ; ---------------------------------------------------------------------------
.text:08049398
.text:08049398 loc_8049398: ; CODE XREF: main+107↓j
.text:08049398 mov edx, ds:buffer0
.text:0804939E mov eax, [ebp+i]
.text:080493A1 add eax, edx
.text:080493A3 movzx eax, byte ptr [eax]
.text:080493A6 movsx eax, al
.text:080493A9 sub esp, 8
.text:080493AC push [ebp+i]
.text:080493AF push eax
.text:080493B0 call complex_function ; 对password0进行变换
.text:080493B5 add esp, 10h
.text:080493B8 mov ecx, eax
.text:080493BA mov edx, ds:buffer0
.text:080493C0 mov eax, [ebp+i]
.text:080493C3 add eax, edx
.text:080493C5 mov edx, ecx
.text:080493C7 mov [eax], dl
.text:080493C9 mov eax, [ebp+i]
.text:080493CC lea edx, [eax+20h]
.text:080493CF mov ecx, ds:buffer1
.text:080493D5 mov eax, [ebp+i]
.text:080493D8 add eax, ecx
.text:080493DA movzx eax, byte ptr [eax]
.text:080493DD movsx eax, al
.text:080493E0 sub esp, 8
.text:080493E3 push edx
.text:080493E4 push eax
.text:080493E5 call complex_function ; 对password1进行变换
.text:080493EA add esp, 10h
.text:080493ED mov ecx, eax
.text:080493EF mov edx, ds:buffer1
.text:080493F5 mov eax, [ebp+i]
.text:080493F8 add eax, edx
.text:080493FA mov edx, ecx
.text:080493FC mov [eax], dl
.text:080493FE add [ebp+i], 1
.text:08049402
.text:08049402 loc_8049402: ; CODE XREF: main+97↑j
.text:08049402 cmp [ebp+i], 7
.text:08049406 jle short loc_8049398
.text:08049408 mov eax, ds:buffer0
.text:0804940D sub esp, 4
.text:08049410 push 8 ; n
.text:08049412 push offset s2 ; "HXUITWOA"
.text:08049417 push eax ; s1
.text:08049418 call _strncmp ; 检验
.text:0804941D add esp, 10h
.text:08049420 test eax, eax
.text:08049422 jnz short loc_8049440
.text:08049424 mov eax, ds:buffer1
.text:08049429 sub esp, 4
.text:0804942C push 8 ; n
.text:0804942E push offset aPnesffeg ; "PNESFFEG"
.text:08049433 push eax ; s1
.text:08049434 call _strncmp ; 检验
.text:08049439 add esp, 10h
.text:0804943C test eax, eax
.text:0804943E jz short loc_8049452
.text:08049440
.text:08049440 loc_8049440: ; CODE XREF: main+123↑j
.text:08049440 sub esp, 0Ch
.text:08049443 push offset s ; "Try again."
.text:08049448 call _puts
.text:0804944D add esp, 10h
.text:08049450 jmp short loc_8049462
.text:08049452 ; ---------------------------------------------------------------------------
.text:08049452
.text:08049452 loc_8049452: ; CODE XREF: main+13F↑j
.text:08049452 sub esp, 0Ch
.text:08049455 push offset aGoodJob ; "Good Job."
.text:0804945A call _puts
.text:0804945F add esp, 10h
.text:08049462
.text:08049462 loc_8049462: ; CODE XREF: main+151↑j
.text:08049462 mov eax, ds:buffer0
.text:08049467 sub esp, 0Ch
.text:0804946A push eax ; ptr
.text:0804946B call _free
.text:08049470 add esp, 10h
.text:08049473 mov eax, ds:buffer1
.text:08049478 sub esp, 0Ch
.text:0804947B push eax ; ptr
.text:0804947C call _free
.text:08049481 add esp, 10h
.text:08049484 mov eax, 0
.text:08049489 mov ecx, [ebp+var_4]
.text:0804948C leave
.text:0804948D lea esp, [ecx-4]
.text:08049490 retn
.text:08049490 ; } // starts at 80492FF
.text:08049490 main endp
.text:08049490
.text:08049490 ; ---------------------------------------------------------------------------

这一关跟上一关不同点在于所使用的内存不再是固定的了, 而是使用malloc申请内存.所以我们无法像上一题那样直接将符号注入到固定的内存中.

虽然malloc分配的内存地址是随机的, 但是这个地址会分配给一个指针变量中(全局/局部), 而这一个变量的地址是固定的, 所以我们可以通过这个地址来找到内存的地址.

而Angr给出了一个更加简单的方法, 我们使用malloc的目的就是为了获得一段内存空间, 那我们也可以使用(伪造)一段未被使用的内存空间来替换掉malloc所分配的内存空间(这样就保证后面使用的内存地址是固定的, 简便了许多), 直接将变量的内容改为我们伪造的内存空间地址即可.

使用Angr解题

代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
import angr
import claripy
import sys

def main(argv):
#建立Angr项目
path_to_binary = argv[1]
project = angr.Project(path_to_binary)

#符号执行地址
start_address = 0x0804938F #scanf的后面
initial_state = project.factory.blank_state(
addr=start_address,
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

# The binary is calling scanf("%8s %8s").
# (!)
password_len_bits = 64 # 八个字符, 每个字符8个位
password0 = claripy.BVS('password0', password_len_bits)
password1 = claripy.BVS('password1', password_len_bits)

# Instead of telling the binary to write to the address of the memory
# allocated with malloc, we can simply fake an address to any unused block of
# memory and overwrite the pointer to the data. This will point the pointer
# with the address of pointer_to_malloc_memory_address0 to fake_heap_address.
# Be aware, there is more than one pointer! Analyze the binary to determine
# global location of each pointer.
# Note: by default, Angr stores integers in memory with big-endianness. To
# specify to use the endianness of your architecture, use the parameter
# endness=project.arch.memory_endness. On x86, this is little-endian.
# 我们可以轻松的伪造一个未被使用的内存的地址指针来覆盖指向原本数据的指针, 不是使用二进制文件从malloc那里得到的内存指针(这样比较麻烦)
# 这会使得内容为pointer_to_malloc_memory_address0的指针指向fake_heap_address
# 注意, 指针不止一个
# 分析二进制文件来确定每个指针的全局位置.
# 注意: 默认情况下, Angr以大端顺序将整数存储再内存中, 要指定使用架构的字节序, 请使用参数endness=project.arch.memory_endness. 在x86上, 这是小端
# (!)
# 简而言之, 主函数中会有一个变量(全局或是局部)来存储分配的内存的地址, 我们要先找到那个变量的地址, 然后直接修改存储的值(原本的值为malloc分配的内存地址, 现在我们要改成一段未被使用的内存地址)
fake_heap_address0 = 0x0804C100 # 我先使用的是.bss段的内存
pointer_to_malloc_memory_address0 = 0x0BB9CE00 #这个是buffer0的地址, 而buffer0存储的是malloc分配的内存的地址, 现在我们要将buffer0的内容修改另一个内存的地址
initial_state.memory.store(pointer_to_malloc_memory_address0, fake_heap_address0, endness=project.arch.memory_endness)
fake_heap_address1 = 0x804C108
pointer_to_malloc_memory_address1 = 0x0BB9CE08
initial_state.memory.store(pointer_to_malloc_memory_address1, fake_heap_address1, endness=project.arch.memory_endness)

# Store our symbolic values at our fake_heap_address. Look at the binary to
# determine the offsets from the fake_heap_address where scanf writes.
# 将我们的符号值存储在我们的fake_heap_address中
# 查看二进制文件来确定scanf写入的fake_heap_address的偏移量
# (!)


initial_state.memory.store(fake_heap_address0, password0)
initial_state.memory.store(fake_heap_address1, password1)

simulation = project.factory.simgr(initial_state)

def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Good Job".encode() in stdout_output

def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Try again".encode() in stdout_output

simulation.explore(find=is_successful, avoid=should_abort)

if simulation.found:
solution_state = simulation.found[0]

solution0 = solution_state.solver.eval(password0,cast_to=bytes).decode()
solution1 = solution_state.solver.eval(password1,cast_to=bytes).decode()
solution = solution0 + solution1

print(solution)
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

去掉注释后的代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
import angr
import claripy
import sys

def main(argv):
path_to_binary = argv[1]
project = angr.Project(path_to_binary)

start_address = 0x0804938F
initial_state = project.factory.blank_state(
addr=start_address,
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

# The binary is calling scanf("%8s %8s").
# (!)
password_len_bits = 64
password0 = claripy.BVS('password0', password_len_bits)
password1 = claripy.BVS('password1', password_len_bits)

fake_heap_address0 = 0x0804C100
pointer_to_malloc_memory_address0 = 0x0BB9CE00
initial_state.memory.store(pointer_to_malloc_memory_address0, fake_heap_address0, endness=project.arch.memory_endness)
fake_heap_address1 = 0x804C108
pointer_to_malloc_memory_address1 = 0x0BB9CE08
initial_state.memory.store(pointer_to_malloc_memory_address1, fake_heap_address1, endness=project.arch.memory_endness)

# Store our symbolic values at our fake_heap_address. Look at the binary to
# determine the offsets from the fake_heap_address where scanf writes.

initial_state.memory.store(fake_heap_address0, password0)
initial_state.memory.store(fake_heap_address1, password1)

simulation = project.factory.simgr(initial_state)

def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Good Job".encode() in stdout_output

def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Try again".encode() in stdout_output

simulation.explore(find=is_successful, avoid=should_abort)

if simulation.found:
solution_state = simulation.found[0]

solution0 = solution_state.solver.eval(password0,cast_to=bytes).decode()
solution1 = solution_state.solver.eval(password1,cast_to=bytes).decode()
solution = solution0 + solution1

print(solution)
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

确定符号执行地址

这个的思路跟上一关差不多, 可以参考上一关

创建符号

我们要找的flag是两个8字符的输入, 也就是说我们要创建两个64为的符号

1
2
3
password_len_bits = 64
password0 = claripy.BVS('password0', password_len_bits)
password1 = claripy.BVS('password1', password_len_bits)

替换内存

我们可以看到地址为0x0BB9CE00存储了malloc的地址, 我们将其替换为一个未被使用的内存地址0x0804C100, 后面使用内存的时候就不再使用malloc分配的内存, 而是我们伪造的内存. 下一个内存的替换也是同理.

1
2
3
4
5
6
7
8
fake_heap_address0 = 0x0804C100
pointer_to_malloc_memory_address0 = 0x0BB9CE00
initial_state.memory.store(pointer_to_malloc_memory_address0, fake_heap_address0, endness=project.arch.memory_endness)

fake_heap_address1 = 0x804C108
pointer_to_malloc_memory_address1 = 0x0BB9CE08
initial_state.memory.store(pointer_to_malloc_memory_address1, fake_heap_address1, endness=project.arch.memory_endness)

注入符号

向伪造的内存中注入符号

1
2
initial_state.memory.store(fake_heap_address0, password0)
initial_state.memory.store(fake_heap_address1, password1)

符号执行

1
simulation.explore(find=is_successful, avoid=should_abort)

结果

执行scaffold06时会报错, 但是会解得flag

image.png

07_angr_symbolic_file

编译并运行

1
2
3
4
5
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/07_angr_symbolic_file$ python3 generate.py 1234 07_angr_symbolic_file
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/07_angr_symbolic_file$ ./07_angr_symbolic_file
Enter the password: aaaaa
Try again.
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/07_angr_symbolic_file$

分析

使用IDA分析

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
.text:080494F1 ; int __cdecl main(int argc, const char **argv, const char **envp)
.text:080494F1 public main
.text:080494F1 main proc near ; DATA XREF: _start+2A↑o
.text:080494F1
.text:080494F1 var_1C= dword ptr -1Ch
.text:080494F1 argc= dword ptr 8
.text:080494F1 argv= dword ptr 0Ch
.text:080494F1 envp= dword ptr 10h
.text:080494F1
.text:080494F1 ; __unwind {
.text:080494F1 endbr32
.text:080494F5 lea ecx, [esp+4]
.text:080494F9 and esp, 0FFFFFFF0h
.text:080494FC push dword ptr [ecx-4]
.text:080494FF push ebp
.text:08049500 mov ebp, esp
.text:08049502 push edi
.text:08049503 push esi
.text:08049504 push ecx
.text:08049505 sub esp, 1Ch
.text:08049508 sub esp, 4
.text:0804950B push 40h ; '@' ; n
.text:0804950D push 0 ; c
.text:0804950F push offset buffer ; s
.text:08049514 call _memset ; 初始化一段内存
.text:08049519 add esp, 10h
.text:0804951C sub esp, 0Ch
.text:0804951F push offset aEnterThePasswo ; "Enter the password: "
.text:08049524 call _printf ; 打印提示
.text:08049529 add esp, 10h
.text:0804952C sub esp, 8
.text:0804952F push offset buffer
.text:08049534 push offset a64s ; "%64s"
.text:08049539 call ___isoc99_scanf ; 读取字符串
.text:0804953E add esp, 10h
.text:08049541 sub esp, 8
.text:08049544 push 40h ; '@' ; n
.text:08049546 push offset buffer ; int
.text:0804954B call ignore_me ; 这个函数在scaffold中有解释, 只是为了模拟一个读取文件的操作, 可以忽略
.text:08049550 add esp, 10h
.text:08049553 sub esp, 4
.text:08049556 push 40h ; '@' ; n
.text:08049558 push 0 ; c
.text:0804955A push offset buffer ; s
.text:0804955F call _memset ; 再次初始化内存
.text:08049564 add esp, 10h
.text:08049567 sub esp, 8
.text:0804956A push offset aRb ; "rb"
.text:0804956F push offset name ; "PNESFFEG.txt"
.text:08049574 call _fopen ; 打开文件PNESFFEG.txt
.text:08049579 add esp, 10h
.text:0804957C mov ds:fp, eax
.text:08049581 mov eax, ds:fp
.text:08049586 push eax ; stream
.text:08049587 push 40h ; '@' ; n
.text:08049589 push 1 ; size
.text:0804958B push offset buffer ; ptr
.text:08049590 call _fread ; 从文件PNESFFEG.txt中读取数据
.text:08049595 add esp, 10h
.text:08049598 mov eax, ds:fp
.text:0804959D sub esp, 0Ch
.text:080495A0 push eax ; stream
.text:080495A1 call _fclose ; 关闭PNESFFEG.txt
.text:080495A6 add esp, 10h
.text:080495A9 sub esp, 0Ch
.text:080495AC push offset name ; "PNESFFEG.txt"
.text:080495B1 call _unlink ; 删除PNESFFEG.txt该文件
.text:080495B6 add esp, 10h
.text:080495B9 mov [ebp+var_1C], 0
.text:080495C0 jmp short loc_80495EF
.text:080495C2 ; ---------------------------------------------------------------------------
.text:080495C2
.text:080495C2 loc_80495C2: ; CODE XREF: main+102↓j
.text:080495C2 mov eax, [ebp+var_1C]
.text:080495C5 add eax, 804C0A0h
.text:080495CA movzx eax, byte ptr [eax]
.text:080495CD movsx eax, al
.text:080495D0 sub esp, 8
.text:080495D3 push [ebp+var_1C]
.text:080495D6 push eax ; 将读取的前8个字节的数据进行变换
.text:080495D7 call complex_function
.text:080495DC add esp, 10h
.text:080495DF mov edx, eax
.text:080495E1 mov eax, [ebp+var_1C]
.text:080495E4 add eax, 804C0A0h
.text:080495E9 mov [eax], dl
.text:080495EB add [ebp+var_1C], 1
.text:080495EF
.text:080495EF loc_80495EF: ; CODE XREF: main+CF↑j
.text:080495EF cmp [ebp+var_1C], 7
.text:080495F3 jle short loc_80495C2
.text:080495F5 mov edx, offset buffer ; 关键判断
.text:080495FA mov eax, offset aHxuitwoa ; "HXUITWOA"
.text:080495FF mov ecx, 9
.text:08049604 mov esi, edx
.text:08049606 mov edi, eax
.text:08049608 repe cmpsb
.text:0804960A setnbe dl
.text:0804960D setb al
.text:08049610 sub edx, eax
.text:08049612 mov eax, edx
.text:08049614 movsx eax, al
.text:08049617 test eax, eax
.text:08049619 jz short loc_8049635
.text:0804961B sub esp, 0Ch
.text:0804961E push offset s ; "Try again."
.text:08049623 call _puts
.text:08049628 add esp, 10h
.text:0804962B sub esp, 0Ch
.text:0804962E push 1 ; status
.text:08049630 call _exit
.text:08049635 ; ---------------------------------------------------------------------------
.text:08049635
.text:08049635 loc_8049635: ; CODE XREF: main+128↑j
.text:08049635 sub esp, 0Ch
.text:08049638 push offset aGoodJob ; "Good Job."
.text:0804963D call _puts
.text:08049642 add esp, 10h
.text:08049645 sub esp, 0Ch
.text:08049648 push 0 ; status
.text:0804964A call _exit
.text:0804964A ; } // starts at 80494F1
.text:0804964A main endp
.text:0804964A
.text:0804964A ; ---------------------------------------------------------------------------
.text:0804964F align 10h
.text:08049650
.text:08049650 ; =============== S U B R O U T I N E =======================================

起始有一个简单的解法就是直接获取固定的内存地址0x804C0A0h注入符号即可求解.

但是在scaffold07中有规定要使用模拟文件来解这道题

使用Angr求解

代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
# This challenge could, in theory, be solved in multiple ways. However, for the
# sake of learning how to simulate an alternate filesystem, please solve this
# challenge according to structure provided below. As a challenge, once you have
# an initial solution, try solving this in an alternate way.
#
# Problem description and general solution strategy:
# The binary loads the password from a file using the fread function. If the
# password is correct, it prints "Good Job." In order to keep consistency with
# the other challenges, the input from the console is written to a file in the
# ignore_me function. As the name suggests, ignore it, as it only exists to
# maintain consistency with other challenges.
# We want to:
# 1. Determine the file from which fread reads.
# 2. Use Angr to simulate a filesystem where that file is replaced with our own
# simulated file.
# 3. Initialize the file with a symbolic value, which will be read with fread
# and propogated through the program.
# 4. Solve for the symbolic input to determine the password.
# 从理论上来讲, 这一关可以通过多种方式来解决. 但是, 为了学习模拟备用文件系统, 请
# 根据下面提供的步骤来通过此关卡. 作为一个挑战, 一旦你有了一个初步的解决方案, 试
# 这用另一种方式解决这个挑战.

# 问题描述和一般解决策略: 二进制文件使用fread函数从文件中加载密码. 如果密码正确,
# 它会打印"Good Job". 为了与其他挑战保持一致, 来自控制台的输入被写入ignore_me函
# 中的文件.顾名思义, 忽略它, 它的存在只是为了与其他挑战保持一致

#我们要:
# 1. 确定fread读取的文件
# 2. 使用Angr模拟一个文件系统, 该文件被我们自己的模拟文件替换
# 3. 用符号值初始化文件, 将用fread读取并通过程序传播
# 4. 求解符号输入以确定密码
import angr
import claripy
import sys

def main(argv):
path_to_binary = argv[1]
project = angr.Project(path_to_binary)

start_address = 0x08049564 # 先设在memset后面
initial_state = project.factory.blank_state(
addr=start_address,
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

# Specify some information needed to construct a simulated file. For this
# challenge, the filename is hardcoded, but in theory, it could be symbolic.
# Note: to read from the file, the binary calls
# 'fread(buffer, sizeof(char), 64, file)'.
# 指定构建模拟文件所需的一些信息. 对于这个挑战, 文件名是硬编码的, 但理论上它是可以是符号化的.
# 注意: 要从文件中读取, 二进制文件调用'fread()'
# (!)
filename = "PNESFFEG.txt" # :string, 文件名
symbolic_file_size_bytes = 64 # 文件大小(字节)

# Construct a bitvector for the password and then store it in the file's
# backing memory. For example, imagine a simple file, 'hello.txt':
# 为flag构建一个符号位向量, 然后将其存储在文件的后背内存中.
# Hello world, my name is John.
# ^ ^
# ^ address 0 ^ address 24 (count the number of characters)
# In order to represent this in memory, we would want to write the string to
# the beginning of the file:
# 为了在内存中表示它, 我们希望将字符串写入文件的开头
#
# hello_txt_contents = claripy.BVV('Hello world, my name is John.', 30*8)
#
# Perhaps, then, we would want to replace John with a
# symbolic variable. We would call:
# 那么, 也许我们想用一个符号变量代替John, 我们可以使用下面的语句
#
# name_bitvector = claripy.BVS('symbolic_name', 4*8)
#
# Then, after the program calls fopen('hello.txt', 'r') and then
# fread(buffer, sizeof(char), 30, hello_txt_file), the buffer would contain
# the string from the file, except four symbolic bytes where the name would be
# stored.
# 然后系统调用fopen("hello.txt", "r")和fread(buffer, sizeof(char), 30, hello_txt_file)
# 在此之后, buffer将包含文件中的字符串, 除了那四个将保存名称的符号字节.
# (!)
# 设定的符号大小等于文件的大小
password = claripy.BVS('password', symbolic_file_size_bytes * 8)

# Construct the symbolic file. The file_options parameter specifies the Linux
# file permissions (read, read/write, execute etc.) The content parameter
# specifies from where the stream of data should be supplied. If content is
# an instance of SimSymbolicMemory (we constructed one above), the stream will
# contain the contents (including any symbolic contents) of the memory,
# beginning from address zero.
# Set the content parameter to our BVS instance that holds the symbolic data.
# 构造符号文件. file_options参数将指定Linux文件权限(读, 写, 执行等).
# content参数将指定从何处提供数据流. 如果content是SimSymbolicMemory的一个实例,
# 则数据流将包含内存的内容, 从地址0开始. 将content参数设置为保存符号数据的BVS实例.
# (!)
password_file = angr.storage.SimFile(filename, content=password, size = symbolic_file_size_bytes)# 这里多添加了一个size参数, 后面如果有错误的话就删了

# Add the symbolic file we created to the symbolic filesystem.
# 将我们创建的符号文件添加到符号文件系统
initial_state.fs.insert(filename, password_file)

simulation = project.factory.simgr(initial_state)

def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Good Job".encode() in stdout_output

def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Try again".encode() in stdout_output

simulation.explore(find=is_successful, avoid=should_abort)

if simulation.found:
solution_state = simulation.found[0]

solution = solution_state.solver.eval(password,cast_to=bytes).decode()

print(solution)
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

去掉注释后的代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
import angr
import claripy
import sys

def main(argv):
path_to_binary = argv[1]
project = angr.Project(path_to_binary)

start_address = 0x08049564
initial_state = project.factory.blank_state(
addr=start_address,
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

filename = "PNESFFEG.txt"
symbolic_file_size_bytes = 64

password = claripy.BVS('password', symbolic_file_size_bytes * 8)

password_file = angr.storage.SimFile(filename, content=password, size = symbolic_file_size_bytes)

initial_state.fs.insert(filename, password_file)

simulation = project.factory.simgr(initial_state)

def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Good Job".encode() in stdout_output

def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Try again".encode() in stdout_output

simulation.explore(find=is_successful, avoid=should_abort)

if simulation.found:
solution_state = simulation.found[0]

solution = solution_state.solver.eval(password,cast_to=bytes).decode()

print(solution)
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

构造模拟文件系统的步骤

1. 准备文件参数

  • 文件名
  • 文件大小
1
2
filename = "PNESFFEG.txt"
symbolic_file_size_bytes = 64

2. 创建符号位向量

1
2
password = claripy.BVS('password', symbolic_file_size_bytes * 8)
# 本题的输入是64个字符, 虽然后面只检查前八个字符, 但是为了满足题目需求, 还是设置64个字符的符号位向量

3. 创建符号文件

利用我们上面的参数(文件名, 文件大小), 并将符号注入到文件中来充当文件的内容, 三个条件合在一起就可以创建一个文件.

同时, 作为符号文件, 我们最终条件约束的也是该文件中的符号内容.

1
password_file = angr.storage.SimFile(filename, content=password, size = symbolic_file_size_by

4. 将符号文件添加到模拟系统中

第一个参数是符号文件名, 第二个参数是符号文件本身

1
initial_state.fs.insert(filename, password_file)

后面二进制文件中读取的文件将会从模拟的文件系统中读取, 读取到的文件数据中就含有我们的符号(对于这一关准确来说全部都是符号数据), 这些数据就是我们的未知数x.

08_angr_constraints

编译并运行

1
2
3
4
5
6
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/08_angr_constraints$ python3 generate.py 1234
08_angr_constraints
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/08_angr_constraints$ ./08_angr_constraints
Enter the password: aaaaa
Try again.
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/08_angr_constraints$

分析

使用IDA进行分析

主函数

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
.text:080492EA ; int __cdecl main(int argc, const char **argv, const char **envp)
.text:080492EA public main
.text:080492EA main proc near ; DATA XREF: _start+2A↑o
.text:080492EA
.text:080492EA var_C = dword ptr -0Ch
.text:080492EA var_4 = dword ptr -4
.text:080492EA argc = dword ptr 8
.text:080492EA argv = dword ptr 0Ch
.text:080492EA envp = dword ptr 10h
.text:080492EA
.text:080492EA ; __unwind {
.text:080492EA endbr32
.text:080492EE lea ecx, [esp+4]
.text:080492F2 and esp, 0FFFFFFF0h
.text:080492F5 push dword ptr [ecx-4]
.text:080492F8 push ebp
.text:080492F9 mov ebp, esp
.text:080492FB push ecx
.text:080492FC sub esp, 14h
.text:080492FF mov ds:password, 49555848h ; 都是password, IDA分析错误, 这里是对参考数据进行赋值
.text:08049309 mov ds:dword_804C034, 414F5754h
.text:08049313 mov ds:dword_804C038, 53454E50h
.text:0804931D mov ds:dword_804C03C, 47454646h
.text:08049327 sub esp, 4
.text:0804932A push 11h ; n
.text:0804932C push 0 ; c
.text:0804932E push offset buffer ; s
.text:08049333 call _memset ; 初始化
.text:08049338 add esp, 10h
.text:0804933B sub esp, 0Ch
.text:0804933E push offset aEnterThePasswo ; "Enter the password: "
.text:08049343 call _printf
.text:08049348 add esp, 10h
.text:0804934B sub esp, 8
.text:0804934E push offset buffer
.text:08049353 push offset a16s ; "%16s"
.text:08049358 call ___isoc99_scanf ; 输入存放在buffer
.text:0804935D add esp, 10h
.text:08049360 mov [ebp+var_C], 0
.text:08049367 jmp short loc_804939E
.text:08049369 ; ---------------------------------------------------------------------------
.text:08049369
.text:08049369 loc_8049369: ; CODE XREF: main+B8↓j
.text:08049369 mov eax, 0Fh
.text:0804936E sub eax, [ebp+var_C]
.text:08049371 mov edx, eax
.text:08049373 mov eax, [ebp+var_C]
.text:08049376 add eax, 804C040h ; 对输入进行变换
.text:0804937B movzx eax, byte ptr [eax]
.text:0804937E movsx eax, al
.text:08049381 sub esp, 8
.text:08049384 push edx
.text:08049385 push eax
.text:08049386 call complex_function
.text:0804938B add esp, 10h
.text:0804938E mov edx, eax
.text:08049390 mov eax, [ebp+var_C]
.text:08049393 add eax, 804C040h
.text:08049398 mov [eax], dl
.text:0804939A add [ebp+var_C], 1
.text:0804939E
.text:0804939E loc_804939E: ; CODE XREF: main+7D↑j
.text:0804939E cmp [ebp+var_C], 0Fh
.text:080493A2 jle short loc_8049369
.text:080493A4 sub esp, 8
.text:080493A7 push 10h
.text:080493A9 push offset buffer
.text:080493AE call check_equals_HXUITWOAPNESFFEG ; 这里是关键判断
.text:080493B3 add esp, 10h
.text:080493B6 test eax, eax ; 测试返回值是否为1, 如果不为1则打印"good job"
.text:080493B8 jnz short loc_80493CC
.text:080493BA sub esp, 0Ch
.text:080493BD push offset s ; "Try again."
.text:080493C2 call _puts
.text:080493C7 add esp, 10h
.text:080493CA jmp short loc_80493DC
.text:080493CC ; ---------------------------------------------------------------------------
.text:080493CC
.text:080493CC loc_80493CC: ; CODE XREF: main+CE↑j
.text:080493CC sub esp, 0Ch
.text:080493CF push offset aGoodJob ; "Good Job."
.text:080493D4 call _puts
.text:080493D9 add esp, 10h
.text:080493DC
.text:080493DC loc_80493DC: ; CODE XREF: main+E0↑j
.text:080493DC mov eax, 0
.text:080493E1 mov ecx, [ebp+var_4]
.text:080493E4 leave
.text:080493E5 lea esp, [ecx-4]
.text:080493E8 retn
.text:080493E8 ; } // starts at 80492EA
.text:080493E8 main endp
.text:080493E8
.text:080493E8 ; ---------------------------------------------------------------------------
.text:080493E9 align 10h
.text:080493F0
.text:080493F0 ; =============== S U B R O U T I N E =======================================

跟进check_equals_HXUITWOAPNESFFEG()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
.text:08049298 ; _BOOL4 __cdecl check_equals_HXUITWOAPNESFFEG(int, unsigned int)
.text:08049298 public check_equals_HXUITWOAPNESFFEG
.text:08049298 check_equals_HXUITWOAPNESFFEG proc near ; CODE XREF: main+C4↓p
.text:08049298
.text:08049298 var_8 = dword ptr -8
.text:08049298 var_4 = dword ptr -4
.text:08049298 arg_0 = dword ptr 8
.text:08049298 arg_4 = dword ptr 0Ch
.text:08049298
.text:08049298 ; __unwind {
.text:08049298 endbr32
.text:0804929C push ebp
.text:0804929D mov ebp, esp
.text:0804929F sub esp, 10h
.text:080492A2 mov [ebp+var_8], 0
.text:080492A9 mov [ebp+var_4], 0
.text:080492B0 jmp short loc_80492D4
.text:080492B2 ; ---------------------------------------------------------------------------
.text:080492B2
.text:080492B2 loc_80492B2: ; CODE XREF: check_equals_HXUITWOAPNESFFEG+42↓j
.text:080492B2 mov edx, [ebp+var_4]
.text:080492B5 mov eax, [ebp+arg_0]
.text:080492B8 add eax, edx
.text:080492BA movzx edx, byte ptr [eax]
.text:080492BD mov eax, [ebp+var_4]
.text:080492C0 add eax, 804C030h ; 这里就是参考数据
.text:080492C5 movzx eax, byte ptr [eax]
.text:080492C8 cmp dl, al ; 参考数据跟变换后的输入进行比较
.text:080492CA jnz short loc_80492D0
.text:080492CC add [ebp+var_8], 1
.text:080492D0
.text:080492D0 loc_80492D0: ; CODE XREF: check_equals_HXUITWOAPNESFFEG+32↑j
.text:080492D0 add [ebp+var_4], 1
.text:080492D4
.text:080492D4 loc_80492D4: ; CODE XREF: check_equals_HXUITWOAPNESFFEG+18↑j
.text:080492D4 mov eax, [ebp+var_4]
.text:080492D7 cmp [ebp+arg_4], eax
.text:080492DA ja short loc_80492B2
.text:080492DC mov eax, [ebp+var_8]
.text:080492DF cmp eax, [ebp+arg_4]
.text:080492E2 setz al
.text:080492E5 movzx eax, al
.text:080492E8 leave
.text:080492E9 retn
.text:080492E9 ; } // starts at 8049298
.text:080492E9 check_equals_HXUITWOAPNESFFEG endp
.text:080492E9
.text:080492EA
.text:080492EA ; =============== S U B R O U T I N E =======================================

其中0x804C030就是参考数据的地址, 位于buffer的上方

使用Angr解题

代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
# The binary asks for a 16 character password to which is applies a complex
# function and then compares with a reference string with the function
# check_equals_[reference string]. (Decompile the binary and take a look at it!)
# The source code for this function is provided here. However, the reference
# string in your version will be different than AABBCCDDEEFFGGHH:
#
# #define REFERENCE_PASSWORD = "AABBCCDDEEFFGGHH";
# int check_equals_AABBCCDDEEFFGGHH(char* to_check, size_t length) {
# uint32_t num_correct = 0;
# for (int i=0; i<length; ++i) {
# if (to_check[i] == REFERENCE_PASSWORD[i]) {
# num_correct += 1;
# }
# }
# return num_correct == length;
# }
# 二进制文件要求输入16个字符的密码, 该密码应用了一个complex_function()进行变换
# 随后在check_equals_HXUITWOAPNESFFEG()中与参考数据进行比较
# 下面是源代码
# ...
#
# char* input = user_input();
# char* encrypted_input = complex_function(input);
# if (check_equals_AABBCCDDEEFFGGHH(encrypted_input, 16)) {
# puts("Good Job.");
# } else {
# puts("Try again.");
# }
#
# The function checks if *to_check == "AABBCCDDEEFFGGHH". Verify this yourself.
# While you, as a human, can easily determine that this function is equivalent
# to simply comparing the strings, the computer cannot. Instead the computer
# would need to branch every time the if statement in the loop was called (16
# times), resulting in 2^16 = 65,536 branches, which will take too long of a
# time to evaluate for our needs.
# check_equals_HXUITWOAPNESFFEG()检查结果 == "AABBCCDDEEFFGGHH", 人类可以轻松确定这是在比较字符串
# 但是计算机不能. 每次调用if时都需要进行分支, 导致了2^16 = 65536个分支, 这将花费很长时间来评估我们的需求
# 为什么会有这些分支, 阅读Angr给出的PPT教程, 可以知道在Angr的机制中, 一遇上if, state就会产生两个分支
# 多重的if叠加, 就造成了路径爆炸, 所以这一关的主要目标就是减少分支到我们可接受的范围内.
#
# We do not know how the complex_function works, but we want to find an input
# that, when modified by complex_function, will produce the string:
# AABBCCDDEEFFGGHH.
# 我们不知道complex_function()具体干了些什么, 但是我们知道我们的目的是然输入在被complex_function变换后的结果是: AABBCCDDEEFFGGHH
#
# In this puzzle, your goal will be to stop the program before this function is
# called and manually constrain the to_check variable to be equal to the
# password you identify by decompiling the binary. Since, you, as a human, know
# that if the strings are equal, the program will print "Good Job.", you can
# be assured that if the program can solve for an input that makes them equal,
# the input will be the correct password.
# 在这个谜题中, 你的目标是在调用此函数之前停止程序, 并手动将to_check变量约束为你通过反编译二进制文件识别的密码.
# 因为, 作为人类, 你知道如果字符串相等, 程序将打印"Good Job".
#
import angr
import claripy
import sys

def main(argv):
path_to_binary = argv[1]
project = angr.Project(path_to_binary)

# 设定符号执行起点, 因为使用的是固定的内存, 所以执行起点设在scanf后面, 如果不设在scanf, Angr复原的将是在调用scanf之前的内存数据, 即未初始化的垃圾数据
start_address = 0x08049360
initial_state = project.factory.blank_state(
addr=start_address,
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

# 创建符号位向量
password_len_bits = 16 * 8 #总共十六个字符
password = claripy.BVS('password', password_len_bits)# 创建一个符号位向量

# 注入符号
password_address = 0x0804C040 # buffer的地址
initial_state.memory.store(password_address, password) # 将符号注入到目标内存

# 创建模拟管理器
simulation = project.factory.simgr(initial_state)

# Angr will not be able to reach the point at which the binary prints out
# 'Good Job.'. We cannot use that as the target anymore.
# Angr将无法到达到达二进制打印出"Good Job"的点, 我们不能用它作为目标了
# (!)

# 符号执行到这里便停下
address_to_check_constraint = 0x080493AE #题目要求的使在调用该函数之前停下来
simulation.explore(find=address_to_check_constraint)

# 开始约束条件
if simulation.found:
solution_state = simulation.found[0]

# Recall that we need to constrain the to_check parameter (see top) of the
# check_equals_ function. Determine the address that is being passed as the
# parameter and load it into a bitvector so that we can constrain it.
# 回想一下, 我们需要约束check_equals函数的to_check参数(上面的源码).
# 确定作为参数传递的地址并将其加载到位向量中, 以便我们可以对其进行约束
# (!)

# 约束参数
constrained_parameter_address = 0x0804C040 #约束参数的地址
constrained_parameter_size_bytes = 16 #约束参数的大小(以字节为单位)
constrained_parameter_bitvector = solution_state.memory.load( #将其加载到一个位向量中, 这个位向量存有该内存中的数据, 而这段内存已经被我们注入了符号
constrained_parameter_address,
constrained_parameter_size_bytes
)
# We want to constrain the system to find an input that will make
# constrained_parameter_bitvector equal the desired value.
# 我们希望约束系统找到一个输入, 使得constrained_parameter_bitvector等于所需值
# (!)
constrained_parameter_desired_value = "HXUITWOAPNESFFEG" # :string (encoded)

# Specify a claripy expression (using Pythonic syntax) that tests whether
# constrained_parameter_bitvector == constrained_parameter_desired_value.
# Add the constraint to the state to let z3 attempt to find an input that
# will make this expression true.
# 指定一个清晰的表达式(使用Pythonic语法)来测试constrained_parameter_bitvector == constrained_parameter_desired_value.
# 将约束添加到状态从而使z3尝试找到是该表达式满足的输入
solution_state.add_constraints(constrained_parameter_bitvector == constrained_parameter_desired_value) # 通过上面得到的位向量来添加约束条件

# Solve for the constrained_parameter_bitvector.
# 求解constrained_parameter_bitvector
# (!)
solution = solution_state.solver.eval(password, cast_to=bytes)

print(solution)
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

去掉注释后的代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
import angr
import claripy
import sys

def main(argv):
path_to_binary = argv[1]
project = angr.Project(path_to_binary)

start_address = 0x08049360
initial_state = project.factory.blank_state(
addr=start_address,
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

password_len_bits = 16 * 8
password = claripy.BVS('password', password_len_bits)

password_address = 0x0804C040
initial_state.memory.store(password_address, password)

simulation = project.factory.simgr(initial_state)

address_to_check_constraint = 0x080493AE
simulation.explore(find=address_to_check_constraint)

if simulation.found:
solution_state = simulation.found[0]

constrained_parameter_address = 0x0804C040
constrained_parameter_size_bytes = 16
constrained_parameter_bitvector = solution_state.memory.load(
constrained_parameter_address,
constrained_parameter_size_bytes
)

solution_state.add_constraints(constrained_parameter_bitvector == constrained_parameter_desired_value) # 通过上面得到的位向量来添加约束条件

solution = solution_state.solver.eval(password, cast_to=bytes)

print(solution)
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

因为本题的约束条件是循环if, 一个if会分成两个state, 这样会造成指数级增长的分支, 所以我们要手动添加约束条件来解出flag.

手动添加约束条件的步骤

1. 确定约束什么, 怎么约束

在使用Angr实现约束之前, 我们要清楚的知道我们要约束什么, 怎么约束.

要约束什么, 我们就要按照关卡中的约束条件去模拟.

本关中约束的对象是经过complex_function加密过后的字符串, 约束的方法跟一个参考数据进行比较, 相等则print”Good Job”. 这也就是我们要模拟的约束条件

2. 确定符号执行位置

可以参考前两关, 应该在scanf之后开始符号执行

3. 创建符号并注入

这里使用的是全局变量, 直接注入即可

1
2
3
4
5
password_len_bits = 16 * 8
password = claripy.BVS('password', password_len_bits)

password_address = 0x0804C040 # 全局变量的地址
initial_state.memory.store(password_address, password)

4. 去掉原本的约束条件

因为原本的约束条件难以实现, 所以我们才要自己添加约束条件来替换它. 所谓替换, 就是去掉(不执行)原本的约束语句, 只用我们自己的约束条件. 在本关中是check_equals_HXUITWOAPNESFFEG()函数.

1
2
address_to_check_constraint = 0x080493AE     #在调用check函数之前停下来
simulation.explore(find=address_to_check_constraint)

去掉原本的约束条件的实现是在check_equals_HXUITWOAPNESFFEG()之前停止符号执行

5. 添加自己的约束条件

我们读取此前注入了符号的内存(注意, 此时这段内存已经经过了complex_function的变换), 并通过==参考数据, 来添加约束条件.

因为这段内存注入了符号, 所以在当前的state中, 约束这段内存就相当于约束了complex_function变换后的符号位向量.

1
2
3
4
5
6
7
8
9
10
11
12
13
if simulation.found:
solution_state = simulation.found[0]

constrained_parameter_address = 0x0804C040 #从该地址开始读取内容
constrained_parameter_size_bytes = 16 # 读取的字节长度
constrained_parameter_bitvector = solution_state.memory.load(
constrained_parameter_address,
constrained_parameter_size_bytes
) # 读取到constrained_parameter_bitvector位向量中去.

# 添加约束条件
constrained_parameter_desired_value = "HXUITWOAPNESFFEG"
solution_state.add_constraints(constrained_parameter_bitvector == constrained_parameter_desired_value)

6. 求解

使用z3求解

1
solution = solution_state.solver.eval(password, cast_to=bytes)

09_angr_hooks

编译并运行

1
2
3
4
5
6
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/09_angr_hooks$ python3 generate.py 1234 09_angr_hooks
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/09_angr_hooks$ ./09_angr_hooks
Enter the password: aaaaaaaaaa
Try again.
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/09_angr_hooks$

分析

使用IDA分析

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
.text:0804930A                   ; int __cdecl main(int argc, const char **argv, const char **envp)
.text:0804930A public main
.text:0804930A main proc near ; DATA XREF: _start+2A↑o
.text:0804930A
.text:0804930A var_10 = dword ptr -10h
.text:0804930A var_C = dword ptr -0Ch
.text:0804930A var_4 = dword ptr -4
.text:0804930A argc = dword ptr 8
.text:0804930A argv = dword ptr 0Ch
.text:0804930A envp = dword ptr 10h
.text:0804930A
.text:0804930A ; __unwind {
.text:0804930A F3 0F 1E FB endbr32
.text:0804930E 8D 4C 24 04 lea ecx, [esp+4]
.text:08049312 83 E4 F0 and esp, 0FFFFFFF0h
.text:08049315 FF 71 FC push dword ptr [ecx-4]
.text:08049318 55 push ebp
.text:08049319 89 E5 mov ebp, esp
.text:0804931B 51 push ecx
.text:0804931C 83 EC 14 sub esp, 14h
.text:0804931F C7 05 34 C0 04 08+ mov ds:password, 49555848h ; 参考数据
.text:0804931F 48 58 55 49
.text:08049329 C7 05 38 C0 04 08+ mov ds:dword_804C038, 414F5754h
.text:08049329 54 57 4F 41
.text:08049333 C7 05 3C C0 04 08+ mov ds:dword_804C03C, 53454E50h
.text:08049333 50 4E 45 53
.text:0804933D C7 05 40 C0 04 08+ mov ds:dword_804C040, 47454646h
.text:0804933D 46 46 45 47
.text:08049347 83 EC 04 sub esp, 4
.text:0804934A 6A 11 push 11h ; n
.text:0804934C 6A 00 push 0 ; c
.text:0804934E 68 44 C0 04 08 push offset buffer ; s
.text:08049353 E8 98 FD FF FF call _memset ; 初始化buffer
.text:08049358 83 C4 10 add esp, 10h
.text:0804935B 83 EC 0C sub esp, 0Ch
.text:0804935E 68 16 A0 04 08 push offset aEnterThePasswo ; "Enter the password: "
.text:08049363 E8 48 FD FF FF call _printf
.text:08049368 83 C4 10 add esp, 10h
.text:0804936B 83 EC 08 sub esp, 8
.text:0804936E 68 44 C0 04 08 push offset buffer
.text:08049373 68 2B A0 04 08 push offset a16s ; "%16s"
.text:08049378 E8 83 FD FF FF call ___isoc99_scanf ; 获取用户输入
.text:0804937D 83 C4 10 add esp, 10h
.text:08049380 C7 45 F0 00 00 00+ mov [ebp+var_10], 0
.text:08049380 00
.text:08049387 EB 35 jmp short loc_80493BE
.text:08049389 ; ---------------------------------------------------------------------------
.text:08049389
.text:08049389 loc_8049389: ; CODE XREF: main+B8↓j
.text:08049389 B8 12 00 00 00 mov eax, 12h
.text:0804938E 2B 45 F0 sub eax, [ebp+var_10]
.text:08049391 89 C2 mov edx, eax
.text:08049393 8B 45 F0 mov eax, [ebp+var_10]
.text:08049396 05 44 C0 04 08 add eax, 804C044h
.text:0804939B 0F B6 00 movzx eax, byte ptr [eax]
.text:0804939E 0F BE C0 movsx eax, al
.text:080493A1 83 EC 08 sub esp, 8
.text:080493A4 52 push edx
.text:080493A5 50 push eax
.text:080493A6 E8 AD FE FF FF call complex_function ; 进行变换
.text:080493AB 83 C4 10 add esp, 10h
.text:080493AE 89 C2 mov edx, eax
.text:080493B0 8B 45 F0 mov eax, [ebp+var_10]
.text:080493B3 05 44 C0 04 08 add eax, 804C044h
.text:080493B8 88 10 mov [eax], dl
.text:080493BA 83 45 F0 01 add [ebp+var_10], 1
.text:080493BE
.text:080493BE loc_80493BE: ; CODE XREF: main+7D↑j
.text:080493BE 83 7D F0 0F cmp [ebp+var_10], 0Fh
.text:080493C2 7E C5 jle short loc_8049389
.text:080493C4 83 EC 08 sub esp, 8
.text:080493C7 6A 10 push 10h
.text:080493C9 68 44 C0 04 08 push offset buffer
.text:080493CE E8 E5 FE FF FF call check_equals_HXUITWOAPNESFFEG ; 最终检查
.text:080493D3 83 C4 10 add esp, 10h
.text:080493D6 A3 58 C0 04 08 mov ds:equals, eax
.text:080493DB C7 45 F4 00 00 00+ mov [ebp+var_C], 0
.text:080493DB 00
.text:080493E2 EB 31 jmp short loc_8049415
.text:080493E4 ; ---------------------------------------------------------------------------
.text:080493E4
.text:080493E4 loc_80493E4: ; CODE XREF: main+10F↓j
.text:080493E4 8B 45 F4 mov eax, [ebp+var_C]
.text:080493E7 8D 50 09 lea edx, [eax+9]
.text:080493EA 8B 45 F4 mov eax, [ebp+var_C]
.text:080493ED 05 34 C0 04 08 add eax, 804C034h
.text:080493F2 0F B6 00 movzx eax, byte ptr [eax]
.text:080493F5 0F BE C0 movsx eax, al
.text:080493F8 83 EC 08 sub esp, 8
.text:080493FB 52 push edx
.text:080493FC 50 push eax
.text:080493FD E8 56 FE FF FF call complex_function
.text:08049402 83 C4 10 add esp, 10h
.text:08049405 89 C2 mov edx, eax
.text:08049407 8B 45 F4 mov eax, [ebp+var_C]
.text:0804940A 05 34 C0 04 08 add eax, 804C034h
.text:0804940F 88 10 mov [eax], dl
.text:08049411 83 45 F4 01 add [ebp+var_C], 1
.text:08049415
.text:08049415 loc_8049415: ; CODE XREF: main+D8↑j
.text:08049415 83 7D F4 0F cmp [ebp+var_C], 0Fh
.text:08049419 7E C9 jle short loc_80493E4
.text:0804941B 83 EC 08 sub esp, 8
.text:0804941E 68 44 C0 04 08 push offset buffer
.text:08049423 68 2B A0 04 08 push offset a16s ; "%16s"
.text:08049428 E8 D3 FC FF FF call ___isoc99_scanf
.text:0804942D 83 C4 10 add esp, 10h
.text:08049430 A1 58 C0 04 08 mov eax, ds:equals
.text:08049435 85 C0 test eax, eax
.text:08049437 74 22 jz short loc_804945B
.text:08049439 83 EC 04 sub esp, 4
.text:0804943C 6A 10 push 10h ; n
.text:0804943E 68 34 C0 04 08 push offset password ; s2
.text:08049443 68 44 C0 04 08 push offset buffer ; s1
.text:08049448 E8 C3 FC FF FF call _strncmp
.text:0804944D 83 C4 10 add esp, 10h
.text:08049450 85 C0 test eax, eax
.text:08049452 75 07 jnz short loc_804945B
.text:08049454 B8 01 00 00 00 mov eax, 1
.text:08049459 EB 05 jmp short loc_8049460
.text:0804945B ; ---------------------------------------------------------------------------
.text:0804945B
.text:0804945B loc_804945B: ; CODE XREF: main+12D↑j
.text:0804945B ; main+148↑j
.text:0804945B B8 00 00 00 00 mov eax, 0
.text:08049460
.text:08049460 loc_8049460: ; CODE XREF: main+14F↑j
.text:08049460 A3 58 C0 04 08 mov ds:equals, eax
.text:08049465 A1 58 C0 04 08 mov eax, ds:equals
.text:0804946A 85 C0 test eax, eax
.text:0804946C 75 12 jnz short loc_8049480
.text:0804946E 83 EC 0C sub esp, 0Ch
.text:08049471 68 0B A0 04 08 push offset s ; "Try again."
.text:08049476 E8 45 FC FF FF call _puts
.text:0804947B 83 C4 10 add esp, 10h
.text:0804947E EB 10 jmp short loc_8049490
.text:08049480 ; ---------------------------------------------------------------------------
.text:08049480
.text:08049480 loc_8049480: ; CODE XREF: main+162↑j
.text:08049480 83 EC 0C sub esp, 0Ch
.text:08049483 68 30 A0 04 08 push offset aGoodJob ; "Good Job."
.text:08049488 E8 33 FC FF FF call _puts
.text:0804948D 83 C4 10 add esp, 10h
.text:08049490
.text:08049490 loc_8049490: ; CODE XREF: main+174↑j
.text:08049490 B8 00 00 00 00 mov eax, 0
.text:08049495 8B 4D FC mov ecx, [ebp+var_4]
.text:08049498 C9 leave
.text:08049499 8D 61 FC lea esp, [ecx-4]
.text:0804949C C3 retn
.text:0804949C ; } // starts at 804930A
.text:0804949C main endp
.text:0804949C
.text:0804949C ; ---------------------------------------------------------------------------
.text:0804949D 66 90 90 align 10h
.text:080494A0
.text:080494A0 ; =============== S U B R O U T I N E =======================================
.text:080494A0
.text:080494A0
.text:080494A0 public __libc_csu_init
.text:080494A0 __libc_csu_init proc near ; DATA XREF: _start+21↑o
.text:080494A0
.text:080494A0 arg_0 = dword ptr 4
.text:080494A0 arg_4 = dword ptr 8
.text:080494A0 arg_8 = dword ptr 0Ch
.text:080494A0
.text:080494A0 ; __unwind {
.text:080494A0 F3 0F 1E FB endbr32
.text:080494A4 55 push ebp
.text:080494A5 E8 6B 00 00 00 call __x86_get_pc_thunk_bp
.text:080494AA 81 C5 56 2B 00 00 add ebp, (offset _GLOBAL_OFFSET_TABLE_ - $)
.text:080494B0 57 push edi
.text:080494B1 56 push esi
.text:080494B2 53 push ebx
.text:080494B3 83 EC 0C sub esp, 0Ch
.text:080494B6 89 EB mov ebx, ebp
.text:080494B8 8B 7C 24 28 mov edi, [esp+1Ch+arg_8]
.text:080494BC E8 3F FB FF FF call _init_proc
.text:080494C1 8D 9D 10 FF FF FF lea ebx, [ebp-0F0h]
.text:080494C7 8D 85 0C FF FF FF lea eax, [ebp-0F4h]
.text:080494CD 29 C3 sub ebx, eax
.text:080494CF C1 FB 02 sar ebx, 2
.text:080494D2 74 29 jz short loc_80494FD
.text:080494D4 31 F6 xor esi, esi
.text:080494D6 8D B4 26 00 00 00+ lea esi, [esi+0]
.text:080494D6 00
.text:080494DD 8D 76 00 lea esi, [esi+0]
.text:080494E0
.text:080494E0 loc_80494E0: ; CODE XREF: __libc_csu_init+5B↓j
.text:080494E0 83 EC 04 sub esp, 4
.text:080494E3 57 push edi
.text:080494E4 FF 74 24 2C push [esp+24h+arg_4]
.text:080494E8 FF 74 24 2C push [esp+28h+arg_0]
.text:080494EC FF 94 B5 0C FF FF+ call ss:(__frame_dummy_init_array_entry - 804C000h)[ebp+esi*4]
.text:080494EC FF
.text:080494F3 83 C6 01 add esi, 1
.text:080494F6 83 C4 10 add esp, 10h
.text:080494F9 39 F3 cmp ebx, esi
.text:080494FB 75 E3 jnz short loc_80494E0
.text:080494FD
.text:080494FD loc_80494FD: ; CODE XREF: __libc_csu_init+32↑j
.text:080494FD 83 C4 0C add esp, 0Ch
.text:08049500 5B pop ebx
.text:08049501 5E pop esi
.text:08049502 5F pop edi
.text:08049503 5D pop ebp
.text:08049504 C3 retn
.text:08049504 ; } // starts at 80494A0
.text:08049504 __libc_csu_init endp
.text:08049504
.text:08049504 ; ---------------------------------------------------------------------------
.text:08049505 8D B4 26 00 00 00+ align 10h
.text:08049510
.text:08049510 ; =============== S U B R O U T I N E =======================================

进行了两次输入和检查:

  • 第一次输入后, 对输入进行加密, 然后与参考数据比较
  • 第二次输入后, 对参考数据进行加密, 然后与输入进行比较

其中我们需要修改的是第一次比较部分, 因为使用了check_equals_HXUITWOAPNESFFEG()函数, 而在这个函数中循环使用了if, 造成过多的分支, 第二次的检验用的是sctcmp所以不用修改.

所以我们要用Angr构建一个检查函数来替换掉check_equals_HXUITWOAPNESFFEG(), 这是一个Hook过程.

使用Angr来解题

代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
# This level performs the following computations:
#
# 1. Get 16 bytes of user input and encrypt it.
# 2. Save the result of check_equals_AABBCCDDEEFFGGHH (or similar)
# 3. Get another 16 bytes from the user and encrypt it.
# 4. Check that it's equal to a predefined password.
#
# The ONLY part of this program that we have to worry about is #2. We will be
# replacing the call to check_equals_ with our own version, using a hook, since
# check_equals_ will run too slowly otherwise.
# 这一关的流程:
# 1. 获取用户16字符输入并加密
# 2. 从check_equals_HXUITWOAPNESFFEG()检验并保存检验结果(相同则1, 不相同则0)
# 3. 再次获取用户输入, 并加密参考数据
# 4. 再次检查, 只不过用的是strcmp()进行检查
# 我们需要担心的只有第二步, 我们要使用自己的检查函数来替换原本的check_equals_HXUITWOAPNESFFEG(), 因为里面有太多的if分支了
# 我们替换的方法是使用钩子

import angr
import claripy
import sys

def main(argv):
# 建立Angr项目
path_to_binary = argv[1]
project = angr.Project(path_to_binary)

# Since Angr can handle the initial call to scanf, we can start from the
# beginning.
# 由于Angr可以处理对scanf的初始调用, 所以这里默认从mian函数开始即可
initial_state = project.factory.entry_state(
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

# Hook the address of where check_equals_ is called.
# 钩取调用check_equals_HXUITWOAPNESFFEG()函数的地址
# (!)
check_equals_called_address = 0x080493CE # 调用check_equals_HXUITWOAPNESFFEG()的地址

# The length parameter in angr.Hook specifies how many bytes the execution
# engine should skip after completing the hook. This will allow hooks to
# replace certain instructions (or groups of instructions). Determine the
# instructions involved in calling check_equals_, and then determine how many
# bytes are used to represent them in memory. This will be the skip length.
# Angr.Hook中的lenth参数指定执行引擎在完成Hook后应该跳过多少字节.
# 这将使Hook替换某些指令(call指令, 后面清理栈的指令)
# 确定调用check_equals_HXUITWOAPNESFFEG()所涉及的指令, 然后确定在内存中使用多少字节来表示他们, 这就是lenth
# (!)
instruction_to_skip_length = 5 + 3 # call指令占5个字节, 下面清理参数栈的add指令占3字节
@project.hook(check_equals_called_address, length=instruction_to_skip_length)
def skip_check_equals_(state):
# Determine the address where user input is stored. It is passed as a
# parameter ot the check_equals_ function. Then, load the string. Reminder:
# int check_equals_(char* to_check, int length) { ...
# 确定存储用户输入的地址. 它作为check_equals_HXUITWOAPNESFFEG()的参数传递
# 然后加载字符串.
user_input_buffer_address = 0x0804C044 # :integer, probably hexadecimal
user_input_buffer_length = 0x10

# Reminder: state.memory.load will read the stored value at the address
# user_input_buffer_address of byte length user_input_buffer_length.
# It will return a bitvector holding the value. This value can either be
# symbolic or concrete, depending on what was stored there in the program.
# state.memry.load()会读取地址user_input+buffer_address处存储的值, 长度为user_input_buffer_lenth
# 它将返回一个保存该值的位向量.这个值可以是符号位向量, 也可以是常数位向量.
# 取决于程序中存储的内容
# 这里加载的buffer中存储的是符号值, 虽然我们没有创建符号并注入, 但是开始时使用的时默认的执行起始状态
# 所以会自动注入符号到scanf所指向的内存, 也就是上面的buffer
user_input_string = state.memory.load(
user_input_buffer_address,
user_input_buffer_length
)

# Determine the string this function is checking the user input against.
# It's encoded in the name of this function; decompile the program to find
# it.
# 确定此函数正在检查的用户输入的字符串
# 它以这个函数的名称编码
check_against_string = "HXUITWOAPNESFFEG" # :string

# gcc uses eax to store the return value, if it is an integer. We need to
# set eax to 1 if check_against_string == user_input_string and 0 otherwise.
# However, since we are describing an equation to be used by z3 (not to be
# evaluated immediately), we cannot use Python if else syntax. Instead, we
# have to use claripy's built in function that deals with if statements.
# claripy.If(expression, ret_if_true, ret_if_false) will output an
# expression that evaluates to ret_if_true if expression is true and
# ret_if_false otherwise.
# Think of it like the Python "value0 if expression else value1".
# gcc使用eax来存储返回值, 如果它是一个整数.
# 如果check_against_string == user_input_string, 就将eax设为1, 否则为0
# 但是, 由于我们描述的是z3使用的方程(不立即计算), 我们不能使用Python的if else语法
# 相反, 我们必须使用claripy的内置函数来处理if语句
# claripy.if(expression, ret_if_true, ret_if_false)
# 根据参数名可以知道第一个参数expression为真, 则返回第二个参数, 反之则返回第三个参数
state.regs.eax = claripy.If(
user_input_string == check_against_string,
claripy.BVV(1, 32),
claripy.BVV(0, 32)
)

simulation = project.factory.simgr(initial_state)

def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Good Job".encode() in stdout_output

def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Try again".encode() in stdout_output

simulation.explore(find=is_successful, avoid=should_abort)

if simulation.found:
solution_state = simulation.found[0]

# Since we are allowing Angr to handle the input, retrieve it by printing
# the contents of stdin. Use one of the early levels as a reference.
# 因为我们允许Angr处理输入, 所以通过打印stdin的内容来查看它.
solution = solution_state.posix.dumps(sys.stdin.fileno())
print(solution)
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

去掉注释后的代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
import angr
import claripy
import sys

def main(argv):
path_to_binary = argv[1]
project = angr.Project(path_to_binary)

initial_state = project.factory.entry_state(
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

check_equals_called_address = 0x080493CE

instruction_to_skip_length = 5 + 3
@project.hook(check_equals_called_address, length=instruction_to_skip_length)

def skip_check_equals_(state):

user_input_buffer_address = 0x0804C044
user_input_buffer_length = 0x10

user_input_string = state.memory.load(
user_input_buffer_address,
user_input_buffer_length
)

check_against_string = "HXUITWOAPNESFFEG"

state.regs.eax = claripy.If(
user_input_string == check_against_string,
claripy.BVV(1, 32),
claripy.BVV(0, 32)
)

simulation = project.factory.simgr(initial_state)

def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Good Job".encode() in stdout_output

def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Try again".encode() in stdout_output

simulation.explore(find=is_successful, avoid=should_abort)

if simulation.found:
solution_state = simulation.found[0]

solution = solution_state.posix.dumps(sys.stdin.fileno())
print(solution)
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

使用Angr实现Hook过程

1. 确定要Hook函数的调用地址

我们的目标是替换掉check_equals_HXUITWOAPNESFFEG(), 并实现与其相同的功能, 所以要先知道在哪里调用了函数才能实现Hook

1
check_equals_called_address = 0x080493CE

2. 开始Hook, 并保证Hook后程序能够正常运行

像是这道题, 如果我们不处理掉原本存在的call check_equals_HXUITWOAPNESFFEG的话, 仍会调用这个函数, 导致Angr不能解出题目, 所以我们还要把原本的call语句覆盖掉, 这个call指令长度为5字节.

1
2
3
4
  instruction_to_skip_length = 5 # call指令占5个字节, 下面清理参数栈的add指令占3字节
@project.hook(check_equals_called_address, length=instruction_to_skip_length)
# check_equals_called_address参数表示要Hook替换指令的起始地址
# length参数表示要Hook替换指令的指令长度

3. 定义Hook函数

紧跟在Hook的下方

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
def skip_check_equals_(state):

user_input_buffer_address = 0x0804C044
user_input_buffer_length = 0x10

user_input_string = state.memory.load(
user_input_buffer_address,
user_input_buffer_length
)

check_against_string = "HXUITWOAPNESFFEG"

state.regs.eax = claripy.If(
user_input_string == check_against_string,
claripy.BVV(1, 32),
claripy.BVV(0, 32)
)

需要注意的是我们描述的是z3使用的方程, 所以不能直接使用if语句, 而是使用claripy内置的函数来表示if.

至此我们便完成了hook

10_angr_simprocedures

编译并运行

1
2
3
4
5
6
7
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/10_angr_simprocedures$ python3 generate.py 12
34 10_angr_simprocedures
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/10_angr_simprocedures$ ./10_angr_simprocedure
s
Enter the password: aaaaaaaaaaa
Try again.
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/10_angr_simprocedures$

分析

使用IDA分析

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
.text:0804932A ; int __cdecl main(int argc, const char **argv, const char **envp)
.text:0804932A public main
.text:0804932A main proc near ; DATA XREF: _start+2A↑o
.text:0804932A
.text:0804932A var_3C = dword ptr -3Ch
.text:0804932A var_2C = dword ptr -2Ch
.text:0804932A var_28 = dword ptr -28h
.text:0804932A var_24 = dword ptr -24h
.text:0804932A s = byte ptr -1Dh
.text:0804932A var_C = dword ptr -0Ch
.text:0804932A var_4 = dword ptr -4
.text:0804932A argc = dword ptr 8
.text:0804932A argv = dword ptr 0Ch
.text:0804932A envp = dword ptr 10h
.text:0804932A
.text:0804932A ; __unwind {
.text:0804932A endbr32
.text:0804932E lea ecx, [esp+4]
.text:08049332 and esp, 0FFFFFFF0h
.text:08049335 push dword ptr [ecx-4]
.text:08049338 push ebp
.text:08049339 mov ebp, esp
.text:0804933B push ecx
.text:0804933C sub esp, 44h
.text:0804933F mov eax, ecx
.text:08049341 mov eax, [eax+4]
.text:08049344 mov [ebp+var_3C], eax
.text:08049347 mov eax, large gs:14h
.text:0804934D mov [ebp+var_C], eax
.text:08049350 xor eax, eax
.text:08049352 mov [ebp+var_24], 0DEADBEEFh
.text:08049359 mov [ebp+var_2C], 11h
.text:08049360 sub esp, 4
.text:08049363 push 10h ; n
.text:08049365 push offset aHxuitwoapnesff ; "HXUITWOAPNESFFEG"
.text:0804936A push offset password ; dest
.text:0804936F call _memcpy
.text:08049374 add esp, 10h
.text:08049377 sub esp, 4
.text:0804937A push 11h ; n
.text:0804937C push 0 ; c
.text:0804937E lea eax, [ebp+s]
.text:08049381 push eax ; s
.text:08049382 call _memset
.text:08049387 add esp, 10h
.text:0804938A sub esp, 0Ch
.text:0804938D push offset aEnterThePasswo ; "Enter the password: "
.text:08049392 call _printf
.text:08049397 add esp, 10h
.text:0804939A sub esp, 8
.text:0804939D lea eax, [ebp+s]
.text:080493A0 push eax
.text:080493A1 push offset a16s ; "%16s"
.text:080493A6 call ___isoc99_scanf
.text:080493AB add esp, 10h
.text:080493AE mov [ebp+var_28], 0
.text:080493B5 jmp short loc_80493EC
.text:080493B7 ; ---------------------------------------------------------------------------
.text:080493B7
.text:080493B7 loc_80493B7: ; CODE XREF: main+C6↓j
.text:080493B7 mov eax, 12h
.text:080493BC sub eax, [ebp+var_28]
.text:080493BF mov edx, eax
.text:080493C1 lea ecx, [ebp+s]
.text:080493C4 mov eax, [ebp+var_28]
.text:080493C7 add eax, ecx
.text:080493C9 movzx eax, byte ptr [eax]
.text:080493CC movsx eax, al
.text:080493CF sub esp, 8
.text:080493D2 push edx
.text:080493D3 push eax
.text:080493D4 call complex_function
.text:080493D9 add esp, 10h
.text:080493DC mov ecx, eax
.text:080493DE lea edx, [ebp+s]
.text:080493E1 mov eax, [ebp+var_28]
.text:080493E4 add eax, edx
.text:080493E6 mov [eax], cl
.text:080493E8 add [ebp+var_28], 1
.text:080493EC
.text:080493EC loc_80493EC: ; CODE XREF: main+8B↑j
.text:080493EC cmp [ebp+var_28], 0Fh
.text:080493F0 jle short loc_80493B7
.text:080493F2 cmp [ebp+var_24], 0DEADBEEFh
.text:080493F9 jz loc_804A532
.text:080493FF cmp [ebp+var_24], 0DEADBEEFh
.text:08049406 jnz loc_8049C9F
.text:0804940C cmp [ebp+var_24], 0DEADBEEFh
.text:08049413 jnz loc_804985C
.text:08049419 cmp [ebp+var_24], 0DEADBEEFh
.text:08049420 jz loc_8049641
.text:08049426 cmp [ebp+var_24], 0DEADBEEFh
.text:0804942D jnz loc_804953A
.text:08049433 cmp [ebp+var_24], 0DEADBEEFh
.text:0804943A jz short loc_80494BB
.text:0804943C cmp [ebp+var_24], 0DEADBEEFh
.text:08049443 jz short loc_8049480
.text:08049445 cmp [ebp+var_24], 0DEADBEEFh
.text:0804944C jz short loc_8049467
.text:0804944E sub esp, 8
.text:08049451 push 10h
.text:08049453 lea eax, [ebp+s]
.text:08049456 push eax
.text:08049457 call check_equals_HXUITWOAPNESFFEG
.text:0804945C add esp, 10h
.text:0804945F mov [ebp+var_2C], eax
.text:08049462 jmp loc_804B654
.text:08049467 ; ---------------------------------------------------------------------------
.text:08049467
.text:08049467 loc_8049467: ; CODE XREF: main+122↑j
.text:08049467 sub esp, 8
.text:0804946A push 10h
.text:0804946C lea eax, [ebp+s]
.text:0804946F push eax
.text:08049470 call check_equals_HXUITWOAPNESFFEG
.text:08049475 add esp, 10h
.text:08049478 mov [ebp+var_2C], eax
.text:0804947B jmp loc_804B654
.text:08049480 ; ---------------------------------------------------------------------------
.text:08049480
.text:08049480 loc_8049480: ; CODE XREF: main+119↑j
.text:08049480 cmp [ebp+var_24], 0DEADBEEFh
.text:08049487 jnz short loc_80494A2
.text:08049489 sub esp, 8
.text:0804948C push 10h
.text:0804948E lea eax, [ebp+s]
.text:08049491 push eax
.text:08049492 call check_equals_HXUITWOAPNESFFEG
.text:08049497 add esp, 10h
.text:0804949A mov [ebp+var_2C], eax
.text:0804949D jmp loc_804B654
.text:080494A2 ; ---------------------------------------------------------------------------
.text:080494A2
.text:080494A2 loc_80494A2: ; CODE XREF: main+15D↑j
.text:080494A2 sub esp, 8
.text:080494A5 push 10h
.text:080494A7 lea eax, [ebp+s]
.text:080494AA push eax
.text:080494AB call check_equals_HXUITWOAPNESFFEG
.text:080494B0 add esp, 10h
.text:080494B3 mov [ebp+var_2C], eax
.text:080494B6 jmp loc_804B654
.text:080494BB ; ---------------------------------------------------------------------------
.text:080494BB
.text:080494BB loc_80494BB: ; CODE XREF: main+110↑j
.text:080494BB cmp [ebp+var_24], 0DEADBEEFh
.text:080494C2 jnz short loc_80494FF
.text:080494C4 cmp [ebp+var_24], 0DEADBEEFh
.text:080494CB jz short loc_80494E6
.text:080494CD sub esp, 8
.text:080494D0 push 10h
.text:080494D2 lea eax, [ebp+s]
.text:080494D5 push eax
.text:080494D6 call check_equals_HXUITWOAPNESFFEG
.text:080494DB add esp, 10h
.text:080494DE mov [ebp+var_2C], eax
.text:080494E1 jmp loc_804B654
.text:080494E6 ; ---------------------------------------------------------------------------
.text:080494E6
.text:080494E6 loc_80494E6: ; CODE XREF: main+1A1↑j
.text:080494E6 sub esp, 8
.text:080494E9 push 10h
.text:080494EB lea eax, [ebp+s]
.text:080494EE push eax
.text:080494EF call check_equals_HXUITWOAPNESFFEG
.text:080494F4 add esp, 10h
.text:080494F7 mov [ebp+var_2C], eax
.text:080494FA jmp loc_804B654
.text:080494FF ; ---------------------------------------------------------------------------

这一关跟上一关不一样的地方在于使用了非常多的check_equals_HXUITWOAPNESFFEG()函数, 大部分都是无用的代码, 但是能够加大我们Hook的难度, 我们的思路要从函数调用处Hook转换到在函数本身进行Hook才能提高效率, 而Angr提供了这样的功能

使用Angr解题

代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
# This challenge is similar to the previous one. It operates under the same
# premise that you will have to replace the check_equals_ function. In this
# case, however, check_equals_ is called so many times that it wouldn't make
# sense to hook where each one was called. Instead, use a SimProcedure to write
# your own check_equals_ implementation and then hook the check_equals_ symbol
# to replace all calls to scanf with a call to your SimProcedure.
# 本次挑战与上一次类似, 你必须替换掉check_equals_HXUITWOAPNESFFEG()它才能正常运行
# 但是, 在这一关中这个函数被调用非常多次, 以至于对每个调用的位置进行Hook是非常低效且无意义的
# 但是如果使用SimProcedure模拟管理器编写你自己的check函数实现, 然后Hook挂钩到check_equals_HXUITWOAPNESFFEG()符号
# 从而实现对所有输入的检查替换为对SimProcedure的调用
#
# You may be thinking:
# Why can't I just use hooks? The function is called many times, but if I hook
# the address of the function itself (rather than the addresses where it is
# called), I can replace its behavior everywhere. Furthermore, I can get the
# parameters by reading them off the stack (with memory.load(regs.esp + xx)),
# and return a value by simply setting eax! Since I know the length of the
# function in bytes, I can return from the hook just before the 'ret'
# instruction is called, which will allow the program to jump back to where it
# was before it called my hook.
# If you thought that, then congratulations! You have just invented the idea of
# SimProcedures! Instead of doing all of that by hand, you can let the already-
# implemented SimProcedures do the boring work for you so that you can focus on
# writing a replacement function in a Pythonic way.
# As a bonus, SimProcedures allow you to specify custom calling conventions, but
# unfortunately it is not covered in this CTF.
# 你可能会想:
# 为什么我不能只使用Hook, 该函数被多次调用, 我们直接Hook函数本身即可
# 但是这个原理其实就是上面SimProcedure的想法, 只是SimProcedure替你解决大部分细节的实现, 让这个想法的实现变得更加简单

import angr
import claripy
import sys

def main(argv):
# 建立Angr项目
path_to_binary = argv[1]
project = angr.Project(path_to_binary)

# 由于使用的是scanf默认的输入, 所以可以使用默认方式从main开始进行符号执行
initial_state = project.factory.entry_state(
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

# Define a class that inherits angr.SimProcedure in order to take advantage
# of Angr's SimProcedures.
# 定义一个继承angr.SimProcedure的类, 以便使用Angr的SimProcedures
class ReplacementCheckEquals(angr.SimProcedure):
# A SimProcedure replaces a function in the binary with a simulated one
# written in Python. Other than it being written in Python, the function
# acts largely the same as any function written in C. Any parameter after
# 'self' will be treated as a parameter to the function you are replacing.
# The parameters will be bitvectors. Additionally, the Python can return in
# the ususal Pythonic way. Angr will treat this in the same way it would
# treat a native function in the binary returning. An example:
# SimProcedure用Python编写的模拟函数替换二进制文件中的函数.
# 除了它是用Python编写的之外, 该函数的行为与任何使用C编写的函数基本相同.
# "self"之后的任何参数都将被视为您要替换的函数的参数, 参数将是位向量
# 此外, Python可以以通常的Pythonic方式返回.
# Angr将像对待二进制返回中的本机函数一样对待它, 下面是一个例子
#
# int add_if_positive(int a, int b) {
# if (a >= 0 && b >= 0) return a + b;
# else return 0;
# }
#
# could be simulated with...
# 可以被模拟成
#
# class ReplacementAddIfPositive(angr.SimProcedure):
# def run(self, a, b):
# if a >= 0 and b >=0:
# return a + b
# else:
# return 0
#
# Finish the parameters to the check_equals_ function. Reminder:
# int check_equals_AABBCCDDEEFFGGHH(char* to_check, int length) { ...
# 完成check_equals_HXUITWOAPNESFFEG(), 提示: int check_equals_XXXXXXXXXXXXXXXX(char* to_check, int length){}
# (!)
def run(self, to_check, len):
# We can almost copy and paste the solution from the previous challenge.
# Hint: Don't look up the address! It's passed as a parameter.
# 我们几乎可以复制上一关的解决方案
# 提示: 不要查地址, 它作为参数传递
# (!)
user_input_buffer_address = to_check
user_input_buffer_length = len

# Note the use of self.state to find the state of the system in a
# SimProcedure.
# 注意使用self.state在SimProcedure中查找系统状态
user_input_string = self.state.memory.load(
user_input_buffer_address,
user_input_buffer_length
)

check_against_string = "HXUITWOAPNESFFEG"

# Finally, instead of setting eax, we can use a Pythonic return statement
# to return the output of this function.
# Hint: Look at the previous solution.
return claripy.If(user_input_string == check_against_string, claripy.BVV(1, 32), claripy.BVV(0, 32))


# Hook the check_equals symbol. Angr automatically looks up the address
# associated with the symbol. Alternatively, you can use 'hook' instead
# of 'hook_symbol' and specify the address of the function. To find the
# correct symbol, disassemble the binary.
# Hook check_equals_HXUITWOAPNESFFEG()的符号. Angr自动查找与符号相关联的地址.
# 或者, 你可以使用"hook"而不是"hook_symbol"并指定函数的地址
# 要找到正确的符号, 请反汇编二进制文件
# (!)
check_equals_symbol = "check_equals_HXUITWOAPNESFFEG" # :string
project.hook_symbol(check_equals_symbol, ReplacementCheckEquals())

# 创建模拟管理器
simulation = project.factory.simgr(initial_state)

def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Good Job".encode() in stdout_output

def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Try again".encode() in stdout_output

simulation.explore(find=is_successful, avoid=should_abort)

if simulation.found:
solution_state = simulation.found[0]

solution = solution_state.posix.dumps(sys.stdin.fileno())
print(solution)
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

去掉注释后的代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
import angr
import claripy
import sys

def main(argv):
path_to_binary = argv[1]
project = angr.Project(path_to_binary)

initial_state = project.factory.entry_state(
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

class ReplacementCheckEquals(angr.SimProcedure):
def run(self, to_check, len):
user_input_buffer_address = to_check
user_input_buffer_length = len

user_input_string = self.state.memory.load(
user_input_buffer_address,
user_input_buffer_length
)

check_against_string = "HXUITWOAPNESFFEG"

return claripy.If(user_input_string == check_against_string, claripy.BVV(1, 32), claripy.BVV(0, 32))

check_equals_symbol = "check_equals_HXUITWOAPNESFFEG"
project.hook_symbol(check_equals_symbol, ReplacementCheckEquals())

simulation = project.factory.simgr(initial_state)

def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Good Job".encode() in stdout_output

def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Try again".encode() in stdout_output

simulation.explore(find=is_successful, avoid=should_abort)

if simulation.found:
solution_state = simulation.found[0]

solution = solution_state.posix.dumps(sys.stdin.fileno())
print(solution)
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

使用Angr实现Hook函数的过程

1. 定义一个继承angr.SimProcedure的类, 以此使用SimProcedure

1
class ReplacementCheckEquals(angr.SimProcedure):

2. 定义我们自己的Hook函数

注意缩进, 函数定义包含在类中

1
2
3
4
5
6
7
8
9
10
11
12
def run(self, to_check, len):
user_input_buffer_address = to_check
user_input_buffer_length = len

user_input_string = self.state.memory.load(
user_input_buffer_address,
user_input_buffer_length
)

check_against_string = "HXUITWOAPNESFFEG"

return claripy.If(user_input_string == check_against_string, claripy.BVV(1, 32), claripy.BVV(0, 32))

注意第一个self后面才是实际程序中传入的参数

我们从传入的参数中得到了用户输入的地址, 并读取相应的内容(这个内容是符号内容, 因为是默认符号执行)

然后使用claripy定义的if模拟原本函数的逻辑

3. 执行Hook函数

定义完Hook函数了, 就可以开始使用这个Hook函数替换原本的函数了

Angr提供的功能非常强大, 你只需要知道符号名, Angr就会自动查找到相应的位置进行Hook.

1
2
check_equals_symbol = "check_equals_HXUITWOAPNESFFEG"
project.hook_symbol(check_equals_symbol, ReplacementCheckEquals())

11_angr_sim_scanf

编译并运行

1
2
3
4
5
6
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/11_angr_sim_scanf$ python3 generate.py 1234 1
1_angr_sim_scanf
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/11_angr_sim_scanf$ ./11_angr_sim_scanf
Enter the password: aaaaaaaaaaaaaaaa
Try again.
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/11_angr_sim_scanf$

分析

使用IDA分析

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
int __cdecl main(int argc, const char **argv, const char **envp)
{
int i; // [esp+20h] [ebp-28h]
char s[20]; // [esp+28h] [ebp-20h] BYREF
unsigned int v7; // [esp+3Ch] [ebp-Ch]

v7 = __readgsdword(0x14u);
memset(s, 0, sizeof(s));
qmemcpy(s, "HXUITWOA", 8);
for ( i = 0; i <= 7; ++i )
s[i] = complex_function(s[i], i); // 对参考数据进行加密
printf("Enter the password: ");
__isoc99_scanf("%u %u", buffer0, buffer1); // 输入的格式是无符号整数
if ( !strncmp(buffer0, s, 4u) && !strncmp(buffer1, &s[4], 4u) )
puts("Good Job.");
else
puts("Try again.");
return 0;
}

这一关scanf有两个输入, 之前我们是在后面注入符号, 在这一关我们Hook掉scanf()函数

使用Angr解题

代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
# This time, the solution involves simply replacing scanf with our own version,
# since Angr does not support requesting multiple parameters with scanf.
# 这一次, 通关方法是将scanf替换为我们自己的版本, 因为Angr 不支持scanf输入多个参数
import angr
import claripy
import sys

def main(argv):
# 建立Angr项目
path_to_binary = argv[1]
project = angr.Project(path_to_binary)

# 默认执行起始状态, 从main开始
initial_state = project.factory.entry_state(
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

# 这一次我们用的是上一关的Hook, 不过Hook掉的是scanf函数
class ReplacementScanf(angr.SimProcedure):
# Finish the parameters to the scanf function. Hint: 'scanf("%u %u", ...)'.
# (!)
def run(self, format_string, scanf0_address, scanf1_address):
# Hint: scanf0_address is passed as a parameter, isn't it?
scanf_data_len = 4 * 8
scanf0 = claripy.BVS('scanf0', scanf_data_len)#一个无符号整型长度为4字节, 32位
scanf1 = claripy.BVS('scanf1', scanf_data_len)

# The scanf function writes user input to the buffers to which the
# parameters point.
# scanf函数将用户输入写入参数指向的缓冲区
# 就是将我们创建的符号位向量载入我们变量的位置, 变量的位置通过参数获得
self.state.memory.store(scanf0_address, scanf0, endness=project.arch.memory_endness)
self.state.memory.store(scanf1_address, scanf1, endness=project.arch.memory_endness)

# Now, we want to 'set aside' references to our symbolic values in the
# globals plugin included by default with a state. You will need to
# store multiple bitvectors. You can either use a list, tuple, or multiple
# keys to reference the different bitvectors.
# 现在, 我们要在默认情况下包含在状态中的globals插件
# (!)
self.state.globals['solution0'] = scanf0
self.state.globals['solution1'] = scanf1


scanf_symbol = "__isoc99_scanf"
project.hook_symbol(scanf_symbol, ReplacementScanf())

simulation = project.factory.simgr(initial_state)

def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Good Job".encode() in stdout_output

def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Try again".encode() in stdout_output

simulation.explore(find=is_successful, avoid=should_abort)

if simulation.found:
solution_state = simulation.found[0]

# Grab whatever you set aside in the globals dict.
# 获取你在globals dict中放置的东西
stored_solutions0 = solution_state.globals['solution0']
stored_solutions1 = solution_state.globals['solution1']
solution0 = solution_state.solver.eval(stored_solutions0)
solution1 = solution_state.solver.eval(stored_solutions1)

print(solution0)
print(solution1)

else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

去掉注释后的代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
import angr
import claripy
import sys

def main(argv):
path_to_binary = argv[1]
project = angr.Project(path_to_binary)

initial_state = project.factory.entry_state(
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

class ReplacementScanf(angr.SimProcedure):

def run(self, format_string, scanf0_address, scanf1_address):
scanf_data_len = 4 * 8
scanf0 = claripy.BVS('scanf0', scanf_data_len)
scanf1 = claripy.BVS('scanf1', scanf_data_len)

self.state.memory.store(scanf0_address, scanf0, endness=project.arch.memory_endness)
self.state.memory.store(scanf1_address, scanf1, endness=project.arch.memory_endness)

self.state.globals['solution0'] = scanf0
self.state.globals['solution1'] = scanf1

scanf_symbol = "__isoc99_scanf"
project.hook_symbol(scanf_symbol, ReplacementScanf())

simulation = project.factory.simgr(initial_state)

def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Good Job".encode() in stdout_output

def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Try again".encode() in stdout_output

simulation.explore(find=is_successful, avoid=should_abort)

if simulation.found:
solution_state = simulation.found[0]

stored_solutions0 = solution_state.globals['solution0']
stored_solutions1 = solution_state.globals['solution1']
solution0 = solution_state.solver.eval(stored_solutions0)
solution1 = solution_state.solver.eval(stored_solutions1)

print(solution0)
print(solution1)

else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

Hook流程

整体的hook流程跟上一个关卡差不多, 但是这里多了一个对globals的存取

1
2
3
4
5
6
#存入globals中
self.state.globals['solution0'] = scanf0
self.state.globals['solution1'] = scanf1
#从globals中取出
stored_solutions0 = solution_state.globals['solution0']
stored_solutions1 = solution_state.globals['solution1']

存入

首先我们要知道存入的对象是谁, scanf0和scanf1是我们创建的符号位向量

但是这两个符号都是在对象声明, 但是我们最终对这两个符号进行约束是在对象外部的, 而两个符号对于外部是不可见的, 所以我们需要一个全局的变量对这两个符号进行存储, 以便在外部能对其进行约束.

取出并约束

取出后进行约束, 还是跟前面的关卡一样使用eval进行约束, 最终得到答案

12_angr_veritesting

编译并运行

1
2
3
4
5
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/12_angr_veritesting$ python3 generate.py 1234 12_angr_veritesting
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/12_angr_veritesting$ ./12_angr_veritesting
Enter the password: aaaaaaaaaaaaaaaa
Try again.
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/12_angr_veritesting$

分析

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
.text:080492B8 ; int __cdecl main(int argc, const char **argv, const char **envp)
.text:080492B8 public main
.text:080492B8 main proc near ; DATA XREF: _start+2A↑o
.text:080492B8
.text:080492B8 var_4C = dword ptr -4Ch
.text:080492B8 var_3C = dword ptr -3Ch
.text:080492B8 var_38 = dword ptr -38h
.text:080492B8 var_34 = dword ptr -34h
.text:080492B8 s = byte ptr -2Dh
.text:080492B8 var_C = dword ptr -0Ch
.text:080492B8 anonymous_0 = dword ptr -8
.text:080492B8 argc = dword ptr 8
.text:080492B8 argv = dword ptr 0Ch
.text:080492B8 envp = dword ptr 10h
.text:080492B8
.text:080492B8 ; __unwind {
.text:080492B8 endbr32
.text:080492BC lea ecx, [esp+4]
.text:080492C0 and esp, 0FFFFFFF0h
.text:080492C3 push dword ptr [ecx-4]
.text:080492C6 push ebp
.text:080492C7 mov ebp, esp
.text:080492C9 push ebx
.text:080492CA push ecx
.text:080492CB sub esp, 50h
.text:080492CE mov eax, ecx
.text:080492D0 mov eax, [eax+4]
.text:080492D3 mov [ebp+var_4C], eax
.text:080492D6 mov eax, large gs:14h
.text:080492DC mov [ebp+var_C], eax
.text:080492DF xor eax, eax
.text:080492E1 sub esp, 4
.text:080492E4 push 21h ; '!' ; n
.text:080492E6 push 0 ; c
.text:080492E8 lea eax, [ebp+s]
.text:080492EB push eax ; s
.text:080492EC call _memset
.text:080492F1 add esp, 10h
.text:080492F4 sub esp, 0Ch
.text:080492F7 push offset aEnterThePasswo ; "Enter the password: "
.text:080492FC call _printf
.text:08049301 add esp, 10h
.text:08049304 sub esp, 8
.text:08049307 lea eax, [ebp+s]
.text:0804930A push eax
.text:0804930B push offset a32s ; "%32s"
.text:08049310 call ___isoc99_scanf
.text:08049315 add esp, 10h
.text:08049318 mov [ebp+var_3C], 0
.text:0804931F mov [ebp+var_34], 0
.text:08049326 mov [ebp+var_38], 0
.text:0804932D jmp short loc_804935F
.text:0804932F ; ---------------------------------------------------------------------------
.text:0804932F
.text:0804932F loc_804932F: ; CODE XREF: main+AB↓j
.text:0804932F lea edx, [ebp+s]
.text:08049332 mov eax, [ebp+var_38]
.text:08049335 add eax, edx
.text:08049337 movzx eax, byte ptr [eax]
.text:0804933A movsx ebx, al
.text:0804933D mov eax, [ebp+var_38]
.text:08049340 add eax, 85h
.text:08049345 sub esp, 8
.text:08049348 push eax
.text:08049349 push 48h ; 'H'
.text:0804934B call complex_function
.text:08049350 add esp, 10h
.text:08049353 cmp ebx, eax
.text:08049355 jnz short loc_804935B
.text:08049357 add [ebp+var_3C], 1
.text:0804935B
.text:0804935B loc_804935B: ; CODE XREF: main+9D↑j
.text:0804935B add [ebp+var_38], 1
.text:0804935F
.text:0804935F loc_804935F: ; CODE XREF: main+75↑j
.text:0804935F cmp [ebp+var_38], 1Fh
.text:08049363 jle short loc_804932F
.text:08049365 cmp [ebp+var_3C], 20h ; ' '
.text:08049369 jnz short loc_8049385
.text:0804936B movzx eax, byte ptr [ebp+var_C]
.text:0804936F test al, al
.text:08049371 jnz short loc_8049385
.text:08049373 sub esp, 0Ch
.text:08049376 push offset aGoodJob ; "Good Job."
.text:0804937B call _puts
.text:08049380 add esp, 10h
.text:08049383 jmp short loc_8049395
.text:08049385 ; ---------------------------------------------------------------------------
.text:08049385
.text:08049385 loc_8049385: ; CODE XREF: main+B1↑j
.text:08049385 ; main+B9↑j
.text:08049385 sub esp, 0Ch
.text:08049388 push offset s ; "Try again."
.text:0804938D call _puts
.text:08049392 add esp, 10h
.text:08049395
.text:08049395 loc_8049395: ; CODE XREF: main+CB↑j
.text:08049395 mov eax, 0
.text:0804939A mov ecx, [ebp+var_C]
.text:0804939D xor ecx, large gs:14h
.text:080493A4 jz short loc_80493AB
.text:080493A6 call ___stack_chk_fail
.text:080493AB ; ---------------------------------------------------------------------------
.text:080493AB
.text:080493AB loc_80493AB: ; CODE XREF: main+EC↑j
.text:080493AB lea esp, [ebp-8]
.text:080493AE pop ecx
.text:080493AF pop ebx
.text:080493B0 pop ebp
.text:080493B1 lea esp, [ecx-4]
.text:080493B4 retn
.text:080493B4 ; } // starts at 80492B8
.text:080493B4 main endp
.text:080493B4
.text:080493B4 ; ---------------------------------------------------------------------------
.text:080493B5 align 10h
.text:080493C0
.text:080493C0 ; =============== S U B R O U T I N E =======================================

这道题有一个循环的if语句, 会导致路径爆炸, 但是与前面不同的是输入加密跟检验是同时进行的, 不想是之前单独用一个函数来检验了.这样会加大的Hook的难度.

veritesting

但是题目给了提示就是使用Veritesting, 百度一下可以知道这个方法就是解决路径爆炸的, 采用的是: 结合静态符号执行以及动态符号执行的方式.

什么是静态符号执行, 什么是动态符号执行

  • 动态符号执行: 以具体数值作为输入来模拟执行程序代码, 启动代码模拟执行器, 并从当前路径的分支语句的为此中搜集所有符号约束. 然后修改该符号约束内容,构造出一条新的可行路径约束, 并用约束求解器求解出一个可行的新的具体输入, 然后进行新的一轮分析. 我简单的理解为走一步看一步.
  • 静态符号执行: 使用的是抽象的符号代替具体值**(不是具体值)**, 在遇到分支语句是, 会探索每一个分支, 将分支条件加入到相应的路径约束中. 最终通过路径约束找出具体值.

关于二者之间的区别, 用一个例子来讲会更便于理解:

  • 动态符号执行像是毛利小五郎, 他老是根据一个线索就随意指认一个嫌疑人是凶手(往往是错的), 然后这个嫌疑人就会向他解释并提供一个新的线索, 这样毛利小五郎就有两个线索了. 这时他在根据这两个线索武断另一个嫌疑人是凶手, 这样下一个嫌疑人也会解释自己不是凶手并提供第三个线索, 以此类推.虽然不能一下子就得到真正的凶手, 但至少能缩小凶手的范围.(只不过嫌疑人不再是经典三选一, 很可能是2^16个嫌疑人)
  • 而静态符号执行像是柯南, 他会冷静的搜索每一个线索, 最终整理这些线索, 一下子就找到凶手.

而Veritesting结合了二者: 先使用动态执行, 遇到简单的代码就切换到静态执行(不含系统调用, 简介跳转, 或难以精准推断的语句), 在静态模式下, 首先动态恢复控制流图, 找到静态执行容易分析的语句和难以分析的语句. 然后换回动态执行去处理静态不好解决的情况(复杂的情况用具体值来”猜测”会更好解决)

使用Angr解题

代码

直接用第一关的代码, 不同的是在创建模拟执行器的时候添加一个参数veritesting = True, simulation = project.factory.simgr(initial_state, veritesting = True)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
# When you construct a simulation manager, you will want to enable Veritesting:
# project.factory.simgr(initial_state, veritesting=True)
# Hint: use one of the first few levels' solutions as a reference.
# 当你构建一个模拟管理器时, 你会想要启用Veritesting:
# 提示: 使用前面几个关卡的其中一个解决方案
# 这一道题有一个循环if, 来检验加密后的字符串, 这会导致路径指数级增长.

import angr
import claripy
import sys

def main(argv):
# 建立Angr项目
file_path = argv[1]
project = angr.Project(file_path)

# 确定符号执行的起始状态, 只有一个参数, 正常进入main函数开始即可
initial_state = project.factory.entry_state(
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

simulation = project.factory.simgr(initial_state, veritesting = True)

# 暂且先试试直接符号执行, 因为有了Veritesting符号增强
def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Good Job".encode() in stdout_output

def is_false(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Try again".encode() in stdout_output

simulation.explore(find = is_successful, avoid = is_false)

if simulation.found:
solution_state = simulation.found[0]
print(solution_state.posix.dumps(sys.stdin.fileno()))
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

去掉注释后的代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
import angr
import claripy
import sys

def main(argv):
file_path = argv[1]
project = angr.Project(file_path)

initial_state = project.factory.entry_state(
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

simulation = project.factory.simgr(initial_state, veritesting = True)

def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Good Job".encode() in stdout_output

def is_false(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Try again".encode() in stdout_output

simulation.explore(find = is_successful, avoid = is_false)

if simulation.found:
solution_state = simulation.found[0]
print(solution_state.posix.dumps(sys.stdin.fileno()))
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

13_angr_static_binary

编译并运行

1
2
3
4
5
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/13_angr_static_binary$ python3 generate.py 1234 13_angr_static_binary
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/13_angr_static_binary$ ./13_angr_static_binary
Enter the password: aaaaaaaaaaaaaaaa
Try again.
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/13_angr_static_binary$

分析

使用伪代码会更清楚

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
int __cdecl main(int argc, const char **argv, const char **envp)
{
int i; // [esp+1Ch] [ebp-3Ch]
int j; // [esp+20h] [ebp-38h]
char v6[20]; // [esp+24h] [ebp-34h] BYREF
char v7[20]; // [esp+38h] [ebp-20h] BYREF
unsigned int v8; // [esp+4Ch] [ebp-Ch]

v8 = __readgsdword(0x14u);
for ( i = 0; i <= 19; ++i )
v7[i] = 0;
qmemcpy(v7, "HXUITWOA", 8);
printf("Enter the password: ");
_isoc99_scanf("%8s", v6);
for ( j = 0; j <= 7; ++j )
v6[j] = complex_function(v6[j], j);
if ( j_strcmp_ifunc(v6, v7) )
puts("Try again.");
else
puts("Good Job.");
return 0;
}

是一个正常的单个输入flag, 然后经过内置函数strcmp检验的程序

使用Angr解题

根据题目要求需要使用内置的SimProceures函数替换原本的一些库函数, 这样会让Angr的运行速度更快

代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
# This challenge is the exact same as the first challenge, except that it was
# compiled as a static binary. Normally, Angr automatically replaces standard
# library functions with SimProcedures that work much more quickly.
# 这一关与第一关完全相同, 只是它被编译为静态二进制文件.
# 通常, Angr会自动将标准库函数替换为运行速度更快的SimProceures
#
# To solve the challenge, manually hook any standard library c functions that
# are used. Then, ensure that you begin the execution at the beginning of the
# main function. Do not use entry_state.
# 要通过这一关, 手动Hook任何使用标准C函数. 然后确保在main函数的开头开始执行.
# 不要使用entry_state
#
# Here are a few SimProcedures Angr has already written for you. They implement
# standard library functions. You will not need all of them:
# 这里有一些Angr已经为你编写好了的SimProcedures. 它们相当于标准库函数
# 你不需要全部的函数, 一部分就可以了
# angr.SIM_PROCEDURES['libc']['malloc']
# angr.SIM_PROCEDURES['libc']['fopen']
# angr.SIM_PROCEDURES['libc']['fclose']
# angr.SIM_PROCEDURES['libc']['fwrite']
# angr.SIM_PROCEDURES['libc']['getchar']
# angr.SIM_PROCEDURES['libc']['strncmp']
# angr.SIM_PROCEDURES['libc']['strcmp']
# angr.SIM_PROCEDURES['libc']['scanf']
# angr.SIM_PROCEDURES['libc']['printf']
# angr.SIM_PROCEDURES['libc']['puts']
# angr.SIM_PROCEDURES['libc']['exit']
#
# As a reminder, you can hook functions with something similar to:
# project.hook(malloc_address, angr.SIM_PROCEDURES['libc']['malloc']())
# 提醒一下, 你可以使用下面的语句来实现Hook
# project.hook(函数地址, angr.SIM_PROCEDURES['libc']['要替换的内置函数名']())
#
# There are many more, see:
# 了解更多, 请看下面的网站
# https://github.com/angr/angr/tree/master/angr/procedures/libc
#
# Additionally, note that, when the binary is executed, the main function is not
# the first piece of code called. In the _start function, __libc_start_main is
# called to start your program. The initialization that occurs in this function
# can take a long time with Angr, so you should replace it with a SimProcedure.
# angr.SIM_PROCEDURES['glibc']['__libc_start_main']
# Note 'glibc' instead of 'libc'.
# 另外, 请注意, 执行二进制文件时, 主函数不是最先被调用的代码,.
# 而是在_start函数中, 通过调用__libc_start_main来启动main函数
# 使用Angr在此函数中进行的初始化可能需要很长的时间, 因此你需要将其替换为SimPorcedure
# ...
# 注意是'glibc'而不是'libc'

import angr
import sys

def main(argv):
binary_path = argv[1]
project = angr.Project(binary_path)

# 符号执行初始状态在main函数的开头
start_address = 0x08049E0F
initial_state = project.factory.blank_state(
addr = start_address,
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

#开始Hook
start_main_address = 0x0804A240
project.hook(start_main_address, angr.SIM_PROCEDURES['glibc']['__libc_start_main']())
strcmp_address = 0x080490D0
project.hook(strcmp_address, angr.SIM_PROCEDURES['libc']['strcmp']())
scanf_address = 0x08051330
project.hook(scanf_address, angr.SIM_PROCEDURES['libc']['scanf']())
puts_address = 0x0805EC90
project.hook(puts_address, angr.SIM_PROCEDURES['libc']['puts']())

simulation = project.factory.simgr(initial_state)

# 创建模拟执行器
def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Good Job".encode() in stdout_output

def is_false(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Try again".encode() in stdout_output

simulation.explore(find = is_successful, avoid = is_false)

if simulation.found:
solution_state = simulation.found[0]
print(solution_state.posix.dumps(sys.stdin.fileno()).decode())
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

去掉注释后的代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
import angr
import sys

def main(argv):
binary_path = argv[1]
project = angr.Project(binary_path)

start_address = 0x08049E0F
initial_state = project.factory.blank_state(
addr = start_address,
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

start_main_address = 0x0804A240
project.hook(start_main_address, angr.SIM_PROCEDURES['glibc']['__libc_start_main']())
strcmp_address = 0x080490D0
project.hook(strcmp_address, angr.SIM_PROCEDURES['libc']['strcmp']())
scanf_address = 0x08051330
project.hook(scanf_address, angr.SIM_PROCEDURES['libc']['scanf']())
puts_address = 0x0805EC90
project.hook(puts_address, angr.SIM_PROCEDURES['libc']['puts']())

simulation = project.factory.simgr(initial_state)

def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Good Job".encode() in stdout_output

def is_false(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())
return "Try again".encode() in stdout_output

simulation.explore(find = is_successful, avoid = is_false)

if simulation.found:
solution_state = simulation.found[0]
print(solution_state.posix.dumps(sys.stdin.fileno()).decode())
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

14_angr_shared_library

编译(动态库直接运行不了)

运行不了后面就没法验证自己解出的flag是不是对的, 可以自己照着加密函数写一个test程序, 也可以直接载入so文件调用validate函数, 第一个方法更简单一些.

1
2
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/14_angr_shared_library$ python3 generate.py 1234 14_angr_shared_library_so.c
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/14_angr_shared_library$

分析

是一个so文件根据提示找到我们需要的函数validate

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
_BOOL4 __cdecl validate(char *s1, int a2)
{
char v3; // al
char s2[20]; // [esp+4h] [ebp-24h] BYREF
int j; // [esp+18h] [ebp-10h]
int i; // [esp+1Ch] [ebp-Ch]

if ( a2 <= 7 )
return 0;
for ( i = 0; i <= 19; ++i )
s2[i] = 0;
qmemcpy(s2, "HXUITWOA", 8);
for ( j = 0; j <= 7; ++j )
{
v3 = complex_function(s1[j], j);
s1[j] = v3;
}
return strcmp(s1, s2) == 0;
}

使用Angr解题

代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
# The shared library has the function validate, which takes a string and returns
# either true (1) or false (0). The binary calls this function. If it returns
# true, the program prints "Good Job." otherwise, it prints "Try again."
# 共享库有函数validate, 它接受一个字符串并返回true(1)或false(0).
# 二进制文件调用此函数, 如果返回True, 程序将打印"Good Job", 否则打印"Tyr again"
#
# Note: When you run this script, make sure you run it on
# lib14_angr_shared_library.so, not the executable. This level is intended to
# teach how to analyse binary formats that are not typical executables.
# 注意: 运行此脚本是, 请确保在lib14_angr_shared_library.so上运行
# 而不是在可以执行文件上运行
# 这个关卡的目的在于如何分析非典型可执行文件的二进制格式

import angr
import claripy
import sys

def main(argv):
path_to_binary = argv[1]

# The shared library is compiled with position-independent code. You will need
# to specify the base address. All addresses in the shared library will be
# base + offset, where offset is their address in the file.
# 共享库用的是与位置无关代码编译的, 你需要指定基地址.
# 共享库中的所有地址都是base + offset, 其中offset是它们在文件中的偏移
# (!)
base = 0x400000
project = angr.Project(path_to_binary, load_options={
'main_opts' : {
'base_addr' : base
}
})

# Initialize any symbolic values here; you will need at least one to pass to
# the validate function.
# 在这里初始化任何符号值, 你至少需要传递一个符号给验证函数
# (!)


buffer_pointer = claripy.BVV(0x500000, 32) # 指针指向一段内存

# Begin the state at the beginning of the validate function, as if it was
# called by the program. Determine the parameters needed to call validate and
# replace 'parameters...' with bitvectors holding the values you wish to pass.
# Recall that 'claripy.BVV(value, size_in_bits)' constructs a bitvector
# initialized to a single value.
# Remember to add the base value you specified at the beginning to the
# function address!
# Hint: int validate(char* buffer, int length) { ...
# Another hint: the password is 8 bytes long.
# 从验证函数起始状态开始, 就像是它被程序调用了. 确定调用验证所需的参数,
# 并将"参数"替换为你像传递的值的位向量.
# 回想一下, 'claripy.BVV(value.size_in_bits)'构造了一个初始化为单个值的位向量.
# 不要忘记将你在开头指定的基址放在函数地址
# 提示: int validate(char * buffer, int lenth){...}
# 另一个提示: 密码长度为8字节
# (!)
validate_function_address = base + 0x0000129C
initial_state = project.factory.call_state(validate_function_address, buffer_pointer, 8) # 八个字符

# You will need to add code to inject a symbolic value into the program. Also,
# at the end of the function, constrain eax to equal true (value of 1) just
# before the function returns. There are multiple ways to do this:
# 1. Use a hook.
# 2. Search for the address just before the function returns and then
# constrain eax (this may require putting code elsewhere)
# 你需要添加代码从而将符号值注入程序.
# 此外, 在函数结束时, 在函数返回之前将eax约束为等于True, 有很多方法可以做到:
# 1. 使用钩子
# 2. 搜索函数返回前的地址, 然后约束eax
# (!)
# 创建符号
password_len_bits = 8 * 8
password = claripy.BVS("password", password_len_bits)
# 符号化
initial_state.memory.store( buffer_pointer, password)

simulation = project.factory.simgr(initial_state)

success_address = base + 0x0000134C
simulation.explore(find=success_address)

if simulation.found:
solution_state = simulation.found[0]

# Determine where the program places the return value, and constrain it so
# that it is true. Then, solve for the solution and print it.
# 确定程序将返回值放在哪里, 并对其进行约束, 使其为真, 然后求解并打印
# (!)

solution_state.add_constraints( solution_state.regs.eax != 0 )
solution = solution_state.solver.eval(password, cast_to = bytes)
print(solution)
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

去掉注释后的代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
import angr
import claripy
import sys

def main(argv):
path_to_binary = argv[1]

base = 0x400000
project = angr.Project(path_to_binary, load_options={
'main_opts' : {
'base_addr' : base
}
})

buffer_pointer = claripy.BVV(0x500000, 32)

validate_function_address = base + 0x0000129C
initial_state = project.factory.call_state(validate_function_address, buffer_pointer, 8)

password_len_bits = 8 * 8
password = claripy.BVS("password", password_len_bits)

initial_state.memory.store( buffer_pointer, password)

simulation = project.factory.simgr(initial_state)

success_address = base + 0x0000134C
simulation.explore(find=success_address)

if simulation.found:
solution_state = simulation.found[0]

solution_state.add_constraints( solution_state.regs.eax != 0 )
solution = solution_state.solver.eval(password, cast_to = bytes)
print(solution)
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

符号执行so文件的过程

因为so文件无法执行其函数, 所以我们需要通过Angr来符号执行它

找到要符号执行的函数的地址(记得加上基址)

要知道函数在哪里才能符号执行

1
validate_function_address = base + 0x0000129C

创建参数(符号位向量), 并模拟调用validate

原函数int validate(char * input, int n);

  • 第一个参数: 该函数的地址(base + offset)
  • 第二个参数(从这个参数开始就是原函数应有的参数): input也得是一个符号位向量, 大小为4个字节
  • 第三个参数: 因为n只能是8, 所以直接传入一个常量就行了
1
2
validate_function_address = base + 0x0000129C
initial_state = project.factory.call_state(validate_function_address, buffer_pointer, 8)

注入buffer的符号位向量

因为前面已经给出了指向buffer的指针input了, 所以我们符号注入的地址自然也是input

1
2
3
4
password_len_bits = 8 * 8
password = claripy.BVS("password", password_len_bits) # 创建符号

initial_state.memory.store( buffer_pointer, password) # 注入符号

符号执行

我们的目标是返回值为1, 所以我们必须让state在return的位置时结束符号执行, 然后再这个state约束eax寄存器, 从而实现约束返回值

1
2
3
4
success_address = base + 0x0000134C # retn汇编指令
simulation.explore(find=success_address)

solution_state.add_constraints( solution_state.regs.eax != 0 ) # 在该状态下约束此时的eax为1即可解得flag

15_angr_arbitrary_read

编译并运行

1
2
3
4
5
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/15_angr_arbitrary_read$ python3 generate.py 1234 15_angr_arbitrary_read
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/15_angr_arbitrary_read$ ./15_angr_arbitrary_read
Enter the password: aaaaa
Try again.
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/15_angr_arbitrary_read$

分析

1
2
3
4
5
6
7
8
9
10
11
12
13
14
int __cdecl main(int argc, const char **argv, const char **envp)
{
char v4; // [esp+Ch] [ebp-1Ch] BYREF
char *s; // [esp+1Ch] [ebp-Ch]

s = try_again;
printf("Enter the password: ");
__isoc99_scanf("%u %20s", &key, &v4);
if ( key == 41217380 )
puts(s);
else
puts(try_again);
return 0;
}

本题我们要利用栈溢出, 实现打印”Good Job”

使用Angr解题

代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
# This binary takes both an integer and a string as a parameter. A certain
# integer input causes the program to reach a buffer overflow with which we can
# read a string from an arbitrary memory location. Our goal is to use Angr to
# search the program for this buffer overflow and then automatically generate
# an exploit to read the string "Good Job."
# 这个二进制接受一个整数和一个字符串输入, 某个整数输入会导致程序达到缓冲区溢出
# 我们可以使用该缓冲区从任意内存位置读取字符串. 我们的目标是使用Angr在程序搜索
# 缓冲区溢出, 然后自动生成一个漏洞利用来读取字符串"Good Job"
#
# What is the point of reading the string "Good Job."?
# This CTF attempts to replicate a simplified version of a possible vulnerability
# where a user can exploit the program to print a secret, such as a password or
# a private key. In order to keep consistency with the other challenges and to
# simplify the challenge, the goal of this program will be to print "Good Job."
# instead.
# 读取字符串"Good Job"有什么意义?
# 这个关卡试图复现一个漏洞的简化版本, 用户可以利用这个漏洞打印一个自己想要打印的东西
# 为了与其他挑战保持一直并简化挑战, 我们需要打印"Good Job"来证明自己已经通关
#
# The general strategy for crafting this script will be to:
# 制作此脚本的通用策略是:
# 1) Search for calls of the 'puts' function, which will eventually be exploited
# to print out "Good Job."
# 1. 搜索"puts"函数的调用, 最终利用这个函数并打印出"Good Job"
# 2) Determine if the first parameter of 'puts', a pointer to the string to be
# printed, can be controlled by the user to be set to the location of the
# "Good Job." string.
# 2. 确定"puts"的第一个参数: 一个指向要打印的字符串的指针. 是否可以由用户控制并设为"Good Job"
# 3) Solve for the input that prints "Good Job."
# 3. 求解打印"Good Job"的输入
#
# Note: The script is structured to implement step #2 before #1.
# 提示: 在脚本中2.的实现早于1.

# Some of the source code for this challenge:
# 这个关卡的部分源代码
# #include <stdio.h>
# #include <stdlib.h>
# #include <string.h>
# #include <stdint.h>
#
# // This will all be in .rodata
# 这都会放在.rodata段区中
# char msg[] = "${ description }$";
# char* try_again = "Try again.";
# char* good_job = "Good Job.";
# uint32_t key;
#
# void print_msg() {
# printf("%s", msg);
# }
#
# uint32_t complex_function(uint32_t input) {
# ...
# }
#
# struct overflow_me {
# char buffer[16];# 上一个输入造成溢出到print的地址, 如果输入的是"Good Job"的地址那么就会导致最终打印的是"Good Job"
# char* to_print;
# };
#
# int main(int argc, char* argv[]) {
# struct overflow_me locals;
# locals.to_print = try_again;# 初始化为"Try again"的地址
#
# print_msg();
#
# printf("Enter the password: ");
# scanf("%u %20s", &key, locals.buffer);注意这里是%20s, 剩下的四个字节就是我们要填入"Good Job"地址的地方
#
# key = complex_function(key);
#
# switch (key) {
# case ?:
# puts(try_again);
# break;
#
# ...
#
# case ?:
# // Our goal is to trick this call to puts to print the "secret
# // password" (which happens, in our case, to be the string
# // "Good Job.")
# 我们的目标是骗过puts的调用, 并打印"Good Job"
# puts(locals.to_print);
# break;
#
# ...
# }
#
# return 0;
# }

import angr
import claripy
import sys

def main(argv):
path_to_binary = argv[1]
project = angr.Project(path_to_binary)

# You can either use a blank state or an entry state; just make sure to start
# at the beginning of the program.
# (!)
initial_state = project.factory.entry_state(
add_options = {angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

# Again, scanf needs to be replaced.
# scanf需要再一次被替代
class ReplacementScanf(angr.SimProcedure):
# Hint: scanf("%u %20s")
def run(self, format_string, scanf0_address, scanf1_address):
# %u
scanf0 = claripy.BVS('scanf0', 4 * 8)

# %20s
scanf1 = claripy.BVS('scanf1', 20 * 8)

# The bitvector.chop(bits=n) function splits the bitvector into a Python
# list containing the bitvector in segments of n bits each. In this case,
# we are splitting them into segments of 8 bits (one byte.)
# bitvector.chop(bits = n)函数将位向量拆分为一个Python列表, 其中很多个包含n位小节的位向量元素
# 我们现在将其分为8位一字节的小结
for char in scanf1.chop(bits=8):
# Ensure that each character in the string is printable. An interesting
# experiment, once you have a working solution, would be to run the code
# without constraining the characters to the printable range of ASCII.
# Even though the solution will technically work without this, it's more
# difficult to enter in a solution that contains character you can't
# copy, paste, or type into your terminal or the web form that checks
# your solution.
# 确保字符串中的每个字符都是可打印的.
# 保证你的解决方案的有效简洁
# (!)
self.state.add_constraints(char >= 0x21, char <= 126)

# Warning: Endianness only applies to integers. If you store a string in
# memory and treat it as a little-endian integer, it will be backwards.
# 警告: 大小端字节序仅适用于整数, 字符串还是正常排列
# key是全局变量所以可以直接用地址来注入符号
#scanf0_address = 0x48587030
self.state.memory.store(scanf0_address, scanf0, endness=project.arch.memory_endness)
self.state.memory.store(scanf1_address, scanf1)

self.state.globals['solution0'] = scanf0
self.state.globals['solution1'] = scanf1

# Hook该scanf()函数
scanf_symbol = '__isoc99_scanf' # :string
project.hook_symbol(scanf_symbol, ReplacementScanf())

# We will call this whenever puts is called. The goal of this function is to
# determine if the pointer passed to puts is controllable by the user, such
# that we can rewrite it to point to the string "Good Job."
# 只要puts被调用, 我们就会调用它, 此函数的目标是确定传递给puts的指针是否可由用户控制
# 以便我们可以重写为指向字符串"Good Job"的地址
def check_puts(state):
# Recall that puts takes one parameter, a pointer to the string it will
# print. If we load that pointer from memory, we can analyse it to determine
# if it can be controlled by the user input in order to point it to the
# location of the "Good Job." string.
# 回想一下, puts有一个参数: 指向字符串的指针.
# 如果我们从内存中找到这个指针,
# 我们可以分析它以确定它是否可以由用户输入控制
#
# Treat the implementation of this function as if puts was just called.
# The stack, registers, memory, etc should be set up as if the x86 call
# instruction was just invoked (but, of course, the function hasn't copied
# the buffers yet.)
# The stack will look as follows:
# 将此函数代替puts
# ...
# esp + 7 -> /----------------\
# esp + 6 -> | puts |
# esp + 5 -> | parameter | 参数先压入栈中
# esp + 4 -> \----------------/
# esp + 3 -> /----------------\
# esp + 2 -> | return |
# esp + 1 -> | address | 调用puts时的call指令会将返回地址压入栈中
# esp -> \----------------/
#
# Hint: Look at level 08, 09, or 10 to review how to load a value from a
# memory address. Remember to use the correct endianness in the future when
# loading integers; it has been included for you here.
# 查看第8, 9, 10关如何从内存地址加载值
# 记住以后再加载整数时要使用正确的大小端字节序, 这个已经被包含在下面的参数中了
# (!)
puts_parameter = state.memory.load(state.regs.esp + 4, 4, endness=project.arch.memory_endness)

# The following function takes a bitvector as a parameter and checks if it
# can take on more than one value. While this does not necessary tell us we
# have found an exploitable state, it is a strong indication that the
# bitvector we checked may be controllable by the user.
# Use it to determine if the pointer passed to puts is symbolic.
# 下面的函数将用一个位向量作为参数, 并检查它是否可以接受多个值
# 虽然这并不一定代表我们发现漏洞, 但是这强烈表明我们检查的位向量可能由用户控制
# 使用它来确定传递给puts的指针是否是符号的
# (!)
if state.solver.symbolic(puts_parameter):
# Determine the location of the "Good Job." string. We want to print it
# out, and we will do so by attempting to constrain the puts parameter to
# equal it. Hint: use 'objdump -s <binary>' to look for the string's
# address in .rodata.
# 确定好"Good Job"的地址 我们想要将其输出, 我们通过对puts参数约束为"Good Job"的地址来实现
# 使用objdump -s <二进制文件>来查看"Good Job"的地址
# (!)
good_job_string_address = 0x4858554B # :integer, probably hexadecimal

# Create an expression that will test if puts_parameter equals
# good_job_string_address. If we add this as a constraint to our solver,
# it will try and find an input to make this expression true. Take a look
# at level 08 to remind yourself of the syntax of this.
# 创建一个表达式来测试puts_parameter是否等于"Good Job"的地址
# 如果我们将此作为约束添加到求解器, 它将尝试找到一个输入以使得该表达式为真
# 看一下第八关, 看看怎么使用该语法
# (!)
is_vulnerable_expression = puts_parameter == good_job_string_address # :boolean bitvector expression

# Have Angr evaluate the state to determine if all the constraints can
# be met, including the one we specified above. If it can be satisfied,
# we have found our exploit!
#
# When doing this, however, we do not want to edit our state in case we
# have not yet found what we are looking for. To test if our expression
# is satisfiable without editing the original, we need to clone the state.
# 让Angr评估状态从而确定是否可以满足所有约束, 包括我们上面指定的约束. 如果可以满足, 我们就可以找到漏洞
# 但是当我们这样做时, 我们不希望编辑我们的状态, 因为我们还没有找到想要的东西.
# 为了此时我们的表达式是否可以再不编辑原始内容的情况下满足, 我们需要克隆状态
copied_state = state.copy()

# We can now play around with the copied state without changing the
# original. We need to add our vulnerable expression as a state to test it.
# Look at level 08 and compare this call to how it is called there.
# 我们现在可以在不改变原始状态的情况下使用克隆状态.
# 我们需要添加易受攻击的表达式作为state从而测试他.
copied_state.add_constraints(is_vulnerable_expression)

# Finally, we test if we can satisfy the constraints of the state.
# 最后, 我们测试可以满足条件的约束
if copied_state.satisfiable():
# Before we return, let's add the constraint to the solver for real,
# instead of just querying whether the constraint _could_ be added.
# 在我们返回之前, 我们要将约束添加到求解其中, 而不仅仅只是查看是否可以约束
state.add_constraints(is_vulnerable_expression)
return True
else:
return False
else: # not state.solver.symbolic(???)
return False

simulation = project.factory.simgr(initial_state)

# In order to determine if we have found a vulnerable call to 'puts', we need
# to run the function check_puts (defined above) whenever we reach a 'puts'
# call. To do this, we will look for the place where the instruction pointer,
# state.addr, is equal to the beginning of the puts function.
# 为了确定我们是否发现了对"puts"函数的易受攻击的调用, 我们需要在调用"puts"调用运行时运行上面定义阿check_ptus
# 为此, 我们将IP指针state.addr变成puts函数开头的地址.
def is_successful(state):
# We are looking for puts. Check that the address is at the (very) beginning
# of the puts function. Warning: while, in theory, you could look for
# any address in puts, if you execute any instruction that adjusts the stack
# pointer, the stack diagram above will be incorrect. Therefore, it is
# recommended that you check for the very beginning of puts.
# (!)
puts_address = 0x08049090
if state.addr == puts_address:
# Return True if we determine this call to puts is exploitable.
return check_puts(state)
else:
# We have not yet found a call to puts; we should continue!
return False

simulation.explore(find=is_successful)

if simulation.found:
solution_state = simulation.found[0]

stored_solutions0 = solution_state.globals['solution0']
stored_solutions1 = solution_state.globals['solution1']
solution0 = solution_state.solver.eval(stored_solutions0)
solution1 = solution_state.solver.eval(stored_solutions1, cast_to = bytes)
print(solution0)
print(solution1)
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

去掉注释后的代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
import angr
import claripy
import sys

def main(argv):
path_to_binary = argv[1]
project = angr.Project(path_to_binary)

initial_state = project.factory.entry_state(
add_options = {angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

class ReplacementScanf(angr.SimProcedure):

def run(self, format_string, scanf0_address, scanf1_address):
scanf0 = claripy.BVS('scanf0', 4 * 8)

scanf1 = claripy.BVS('scanf1', 20 * 8)

for char in scanf1.chop(bits=8):
self.state.add_constraints(char >= 0x21, char <= 126)

self.state.memory.store(scanf0_address, scanf0, endness=project.arch.memory_endness)
self.state.memory.store(scanf1_address, scanf1)

self.state.globals['solution0'] = scanf0
self.state.globals['solution1'] = scanf1

scanf_symbol = '__isoc99_scanf' # :string
project.hook_symbol(scanf_symbol, ReplacementScanf())

def check_puts(state):
puts_parameter = state.memory.load(state.regs.esp + 4, 4, endness=project.arch.memory_endness)

if state.solver.symbolic(puts_parameter):
good_job_string_address = 0x4858554B

is_vulnerable_expression = puts_parameter == good_job_string_address

copied_state = state.copy()

copied_state.add_constraints(is_vulnerable_expression)

if copied_state.satisfiable():

state.add_constraints(is_vulnerable_expression)
return True
else:
return False
else:
return False

simulation = project.factory.simgr(initial_state)

def is_successful(state):
puts_address = 0x08049090
if state.addr == puts_address:
return check_puts(state)
else:
return False

simulation.explore(find=is_successful)

if simulation.found:
solution_state = simulation.found[0]

stored_solutions0 = solution_state.globals['solution0']
stored_solutions1 = solution_state.globals['solution1']
solution0 = solution_state.solver.eval(stored_solutions0)
solution1 = solution_state.solver.eval(stored_solutions1, cast_to = bytes)
print(solution0)
print(solution1)
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

查找漏洞思路

Hook scanf()

因为这道题有两个输入, 所以我们需要先进行Hook

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
class ReplacementScanf(angr.SimProcedure):

def run(self, format_string, scanf0_address, scanf1_address):
scanf0 = claripy.BVS('scanf0', 4 * 8)

scanf1 = claripy.BVS('scanf1', 20 * 8)

for char in scanf1.chop(bits=8):
self.state.add_constraints(char >= 0x21, char <= 126)

self.state.memory.store(scanf0_address, scanf0, endness=project.arch.memory_endness)
self.state.memory.store(scanf1_address, scanf1)

self.state.globals['solution0'] = scanf0
self.state.globals['solution1'] = scanf1

scanf_symbol = '__isoc99_scanf' # :string
project.hook_symbol(scanf_symbol, ReplacementScanf())

注意在这时我们创建了两个符号scanf0和scanf1, 并且存入了globals中, 存入golbal中是因为后面对符号的约束求解是在该<类>的外部的, 所以需要将符号值存入全局变量中.

定义check_puts()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
def check_puts(state):
puts_parameter = state.memory.load(state.regs.esp + 4, 4, endness=project.arch.memory_endness)

if state.solver.symbolic(puts_parameter):
good_job_string_address = 0x4858554B

is_vulnerable_expression = puts_parameter == good_job_string_address

copied_state = state.copy()

copied_state.add_constraints(is_vulnerable_expression)

if copied_state.satisfiable():

state.add_constraints(is_vulnerable_expression)
return True
else:
return False
else:
return False

该函数的是在找到puts函数的时候调用的, 也就是说此时符号执行已经到达了puts开头.我们将此时的state传入check_puts中.

这时就可以获取该状态(刚进入puts的上下文)下esp的值, 从而获取参数值(要打印的字符串的指针).

找到目标打印字符串

1
2
3
if state.solver.symbolic(puts_parameter):
good_job_string_address = 0x4858554B

复制状态, 检验是否满足条件

因为要对状态进行修改, 我们需要先”测试”能不能满足条件得到我们想要的结果

1
2
3
4
5
is_vulnerable_expression = puts_parameter == good_job_string_address

copied_state = state.copy()

copied_state.add_constraints(is_vulnerable_expression)

如果可满足, 则给当前状态添加约束

1
2
if copied_state.satisfiable():
state.add_constraints(is_vulnerable_expression)

外部符号执行, 并约束求解

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
simulation = project.factory.simgr(initial_state)

def is_successful(state):
puts_address = 0x08049090
if state.addr == puts_address:
return check_puts(state)
else:
return False

simulation.explore(find=is_successful)

if simulation.found:
solution_state = simulation.found[0]

stored_solutions0 = solution_state.globals['solution0']
stored_solutions1 = solution_state.globals['solution1']
solution0 = solution_state.solver.eval(stored_solutions0)
solution1 = solution_state.solver.eval(stored_solutions1, cast_to = bytes)
print(solution0)
print(solution1)
else:
raise Exception('Could not find the solution')

16_angr_arbitrary_write

编译并执行

1
2
3
4
5
6
7
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/16_angr_arbitrary_write$ python3 generate.py
1234 16_angr_arbitrary_write
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/16_angr_arbitrary_write$ ./16_angr_arbitrary_
write
Enter the password: aaaaaaa
Try again.
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/16_angr_arbitrary_write$

分析

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
int __cdecl main(int argc, const char **argv, const char **envp)
{
char s[16]; // [esp+Ch] [ebp-1Ch] BYREF
char *dest; // [esp+1Ch] [ebp-Ch]

dest = unimportant_buffer;
memset(s, 0, sizeof(s));
strncpy(password_buffer, "PASSWORD", 0xCu);
printf("Enter the password: ");
__isoc99_scanf("%u %20s", &key, s);
if ( key == 0xA95593 )
strncpy(dest, s, 0x10u);
else
strncpy(unimportant_buffer, s, 0x10u);
if ( !strncmp(password_buffer, "TWOAPNES", 8u) )
puts("Good Job.");
else
puts("Try again.");
return 0;
}

可以看到不论我们按照规定输入什么字符都无法使得程序输出”Good Job”, 所以我们仍然需要利用栈溢出的原理得到”Good Job”

我们先看看栈区图

image.png

所以只要我们输入的20个字符的最后四个覆盖dest的字符是unimportant_buffer的地址的话, 后面strncpy(dest, s, 0x10u);实际上就是给unimportant_buffer赋值, 注意第二个参数s也是我们可以控制的, 因为这个就是我们的input, 只要input = “TWOAPNES”, 我们就可以实现输出Good Job了.

使用Angr解题

代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
# Essentially, the program does the following:
# 本质上, 程序执行以下操作:
#
# scanf("%d %20s", &key, user_input);
# ...
# // if certain unknown conditions are true...
# // 如果某些未知条件为真
# strncpy(random_buffer, user_input);
# ^
# ... |
# if (strncmp(secure_buffer, reference_string)) {
# // The secure_buffer does not equal the reference string.
# // secure_buffer不等于reference_string
# puts("Try again.");
# } else {
# // The two are equal.
# // 如果二者相等就打印Good Job
# puts("Good Job.");
# }
#
# If this program has no bugs in it, it would _always_ print "Try again." since
# user_input copies into random_buffer, not secure_buffer.
# 如果程序正常运行, 它只会打印"Try again", 因为输入无法影响到secure_buffer
#
# The question is: can we find a buffer overflow that will allow us to overwrite
# the random_buffer pointer to point to secure_buffer? (Spoiler: we can, but we
# will need to use Angr.)
# 问题是: 我们是否可以找到一个缓冲区溢出, 让我们可以覆盖random_buffer指针指向secure_buffer
# 提示: 我们可以使用Angr实现
#
# We want to identify a place in the binary, when strncpy is called, when we can:
# 我们想在二进制文件中确定一个位置, 当调用strncpy时, 我们可以:
# 1) Control the source contents (not the source pointer!)
# 1) 控制源内容
# * This will allow us to write arbitrary data to the destination.
# * 这将允许我们将任何数据写入目的地
# 2) Control the destination pointer
# 2) 控制目标指针
# * This will allow us to write to an arbitrary location.
# * 这将允许我们写入任何位置
# If we can meet both of those requirements, we can write arbitrary data to an
# arbitrary location. Finally, we need to contrain the source contents to be
# equal to the reference_string and the destination pointer to be equal to the
# secure_buffer.
# 如果我们能同时满足这两个要求, 我们就可以将任意数据写入任意位置
# 最后, 我们需要约束源内容等于reference_string, 目标指针等于secur_buffer

import angr
import claripy
import sys

def main(argv):
path_to_binary = argv[1]
project = angr.Project(path_to_binary)

# You can either use a blank state or an entry state; just make sure to start
# at the beginning of the program.
# entry_state和blank_state都行, 只需保证在程序开头
initial_state = project.factory.entry_state(
add_options = {angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

class ReplacementScanf(angr.SimProcedure):
# Hint: scanf("%u %20s")
def run(self, format_string, scanf0_address, scanf1_address):
# %u
scanf0 = claripy.BVS('scanf0', 32)

# %20s
scanf1 = claripy.BVS('scanf1', 20* 8)

for char in scanf1.chop(bits=8):
self.state.add_constraints(char >= 0x21, char <= 0x7A)

scanf0_address = 0x4858554C
self.state.memory.store(scanf0_address, scanf0, endness=project.arch.memory_endness)
self.state.memory.store(scanf1_address, scanf1)# 因为是字符串, 所以不用考虑字节序的问题

self.state.globals['solutions0'] = scanf0
self.state.globals['solutions1'] = scanf1

scanf_symbol = '__isoc99_scanf' # :string
project.hook_symbol(scanf_symbol, ReplacementScanf())

# In this challenge, we want to check strncpy to determine if we can control
# both the source and the destination. It is common that we will be able to
# control at least one of the parameters, (such as when the program copies a
# string that it received via stdin).
# 在这个挑战中, 我们要检查strcpy来确定我们是否可以同时控制src和dest.
# 通常我们能够至少控制一个参数()
def check_strncpy(state):
# The stack will look as follows:
# 栈空间如下图所示
# ... ________________
# esp + 15 -> / \
# esp + 14 -> | param2 |
# esp + 13 -> | len |
# esp + 12 -> \________________/
# esp + 11 -> / \
# esp + 10 -> | param1 |
# esp + 9 -> | src |
# esp + 8 -> \________________/
# esp + 7 -> / \
# esp + 6 -> | param0 |
# esp + 5 -> | dest |
# esp + 4 -> \________________/
# esp + 3 -> / \
# esp + 2 -> | return |
# esp + 1 -> | address |
# esp -> \________________/
# (!)
strncpy_src = state.memory.load(state.regs.esp + 8, 4, endness = project.arch.memory_endness) #这里读错了, 4字节当成位了
strncpy_dest = state.memory.load(state.regs.esp + 4, 4, endness = project.arch.memory_endness) #
strncpy_len = state.memory.load(state.regs.esp + 12, 4, endness = project.arch.memory_endness)

# We need to find out if src is symbolic, however, we care about the
# contents, rather than the pointer itself. Therefore, we have to load the
# the contents of src to determine if they are symbolic.
# Hint: How many bytes is strncpy copying?
# 我们需要确定src是否是符号, 然而我们关心的是内容, 而不是指针本身
# 因此, 我们必须加载src的内容来确定它们是否是符号.
# 提示: strncpy复制了多少字节
# (!)
src_contents = state.memory.load(strncpy_src, strncpy_len)

# Our goal is to determine if we can write arbitrary data to an arbitrary
# location. This means determining if the source contents are symbolic
# (arbitrary data) and the destination pointer is symbolic (arbitrary
# destination).
# 我们的目标是确定我们是否可以将任意数据写入任意位置.
# 这意味着必须确定src内容是符号, 以及目标指针是符号的.
# (!)
if state.solver.symbolic(strncpy_dest) and state.solver.symbolic(src_contents):
# Use ltrace to determine the password. Decompile the binary to determine
# the address of the buffer it checks the password against. Our goal is to
# overwrite that buffer to store the password.
# 使用ltrace确定密码, 反编译二进制文件来确定它检查密码的缓冲区的地址
# 我们的目标是覆盖该缓冲区用来存储密码
# (!)
password_string = 'TWOAPNES' # :string
buffer_address = 0x4858553C # :integer, probably in hexadecimal

# Create an expression that tests if the first n bytes is length. Warning:
# while typical Python slices (array[start:end]) will work with bitvectors,
# they are indexed in an odd way. The ranges must start with a high value
# and end with a low value. Additionally, the bits are indexed from right
# to left. For example, let a bitvector, b, equal 'ABCDEFGH', (64 bits).
# The following will read bit 0-7 (total of 1 byte) from the right-most
# bit (the end of the string).
# b[7:0] == 'H'
# To access the beginning of the string, we need to access the last 16
# bits, or bits 48-63:
# b[63:48] == 'AB'
# In this specific case, since we don't necessarily know the length of the
# contents (unless you look at the binary), we can use the following:
# b[-1:-16] == 'AB', since, in Python, -1 is the end of the list, and -16
# is the 16th element from the end of the list. The actual numbers should
# correspond with the length of password_string.
# (!)
does_src_hold_password = src_contents[-1:-64] == password_string

# Create an expression to check if the dest parameter can be set to
# buffer_address. If this is true, then we have found our exploit!
# 创建一个表达式来检查dest参数是否可以设置为buffer_address
# 如果为真, 我们就找到了漏洞
# (!)
does_dest_equal_buffer_address = strncpy_dest == buffer_address

# In the previous challenge, we copied the state, added constraints to the
# copied state, and then determined if the constraints of the new state
# were satisfiable. Since that pattern is so common, Angr implemented a
# parameter 'extra_constraints' for the satisfiable function that does the
# exact same thing. Note that we can pass multiple expressions to
# extra_constraints.
# 在之前的挑战中, 我们复制了状态, 并给复制的状态添加了约束, 然后判断<复制状态>的约束是否满足
# 由于这种方法很常见, 所以Angr为可满足函数事项了一个参数"extra_constraints", 它具有相同的功能
if state.satisfiable(extra_constraints=(does_src_hold_password, does_dest_equal_buffer_address)):
state.add_constraints(does_src_hold_password, does_dest_equal_buffer_address)
return True
else:
return False
else: # not state.solver.symbolic(???)
return False

simulation = project.factory.simgr(initial_state)

def is_successful(state):
strncpy_address = 0x080490F0
if state.addr == strncpy_address:
return check_strncpy(state)
else:
return False

simulation.explore(find=is_successful)

if simulation.found:
solution_state = simulation.found[0]

solution0 = solution_state.solver.eval(solution_state.globals['solutions0'])
solution1 = solution_state.solver.eval(solution_state.globals['solutions1'], cast_to = bytes)
print(solution0)
print(solution1)

else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

去掉注释后的代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
import angr
import claripy
import sys

def main(argv):
path_to_binary = argv[1]
project = angr.Project(path_to_binary)

initial_state = project.factory.entry_state(
add_options = {angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

class ReplacementScanf(angr.SimProcedure):
def run(self, format_string, scanf0_address, scanf1_address):
scanf0 = claripy.BVS('scanf0', 32)
scanf1 = claripy.BVS('scanf1', 20 * 8)

for char in scanf1.chop(bits=8):
self.state.add_constraints(char >= 0x21, char <= 0x7A)

#scanf0_address = 0x4858554C
self.state.memory.store(scanf0_address, scanf0, endness=project.arch.memory_endness)
self.state.memory.store(scanf1_address, scanf1)

self.state.globals['solutions0'] = scanf0
self.state.globals['solutions1'] = scanf1

scanf_symbol = '__isoc99_scanf' # :string
project.hook_symbol(scanf_symbol, ReplacementScanf())

def check_strncpy(state):
strncpy_src = state.memory.load(state.regs.esp + 8, 4, endness = project.arch.memory_endness) #这里读错了, 4字节当成位了
strncpy_dest = state.memory.load(state.regs.esp + 4, 4, endness = project.arch.memory_endness) #
strncpy_len = state.memory.load(state.regs.esp + 12, 4, endness = project.arch.memory_endness)

src_contents = state.memory.load(strncpy_src, strncpy_len)

if state.solver.symbolic(strncpy_dest) and state.solver.symbolic(src_contents):
password_string = 'TWOAPNES' # :string
buffer_address = 0x4858553C # :integer, probably in hexadecimal
does_src_hold_password = src_contents[-1:-64] == password_string

does_dest_equal_buffer_address = strncpy_dest == buffer_address

if state.satisfiable(extra_constraints=(does_src_hold_password, does_dest_equal_buffer_address)):
state.add_constraints(does_src_hold_password, does_dest_equal_buffer_address)
return True
else:
return False
else:
return False

simulation = project.factory.simgr(initial_state)

def is_successful(state):
strncpy_address = 0x080490F0
if state.addr == strncpy_address:
return check_strncpy(state)
else:
return False

simulation.explore(find=is_successful)

if simulation.found:
solution_state = simulation.found[0]

solution0 = solution_state.solver.eval(solution_state.globals['solutions0'])
solution1 = solution_state.solver.eval(solution_state.globals['solutions1'], cast_to = bytes)
print(solution0)
print(solution1)

else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

解题过程:

1. Hook掉scanf, 实现两个参数的符号化

前几关都有相关的过程

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
class ReplacementScanf(angr.SimProcedure):
def run(self, format_string, scanf0_address, scanf1_address):
scanf0 = claripy.BVS('scanf0', 32)
scanf1 = claripy.BVS('scanf1', 20 * 8)

for char in scanf1.chop(bits=8):
self.state.add_constraints(char >= 0x21, char <= 0x7A)

#scanf0_address = 0x4858554C
self.state.memory.store(scanf0_address, scanf0, endness=project.arch.memory_endness)
self.state.memory.store(scanf1_address, scanf1)

self.state.globals['solutions0'] = scanf0
self.state.globals['solutions1'] = scanf1

scanf_symbol = '__isoc99_scanf' # :string
project.hook_symbol(scanf_symbol, ReplacementScanf())

2. 定义check_strncpy(state)

这一步我们先往下看, 调用了这个函数的地方

可以看到只要我们一调用strncpy, 就会调用这个函数, 也就是说传入check_strncpy的state是刚进入strncpy的state

1
2
3
4
5
6
def is_successful(state):
strncpy_address = 0x080490F0
if state.addr == strncpy_address:
return check_strncpy(state)
else:
return False

3. 根据当前的state获取esp, 从而获取参数值

1
2
3
strncpy_src = state.memory.load(state.regs.esp + 8, 4, endness = project.arch.memory_endness)    #这里读错了, 4字节当成位了
strncpy_dest = state.memory.load(state.regs.esp + 4, 4, endness = project.arch.memory_endness) #
strncpy_len = state.memory.load(state.regs.esp + 12, 4, endness = project.arch.memory_endness)

分别获取的源字符串地址, 目标缓冲区地址, 长度变量

4. 获取src的内容(即我们为溢出的输入)

1
src_contents = state.memory.load(strncpy_src, strncpy_len)

5. 验证符号化(验证是否溢出)

如果我们找到了漏洞的话, 那么此时目标缓冲区地址应该是我们的输入溢出所覆盖的内容, 而前面我们的输入已经符号化了, 所以这个地址应该也是符号化的未知数.同理源字符串地址就是我们的scanf1, 也应该是符号化的.我们就可以根据这两个特征来进行约束, 表示我们成功造成了栈溢出

1
2
if state.satisfiable(extra_constraints=(does_src_hold_password, does_dest_equal_buffer_address)):
state.add_constraints(does_src_hold_password, does_dest_equal_buffer_address)

6. 进一步约束

在验证我们了已经造成了栈溢出后, 我们要进行进一步的约束, 以求达到输出”Good Job”的目标

进行了两个约束:

  • 约束src(也就是我们的scanf1)的内容为: TWOAPNES
  • 约束dest(栈溢出到dest的那部分)的地址是: secure_buffer的地址
1
2
3
4
5
password_string = 'TWOAPNES' # :string
buffer_address = 0x4858553C # :integer, probably in hexadecimal
does_src_hold_password = src_contents[-1:-64] == password_string

does_dest_equal_buffer_address = strncpy_dest == buffer_address

7. 检验是否满足条件

见上一关的描述

1
2
if state.satisfiable(extra_constraints=(does_src_hold_password, does_dest_equal_buffer_address)):
state.add_constraints(does_src_hold_password, does_dest_equal_buffer_address)

17_angr_arbitrary_jump

编译并运行

1
2
3
4
5
6
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/17_angr_arbitrary_jump$ python3 generate.py 1234 17_angr_arbitrary_jump
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/17_angr_arbitrary_jump$ ./17_angr_arbitrary_j
ump
Enter the password: aaaaa
Try again.
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/17_angr_arbitrary_jump$

分析

1
2
3
4
5
6
7
int __cdecl main(int argc, const char **argv, const char **envp)
{
printf("Enter the password: ");
read_input();
puts("Try again.");
return 0;
}

可以看到主函数中没有任何跟”Good Job”相关的函数, 我们使用字符串查找找到了一个print_good()函数

1
2
3
4
5
void __noreturn print_good()
{
puts("Good Job.");
exit(0);
}

但是没有任何引用, 这一关要求我们自己构造Jmp实现调用该函数

在read_input()的栈空间中return的地址是紧贴在v1后面的, v1造成栈溢出并将覆盖return的地址就可以实现调用print_good()

使用Angr解题

代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
# An unconstrained state occurs when there are too many
# possible branches from a single instruction. This occurs, among other ways,
# when the instruction pointer (on x86, eip) is completely symbolic, meaning
# that user input can control the address of code the computer executes.
# For example, imagine the following pseudo assembly:
# 当一条指令有太多分支时, 就会出现不受约束的状态.
# 当指令指针eip成为符号时, 就会发生这种情况, 这意味着用户输入可以控制计算机执行的代码的地址
# 通常Angr遇到不受约束的情况时, 它会将其抛出.
# 而在这一观众我们想利用这种不受约束的情况来条状到我们想要的位置
# 稍后我们将讨论如何禁用Angr的默认行为
#
# mov user_input, eax
# jmp eax
#
# The value of what the user entered dictates the next instruction. This
# is an unconstrained state. It wouldn't usually make sense for the execution
# engine to continue. (Where should the program jump to if eax could be
# anything?) Normally, when Angr encounters an unconstrained state, it throws
# it out. In our case, we want to exploit the unconstrained state to jump to
# a location of our choosing. We will get to how to disable Angr's default
# behavior later.
# 用户输入的值决定下一条指令, 这是一个不受约束的状态.
# 执行引擎继续运行通常没有意义.
# (如果eax可以是任何东西, 程序应该跳转到哪里)
# 本关我们利用不受约束的状态跳转到我们选择的位置.
#
# This challenge represents a classic stack-based buffer overflow attack to
# overwrite the return address and jump to a function that prints "Good Job."
# Our strategy for solving the challenge is as follows:
# 1. Initialize the simulation and ask Angr to record unconstrained states.
# 2. Step through the simulation until we have found a state where eip is
# symbolic.
# 3. Constrain eip to equal the address of the "print_good" function.
# 这一关代表了一个经典的基于堆栈的缓冲区溢出攻击, 覆盖返回值并跳转到打印"Good Job"的函数
# 步骤:
# 1. 初始化模拟并要求Angr记录无状态
# 2. 逐步模拟, 知道我们找到eip变成符号的状态
# 3. 约束eip等于"print_good"函数的地址


import angr
import claripy
import sys

def main(argv):
path_to_binary = argv[1]
project = angr.Project(path_to_binary)

# Make a symbolic input that has a decent size to trigger overflow
# 创建一个合适大小的符号来触发溢出
# (!)
symbolic_input = claripy.BVS("input", 8 * 59)

# Create initial state and set stdin to the symbolic input
# 创建初始状态并将标准输入设置为符号输入
initial_state = project.factory.entry_state(
stdin=symbolic_input,#!!!!!!!!!!!!!!!!!!!!!!!!
add_options = {
angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS
}
)

# Ensure that every byte of input is within the acceptable ASCII range (A..Z)
# 确保输入的每个字符都时可见字符(题目限制的时A到Z)
# (!)
for byte in symbolic_input.chop(bits=8):
initial_state.add_constraints(
claripy.And(
byte >= 'A',
byte <= 'Z'
)
)

# The save_unconstrained=True parameter specifies to Angr to not throw out
# unconstrained states. Instead, it will move them to the list called
# 'simulation.unconstrained'. Additionally, we will be using a few stashes
# that are not included by default, such as 'found' and 'not_needed'. You will
# see how these are used later.
# save_unconstrained = True参数指定Angr不抛出不受约束的状态
# 相对应, 它将会将这些状态添加到"simulation.unconstrained"列表中
# 此外, 我们将使用一些默认情况下不包含的存储, 例如"found"和"not_needed"
# (!)
simulation = project.factory.simgr(
initial_state,
save_unconstrained=True,
stashes={
'active' : [initial_state],
'unconstrained' : [],
'found' : [],
'not_needed' : []
}
)

# Explore will not work for us, since the method specified with the 'find'
# parameter will not be called on an unconstrained state. Instead, we want to
# explore the binary ourselves. To get started, construct an exit condition
# to know when the simulation has found a solution. We will later move
# states from the unconstrained list to the simulation.found list.
# Create a boolean value that indicates a state has been found.
# 负责约束的explore不再起作用了, 因为find参数指定的方法在不受约束的情况下失效
# 我们需要自己探索二进制文件
# 首先, 构造退出条件, 当simulation发现解决方案时我们就可以知道这一消息.
# 我们稍后会将state从不受约束的列表移动到simulation.found列表
# 并创建一个bool值标识一个被找到的state
def has_found_solution():
return simulation.found

# An unconstrained state occurs when there are too many possible branches
# from a single instruction. This occurs, among other ways, when the
# instruction pointer (on x86, eip) is completely symbolic, meaning
# that user input can control the address of code the computer executes.
# For example, imagine the following pseudo assembly:
# 当一条指令有太多<可能分支>时, 就会出现不受约束的状态(比如JMP 可以跳转的地址有太多了)
# 当EIP时符号时就会出现这种情况, 这意味着用户输入可以控制计算机执行代码的地址.
# 例如下面的汇编(ATT风格的)
#
# mov user_input, eax
# jmp eax
#
# The value of what the user entered dictates the next instruction. This
# is an unconstrained state. It wouldn't usually make sense for the execution
# engine to continue. (Where should the program jump to if eax could be
# anything?) Normally, when Angr encounters an unconstrained state, it throws
# it out. In our case, we want to exploit the unconstrained state to jump to
# a location of our choosing. Check if there are still unconstrained states
# by examining the simulation.unconstrained list.
# 用户输入的数据决定了下一条指令的. 这就是一个不受约束的state
# 执行引擎继续运行通常没有意义(如果eax可以是任何值, 程序应该跳到哪里)
# Angr遇见这种情况, 通常会抛出异常
# 而在这一关中, 我们想利用这个不受约束的状态来跳转到我们的print_good()函数
# 通过检查Simultion.unconstrained列表检查是否仍然存在不受约束的状态
#
# (!)
def has_unconstrained_to_check():
return simulation.unconstrained

# The list simulation.active is a list of all states that can be explored
# further.
# 列表simulation.active是可以进一步探索所有状态的列表
# (!)
def has_active():
return simulation.active

while (has_active() or has_unconstrained_to_check()) and (not has_found_solution()):
for unconstrained_state in simulation.unconstrained:
def should_move(s):
return s is unconstrained_state
# Look for unconstrained states and move them to the 'found' stash.
# A 'stash' should be a string that corresponds to a list that stores
# all the states that the state group keeps. Values include:
# 'active' = states that can be stepped
# 'deadended' = states that have exited the program
# 'errored' = states that encountered an error with Angr
# 'unconstrained' = states that are unconstrained
# 'found' = solutions
# anything else = whatever you want, perhaps you want a 'not_needed',
# you can call it whatever you want
# (!)
# 寻找不受约束的状态并将它们移动到'found'stash中, 'stash'应该是一个字符串, 对应于一个存储state组所保留的所有state的列表
simulation.move('unconstrained', 'found', filter_func = should_move)

# Advance the simulation.
simulation.step()

if simulation.found:
solution_state = simulation.found[0]

# Constrain the instruction pointer to target the print_good function and
# then solve for the user input (recall that this is
# 'solution_state.posix.dumps(sys.stdin.fileno())')
# 约束指令指针指向print_good函数, 然后求解用户输入
# (!)
solution_state.add_constraints(solution_state.regs.eip == 0x48585558)

solution = solution_state.posix.dumps(sys.stdin.fileno()).decode()
print(solution)
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

去掉注释后的代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
import angr
import claripy
import sys

def main(argv):
path_to_binary = argv[1]
project = angr.Project(path_to_binary)


symbolic_input = claripy.BVS("input", 8 * 59)

initial_state = project.factory.entry_state(
stdin=symbolic_input,
add_options = {
angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS
}
)

for byte in symbolic_input.chop(bits=8):
initial_state.add_constraints(
claripy.And(
byte >= 'A',
byte <= 'Z'
)
)

simulation = project.factory.simgr(
initial_state,
save_unconstrained=True,
stashes={
'active' : [initial_state],
'unconstrained' : [],
'found' : [],
'not_needed' : []
}
)

def has_found_solution():
return simulation.found

def has_unconstrained_to_check():
return simulation.unconstrained

def has_active():
return simulation.active

while (has_active() or has_unconstrained_to_check()) and (not has_found_solution()):
for unconstrained_state in simulation.unconstrained:
def should_move(s):
return s is unconstrained_state

simulation.move('unconstrained', 'found', filter_func = should_move)

simulation.step()

if simulation.found:
solution_state = simulation.found[0]

solution_state.add_constraints(solution_state.regs.eip == 0x48585558)

solution = solution_state.posix.dumps(sys.stdin.fileno()).decode()
print(solution)
else:
raise Exception('Could not find the solution')

if __name__ == '__main__':
main(sys.argv)

有点奇怪的就是符号的长度试了只能是59字节及其以上, 但是我们实际溢出所需要的字节数只是52个字节

image.png

而且得到的输出后面的也是无用的字符

AAAABAABBAAAABAABBBABBBBBBBABBBAAABBAABABABBABBBXUXHBABAABB

只有XUXH是我们要覆盖的返回地址, 后面的ABAABB都没有用, 但是不使用长度为>=59bytes的符号就跑不出来.

Angr_Lab常用的函数

angr.Project(path_to_binary)

这个函数的作用是建立Angr项目, 参数path_to_binary是二进制文件的路径

使用示例

1
2
path_to_binary = argv[1] #用执行参数得到文件路径
project = angr.Project(path_to_binary) #在该二进制文件上建立Angr项目

project.factory.entry_state(add_options = {})

设置执行起始状态, 指示Angr从main函数开始执行

使用示例

1
2
3
4
initial_state = project.factory.entry_state(
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

project.factory.blank_state(addr = , add_options = {})

设置执行起始状态, 指示Angr从指定的地址处开始执行.

注意容易和上一个函数混淆: 一个是entry_state, 一个是blank_state

使用示例

1
2
3
4
5
initial_state = project.factory.blank_state(
addr=start_address,
add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

claripy.BVS(‘’, num)

创建符号位向量, 第一个参数是符号名, 第二个参数是符号位向量的长度(bit为单位)

使用示例

1
2
3
password_size_bits = 32
password0 = claripy.BVS('password0', password_size_bits)
password1 = claripy.BVS('password1', password_size_bits)

simulation.explore(find= , avoid=)

符号执行, find表示<接受路径>, avoid表示<避免路径>

两个参数的形式是多样的

使用地址为参数

当用地址为参数时, find和avoid都接受地址来作为<接受/避免路径>

1
2
print_good_address = 0x080492F3
simulation.explore(find=print_good_address)

使用回调函数作为参数

当使用回到函数作为参数时, 其函数返回值要为True或False, 而不是地址.

1
2
3
4
5
6
7
8
9
10
11
def is_successful(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())

return "Good Job".encode() in stdout_output#如果"Good Job."字符串在标准输出中, 则返回真

def should_abort(state):
stdout_output = state.posix.dumps(sys.stdout.fileno())

return "Try again".encode() in stdout_output#如果"Tyr again."字符串在标准输出中, 则返回真

simulation.explore(find=is_successful, avoid=should_abort)

solution_state.posix.dumps(sys.stdin.fileno())

当使用默认的符号注入时, 最终得到的符号就是sys.stdin.fileno()

使用示例

1
print(solution_state.posix.dumps(sys.stdin.fileno()).decode())# 解码为unicode并打印

solution_state.add_constraints( solution_state.xx)

添加约束条件, 注意这个状态一定要用solution_state(之前用成了initial_state)

使用示例

1
2
3
solution_state = simulation.found[0]  
solution_state.add_constraints( solution_state.regs.eax != 0 )
solution = solution_state.solver.eval(password, cast_to = bytes)

solution_state.solver.eval(symbol)

我们自己创建符号并求解的时候, 就需要使用这个函数来约束求解符号的值.

使用示例

整数

1
2
3
4
5
6
7
8
9
if simulation.found:
solution_state = simulation.found[0]

solution0 = solution_state.solver.eval(password0)
solution1 = solution_state.solver.eval(password1)
solution2 = solution_state.solver.eval(password2)

solution = str(hex(solution0)) + ' ' + str(hex(solution1)) + ' ' + str(hex(solution2)) # :string, 注意这里要转换成十六进制, 因为scanf用的是%x
print(solution)

字符串

1
solution = solution_state.solver.eval(password,cast_to=bytes)
文章作者: LamのCrow
文章链接: http://example.com/2022/05/18/AngrLab2c872/
版权声明: 本博客所有文章除特别声明外,均采用 CC BY-NC-SA 4.0 许可协议。转载请注明来自 LamのCrow