Angr_Lab

实验准备

编译

首先是每个关卡都有一个独立的文件夹, 里面会有一个形似00_angr_find.c.jinja的文件, 这是还未编译的文件, 我们需要手动编译, 而Angr_Lab提供了编译生成脚本generate.py

1 2	python generate.py 1234 00_angr_find 其中1234是我们输入的一个随机数, 最后一个参数就是我们要输出编译的文件

运行

在编译得到可执行文件后, 我们可以先尝试运行.

在WSL上运行生成的可执行文件(为32位程序, 需要安装32位运行库, 同时需要WSL2)

可以看到是输入flag, 正确则为Good Job, 错误则为Try again

分析

使用IDA分析程序逻辑, 找到接收的执行路径地址等等…

使用Angr解题

在同文件下有一个scaffold00.py文件, 里面有Angr解题的基本框架, 同时还有很多的提示以及知识点.我们根据这些去解题.

00_angr_find

我们在编译得到了可执行文件后首先分析

main()

int __cdecl main(int argc, const char **argv, const char **envp)
{
  int i; // [esp+1Ch] [ebp-1Ch]
  char s1[9]; // [esp+23h] [ebp-15h] BYREF
  unsigned int v6; // [esp+2Ch] [ebp-Ch]

  v6 = __readgsdword(0x14u);
  printf("Enter the password: ");
  __isoc99_scanf("%8s", s1);
  for ( i = 0; i <= 7; ++i )
    s1[i] = complex_function(s1[i], i);
  if ( !strcmp(s1, "HXUITWOA") )
    puts("Good Job.");
  else
    puts("Try again.");
  return 0;
}

找到加密函数后跟进查看

complex_function(int a1, int a2)

int __cdecl complex_function(int a1, int a2)
{
  if ( a1 <= 64 || a1 > 90 )                    // 首先判断是否是小写字母以及@
  {
    puts("Try again.");
    exit(1);
  }
  return (3 * a2 + a1 - 65) % 26 + 65;          // 然后进行运算
}

汇编查看”Good Job”的地址

最终打印”Good Job”字符串的汇编语句在mian()函数中, 我们查看mian()反汇编

; int __cdecl main(int argc, const char **argv, const char **envp)
.text:08049242                 public main
.text:08049242 main            proc near               ; DATA XREF: _start+2A↑o
.text:08049242
.text:08049242 var_2C          = dword ptr -2Ch
.text:08049242 var_1C          = dword ptr -1Ch
.text:08049242 s1              = byte ptr -15h
.text:08049242 var_C           = dword ptr -0Ch
.text:08049242 var_4           = dword ptr -4
.text:08049242 argc            = dword ptr  8
.text:08049242 argv            = dword ptr  0Ch
.text:08049242 envp            = dword ptr  10h
.text:08049242
.text:08049242 ; __unwind {
.text:08049242                 lea     ecx, [esp+4]
.text:08049246                 and     esp, 0FFFFFFF0h
.text:08049249                 push    dword ptr [ecx-4]
.text:0804924C                 push    ebp
.text:0804924D                 mov     ebp, esp
.text:0804924F                 push    ecx
.text:08049250                 sub     esp, 34h
.text:08049253                 mov     eax, ecx
.text:08049255                 mov     eax, [eax+4]
.text:08049258                 mov     [ebp+var_2C], eax
.text:0804925B                 mov     eax, large gs:14h
.text:08049261                 mov     [ebp+var_C], eax
.text:08049264                 xor     eax, eax
.text:08049266                 sub     esp, 0Ch
.text:08049269                 push    offset aEnterThePasswo ; "Enter the password: "
.text:0804926E                 call    _printf
.text:08049273                 add     esp, 10h
.text:08049276                 sub     esp, 8
.text:08049279                 lea     eax, [ebp+s1]
.text:0804927C                 push    eax
.text:0804927D                 push    offset a8s      ; "%8s"
.text:08049282                 call    ___isoc99_scanf
.text:08049287                 add     esp, 10h
.text:0804928A                 mov     [ebp+var_1C], 0
.text:08049291                 jmp     short loc_80492C0
.text:08049293 ; ---------------------------------------------------------------------------
.text:08049293
.text:08049293 loc_8049293:                            ; CODE XREF: main+82↓j
.text:08049293                 lea     edx, [ebp+s1]
.text:08049296                 mov     eax, [ebp+var_1C]
.text:08049299                 add     eax, edx
.text:0804929B                 movzx   eax, byte ptr [eax]
.text:0804929E                 movsx   eax, al
.text:080492A1                 sub     esp, 8
.text:080492A4                 push    [ebp+var_1C]
.text:080492A7                 push    eax
.text:080492A8                 call    complex_function
.text:080492AD                 add     esp, 10h
.text:080492B0                 mov     ecx, eax
.text:080492B2                 lea     edx, [ebp+s1]
.text:080492B5                 mov     eax, [ebp+var_1C]
.text:080492B8                 add     eax, edx
.text:080492BA                 mov     [eax], cl
.text:080492BC                 add     [ebp+var_1C], 1
.text:080492C0
.text:080492C0 loc_80492C0:                            ; CODE XREF: main+4F↑j
.text:080492C0                 cmp     [ebp+var_1C], 7
.text:080492C4                 jle     short loc_8049293
.text:080492C6                 sub     esp, 8
.text:080492C9                 push    offset s2       ; "HXUITWOA"
.text:080492CE                 lea     eax, [ebp+s1]
.text:080492D1                 push    eax             ; s1
.text:080492D2                 call    _strcmp
.text:080492D7                 add     esp, 10h
.text:080492DA                 test    eax, eax
.text:080492DC                 jz      short loc_80492F0
.text:080492DE                 sub     esp, 0Ch
.text:080492E1                 push    offset s        ; "Try again."
.text:080492E6                 call    _puts
.text:080492EB                 add     esp, 10h
.text:080492EE                 jmp     short loc_8049300
.text:080492F0 ; ---------------------------------------------------------------------------
.text:080492F0
.text:080492F0 loc_80492F0:                            ; CODE XREF: main+9A↑j
.text:080492F0                 sub     esp, 0Ch
.text:080492F3                 push    offset aGoodJob ; "Good Job."
.text:080492F8                 call    _puts
.text:080492FD                 add     esp, 10h
.text:08049300
.text:08049300 loc_8049300:                            ; CODE XREF: main+AC↑j
.text:08049300                 mov     eax, 0
.text:08049305                 mov     ecx, [ebp+var_C]
.text:08049308                 xor     ecx, large gs:14h
.text:0804930F                 jz      short loc_8049316
.text:08049311                 call    ___stack_chk_fail
.text:08049316 ; ---------------------------------------------------------------------------
.text:08049316
.text:08049316 loc_8049316:                            ; CODE XREF: main+CD↑j
.text:08049316                 mov     ecx, [ebp+var_4]
.text:08049319                 leave
.text:0804931A                 lea     esp, [ecx-4]
.text:0804931D                 retn
.text:0804931D ; } // starts at 8049242
.text:0804931D main            endp
.text:0804931D
.text:0804931D ; ---------------------------------------------------------------------------

可以看到我们要找的是: .text:080492F3 push offset aGoodJob ; "Good Job."

使用Angr解题

在了解了程序的大致逻辑以后, 我们打开scaffold00.py使用Angr来解题

代码

# Before you begin, here are a few notes about these capture-the-flag
# challenges.
#
# Each binary, when run, will ask for a password, which can be entered via stdin
# (typing it into the console.) Many of the levels will accept many different
# passwords. Your goal is to find a single password that works for each binary.
#
# If you enter an incorrect password, the program will print "Try again." If you
# enter a correct password, the program will print "Good Job."
#
# Each challenge will be accompanied by a file like this one, named
# "scaffoldXX.py". It will offer guidance as well as the skeleton of a possible
# solution. You will have to edit each file. In some cases, you will have to
# edit it significantly. While use of these files is recommended, you can write
# a solution without them, if you find that they are too restrictive.
#
# Places in the scaffoldXX.py that require a simple substitution will be marked
# with three question marks (???). Places that require more code will be marked
# with an ellipsis (...). Comments will document any new concepts, but will be
# omitted for concepts that have already been covered (you will need to use
# previous scaffoldXX.py files as a reference to solve the challenges.) If a
# comment documents a part of the code that needs to be changed, it will be
# marked with an exclamation point at the end, on a separate line (!).

#在开始之前, 这里有一些关于CTF挑战的注意事项

#每个二进制文件在运行时都会要求输入密码, 不同关卡有不同的密码(flag), 你的目标就是找到flag

#如果输入flag是错误的, 打印"Try again"; 如果输入flag正确, 则打印"Good Job""

#每个挑战都会附带一个类似"scaffoldXX.py", 它将提供指导以及可能的解决方案的框架.你必须取编辑每个文件.
#在某些情况下, 你将必须去进行大量的编辑. 虽然建议使用这些文件, 但是如果你发现它们限制了你, 你用自己
#的方式去编写

#"???"表示这里需要替换为其他内容, "..."表示这里需要跟多代码, "!"表示下面的代码将需要修改
#有新概念的地方会有注释, 但是已经讲过的概念不会再重复(你需要将以前的scaffoldXX.py的内容作为参考来解决)
#
import angr
import sys

def main(argv):
  # Create an Angr project.创建一个Angr项目
  # If you want to be able to point to the binary from the command line, you can
  # use argv[1] as the parameter. Then, you can run the script from the command
  # line as follows:
  #如果你想能够从命令行指向二进制文件, 可以使用argv[1]作为参数. 然后你从命令行运行脚本, 如下所示:
  # python ./scaffold00.py [binary]
  #意思就是你可以在打开文件的时候在后面加上一个参数, 这个参数是需要解的文件, 然后path_to_binary = argv[1]
  # (!)
  path_to_binary = argv[1]  # :string
  project = angr.Project(path_to_binary)#创建Angr项目, 参数一个文件路径

  # Tell Angr where to start executing (should it start from the main()
  # function or somewhere else?) For now, use the entry_state function
  # to instruct Angr to start from the main() function.
  #告诉Angr应该从哪里开始执行(应该从main()函数还是其他地方开始执行), 下面使用entry_state函数指示Angr从main()函数开始
  initial_state = project.factory.entry_state(
    add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,#符号 填充 不受限制的 内存
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}#符号 填充 不受限制的 寄存器
  )

  # Create a simulation manager initialized with the starting state. It provides
  # a number of useful tools to search and execute the binary.
  # 创建一个初始化为"起始状态"的模拟管理器. 它提供了大量的工具去搜索和执行二进制文件
  simulation = project.factory.simgr(initial_state)

  # Explore the binary to attempt to find the address that prints "Good Job."
  # You will have to find the address you want to find and insert it here. 
  # This function will keep executing until it either finds a solution or it 
  # has explored every possible path through the executable.
  # 探索二进制文件以尝试找到打印"Good Job"的地址. 你必须找到你想要搜索的地址并将该地址赋给print_good_address变量
  # 此函数将继续执行, 直到找到解决方案或探索了可执行文件的所有可能路径
  # (!)下面有需要修改的地址
  print_good_address = 0x080492F3  # :integer (probably in hexadecimal)
  simulation.explore(find=print_good_address)

  # Check that we have found a solution. The simulation.explore() method will
  # set simulation.found to a list of the states that it could find that reach
  # the instruction we asked it to search for. Remember, in Python, if a list
  # is empty, it will be evaluated as false, otherwise true.
  # 检查我们是否找到了解决方案, simulation.explore()方法把simulation.found设置为一个<状态列表>
  # 它可以找到我们要求它搜索的指令的状态. 请记住: 在Python中, 如果列表为空, 它将被判断为false, 否则为true
  if simulation.found:#类似于全局变量在simulation.explore中被设置成了我们想要的执行状态
    # The explore method stops after it finds a single state that arrives at the
    # target address.
    # explore方法找到目标地址的单个状态后停止
    solution_state = simulation.found[0]

    # Print the string that Angr wrote to stdin to follow solution_state. This 
    # is our solution.
    # 打印Angr写入标准输入的字符串以跟随solution_state, 这就是我们的解决方案
    print(solution_state.posix.dumps(sys.stdin.fileno()).decode())
  else:
    # If Angr could not find a path that reaches print_good_address, throw an
    # error. Perhaps you mistyped the print_good_address?
    # 如果Angr不能找到一条到达print_good_address, 则抛出一个error.
    #也许是你写错了print_good_address
    raise Exception('Could not find the solution')

if __name__ == '__main__':
  main(sys.argv)

把注释去掉以后

import angr
import sys

def main(argv):
  path_to_binary = argv[1]
  project = angr.Project(path_to_binary)

  initial_state = project.factory.entry_state(
    add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
  )

  simulation = project.factory.simgr(initial_state)

  print_good_address = 0x080492F3
  simulation.explore(find=print_good_address)

  if simulation.found:
    solution_state = simulation.found[0]
    print(solution_state.posix.dumps(sys.stdin.fileno()).decode())
  else:
    raise Exception('Could not find the solution')

if __name__ == '__main__':
  main(sys.argv)

使用的流程

找到目标程序, 并在该程序的基础上创建Angr项目
指定执行起点(符号应该是默认为标准输入了)
创建一个simulation来提供工具(方法)去搜索和执行二进制文件
通过上面的分析我们找到了”Good Job”的地址, 我们使用simulation提供的explore来设定”接受的执行路径”
explore后开始符号执行
最后搜索到的结果放在simulation.found(有点像是全局变量)中, 如果没搜索到found则为空对于if来讲就是false, 成功则为非空对于if来讲就是true.

01_angr_avoid

编译并执行

lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/01_angr_avoid$ python3 generate.py 1234 01_angr_avoid
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/01_angr_avoid$ ./01_angr_avoid
Enter the password: aaaaaaaaa
Try again.

分析

使用IDA查看

来到mian()函数, 可以看到很多的”应该避免的路径”, mian()代码非常庞大以至于IDA给出无法显示的警告

Untitled

“Good Job”所在地址

我们使用字符串查找可以找到我们需要的”接受的执行路径”

.text:080492CC ; int __cdecl maybe_good(char *s1, char *s2)
.text:080492CC                 public maybe_good
.text:080492CC maybe_good      proc near               ; CODE XREF: main+31E↓p
.text:080492CC                                         ; main+336↓p ...
.text:080492CC
.text:080492CC s1              = dword ptr  8
.text:080492CC s2              = dword ptr  0Ch
.text:080492CC
.text:080492CC ; __unwind {
.text:080492CC                 endbr32
.text:080492D0                 push    ebp
.text:080492D1                 mov     ebp, esp
.text:080492D3                 sub     esp, 8
.text:080492D6                 movzx   eax, should_succeed
.text:080492DD                 test    al, al
.text:080492DF                 jz      short loc_804930A
.text:080492E1                 sub     esp, 4
.text:080492E4                 push    8               ; n
.text:080492E6                 push    [ebp+s2]        ; s2
.text:080492E9                 push    [ebp+s1]        ; s1
.text:080492EC                 call    _strncmp
.text:080492F1                 add     esp, 10h
.text:080492F4                 test    eax, eax
.text:080492F6                 jnz     short loc_804930A
.text:080492F8                 sub     esp, 0Ch
.text:080492FB                 push    offset aGoodJob ; "Good Job."
.text:08049300                 call    _puts
.text:08049305                 add     esp, 10h
.text:08049308                 jmp     short loc_804931B
.text:0804930A ; ---------------------------------------------------------------------------
.text:0804930A
.text:0804930A loc_804930A:                            ; CODE XREF: maybe_good+13↑j
.text:0804930A                                         ; maybe_good+2A↑j
.text:0804930A                 sub     esp, 0Ch
.text:0804930D                 push    offset s        ; "Try again."
.text:08049312                 call    _puts
.text:08049317                 add     esp, 10h
.text:0804931A                 nop
.text:0804931B
.text:0804931B loc_804931B:                            ; CODE XREF: maybe_good+3C↑j
.text:0804931B                 nop
.text:0804931C                 leave
.text:0804931D                 retn
.text:0804931D ; } // starts at 80492CC
.text:0804931D maybe_good      endp

可以看到我们要找的是: .text:080492FB push offset aGoodJob ; "Good Job."

mian()函数

我们稍微分析以下main函数

; int __cdecl main(int argc, const char **argv, const char **envp)
.text:0804931E                 public main
.text:0804931E main            proc near               ; DATA XREF: _start+2A↑o
.text:0804931E
.text:0804931E var_4C          = dword ptr -4Ch
.text:0804931E j               = dword ptr -3Ch
.text:0804931E i               = dword ptr -38h
.text:0804931E input           = byte ptr -34h
.text:0804931E s2              = byte ptr -20h
.text:0804931E var_C           = dword ptr -0Ch
.text:0804931E var_4           = dword ptr -4
.text:0804931E argc            = dword ptr  8
.text:0804931E argv            = dword ptr  0Ch
.text:0804931E envp            = dword ptr  10h
.text:0804931E
.text:0804931E ; __unwind {
.text:0804931E                 endbr32
.text:08049322                 lea     ecx, [esp+4]
.text:08049326                 and     esp, 0FFFFFFF0h
.text:08049329                 push    dword ptr [ecx-4]
.text:0804932C                 push    ebp
.text:0804932D                 mov     ebp, esp
.text:0804932F                 push    ecx
.text:08049330                 sub     esp, 54h
.text:08049333                 mov     eax, ecx
.text:08049335                 mov     eax, [eax+4]
.text:08049338                 mov     [ebp+var_4C], eax
.text:0804933B                 mov     eax, large gs:14h
.text:08049341                 mov     [ebp+var_C], eax
.text:08049344                 xor     eax, eax
.text:08049346                 mov     [ebp+j], 0
.text:0804934D                 jmp     short loc_804935E
.text:0804934F ; ---------------------------------------------------------------------------
.text:0804934F
.text:0804934F loc_804934F:                            ; CODE XREF: main+44↓j
.text:0804934F                 lea     edx, [ebp+s2]   ; s2数组初始化为0
.text:08049352                 mov     eax, [ebp+j]
.text:08049355                 add     eax, edx
.text:08049357                 mov     byte ptr [eax], 0
.text:0804935A                 add     [ebp+j], 1
.text:0804935E
.text:0804935E loc_804935E:                            ; CODE XREF: main+2F↑j
.text:0804935E                 cmp     [ebp+j], 13h
.text:08049362                 jle     short loc_804934F ; s2数组初始化为0
.text:08049364                 lea     eax, [ebp+s2]
.text:08049367                 mov     dword ptr [eax], 57514A4Dh
.text:0804936D                 mov     dword ptr [eax+4], 454C4441h
.text:08049374                 sub     esp, 0Ch
.text:08049377                 push    offset aEnterThePasswo ; "Enter the password: "
.text:0804937C                 call    _printf
.text:08049381                 add     esp, 10h
.text:08049384                 sub     esp, 8
.text:08049387                 lea     eax, [ebp+input]
.text:0804938A                 push    eax             ; input
.text:0804938B                 push    offset a8s      ; "%8s"
.text:08049390                 call    ___isoc99_scanf
.text:08049395                 add     esp, 10h
.text:08049398                 mov     [ebp+i], 0
.text:0804939F                 jmp     short loc_80493CE ; i的初始值为0
.text:080493A1 ; ---------------------------------------------------------------------------
.text:080493A1
.text:080493A1 loc_80493A1:                            ; CODE XREF: main+B4↓j
.text:080493A1                 lea     edx, [ebp+input]
.text:080493A4                 mov     eax, [ebp+i]
.text:080493A7                 add     eax, edx        ; i + input
.text:080493A9                 movzx   eax, byte ptr [eax] ; input[i]
.text:080493AC                 movsx   eax, al         ; 取单字节
.text:080493AF                 sub     esp, 8          ; 开辟栈空间
.text:080493B2                 push    [ebp+i]         ; 将i压入栈中
.text:080493B5                 push    eax             ; 将push[i]压入栈中
.text:080493B6                 call    complex_function ; complex_function(input[i], i);这个函数跟00_angr_find是一样的
.text:080493BB                 add     esp, 10h        ; 调用者清理栈
.text:080493BE                 mov     ecx, eax        ; 返回值就是加密后的数据
.text:080493C0                 lea     edx, [ebp+input]
.text:080493C3                 mov     eax, [ebp+i]
.text:080493C6                 add     eax, edx        ; input + i
.text:080493C8                 mov     [eax], cl       ; input[i] = complex_function(input[i], i)
.text:080493CA                 add     [ebp+i], 1
.text:080493CE
.text:080493CE loc_80493CE:                            ; CODE XREF: main+81↑j
.text:080493CE                 cmp     [ebp+i], 7      ; i的初始值为0
.text:080493D2                 jle     short loc_80493A1 ; for(i = 0; i <= 7; i++)
.text:080493D4                 lea     eax, [ebp+input] ; 8个数据加密完成后来到这里
.text:080493D7                 add     eax, 1          ; input + 1
.text:080493DA                 movzx   eax, byte ptr [eax]
.text:080493DD                 movzx   eax, al         ; input[1]
.text:080493E0                 and     eax, 10h        ; input[1] & 0x10, 取高4位
.text:080493E3                 test    eax, eax
.text:080493E5                 setz    dl              ; 如果高4位为0的话dl为1
.text:080493E8                 lea     eax, [ebp+s2]
.text:080493EB                 add     eax, 1
.text:080493EE                 movzx   eax, byte ptr [eax] ; s2[1]
.text:080493F1                 movzx   eax, al
.text:080493F4                 and     eax, 10h
.text:080493F7                 test    eax, eax
.text:080493F9                 setnz   al              ; 看s2[1]高位是否为0
.text:080493FC                 xor     eax, edx        ; s2[1]必为0, 所以input[1]高位不为0的话
.text:080493FE                 test    al, al          ; 则不跳转
.text:08049400                 jz      loc_808F34F     ; 可以看到跳转的地址就是avoidme

跟进跳转

我们跟进地址为0x08049400地址处的条件跳转

loc_808F34F:                            ; CODE XREF: main+E2↑j
.text:0808F34F                 call    avoid_me
.text:0808F354                 lea     eax, [ebp+input]
.text:0808F357                 add     eax, 1
.text:0808F35A                 movzx   eax, byte ptr [eax]
.text:0808F35D                 movzx   eax, al
.text:0808F360                 and     eax, 8
.text:0808F363                 test    eax, eax
.text:0808F365                 setz    dl
.text:0808F368                 lea     eax, [ebp+s2]
.text:0808F36B                 add     eax, 1
.text:0808F36E                 movzx   eax, byte ptr [eax]
.text:0808F371                 movzx   eax, al
.text:0808F374                 and     eax, 8
.text:0808F377                 test    eax, eax
.text:0808F379                 setnz   al
.text:0808F37C                 xor     eax, edx
.text:0808F37E                 test    al, al
.text:0808F380                 jz      loc_80B230F
.text:0808F386                 lea     eax, [ebp+input]
.text:0808F389                 add     eax, 1
.text:0808F38C                 movzx   eax, byte ptr [eax]
.text:0808F38F                 movzx   eax, al
.text:0808F392                 and     eax, 4
.text:0808F395                 test    eax, eax
.text:0808F397                 setnz   dl
.text:0808F39A                 lea     eax, [ebp+s2]
.text:0808F39D                 add     eax, 1
.text:0808F3A0                 movzx   eax, byte ptr [eax]
.text:0808F3A3                 movzx   eax, al
.text:0808F3A6                 and     eax, 4
.text:0808F3A9                 test    eax, eax
.text:0808F3AB                 setnz   al
.text:0808F3AE                 xor     eax, edx
.text:0808F3B0                 test    al, al
.text:0808F3B2                 jz      loc_80A0B66
.text:0808F3B8                 call    avoid_me
.text:0808F3BD                 lea     eax, [ebp+input]
.text:0808F3C0                 add     eax, 1
.text:0808F3C3                 movzx   eax, byte ptr [eax]
.text:0808F3C6                 movzx   eax, al
.text:0808F3C9                 and     eax, 2
.text:0808F3CC                 test    eax, eax
.text:0808F3CE                 setnz   dl
.text:0808F3D1                 lea     eax, [ebp+s2]
.text:0808F3D4                 add     eax, 1
.text:0808F3D7                 movzx   eax, byte ptr [eax]
.text:0808F3DA                 movzx   eax, al
.text:0808F3DD                 and     eax, 2
.text:0808F3E0                 test    eax, eax
.text:0808F3E2                 setnz   al
.text:0808F3E5                 xor     eax, edx
.text:0808F3E7                 test    al, al
.text:0808F3E9                 jz      loc_8097FAD
.text:0808F3EF                 call    avoid_me

需要避开的地址

可以看到调用了很多个avoid_me()函数, 每一个都是需要避开的路径, 一个一个避开肯定不显示, 所以我们跟进avoid_me()函数, 直接避开这个函数本身就好了

.text:080492BB ; void avoid_me()
.text:080492BB                 public avoid_me
.text:080492BB avoid_me        proc near               ; CODE XREF: main+14C↓p
.text:080492BB                                         ; main+183↓p ...
.text:080492BB ; __unwind {
.text:080492BB                 endbr32
.text:080492BF                 push    ebp
.text:080492C0                 mov     ebp, esp
.text:080492C2                 mov     should_succeed, 0
.text:080492C9                 nop
.text:080492CA                 pop     ebp
.text:080492CB                 retn
.text:080492CB ; } // starts at 80492BB
.text:080492CB avoid_me        endp

我们选择避开: .text:080492C2 mov should_succeed, 0

使用Angr解题

与上一个关卡相比, 这里这一关添加了大量的”应避开的路径”, 如果只有”接收路径”而没有”避开路径”的话, 时间成本会大很多.

代码

import angr
import sys

def main(argv):
  #获取二进制文件, 并在此基础上创建一个Angr项目
  path_to_binary = argv[1]
  project = angr.Project(path_to_binary)

  #指定执行的起始状态, 这里表示从main函数开始
  initial_state = project.factory.entry_state(
    add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
  )
  #创建一个模拟管理器, 提供搜索和执行的工具(方法), 注意其参数为起始状态
  simulation = project.factory.simgr(initial_state)

  # Explore the binary, but this time, instead of only looking for a state that
  # reaches the print_good_address, also find a state that does not reach 
  # will_not_succeed_address. The binary is pretty large, to save you some time,
  # everything you will need to look at is near the beginning of the address 
  # space.
  # 尝试分析二进制文件, 但是这一次, 除了要找到我们的"接收的路径"print_good_address, 
  # 还要找到一个"避免的路径"will_not_succeed_address. 二进制文件非常大, 为了节省你的
  # 时间, 你需要找的所有内容都在地址内容的开头附近
  # (!)
  print_good_address = 0x080492FB
  will_not_succeed_address = 0x080492C2
  # 使用模拟管理器的探索: 开始符号执行, 并搜索find地址, 避开avoid地址
  simulation.explore(find=print_good_address, avoid=will_not_succeed_address)

  #最终的结果会存储在simulation的found位向量中
  if simulation.found:#根据found是否为空来判断是否找到了我们想要的符号值
    solution_state = simulation.found[0]#获取符号值
    print(solution_state.posix.dumps(sys.stdin.fileno()).decode())#解码为字符串并打印
  else:
    raise Exception('Could not find the solution')

if __name__ == '__main__':
  main(sys.argv)

去掉注释后

import angr
import sys

def main(argv):
  path_to_binary = argv[1]
  project = angr.Project(path_to_binary)

  initial_state = project.factory.entry_state(
    add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
  )
  simulation = project.factory.simgr(initial_state)

  print_good_address = 0x080492FB
  will_not_succeed_address = 0x080492C2

  simulation.explore(find=print_good_address, avoid=will_not_succeed_address)

  if simulation.found:
    solution_state = simulation.found[0]
    print(solution_state.posix.dumps(sys.stdin.fileno()).decode())
  else:
    raise Exception('Could not find the solution')

if __name__ == '__main__':
  main(sys.argv)

整个的执行流程跟上一题差不多, 要学习的是

simulation.explore(find=print_good_address, avoid=will_not_succeed_address)

02_angr_find_codition

编译并运行

lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/02_angr_find_condition$ python3 generate.py 1234 02_angr_find_condition
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/02_angr_find_condition$ ./02_angr_find_condition
Enter the password: aaa
Try again.

分析

使用IDA分析

分析mian函数的部分

; int __cdecl main(int argc, const char **argv, const char **envp)
.text:080492BB                 public main
.text:080492BB main            proc near               ; DATA XREF: _start+2A↑o
.text:080492BB
.text:080492BB var_4C          = dword ptr -4Ch
.text:080492BB var_40          = dword ptr -40h
.text:080492BB var_3C          = dword ptr -3Ch
.text:080492BB var_38          = dword ptr -38h
.text:080492BB s1              = byte ptr -34h
.text:080492BB s2              = byte ptr -20h
.text:080492BB var_C           = dword ptr -0Ch
.text:080492BB var_4           = dword ptr -4
.text:080492BB argc            = dword ptr  8
.text:080492BB argv            = dword ptr  0Ch
.text:080492BB envp            = dword ptr  10h
.text:080492BB
.text:080492BB ; __unwind {
.text:080492BB                 endbr32
.text:080492BF                 lea     ecx, [esp+4]
.text:080492C3                 and     esp, 0FFFFFFF0h
.text:080492C6                 push    dword ptr [ecx-4]
.text:080492C9                 push    ebp
.text:080492CA                 mov     ebp, esp
.text:080492CC                 push    ecx
.text:080492CD                 sub     esp, 54h
.text:080492D0                 mov     eax, ecx
.text:080492D2                 mov     eax, [eax+4]
.text:080492D5                 mov     [ebp+var_4C], eax
.text:080492D8                 mov     eax, large gs:14h
.text:080492DE                 mov     [ebp+var_C], eax
.text:080492E1                 xor     eax, eax
.text:080492E3                 mov     [ebp+var_38], 0DEADBEEFh
.text:080492EA                 mov     [ebp+var_40], 0
.text:080492F1                 jmp     short loc_8049302
.text:080492F3 ; ---------------------------------------------------------------------------
.text:080492F3
.text:080492F3 loc_80492F3:                            ; CODE XREF: main+4B↓j
.text:080492F3                 lea     edx, [ebp+s2]
.text:080492F6                 mov     eax, [ebp+var_40]
.text:080492F9                 add     eax, edx
.text:080492FB                 mov     byte ptr [eax], 0
.text:080492FE                 add     [ebp+var_40], 1
.text:08049302
.text:08049302 loc_8049302:                            ; CODE XREF: main+36↑j
.text:08049302                 cmp     [ebp+var_40], 13h
.text:08049306                 jle     short loc_80492F3
.text:08049308                 lea     eax, [ebp+s2]
.text:0804930B                 mov     dword ptr [eax], 57514A4Dh
.text:08049311                 mov     dword ptr [eax+4], 454C4441h
.text:08049318                 sub     esp, 0Ch
.text:0804931B                 push    offset aEnterThePasswo ; "Enter the password: "
.text:08049320                 call    _printf
.text:08049325                 add     esp, 10h
.text:08049328                 sub     esp, 8
.text:0804932B                 lea     eax, [ebp+s1]
.text:0804932E                 push    eax
.text:0804932F                 push    offset a8s      ; "%8s"
.text:08049334                 call    ___isoc99_scanf
.text:08049339                 add     esp, 10h
.text:0804933C                 mov     [ebp+var_3C], 0
.text:08049343                 jmp     short loc_8049376
.text:08049345 ; ---------------------------------------------------------------------------
.text:08049345
.text:08049345 loc_8049345:                            ; CODE XREF: main+BF↓j
.text:08049345                 mov     eax, [ebp+var_3C]
.text:08049348                 lea     edx, [eax+8]
.text:0804934B                 lea     ecx, [ebp+s1]
.text:0804934E                 mov     eax, [ebp+var_3C]
.text:08049351                 add     eax, ecx
.text:08049353                 movzx   eax, byte ptr [eax]
.text:08049356                 movsx   eax, al
.text:08049359                 sub     esp, 8
.text:0804935C                 push    edx
.text:0804935D                 push    eax
.text:0804935E                 call    complex_function
.text:08049363                 add     esp, 10h
.text:08049366                 mov     ecx, eax
.text:08049368                 lea     edx, [ebp+s1]
.text:0804936B                 mov     eax, [ebp+var_3C]
.text:0804936E                 add     eax, edx
.text:08049370                 mov     [eax], cl
.text:08049372                 add     [ebp+var_3C], 1
.text:08049376
.text:08049376 loc_8049376:                            ; CODE XREF: main+88↑j
.text:08049376                 cmp     [ebp+var_3C], 7
.text:0804937A                 jle     short loc_8049345
.text:0804937C                 cmp     [ebp+var_38], 0DEADBEEFh ; 因为var_38的值一直都没有改变, 所以一旦第一个跳转了, 就说明后面的跳转会一直跳转到错误的位置
.text:08049383                 jz      loc_804B97C     ; 这里最终跳转到了Try again, 应该从这里跳转到的位置时设为"避开路径"
.text:08049389                 cmp     [ebp+var_38], 0DEADBEEFh
.text:08049390                 jz      loc_804A689
.text:08049396                 cmp     [ebp+var_38], 0DEADBEEFh
.text:0804939D                 jz      loc_8049D16
.text:080493A3                 cmp     [ebp+var_38], 0DEADBEEFh
.text:080493AA                 jnz     loc_8049863
.text:080493B0                 cmp     [ebp+var_38], 0DEADBEEFh
.text:080493B7                 jz      loc_8049610
.text:080493BD                 cmp     [ebp+var_38], 0DEADBEEFh
.text:080493C4                 jnz     loc_80494ED
.text:080493CA                 cmp     [ebp+var_38], 0DEADBEEFh
.text:080493D1                 jnz     loc_8049462
.text:080493D7                 cmp     [ebp+var_38], 0DEADBEEFh
.text:080493DE                 jnz     short loc_8049421
.text:080493E0                 sub     esp, 8
.text:080493E3                 lea     eax, [ebp+s2]
.text:080493E6                 push    eax             ; s2
.text:080493E7                 lea     eax, [ebp+s1]
.text:080493EA                 push    eax             ; s1
.text:080493EB                 call    _strcmp
.text:080493F0                 add     esp, 10h
.text:080493F3                 test    eax, eax
.text:080493F5                 jz      short loc_804940C
.text:080493F7                 sub     esp, 0Ch
.text:080493FA                 push    offset s        ; "Try again."
.text:080493FF                 call    _puts
.text:08049404                 add     esp, 10h
.text:08049407                 jmp     loc_804DF5E
.text:0804940C ; ---------------------------------------------------------------------------
.text:0804940C
.text:0804940C loc_804940C:                            ; CODE XREF: main+13A↑j
.text:0804940C                 sub     esp, 0Ch
.text:0804940F                 push    offset aGoodJob ; "Good Job."
.text:08049414                 call    _puts
.text:08049419                 add     esp, 10h
.text:0804941C                 jmp     loc_804DF5E
.text:08049421 ; ---------------------------------------------------------------------------
.text:08049421
.text:08049421 loc_8049421:                            ; CODE XREF: main+123↑j
.text:08049421                 sub     esp, 8
.text:08049424                 lea     eax, [ebp+s2]
.text:08049427                 push    eax             ; s2
.text:08049428                 lea     eax, [ebp+s1]
.text:0804942B                 push    eax             ; s1
.text:0804942C                 call    _strcmp
.text:08049431                 add     esp, 10h
.text:08049434                 test    eax, eax
.text:08049436                 jz      short loc_804944D
.text:08049438                 sub     esp, 0Ch
.text:0804943B                 push    offset s        ; "Try again."
.text:08049440                 call    _puts
.text:08049445                 add     esp, 10h
.text:08049448                 jmp     loc_804DF5E
.text:0804944D ; ---------------------------------------------------------------------------
.text:0804944D
.text:0804944D loc_804944D:                            ; CODE XREF: main+17B↑j
.text:0804944D                 sub     esp, 0Ch
.text:08049450                 push    offset aGoodJob ; "Good Job."
.text:08049455                 call    _puts
.text:0804945A                 add     esp, 10h
.text:0804945D                 jmp     loc_804DF5E
.text:08049462 ; ---------------------------------------------------------------------------
.text:08049462
.text:08049462 loc_8049462:                            ; CODE XREF: main+116↑j
.text:08049462                 cmp     [ebp+var_38], 0DEADBEEFh
.text:08049469                 jz      short loc_80494AC
.text:0804946B                 sub     esp, 8
.text:0804946E                 lea     eax, [ebp+s2]
.text:08049471                 push    eax             ; s2
.text:08049472                 lea     eax, [ebp+s1]
.text:08049475                 push    eax             ; s1
.text:08049476                 call    _strcmp
.text:0804947B                 add     esp, 10h
.text:0804947E                 test    eax, eax
.text:08049480                 jz      short loc_8049497
.text:08049482                 sub     esp, 0Ch
.text:08049485                 push    offset s        ; "Try again."
.text:0804948A                 call    _puts
.text:0804948F                 add     esp, 10h
.text:08049492                 jmp     loc_804DF5E
.text:08049497 ; ---------------------------------------------------------------------------
.text:08049497
.text:08049497 loc_8049497:                            ; CODE XREF: main+1C5↑j
.text:08049497                 sub     esp, 0Ch
.text:0804949A                 push    offset aGoodJob ; "Good Job."
.text:0804949F                 call    _puts
.text:080494A4                 add     esp, 10h
.text:080494A7                 jmp     loc_804DF5E

mian函数非常的大, 而且这个关卡与前面不同的地方在于它的<接受路径>和<避免路径>都有非常多条, 这样的结果就是我们很难通过以约束地址的方式来找到真正的路径, 即便能这样的效率也非常的低. 所以Angr提供多样的约束方式, 而这也是通过本关卡所需要的.

本题要用的是识别每个<状态>的标准输出中是否含有”Good Job.”字符串来判断是否是<接受路径>, 而其载体则是<函数>.

使用Angr解题

代码

# It is very useful to be able to search for a state that reaches a certain
# instruction. However, in some cases, you may not know the address of the
# specific instruction you want to reach (or perhaps there is no single
# instruction goal.) In this challenge, you don't know which instruction
# grants you success. Instead, you just know that you want to find a state where
# the binary prints "Good Job."
# 能够搜索到一个到达特定指令的状态是非常有用的. 但是, 在某些情况下, 你可能不知道要到达的特定指令的地址
# (或者可能没有单一的指令目标). 在这个关卡中, 你不知道那条指令可以让你成功. 相反, 你只知道你想找的是打印
# "Good Job"的状态

# Angr is powerful in that it allows you to search for a states that meets an 
# arbitrary condition that you specify in Python, using a predicate you define
# as a function that takes a state and returns True if you have found what you
# are looking for, and False otherwise.
# Angr的强大之处在于它允许你去搜索一个满足你已经在Python中声明了的任意条件的状态
# 使用你定义为函数的谓词, 该函数接收状态并在找到所需内容时返回True, 否则为False
import angr
import sys

def main(argv):
  #在二进制文件建立Angr项目
  path_to_binary = argv[1]
  project = angr.Project(path_to_binary)

  #指定执行的起始状态, 这里是main()函数
  initial_state = project.factory.entry_state(
    add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
  )

  #创建模拟管理器, 提供搜索和执行的工具
  simulation = project.factory.simgr(initial_state)

  # Define a function that checks if you have found the state you are looking
  # for.
  # 定义一个函数来检查你是否找到了你正在寻找的状态
  def is_successful(state):
    # Dump whatever has been printed out by the binary so far into a string.
    # 将目前二进制文件打印出来的任何内容转储到一个字符串中
    stdout_output = state.posix.dumps(sys.stdout.fileno())

    # Return whether 'Good Job.' has been printed yet.
    # 返回是否已经打印出"Good Job"
    # (!)
    return "Good Job".encode() in stdout_output#如果"Good Job."字符串在标准输出中, 则返回真

  # Same as above, but this time check if the state should abort. If you return
  # False, Angr will continue to step the state. In this specific challenge, the
  # only time at which you will know you should abort is when the program prints
  # "Try again."
  # 与上面相同, 但这次检查状态是否应该终止, 如果返回False, Angr将继续步入状态, 在
  # 这个特殊的挑战中, 你应该知道终止的唯一可能是在程序打印"Try again"的时候.
  def should_abort(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())

    return "Try again".encode() in stdout_output#如果"Tyr again."字符串在标准输出中, 则返回真

  # Tell Angr to explore the binary and find any state that is_successful identfies
  # as a successful state by returning True.
  # 让Angr探索二进制文件, 并通过返回True找到is_successful状态并识别为成功状态
  simulation.explore(find=is_successful, avoid=should_abort)

  if simulation.found:
    solution_state = simulation.found[0]
    print(solution_state.posix.dumps(sys.stdin.fileno()).decode())
  else:
    raise Exception('Could not find the solution')

if __name__ == '__main__':
  main(sys.argv)

去掉注释后

import angr
import sys

def main(argv):

  path_to_binary = argv[1]
  project = angr.Project(path_to_binary)

  initial_state = project.factory.entry_state(
    add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
  )

  simulation = project.factory.simgr(initial_state)

  def is_successful(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())

    return "Good Job".encode() in stdout_output#如果"Good Job."字符串在标准输出中, 则返回真

  def should_abort(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())

    return "Try again".encode() in stdout_output#如果"Tyr again."字符串在标准输出中, 则返回真

  simulation.explore(find=is_successful, avoid=should_abort)

  if simulation.found:
    solution_state = simulation.found[0]
    print(solution_state.posix.dumps(sys.stdin.fileno()).decode())
  else:
    raise Exception('Could not find the solution')

if __name__ == '__main__':
  main(sys.argv)

我们可以看到explore的参数类型发生了改变simulation.explore(find=is_successful, avoid=should_abort), 从具体的地址变成了回调函数, 而find和avoid就是通过回调函数的返回值来判断是<接受路径>还是<避免路径>

03_angr_symbolic_registers

编译并运行

lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/03_angr_symbolic_registers$ python3 generate.py 1234 03_angr_symbolic_registers
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/03_angr_symbolic_registers$ ./03_angr_symbolic_registers
Enter the password: aaaa
a
a
Try again.

分析

main函数

 ; int __cdecl main(int argc, const char **argv, const char **envp)
.text:080495CF                 public main
.text:080495CF main            proc near               ; DATA XREF: _start+2A↑o
.text:080495CF
.text:080495CF var_14          = dword ptr -14h
.text:080495CF var_10          = dword ptr -10h
.text:080495CF var_C           = dword ptr -0Ch
.text:080495CF var_4           = dword ptr -4
.text:080495CF argc            = dword ptr  8
.text:080495CF argv            = dword ptr  0Ch
.text:080495CF envp            = dword ptr  10h
.text:080495CF
.text:080495CF ; __unwind {
.text:080495CF                 endbr32
.text:080495D3                 lea     ecx, [esp+4]
.text:080495D7                 and     esp, 0FFFFFFF0h
.text:080495DA                 push    dword ptr [ecx-4]
.text:080495DD                 push    ebp
.text:080495DE                 mov     ebp, esp
.text:080495E0                 push    ecx
.text:080495E1                 sub     esp, 14h
.text:080495E4                 sub     esp, 0Ch
.text:080495E7                 push    offset aEnterThePasswo ; "Enter the password: "
.text:080495EC                 call    _printf
.text:080495F1                 add     esp, 10h
.text:080495F4                 call    get_user_input
.text:080495F9                 mov     [ebp+var_14], eax
.text:080495FC                 mov     [ebp+var_10], ebx
.text:080495FF                 mov     [ebp+var_C], edx
.text:08049602                 sub     esp, 0Ch
.text:08049605                 push    [ebp+var_14]
.text:08049608                 call    complex_function_1
.text:0804960D                 add     esp, 10h
.text:08049610                 mov     [ebp+var_14], eax
.text:08049613                 sub     esp, 0Ch
.text:08049616                 push    [ebp+var_10]
.text:08049619                 call    complex_function_2
.text:0804961E                 add     esp, 10h
.text:08049621                 mov     [ebp+var_10], eax
.text:08049624                 sub     esp, 0Ch
.text:08049627                 push    [ebp+var_C]
.text:0804962A                 call    complex_function_3
.text:0804962F                 add     esp, 10h
.text:08049632                 mov     [ebp+var_C], eax
.text:08049635                 cmp     [ebp+var_14], 0
.text:08049639                 jnz     short loc_8049647
.text:0804963B                 cmp     [ebp+var_10], 0
.text:0804963F                 jnz     short loc_8049647
.text:08049641                 cmp     [ebp+var_C], 0
.text:08049645                 jz      short loc_8049659
.text:08049647
.text:08049647 loc_8049647:                            ; CODE XREF: main+6A↑j
.text:08049647                                         ; main+70↑j
.text:08049647                 sub     esp, 0Ch
.text:0804964A                 push    offset s        ; "Try again."
.text:0804964F                 call    _puts
.text:08049654                 add     esp, 10h
.text:08049657                 jmp     short loc_8049669
.text:08049659 ; ---------------------------------------------------------------------------
.text:08049659
.text:08049659 loc_8049659:                            ; CODE XREF: main+76↑j
.text:08049659                 sub     esp, 0Ch
.text:0804965C                 push    offset aGoodJob ; "Good Job."
.text:08049661                 call    _puts
.text:08049666                 add     esp, 10h
.text:08049669
.text:08049669 loc_8049669:                            ; CODE XREF: main+88↑j
.text:08049669                 mov     eax, 0
.text:0804966E                 mov     ecx, [ebp+var_4]
.text:08049671                 leave
.text:08049672                 lea     esp, [ecx-4]
.text:08049675                 retn
.text:08049675 ; } // starts at 80495CF
.text:08049675 main            endp

在读取了用户输入后, 进行了三次判断, 且只有一个<接受路径>和<避免路径>, <接受路径>和<避免路径>都只有一个, 可以用地址来表示, 但是还是用回调函数判断输出的方式会更通用一些, 我们先跟进get_user_input(), 然后跟进加密函数complex_function_1()

get_user_input()

; __unwind {
.text:08049582                 endbr32
.text:08049586                 push    ebp
.text:08049587                 mov     ebp, esp
.text:08049589                 sub     esp, 18h
.text:0804958C                 mov     eax, large gs:14h
.text:08049592                 mov     [ebp+var_C], eax
.text:08049595                 xor     eax, eax
.text:08049597                 lea     ecx, [ebp+var_10]
.text:0804959A                 push    ecx
.text:0804959B                 lea     ecx, [ebp+var_14]
.text:0804959E                 push    ecx
.text:0804959F                 lea     ecx, [ebp+var_18]
.text:080495A2                 push    ecx
.text:080495A3                 push    offset aXXX     ; 读取三个整数
.text:080495A8                 call    ___isoc99_scanf
.text:080495AD                 add     esp, 10h
.text:080495B0                 mov     eax, [ebp+var_18]
.text:080495B3                 mov     edx, [ebp+var_14]
.text:080495B6                 mov     ebx, edx
.text:080495B8                 mov     edx, [ebp+var_10]
.text:080495BB                 nop
.text:080495BC                 mov     ecx, [ebp+var_C]
.text:080495BF                 xor     ecx, large gs:14h
.text:080495C6                 jz      short locret_80495CD
.text:080495C8                 call    ___stack_chk_fail

aXXX字符串常量是"%x %x %x", 是读取三个整数

complex_function_1()

 __unwind {
.text:08049218                 endbr32
.text:0804921C                 push    ebp
.text:0804921D                 mov     ebp, esp
.text:0804921F                 xor     [ebp+arg_0], 0B47ECDE5h
.text:08049226                 add     [ebp+arg_0], 7C34CCE5h
.text:0804922D                 mov     eax, [ebp+arg_0]
.text:08049230                 sub     eax, 68313899h
.text:08049235                 mov     [ebp+arg_0], eax
.text:08049238                 add     [ebp+arg_0], 25546EB0h
.text:0804923F                 xor     [ebp+arg_0], 667C4B5Fh
.text:08049246                 add     [ebp+arg_0], 5218863Ah
.text:0804924D                 add     [ebp+arg_0], 4A61A7C0h
.text:08049254                 xor     [ebp+arg_0], 0D81E98A1h
.text:0804925B                 add     [ebp+arg_0], 0EA4480Ah
.text:08049262                 xor     [ebp+arg_0], 0FA091CC9h
.text:08049269                 xor     [ebp+arg_0], 0BB654A8Dh
.text:08049270                 xor     [ebp+arg_0], 518DA374h
.text:08049277                 mov     eax, [ebp+arg_0]
.text:0804927A                 sub     eax, 573A9DA6h
.text:0804927F                 mov     [ebp+arg_0], eax
.text:08049282                 mov     eax, [ebp+arg_0]
.text:08049285                 sub     eax, 720B986Bh
.text:0804928A                 mov     [ebp+arg_0], eax
.text:0804928D                 xor     [ebp+arg_0], 62F378CAh
.text:08049294                 mov     eax, [ebp+arg_0]
.text:08049297                 sub     eax, 7C6D5375h
.text:0804929C                 mov     [ebp+arg_0], eax
.text:0804929F                 xor     [ebp+arg_0], 5073ECCAh
.text:080492A6                 xor     [ebp+arg_0], 1E57541Dh
.text:080492AD                 mov     eax, [ebp+arg_0]
.text:080492B0                 sub     eax, 5EDD295Ch
.text:080492B5                 mov     [ebp+arg_0], eax
.text:080492B8                 mov     eax, [ebp+arg_0]
.text:080492BB                 sub     eax, 70D80AE9h
.text:080492C0                 mov     [ebp+arg_0], eax
.text:080492C3                 add     [ebp+arg_0], 41CE452Bh
.text:080492CA                 xor     [ebp+arg_0], 2CB9F02h
.text:080492D1                 xor     [ebp+arg_0], 1DA8339Eh
.text:080492D8                 add     [ebp+arg_0], 1708A4A7h
.text:080492DF                 xor     [ebp+arg_0], 0CCAC92A7h
.text:080492E6                 mov     eax, [ebp+arg_0]
.text:080492E9                 sub     eax, 3D111768h
.text:080492EE                 mov     [ebp+arg_0], eax
.text:080492F1                 mov     eax, [ebp+arg_0]
.text:080492F4                 sub     eax, 2369D87Bh
.text:080492F9                 mov     [ebp+arg_0], eax
.text:080492FC                 xor     [ebp+arg_0], 184BE27Fh
.text:08049303                 add     [ebp+arg_0], 281A05Eh
.text:0804930A                 add     [ebp+arg_0], 66205B90h
.text:08049311                 xor     [ebp+arg_0], 3C3B90F5h
.text:08049318                 mov     eax, [ebp+arg_0]
.text:0804931B                 pop     ebp
.text:0804931C                 retn

可以看到是不断的与常量xor, and, sub和add运算最终返回运算结果, 需要注意的是运算过程中间arg_0变量的值会传送给eax寄存器mov eax, [ebp+arg_0], 而且最后返回时也会将值传送给eax寄存器.

再看看最终到达<接受路径>的条件

.text:08049635 cmp [ebp+var_14], 0 .text:08049639 jnz short loc_8049647 .text:0804963B cmp [ebp+var_10], 0 .text:0804963F jnz short loc_8049647 .text:08049641 cmp [ebp+var_C], 0 .text:08049645 jz short loc_8049659

比较转换后的三个输入整数是否为0

使用Angr解题

首先因为Angr不支持scanf读取多个内容, 所以本题中读取三个十六进制整数的符号我们无法让Angr自动完成注入. 所以我们要在合适的执行位置, 找到注入点.

1
2
3

.text:080495F9                 mov     [ebp+var_14], eax
.text:080495FC                 mov     [ebp+var_10], ebx
.text:080495FF                 mov     [ebp+var_C], edx

这里就是我们所需要的注入点, 我们是以寄存器作为符号.

password0_size_in_bits = 32  # :integer, 这个是符号位向量的位数
  password0 = claripy.BVS('password0', password0_size_in_bits)# 成功创建符号
  password1 = claripy.BVS('password1', password0_size_in_bits)
  password2 = claripy.BVS('password2', password0_size_in_bits)
  initial_state.regs.eax = password0
  initial_state.regs.ebx = password1
  initial_state.regs.edx = password2

注入符号并符号执行的过程我们可以理解为: [从符号执行的此刻开始, 你就是未知数了]. 在本题中的例子就是这样

接下来是两种情况:

如上图最上方的红框, 如果我们在这个位置开始符号执行, 那么我们就是以此刻的三个寄存器值来作为未知数. 最终Angr还原的符号就是此时的eax, ebx, edx, 可此时eax, ebx还有edx并不是我们所想要求得的flag, 所以不能想之前的关卡那样默认从mian()开头开始符号执行
另一种情况就是从第二个红框处开始符号执行, 最终Angr还原的符号就是此时的eax, ebx, edx. 而这三个数此时就是我们的flag, 所以从这里开始执行才是正确的.

下面是代码

# Angr doesn't currently support reading multiple things with scanf (Ex: 
# scanf("%u %u).) You will have to tell the simulation engine to begin the
# program after scanf is called and manually inject the symbols into registers.
# Angr目前不支持使用scanf读取多个内容, (如: scanf("%u %u")), 你必须告诉模拟引擎在调用
# scanf后启动程序并手动将符号注入寄存器
import angr
import claripy
import sys

def main(argv):
  # 在二进制文件建立Angr项目
  path_to_binary = argv[1]
  project = angr.Project(path_to_binary)

  # Sometimes, you want to specify where the program should start. The variable
  # start_address will specify where the symbolic execution engine should begin.
  # Note that we are using blank_state, not entry_state.
  # 有时, 你想要明确指定程序应该从哪里开始. 变量start_address将指示符号引擎从哪里开始
  # 请注意, 我们使用的是blank_state()而不是entry_state()
  # (!)
  start_address = 0x080495F9  # :integer (probably hexadecimal)
  initial_state = project.factory.blank_state(
    addr=start_address,
    add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
  )

  # Create a symbolic bitvector (the datatype Angr uses to inject symbolic
  # values into the binary.) The first parameter is just a name Angr uses
  # to reference it. 
  # You will have to construct multiple bitvectors. Copy the two lines below
  # and change the variable names. To figure out how many (and of what size)
  # you need, dissassemble the binary and determine the format parameter passed
  # to scanf.
  # 创建一个符号位向量(Angr用来将符号值注入二进制文件的数据类型), 第一个参数只是Angr用来引用它的名称
  # 你必须将多个寄存器设置为不同的位向量, 先声明一定数量的符号位向量, 然后在合适的位置, 合适的寄存器注入符号
  # (!)
  password0_size_in_bits = 32  # :integer, 这个是符号位向量的位数
  password0 = claripy.BVS('password0', password0_size_in_bits)# 成功创建符号
  password1 = claripy.BVS('password1', password0_size_in_bits)
  password2 = claripy.BVS('password2', password0_size_in_bits)
  ...

  # Set a register to a symbolic value. This is one way to inject symbols into
  # the program.
  # 将寄存器设置为符号值, 这是将符号注入程序的一种方法
  # initial_state.regs stores a number of convenient attributes that reference
  # registers by name. For example, to set eax to password0, use:
  # initial_state.regs存储了许多按名称引用寄存器的遍历属性.
  # 如, 将eax设置为password0, 请使用下面的语句
  # initial_state.regs.eax = password0
  #
  # You will have to set multiple registers to distinct bitvectors. Copy and
  # paste the line below and change the register. To determine which registers
  # to inject which symbol, dissassemble the binary and look at the instructions
  # immediately following the call to scanf.
  # 你必须将多个寄存器设置为不同的位向量. 
  # 复制并粘贴下面的行并更改寄存器.
  # 要确定哪些寄存器要注入哪些符号, 请反汇编二进制文件并查看调用scanf后立即执行的指令 
  # (!)
  initial_state.regs.eax = password0
  initial_state.regs.ebx = password1
  initial_state.regs.edx = password2

  # 创建一个模拟管理器
  simulation = project.factory.simgr(initial_state)

  def is_successful(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return "Good Job".encode() in stdout_output

  def should_abort(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return "Try again".encode() in stdout_output

  simulation.explore(find=is_successful, avoid=should_abort)

  if simulation.found:
    solution_state = simulation.found[0]# 如果找到了<接受路径>

    # Solve for the symbolic values. If there are multiple solutions, we only
    # care about one, so we can use eval, which returns any (but only one)
    # solution. Pass eval the bitvector you want to solve for.
    # 求解符号值. 如果有多个解决方案, 我们只关心一个, 所以我们可以使用eval()方法, 这个方法可以返回任何(但只有一个)
    # 解决方案, 将需要求解的位向量传递给eval()方法
    # (!)
    solution0 = solution_state.solver.eval(password0)
    solution1 = solution_state.solver.eval(password1)
    solution2 = solution_state.solver.eval(password2)

    # Aggregate and format the solutions you computed above, and then print
    # the full string. Pay attention to the order of the integers, and the
    # expected base (decimal, octal, hexadecimal, etc).
    # 合并, 格式话你在上面得到的答案, 然后打印完整的字符串. 注意整数的顺序, 以及预期的基数
    # 
    solution = str(hex(solution0)) + ' ' + str(hex(solution1)) + ' ' + str(hex(solution2))   # :string, 注意这里要转换成十六进制, 因为scanf用的是%x
    print(solution)
  else:
    raise Exception('Could not find the solution')

if __name__ == '__main__':
  main(sys.argv)

去掉注释后

import angr
import claripy
import sys

def main(argv):
  path_to_binary = argv[1]
  project = angr.Project(path_to_binary)

  start_address = 0x080495F9  # :integer (probably hexadecimal)
  initial_state = project.factory.blank_state(
    addr=start_address,
    add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
  )

  password0_size_in_bits = 32  
  password0 = claripy.BVS('password0', password0_size_in_bits)
  password1 = claripy.BVS('password1', password0_size_in_bits)
  password2 = claripy.BVS('password2', password0_size_in_bits)
  initial_state.regs.eax = password0
  initial_state.regs.ebx = password1
  initial_state.regs.edx = password2

  simulation = project.factory.simgr(initial_state)

  def is_successful(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return "Good Job".encode() in stdout_output

  def should_abort(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return "Try again".encode() in stdout_output

  simulation.explore(find=is_successful, avoid=should_abort)

  if simulation.found:
    solution_state = simulation.found[0]

    solution0 = solution_state.solver.eval(password0)
    solution1 = solution_state.solver.eval(password1)
    solution2 = solution_state.solver.eval(password2)

    solution = str(hex(solution0)) + ' ' + str(hex(solution1)) + ' ' + str(hex(solution2))   # :string, 注意这里要转换成十六进制, 因为scanf用的是%x
    print(solution)
  else:
    raise Exception('Could not find the solution')

if __name__ == '__main__':
  main(sys.argv)

04_angr_symbolic_stack

编译并运行

lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/04_angr_symbolic_stack$ python3 generate.py 1
234 04_angr_symbolic_stack
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/04_angr_symbolic_stack$ cp 04_angr_symbolic_s
tack ./1
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/04_angr_symbolic_stack$ ./04_angr_symbolic_st
ack
Enter the password: aaaaaaa
Try again.

分析

使用IDA分析

mian()

.text:08049450 ; __unwind {
.text:08049450                 endbr32
.text:08049454                 lea     ecx, [esp+4]
.text:08049458                 and     esp, 0FFFFFFF0h
.text:0804945B                 push    dword ptr [ecx-4]
.text:0804945E                 push    ebp
.text:0804945F                 mov     ebp, esp
.text:08049461                 push    ecx
.text:08049462                 sub     esp, 4
.text:08049465                 sub     esp, 0Ch
.text:08049468                 push    offset aEnterThePasswo ; "Enter the password: "
.text:0804946D                 call    _printf         ; 打印提示
.text:08049472                 add     esp, 10h
.text:08049475                 call    handle_user     ; 主要函数
.text:0804947A                 mov     eax, 0
.text:0804947F                 mov     ecx, [ebp+var_4]
.text:08049482                 leave
.text:08049483                 lea     esp, [ecx-4]
.text:08049486                 retn
.text:08049486 ; } // starts at 8049450
.text:08049486 main            endp

handle_user()

.text:080493D0 ; int handle_user()
.text:080493D0                 public handle_user
.text:080493D0 handle_user     proc near               ; CODE XREF: main+25↓p
.text:080493D0
.text:080493D0 num_2           = dword ptr -10h
.text:080493D0 num_1           = dword ptr -0Ch
.text:080493D0
.text:080493D0 ; __unwind {
.text:080493D0                 endbr32
.text:080493D4                 push    ebp
.text:080493D5                 mov     ebp, esp
.text:080493D7                 sub     esp, 18h
.text:080493DA                 sub     esp, 4
.text:080493DD                 lea     eax, [ebp+num_2] ; 跟上题不同的是输入后不能用寄存器来作为符号了(一般题目都不会那样)
.text:080493E0                 push    eax
.text:080493E1                 lea     eax, [ebp+num_1] ; 同上
.text:080493E4                 push    eax
.text:080493E5                 push    offset aUU      ; "%u %u"
.text:080493EA                 call    ___isoc99_scanf
.text:080493EF                 add     esp, 10h
.text:080493F2                 mov     eax, [ebp+num_1]
.text:080493F5                 sub     esp, 0Ch
.text:080493F8                 push    eax
.text:080493F9                 call    complex_function0 ; 给num_1变换
.text:080493FE                 add     esp, 10h
.text:08049401                 mov     [ebp+num_1], eax
.text:08049404                 mov     eax, [ebp+num_2]
.text:08049407                 sub     esp, 0Ch
.text:0804940A                 push    eax
.text:0804940B                 call    complex_function1 ; 给num_2变换
.text:08049410                 add     esp, 10h
.text:08049413                 mov     [ebp+num_2], eax
.text:08049416                 mov     eax, [ebp+num_1] ; 关键比较
.text:08049419                 cmp     eax, 0A63C9805h
.text:0804941E                 jnz     short loc_804942A
.text:08049420                 mov     eax, [ebp+num_2]
.text:08049423                 cmp     eax, 0B47ECDE5h
.text:08049428                 jz      short loc_804943C
.text:0804942A
.text:0804942A loc_804942A:                            ; CODE XREF: handle_user+4E↑j
.text:0804942A                 sub     esp, 0Ch
.text:0804942D                 push    offset s        ; "Try again."
.text:08049432                 call    _puts
.text:08049437                 add     esp, 10h
.text:0804943A                 jmp     short loc_804944D
.text:0804943C ; ---------------------------------------------------------------------------
.text:0804943C
.text:0804943C loc_804943C:                            ; CODE XREF: handle_user+58↑j
.text:0804943C                 sub     esp, 0Ch
.text:0804943F                 push    offset aGoodJob ; "Good Job."
.text:08049444                 call    _puts
.text:08049449                 add     esp, 10h
.text:0804944C                 nop
.text:0804944D
.text:0804944D loc_804944D:                            ; CODE XREF: handle_user+6A↑j
.text:0804944D                 nop
.text:0804944E                 leave
.text:0804944F                 retn
.text:0804944F ; } // starts at 80493D0
.text:0804944F handle_user     endp

根据题目的名字来判断, 我们要用堆栈的局部变量来作为符号.

同时前面有个误区就是我认为的符号执行起点类似于断点, 断点前的语句都执行过了, 然后在断点处注入符号, 而实际上的符号执行起点则是直接从该点开始执行, 前面的语句都没有作用.

所以这里我们要考虑栈存在的问题, 因为scanf以前的语句都没有执行, 所以局部变量其实也是不存在的, 要想通过scanf的两个参数(即两个局部变量)来注入符号, 就需要自己构造一个栈.

使用Angr解题

具体的分析还是先看Angr代码后再讲会清晰一些

以下是解题代码

# This challenge will be more challenging than the previous challenges that you
# have encountered thus far. Since the goal of this CTF is to teach symbolic
# execution and not how to construct stack frames, these comments will work you
# through understanding what is on the stack.
#   ! ! !
# IMPORTANT: Any addresses in this script aren't necessarily right! Dissassemble
#            the binary yourself to determine the correct addresses!
#   ! ! !
# 这个挑战比之前的都要难一些. 由于这个CTF的目标是教你Angr而不是堆栈, 因此这些注释将
# 帮助你了解栈上的内容
# 重要提示: 此脚本中的任何地址都不一定是正确的, 需要自己反汇编二进制代码查看
import angr
import claripy
import sys

def main(argv):
  # 找到二进制文件并在此基础上建立Angr项目
  path_to_binary = argv[1]
  project = angr.Project(path_to_binary)

  # For this challenge, we want to begin after the call to scanf. Note that this
  # is in the middle of a function.
  #针对这个关卡, 我们想要再调用完scanf以后开始符号执行. 记住这是在函数的中间完成的
  # This challenge requires dealing with the stack, so you have to pay extra
  # careful attention to where you start, otherwise you will enter a condition
  # where the stack is set up incorrectly. In order to determine where after
  # scanf to start, we need to look at the dissassembly of the call and the
  # instruction immediately following it:
  # 这个挑战需要处理堆栈, 所以你需要特别注意你从哪里开始, 否则你会进入错误的堆栈.
  # 为了确定scanf之后从哪里开始, 我们需要查看调用的反汇编和紧随其后的指令
  #   sub    $0x4,%esp
  #   lea    -0x10(%ebp),%eax
  #   push   %eax
  #   lea    -0xc(%ebp),%eax
  #   push   %eax
  #   push   $0x80489c3
  #   call   8048370 <__isoc99_scanf@plt>
  #   add    $0x10,%esp
  # Now, the question is: do we start on the instruction immediately following
  # scanf (add $0x10,%esp), or the instruction following that (not shown)?
  # Consider what the 'add $0x10,%esp' is doing. Hint: it has to do with the
  # scanf parameters that are pushed to the stack before calling the function.
  # Given that we are not calling scanf in our Angr simulation, where should we
  # start?
  # 现在的问题是: 我们是从scanf之后(add $0x10, %esp)开始, 还是在这个指令之后的指令开始.
  # 提示: 它与调用函数之前被压入堆栈的scanf参数有关. 
  # 鉴于我们没有在Angr模拟中调用scanf, 我们应该从哪里开始呢?
  # 答案是在add后面开始, 因为该语句负责清理scanf的堆栈, 如果直接在这条语句处开始, 那么使用要调用者函数的栈数据的话, 地址都需要加上0x10
  # (!)
  start_address = 0x80493F2
  initial_state = project.factory.blank_state(
    addr=start_address,
    add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
  )

  # We are jumping into the middle of a function! Therefore, we need to account
  # for how the function constructs the stack. The second instruction of the
  # function is:
  # 我们正在跳入一个函数的中间! 因此, 我们需要知道堆栈的结构, 该函数的第二条指令是
  #   mov    %esp,%ebp这个是ATT风格的汇编语句
  # At which point it allocates the part of the stack frame we plan to target:
  # 在这里上, 它分配了我们计划起始的栈帧的一部分
  #   sub    $0x18,%esp
  # Note the value of esp relative to ebp. The space between them is (usually)
  # the stack space. Since esp was decreased by 0x18
  # 注意esp相对于ebp的值, 它们之间的空间是堆栈栈空间. 由于esp减少了0x18
  #
  #                   高地址
  #        /-------- The stack --------\
  # ebp -> |                           |
  #        |---------------------------|
  #        |                           |
  #        |---------------------------|
  #         . . . (total of 0x18 bytes)
  #         . . . Somewhere in here is
  #         . . . the data that stores
  #         . . . the result of scanf.
  # esp -> |                           |
  #        \---------------------------/
  #                   低地址
  # Since we are starting after scanf, we are skipping this stack construction
  # step. To make up for this, we need to construct the stack ourselves. Let us
  # start by initializing ebp in the exact same way the program does.
  # 因为我们是在scanf之后开始的, 所以我们跳过了这个堆栈构建步骤.
  # 为了弥补这一点, 我们需要自己构建堆栈. 然我们以与程序完全相同的方式初始化ebp
  initial_state.regs.ebp = initial_state.regs.esp

  # scanf("%u %u") needs to be replaced by injecting two bitvectors. The
  # reason for this is that Angr does not (currently) automatically inject
  # symbols if scanf has more than one input parameter. This means Angr can
  # handle 'scanf("%u")', but not 'scanf("%u %u")'.
  # You can either copy and paste the line below or use a Python list.
  # scanf("%u %u")需要通过注入两个位向量来替换. 原因是如果scanf有多个输入参数, Angr不会自动注入符号
  # 这意味着Angr可以处理scanf("%u"), 但不能处理scanf("%u %u").
  # 你可以复制粘贴下面的行或使用Python列表
  # (!)
  password_size_bits = 32
  password0 = claripy.BVS('password0', password_size_bits)
  password1 = claripy.BVS('password1', password_size_bits)

  # Here is the hard part. We need to figure out what the stack looks like, at
  # least well enough to inject our symbols where we want them. In order to do
  # that, let's figure out what the parameters of scanf are:
  # 这是最难的部分, 我们需要能清楚堆栈长什么样, 至少能让我们的符号注入我们想要的地方.
  # 为了做到这一点, 让我们能清楚scanf的参数是什么
  #   sub    $0x4,%esp
  #   lea    -0x10(%ebp),%eax
  #   push   %eax
  #   lea    -0xc(%ebp),%eax
  #   push   %eax
  #   push   $0x80489c3
  #   call   8048370 <__isoc99_scanf@plt>
  #   add    $0x10,%esp 
  # As you can see, the call to scanf looks like this:
  # 正如你所看见的, 对scanf的调用如下所示
  # scanf(  0x80489c3,   ebp - 0xc,   ebp - 0x10  )
  #      format_string    password0    password1
  #  From this, we can construct our new, more accurate stack diagram:
  # 由此, 我们可以构建新的, 更准确的堆栈图
  #
  #            /-------- The stack --------\
  # ebp ->     |          padding          |
  #            |---------------------------|
  # ebp - 0x01 |       more padding        |
  #            |---------------------------|
  # ebp - 0x02 |     even more padding     |
  #            |---------------------------|
  #                        . . .               <- How much padding? Hint: how
  #            |---------------------------|      many bytes is password0?  填充了9个字节, password0从9开始
  # ebp - 0x0b |   password0, second byte  |
  #            |---------------------------|
  # ebp - 0x0c |   password0, first byte   |
  #            |---------------------------|
  # ebp - 0x0d |   password1, last byte    |
  #            |---------------------------|
  #                        . . .
  #            |---------------------------|
  # ebp - 0x10 |   password1, first byte   |
  #            |---------------------------|
  #                        . . .
  #            |---------------------------|
  # esp ->     |                           |
  #            \---------------------------/
  #
  # Figure out how much space there is and allocate the necessary padding to
  # the stack by decrementing esp before you push the password bitvectors.
  # 计算出有多少空间, 并在你压入password位向量之前通过递减esp来分配栈空间
  padding_length_in_bytes = 0x8  # :integer
  initial_state.regs.esp -= padding_length_in_bytes#相当于汇编: sub    esp, 0x10

  # Push the variables to the stack. Make sure to push them in the right order!
  # The syntax for the following function is:
  #将变量压入堆栈, 确保以正确的形式压入它们(正确的顺序), 以下函数的语法是:
  # initial_state.stack_push(bitvector)
  #
  # This will push the bitvector on the stack, and increment esp the correct
  # amount. You will need to push multiple bitvectors on the stack.
  # 这会将位向量压入栈中, 并给esp加上正确的值
  # 你将需要再栈中压入许多的位向量
  # (!)
  initial_state.stack_push(password0)  # :bitvector (claripy.BVS, claripy.BVV, claripy.BV)
  initial_state.stack_push(password1)

  simulation = project.factory.simgr(initial_state)

  def is_successful(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return "Good Job".encode() in stdout_output

  def should_abort(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return "Try again".encode() in stdout_output

  simulation.explore(find=is_successful, avoid=should_abort)

  if simulation.found:
    solution_state = simulation.found[0]

    solution0 = solution_state.solver.eval(password0)
    solution1 = solution_state.solver.eval(password1)

    solution = str((solution0)) + ' ' + str((solution1))
    print(solution)
  else:
    raise Exception('Could not find the solution')

if __name__ == '__main__':
  main(sys.argv)

去掉注释后

import angr
import claripy
import sys

def main(argv):
  path_to_binary = argv[1]
  project = angr.Project(path_to_binary)

  start_address = 0x80493F2
  initial_state = project.factory.blank_state(
    addr=start_address,
    add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
  )
  initial_state.regs.ebp = initial_state.regs.esp

  password_size_bits = 32
  password0 = claripy.BVS('password0', password_size_bits)
  password1 = claripy.BVS('password1', password_size_bits)

  padding_length_in_bytes = 0x8
  initial_state.regs.esp -= padding_length_in_bytes

  initial_state.stack_push(password0)
  initial_state.stack_push(password1)

  simulation = project.factory.simgr(initial_state)

  def is_successful(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return "Good Job".encode() in stdout_output

  def should_abort(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return "Try again".encode() in stdout_output

  simulation.explore(find=is_successful, avoid=should_abort)

  if simulation.found:
    solution_state = simulation.found[0]

    solution0 = solution_state.solver.eval(password0)
    solution1 = solution_state.solver.eval(password1)

    solution = str((solution0)) + ' ' + str((solution1))
    print(solution)
  else:
    raise Exception('Could not find the solution')

if __name__ == '__main__':
  main(sys.argv)

问题一: 符号执行的起点是在`add esp, 10h`前面还是后面

这里我们要知道的是: 在符号执行之前, Angr就会完成符号注入

如果在add esp, 10h处注入, Angr会先获取该条指令处的state, 而当前的state的esp指针是经过了___isoc99_scanf的retn指令后的入口地址, 但是参数是在调用之前压入栈中的, 也就是说此时的esp是在传入三个参数后的位置. 此时注入了两个符号分别是esp - 8 和 esp - 4, 在注入完成后才执行add esp, 10h, 这个指令的作用就是清理参数的栈回到调用者的栈顶, 所以前面注入的符号直接作废了. 下面的图像会直观一些

这是在add esp, 10h命令的位置符号执行, 符号注入发生在符号执行之前

下面是执行完add esp, 10h命令后

所以我们需要在add esp, 10h后开始符号执行

关于栈注入符号的理解

即假定一块栈的内容为x, 然后进行条件约束并求解.

如何注入

1. 构造栈

可以看看引用password的方式ebp - 0xc 和 ebp - 0x10, 都是通过ebp加上偏移量来是实现的, 我们需要自己构造一个栈用来保存自己的符号. 同时保证后面的汇编语句能够正确引用到我们的符号

1	initial_state.regs.ebp = initial_state.regs.esp

当ebp改变时, 就是创建了一个新的栈帧, 也就是一个新的栈.

向栈中注入符号

padding_length_in_bytes = 0x8
initial_state.regs.esp -= padding_length_in_bytes

initial_state.stack_push(password0)
initial_state.stack_push(password1)

前两个语句是调整偏移, 为了模仿原本的栈结构, 使后面的汇编语句能够正确引用到这两个符号.

下面的两个语句就是注入符号到栈中了(源代码在前面申请了这两个32位的符号)

符号执行

1	simulation = project.factory.simgr(initial_state)

05_angr_symbolic_memory

编译并执行

lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/05_angr_symbolic_memory$ python3 generate.py 1234 05_angr_symbolic_memory
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/05_angr_symbolic_memory$ ./05_angr_symbolic_memory
Enter the password: aaaaa
aaaaa
aaaaa
aaaaa
Try again.

分析

使用IDA查看

.text:080492BB ; __unwind {
.text:080492BB                 endbr32
.text:080492BF                 lea     ecx, [esp+4]
.text:080492C3                 and     esp, 0FFFFFFF0h
.text:080492C6                 push    dword ptr [ecx-4]
.text:080492C9                 push    ebp
.text:080492CA                 mov     ebp, esp
.text:080492CC                 push    ecx
.text:080492CD                 sub     esp, 14h
.text:080492D0                 sub     esp, 4
.text:080492D3                 push    21h ; '!'       ; n
.text:080492D5                 push    0               ; c
.text:080492D7                 push    offset user_input ; s
.text:080492DC                 call    _memset         ; 初始化内存
.text:080492E1                 add     esp, 10h
.text:080492E4                 sub     esp, 0Ch
.text:080492E7                 push    offset aEnterThePasswo ; "Enter the password: "
.text:080492EC                 call    _printf
.text:080492F1                 add     esp, 10h
.text:080492F4                 sub     esp, 0Ch
.text:080492F7                 push    offset unk_8134378
.text:080492FC                 push    offset unk_8134370
.text:08049301                 push    offset unk_8134368
.text:08049306                 push    offset user_input ; 将输入保存在上面申请的内存中
.text:0804930B                 push    offset a8s8s8s8s ; "%8s %8s %8s %8s"
.text:08049310                 call    ___isoc99_scanf
.text:08049315                 add     esp, 20h
.text:08049318                 mov     [ebp+var_C], 0
.text:0804931F                 jmp     short loc_804934E
.text:08049321 ; ---------------------------------------------------------------------------
.text:08049321
.text:08049321 loc_8049321:                            ; CODE XREF: main+97↓j
.text:08049321                 mov     eax, [ebp+var_C]
.text:08049324                 add     eax, 8134360h
.text:08049329                 movzx   eax, byte ptr [eax]
.text:0804932C                 movsx   eax, al
.text:0804932F                 sub     esp, 8
.text:08049332                 push    [ebp+var_C]
.text:08049335                 push    eax             ; 将内存中保存的输入进行变换
.text:08049336                 call    complex_function
.text:0804933B                 add     esp, 10h
.text:0804933E                 mov     edx, eax
.text:08049340                 mov     eax, [ebp+var_C]
.text:08049343                 add     eax, 8134360h
.text:08049348                 mov     [eax], dl
.text:0804934A                 add     [ebp+var_C], 1
.text:0804934E
.text:0804934E loc_804934E:                            ; CODE XREF: main+64↑j
.text:0804934E                 cmp     [ebp+var_C], 1Fh
.text:08049352                 jle     short loc_8049321
.text:08049354                 sub     esp, 4
.text:08049357                 push    20h ; ' '       ; n
.text:08049359                 push    offset s2       ; "HXUITWOAPNESFFEGWWORQMFUTTYBKKFF"
.text:0804935E                 push    offset user_input ; s1
.text:08049363                 call    _strncmp
.text:08049368                 add     esp, 10h
.text:0804936B                 test    eax, eax
.text:0804936D                 jz      short loc_8049381
.text:0804936F                 sub     esp, 0Ch
.text:08049372                 push    offset s        ; "Try again."
.text:08049377                 call    _puts
.text:0804937C                 add     esp, 10h
.text:0804937F                 jmp     short loc_8049391
.text:08049381 ; ---------------------------------------------------------------------------
.text:08049381
.text:08049381 loc_8049381:                            ; CODE XREF: main+B2↑j
.text:08049381                 sub     esp, 0Ch
.text:08049384                 push    offset aGoodJob ; "Good Job."
.text:08049389                 call    _puts
.text:0804938E                 add     esp, 10h
.text:08049391
.text:08049391 loc_8049391:                            ; CODE XREF: main+C4↑j
.text:08049391                 mov     eax, 0
.text:08049396                 mov     ecx, [ebp+var_4]
.text:08049399                 leave
.text:0804939A                 lea     esp, [ecx-4]
.text:0804939D                 retn
.text:0804939D ; } // starts at 80492BB
.text:0804939D main            endp
.text:0804939D
.text:0804939D ; ---------------------------------------------------------------------------

可以看到现在题目中有四个输入, 所以不能让Angr自动执行, 同时输入也没有保存在栈中, 而是在一段内存中.

查看complex_function()的调用处的汇编可以发现内存的地址是一个常量

.text:08049321                 mov     eax, [ebp+var_C]
.text:08049324                 add     eax, 8134360h
.text:08049329                 movzx   eax, byte ptr [eax]
.text:0804932C                 movsx   eax, al
.text:0804932F                 sub     esp, 8
.text:08049332                 push    [ebp+var_C]
.text:08049335                 push    eax             ; 将内存中保存的输入进行变换
.text:08049336                 call    complex_function

于是我们得到保存输入的内存地址为: 0x8134360

使用Angr解题

代码

import angr
import claripy
import sys

def main(argv):
  # 获取二进制文件, 并在该二进制文件上建立Angr文件
  path_to_binary = argv[1]
  project = angr.Project(path_to_binary)

  # 符号执行的地址
  start_address = 0x08049315     #必要条件: 必须在memset后面, 如果在前面的话, 该段内存还未被用户使用, Angr复原的就是无用数据
                                 #必要条件: 要在scanf后面, 如果在前面的话, Angr还原的state就是在scanf前的处于初始化的数据了
                                 #问题: 是否还要再管那个栈空间, 经过验证是不用管的, 前面后面都可以
  initial_state = project.factory.blank_state(
    addr=start_address,
    add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
  )

  # The binary is calling scanf("%8s %8s %8s %8s").
  # (!)
  password_len_bits = 64 # 8个字符, 一个字符8位, 总共64位
  password0 = claripy.BVS('password0', password_len_bits)
  password1 = claripy.BVS('password1', password_len_bits)
  password2 = claripy.BVS('password2', password_len_bits)
  password3 = claripy.BVS('password3', password_len_bits)
  ...

  # Determine the address of the global variable to which scanf writes the user
  # input. The function 'initial_state.memory.store(address, value)' will write
  # 'value' (a bitvector) to 'address' (a memory location, as an integer.) The
  # 'address' parameter can also be a bitvector (and can be symbolic!).
  # 确定scanf写入输入的全局变量的地址.
  # 函数initial_state.memory.store(address, value)的作用是
  # 将参数value(位向量)写入地址为address的内存
  # 参数address也可以是位向量(也可以是符号)
  # (!)
  password0_address = 0x8134360
  initial_state.memory.store(password0_address, password0)
  initial_state.memory.store(password0_address + 0x8, password1)
  initial_state.memory.store(password0_address + 0x10, password2)
  initial_state.memory.store(password0_address + 0x18, password3)

  simulation = project.factory.simgr(initial_state)

  def is_successful(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return "Good Job".encode() in stdout_output

  def should_abort(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return "Try again".encode() in stdout_output

  simulation.explore(find=is_successful, avoid=should_abort)

  if simulation.found:
    solution_state = simulation.found[0]

    # Solve for the symbolic values. We are trying to solve for a string.
    # Therefore, we will use eval, with named parameter cast_to=bytes
    # which returns bytes that can be decoded to a string instead of an integer.
    # 求解符号值.
    # 我们正常是解出一个字符串, 因此, 我们将使用带有  命名参数cast_to=bytes的eval函数
    # 加上该参数后这个函数将返回解码为字符串的数据. 而不是整数的字节
    # (!)
    # cast_to = bytes表示切片成单字节, 然后外部用decode解码成unicode
    solution0 = solution_state.solver.eval(password0,cast_to=bytes).decode()
    solution1 = solution_state.solver.eval(password1,cast_to=bytes).decode()
    solution2 = solution_state.solver.eval(password2,cast_to=bytes).decode()
    solution3 = solution_state.solver.eval(password3,cast_to=bytes).decode()
    solution = solution0 + solution1 + solution2 + solution3

    print(solution)
  else:
    raise Exception('Could not find the solution')

if __name__ == '__main__':
  main(sys.argv)

去掉注释后的代码

import angr
import claripy
import sys

def main(argv):
  path_to_binary = argv[1]
  project = angr.Project(path_to_binary)

  start_address = 0x08049318
  initial_state = project.factory.blank_state(
    addr=start_address,
    add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
  )

  password_len_bits = 64
  password0 = claripy.BVS('password0', password_len_bits)
  password1 = claripy.BVS('password1', password_len_bits)
  password2 = claripy.BVS('password2', password_len_bits)
  password3 = claripy.BVS('password3', password_len_bits)

  password0_address = 0x8134360
  initial_state.memory.store(password0_address, password0)
  initial_state.memory.store(password0_address + 0x8, password1)
  initial_state.memory.store(password0_address + 0x10, password2)
  initial_state.memory.store(password0_address + 0x18, password3)

  simulation = project.factory.simgr(initial_state)

  def is_successful(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return "Good Job".encode() in stdout_output

  def should_abort(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return "Try again".encode() in stdout_output

  simulation.explore(find=is_successful, avoid=should_abort)

  if simulation.found:
    solution_state = simulation.found[0]

    solution0 = solution_state.solver.eval(password0,cast_to=bytes).decode()
    solution1 = solution_state.solver.eval(password1,cast_to=bytes).decode()
    solution2 = solution_state.solver.eval(password2,cast_to=bytes).decode()
    solution3 = solution_state.solver.eval(password3,cast_to=bytes).decode()
    solution = solution0 + solution1 + solution2 + solution3

    print(solution)
  else:
    raise Exception('Could not find the solution')

if __name__ == '__main__':
  main(sys.argv)

符号执行的地址的位置

memset()

我们的用户输入是发生再memset()后面的, 所以要在memset()后面符号执行

scanf()

如果再scanf()之前注入符号的话, Angr还原的state也是scanf()之前的那段内存了, 而在scanf调用之前的内存还是00 00 00 00这样的形式, 所以必须要scanf()后面符号执行

`add esp, 10h`

是否要在这条指令后面开始执行, 因为本题没有涉及栈的符号注入, 所以不用在意在前还是在后. 后面经过检验确实是这样.

1
2
3

start_address = 0x08049315   #这是add     esp, 10h
start_address = 0x08049318   #这是add     esp, 10h的下一个汇编语句
# 二者都能得到正确的flag

06_angr_symbolic_dynamic_memory

编译并运行

lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/06_angr_symbolic_dynamic_memory$ python3 generate.py 1234 06_angr_symbolic_dynamic_memory
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/06_angr_symbolic_dynamic_memory$ ./06_angr_symbolic_dynamic_memory
Enter the password: aaaaaaaaaaaa
Try again.

分析

使用IDA分析

; int __cdecl main(int argc, const char **argv, const char **envp)
.text:080492FF                 public main
.text:080492FF main            proc near               ; DATA XREF: _start+2A↑o
.text:080492FF
.text:080492FF i               = dword ptr -0Ch
.text:080492FF var_4           = dword ptr -4
.text:080492FF argc            = dword ptr  8
.text:080492FF argv            = dword ptr  0Ch
.text:080492FF envp            = dword ptr  10h
.text:080492FF
.text:080492FF ; __unwind {
.text:080492FF                 endbr32
.text:08049303                 lea     ecx, [esp+4]
.text:08049307                 and     esp, 0FFFFFFF0h
.text:0804930A                 push    dword ptr [ecx-4]
.text:0804930D                 push    ebp
.text:0804930E                 mov     ebp, esp
.text:08049310                 push    ecx
.text:08049311                 sub     esp, 14h
.text:08049314                 sub     esp, 0Ch
.text:08049317                 push    9               ; size
.text:08049319                 call    _malloc         ; 申请内存
.text:0804931E                 add     esp, 10h
.text:08049321                 mov     ds:buffer0, eax
.text:08049326                 sub     esp, 0Ch
.text:08049329                 push    9               ; size
.text:0804932B                 call    _malloc         ; 申请内存
.text:08049330                 add     esp, 10h
.text:08049333                 mov     ds:buffer1, eax
.text:08049338                 mov     eax, ds:buffer0
.text:0804933D                 sub     esp, 4
.text:08049340                 push    9               ; n
.text:08049342                 push    0               ; c
.text:08049344                 push    eax             ; s
.text:08049345                 call    _memset         ; 初始化内存
.text:0804934A                 add     esp, 10h
.text:0804934D                 mov     eax, ds:buffer1
.text:08049352                 sub     esp, 4
.text:08049355                 push    9               ; n
.text:08049357                 push    0               ; c
.text:08049359                 push    eax             ; s
.text:0804935A                 call    _memset         ; 初始化内存
.text:0804935F                 add     esp, 10h
.text:08049362                 sub     esp, 0Ch
.text:08049365                 push    offset aEnterThePasswo ; "Enter the password: "
.text:0804936A                 call    _printf
.text:0804936F                 add     esp, 10h
.text:08049372                 mov     edx, ds:buffer1
.text:08049378                 mov     eax, ds:buffer0
.text:0804937D                 sub     esp, 4
.text:08049380                 push    edx
.text:08049381                 push    eax
.text:08049382                 push    offset a8s8s    ; "%8s %8s"
.text:08049387                 call    ___isoc99_scanf ; 用户输入
.text:0804938C                 add     esp, 10h
.text:0804938F                 mov     [ebp+i], 0
.text:08049396                 jmp     short loc_8049402
.text:08049398 ; ---------------------------------------------------------------------------
.text:08049398
.text:08049398 loc_8049398:                            ; CODE XREF: main+107↓j
.text:08049398                 mov     edx, ds:buffer0
.text:0804939E                 mov     eax, [ebp+i]
.text:080493A1                 add     eax, edx
.text:080493A3                 movzx   eax, byte ptr [eax]
.text:080493A6                 movsx   eax, al
.text:080493A9                 sub     esp, 8
.text:080493AC                 push    [ebp+i]
.text:080493AF                 push    eax
.text:080493B0                 call    complex_function ; 对password0进行变换
.text:080493B5                 add     esp, 10h
.text:080493B8                 mov     ecx, eax
.text:080493BA                 mov     edx, ds:buffer0
.text:080493C0                 mov     eax, [ebp+i]
.text:080493C3                 add     eax, edx
.text:080493C5                 mov     edx, ecx
.text:080493C7                 mov     [eax], dl
.text:080493C9                 mov     eax, [ebp+i]
.text:080493CC                 lea     edx, [eax+20h]
.text:080493CF                 mov     ecx, ds:buffer1
.text:080493D5                 mov     eax, [ebp+i]
.text:080493D8                 add     eax, ecx
.text:080493DA                 movzx   eax, byte ptr [eax]
.text:080493DD                 movsx   eax, al
.text:080493E0                 sub     esp, 8
.text:080493E3                 push    edx
.text:080493E4                 push    eax
.text:080493E5                 call    complex_function ; 对password1进行变换
.text:080493EA                 add     esp, 10h
.text:080493ED                 mov     ecx, eax
.text:080493EF                 mov     edx, ds:buffer1
.text:080493F5                 mov     eax, [ebp+i]
.text:080493F8                 add     eax, edx
.text:080493FA                 mov     edx, ecx
.text:080493FC                 mov     [eax], dl
.text:080493FE                 add     [ebp+i], 1
.text:08049402
.text:08049402 loc_8049402:                            ; CODE XREF: main+97↑j
.text:08049402                 cmp     [ebp+i], 7
.text:08049406                 jle     short loc_8049398
.text:08049408                 mov     eax, ds:buffer0
.text:0804940D                 sub     esp, 4
.text:08049410                 push    8               ; n
.text:08049412                 push    offset s2       ; "HXUITWOA"
.text:08049417                 push    eax             ; s1
.text:08049418                 call    _strncmp        ; 检验
.text:0804941D                 add     esp, 10h
.text:08049420                 test    eax, eax
.text:08049422                 jnz     short loc_8049440
.text:08049424                 mov     eax, ds:buffer1
.text:08049429                 sub     esp, 4
.text:0804942C                 push    8               ; n
.text:0804942E                 push    offset aPnesffeg ; "PNESFFEG"
.text:08049433                 push    eax             ; s1
.text:08049434                 call    _strncmp        ; 检验
.text:08049439                 add     esp, 10h
.text:0804943C                 test    eax, eax
.text:0804943E                 jz      short loc_8049452
.text:08049440
.text:08049440 loc_8049440:                            ; CODE XREF: main+123↑j
.text:08049440                 sub     esp, 0Ch
.text:08049443                 push    offset s        ; "Try again."
.text:08049448                 call    _puts
.text:0804944D                 add     esp, 10h
.text:08049450                 jmp     short loc_8049462
.text:08049452 ; ---------------------------------------------------------------------------
.text:08049452
.text:08049452 loc_8049452:                            ; CODE XREF: main+13F↑j
.text:08049452                 sub     esp, 0Ch
.text:08049455                 push    offset aGoodJob ; "Good Job."
.text:0804945A                 call    _puts
.text:0804945F                 add     esp, 10h
.text:08049462
.text:08049462 loc_8049462:                            ; CODE XREF: main+151↑j
.text:08049462                 mov     eax, ds:buffer0
.text:08049467                 sub     esp, 0Ch
.text:0804946A                 push    eax             ; ptr
.text:0804946B                 call    _free
.text:08049470                 add     esp, 10h
.text:08049473                 mov     eax, ds:buffer1
.text:08049478                 sub     esp, 0Ch
.text:0804947B                 push    eax             ; ptr
.text:0804947C                 call    _free
.text:08049481                 add     esp, 10h
.text:08049484                 mov     eax, 0
.text:08049489                 mov     ecx, [ebp+var_4]
.text:0804948C                 leave
.text:0804948D                 lea     esp, [ecx-4]
.text:08049490                 retn
.text:08049490 ; } // starts at 80492FF
.text:08049490 main            endp
.text:08049490
.text:08049490 ; ---------------------------------------------------------------------------

这一关跟上一关不同点在于所使用的内存不再是固定的了, 而是使用malloc申请内存.所以我们无法像上一题那样直接将符号注入到固定的内存中.

虽然malloc分配的内存地址是随机的, 但是这个地址会分配给一个指针变量中(全局/局部), 而这一个变量的地址是固定的, 所以我们可以通过这个地址来找到内存的地址.

而Angr给出了一个更加简单的方法, 我们使用malloc的目的就是为了获得一段内存空间, 那我们也可以使用(伪造)一段未被使用的内存空间来替换掉malloc所分配的内存空间(这样就保证后面使用的内存地址是固定的, 简便了许多), 直接将变量的内容改为我们伪造的内存空间地址即可.

使用Angr解题

代码

import angr
import claripy
import sys

def main(argv):
  #建立Angr项目
  path_to_binary = argv[1]
  project = angr.Project(path_to_binary)

  #符号执行地址
  start_address = 0x0804938F #scanf的后面
  initial_state = project.factory.blank_state(
    addr=start_address,
    add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
  )

  # The binary is calling scanf("%8s %8s").
  # (!)
  password_len_bits = 64 # 八个字符, 每个字符8个位
  password0 = claripy.BVS('password0', password_len_bits)
  password1 = claripy.BVS('password1', password_len_bits)

  # Instead of telling the binary to write to the address of the memory
  # allocated with malloc, we can simply fake an address to any unused block of
  # memory and overwrite the pointer to the data. This will point the pointer
  # with the address of pointer_to_malloc_memory_address0 to fake_heap_address.
  # Be aware, there is more than one pointer! Analyze the binary to determine
  # global location of each pointer.
  # Note: by default, Angr stores integers in memory with big-endianness. To
  # specify to use the endianness of your architecture, use the parameter
  # endness=project.arch.memory_endness. On x86, this is little-endian.
  # 我们可以轻松的伪造一个未被使用的内存的地址指针来覆盖指向原本数据的指针, 不是使用二进制文件从malloc那里得到的内存指针(这样比较麻烦)
  # 这会使得内容为pointer_to_malloc_memory_address0的指针指向fake_heap_address
  # 注意, 指针不止一个
  # 分析二进制文件来确定每个指针的全局位置.
  # 注意: 默认情况下, Angr以大端顺序将整数存储再内存中, 要指定使用架构的字节序, 请使用参数endness=project.arch.memory_endness. 在x86上, 这是小端
  # (!)
  # 简而言之, 主函数中会有一个变量(全局或是局部)来存储分配的内存的地址, 我们要先找到那个变量的地址, 然后直接修改存储的值(原本的值为malloc分配的内存地址, 现在我们要改成一段未被使用的内存地址)
  fake_heap_address0 = 0x0804C100 # 我先使用的是.bss段的内存
  pointer_to_malloc_memory_address0 = 0x0BB9CE00    #这个是buffer0的地址, 而buffer0存储的是malloc分配的内存的地址, 现在我们要将buffer0的内容修改另一个内存的地址
  initial_state.memory.store(pointer_to_malloc_memory_address0, fake_heap_address0, endness=project.arch.memory_endness)
  fake_heap_address1 = 0x804C108
  pointer_to_malloc_memory_address1 = 0x0BB9CE08
  initial_state.memory.store(pointer_to_malloc_memory_address1, fake_heap_address1, endness=project.arch.memory_endness)

  # Store our symbolic values at our fake_heap_address. Look at the binary to
  # determine the offsets from the fake_heap_address where scanf writes.
  # 将我们的符号值存储在我们的fake_heap_address中
  # 查看二进制文件来确定scanf写入的fake_heap_address的偏移量
  # (!)


  initial_state.memory.store(fake_heap_address0, password0)
  initial_state.memory.store(fake_heap_address1, password1)

  simulation = project.factory.simgr(initial_state)

  def is_successful(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return "Good Job".encode() in stdout_output

  def should_abort(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return "Try again".encode() in stdout_output

  simulation.explore(find=is_successful, avoid=should_abort)

  if simulation.found:
    solution_state = simulation.found[0]

    solution0 = solution_state.solver.eval(password0,cast_to=bytes).decode()
    solution1 = solution_state.solver.eval(password1,cast_to=bytes).decode()
    solution = solution0 + solution1

    print(solution)
  else:
    raise Exception('Could not find the solution')

if __name__ == '__main__':
  main(sys.argv)

去掉注释后的代码

import angr
import claripy
import sys

def main(argv):
  path_to_binary = argv[1]
  project = angr.Project(path_to_binary)

  start_address = 0x0804938F
  initial_state = project.factory.blank_state(
    addr=start_address,
    add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
  )

  # The binary is calling scanf("%8s %8s").
  # (!)
  password_len_bits = 64
  password0 = claripy.BVS('password0', password_len_bits)
  password1 = claripy.BVS('password1', password_len_bits)

  fake_heap_address0 = 0x0804C100
  pointer_to_malloc_memory_address0 = 0x0BB9CE00
  initial_state.memory.store(pointer_to_malloc_memory_address0, fake_heap_address0, endness=project.arch.memory_endness)
  fake_heap_address1 = 0x804C108
  pointer_to_malloc_memory_address1 = 0x0BB9CE08
  initial_state.memory.store(pointer_to_malloc_memory_address1, fake_heap_address1, endness=project.arch.memory_endness)

  # Store our symbolic values at our fake_heap_address. Look at the binary to
  # determine the offsets from the fake_heap_address where scanf writes.

  initial_state.memory.store(fake_heap_address0, password0)
  initial_state.memory.store(fake_heap_address1, password1)

  simulation = project.factory.simgr(initial_state)

  def is_successful(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return "Good Job".encode() in stdout_output

  def should_abort(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return "Try again".encode() in stdout_output

  simulation.explore(find=is_successful, avoid=should_abort)

  if simulation.found:
    solution_state = simulation.found[0]

    solution0 = solution_state.solver.eval(password0,cast_to=bytes).decode()
    solution1 = solution_state.solver.eval(password1,cast_to=bytes).decode()
    solution = solution0 + solution1

    print(solution)
  else:
    raise Exception('Could not find the solution')

if __name__ == '__main__':
  main(sys.argv)

确定符号执行地址

这个的思路跟上一关差不多, 可以参考上一关

创建符号

我们要找的flag是两个8字符的输入, 也就是说我们要创建两个64为的符号

1
2
3

password_len_bits = 64
password0 = claripy.BVS('password0', password_len_bits)
password1 = claripy.BVS('password1', password_len_bits)

替换内存

我们可以看到地址为0x0BB9CE00存储了malloc的地址, 我们将其替换为一个未被使用的内存地址0x0804C100, 后面使用内存的时候就不再使用malloc分配的内存, 而是我们伪造的内存. 下一个内存的替换也是同理.

fake_heap_address0 = 0x0804C100
pointer_to_malloc_memory_address0 = 0x0BB9CE00
initial_state.memory.store(pointer_to_malloc_memory_address0, fake_heap_address0, endness=project.arch.memory_endness)

fake_heap_address1 = 0x804C108
pointer_to_malloc_memory_address1 = 0x0BB9CE08
initial_state.memory.store(pointer_to_malloc_memory_address1, fake_heap_address1, endness=project.arch.memory_endness)

注入符号

向伪造的内存中注入符号

1 2	initial_state.memory.store(fake_heap_address0, password0) initial_state.memory.store(fake_heap_address1, password1)

符号执行

1	simulation.explore(find=is_successful, avoid=should_abort)

结果

执行scaffold06时会报错, 但是会解得flag

07_angr_symbolic_file

编译并运行

lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/07_angr_symbolic_file$ python3 generate.py 1234 07_angr_symbolic_file
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/07_angr_symbolic_file$ ./07_angr_symbolic_file
Enter the password: aaaaa
Try again.
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/07_angr_symbolic_file$

分析

使用IDA分析

.text:080494F1 ; int __cdecl main(int argc, const char **argv, const char **envp)
.text:080494F1 public main
.text:080494F1 main proc near                          ; DATA XREF: _start+2A↑o
.text:080494F1
.text:080494F1 var_1C= dword ptr -1Ch
.text:080494F1 argc= dword ptr  8
.text:080494F1 argv= dword ptr  0Ch
.text:080494F1 envp= dword ptr  10h
.text:080494F1
.text:080494F1 ; __unwind {
.text:080494F1 endbr32
.text:080494F5 lea     ecx, [esp+4]
.text:080494F9 and     esp, 0FFFFFFF0h
.text:080494FC push    dword ptr [ecx-4]
.text:080494FF push    ebp
.text:08049500 mov     ebp, esp
.text:08049502 push    edi
.text:08049503 push    esi
.text:08049504 push    ecx
.text:08049505 sub     esp, 1Ch
.text:08049508 sub     esp, 4
.text:0804950B push    40h ; '@'                       ; n
.text:0804950D push    0                               ; c
.text:0804950F push    offset buffer                   ; s
.text:08049514 call    _memset                         ; 初始化一段内存
.text:08049519 add     esp, 10h
.text:0804951C sub     esp, 0Ch
.text:0804951F push    offset aEnterThePasswo          ; "Enter the password: "
.text:08049524 call    _printf                         ; 打印提示
.text:08049529 add     esp, 10h
.text:0804952C sub     esp, 8
.text:0804952F push    offset buffer
.text:08049534 push    offset a64s                     ; "%64s"
.text:08049539 call    ___isoc99_scanf                 ; 读取字符串
.text:0804953E add     esp, 10h
.text:08049541 sub     esp, 8
.text:08049544 push    40h ; '@'                       ; n
.text:08049546 push    offset buffer                   ; int
.text:0804954B call    ignore_me                       ; 这个函数在scaffold中有解释, 只是为了模拟一个读取文件的操作, 可以忽略
.text:08049550 add     esp, 10h
.text:08049553 sub     esp, 4
.text:08049556 push    40h ; '@'                       ; n
.text:08049558 push    0                               ; c
.text:0804955A push    offset buffer                   ; s
.text:0804955F call    _memset                         ; 再次初始化内存
.text:08049564 add     esp, 10h
.text:08049567 sub     esp, 8
.text:0804956A push    offset aRb                      ; "rb"
.text:0804956F push    offset name                     ; "PNESFFEG.txt"
.text:08049574 call    _fopen                          ; 打开文件PNESFFEG.txt
.text:08049579 add     esp, 10h
.text:0804957C mov     ds:fp, eax
.text:08049581 mov     eax, ds:fp
.text:08049586 push    eax                             ; stream
.text:08049587 push    40h ; '@'                       ; n
.text:08049589 push    1                               ; size
.text:0804958B push    offset buffer                   ; ptr
.text:08049590 call    _fread                          ; 从文件PNESFFEG.txt中读取数据
.text:08049595 add     esp, 10h
.text:08049598 mov     eax, ds:fp
.text:0804959D sub     esp, 0Ch
.text:080495A0 push    eax                             ; stream
.text:080495A1 call    _fclose                         ; 关闭PNESFFEG.txt
.text:080495A6 add     esp, 10h
.text:080495A9 sub     esp, 0Ch
.text:080495AC push    offset name                     ; "PNESFFEG.txt"
.text:080495B1 call    _unlink                         ; 删除PNESFFEG.txt该文件
.text:080495B6 add     esp, 10h
.text:080495B9 mov     [ebp+var_1C], 0
.text:080495C0 jmp     short loc_80495EF
.text:080495C2 ; ---------------------------------------------------------------------------
.text:080495C2
.text:080495C2 loc_80495C2:                            ; CODE XREF: main+102↓j
.text:080495C2 mov     eax, [ebp+var_1C]
.text:080495C5 add     eax, 804C0A0h
.text:080495CA movzx   eax, byte ptr [eax]
.text:080495CD movsx   eax, al
.text:080495D0 sub     esp, 8
.text:080495D3 push    [ebp+var_1C]
.text:080495D6 push    eax                             ; 将读取的前8个字节的数据进行变换
.text:080495D7 call    complex_function
.text:080495DC add     esp, 10h
.text:080495DF mov     edx, eax
.text:080495E1 mov     eax, [ebp+var_1C]
.text:080495E4 add     eax, 804C0A0h
.text:080495E9 mov     [eax], dl
.text:080495EB add     [ebp+var_1C], 1
.text:080495EF
.text:080495EF loc_80495EF:                            ; CODE XREF: main+CF↑j
.text:080495EF cmp     [ebp+var_1C], 7
.text:080495F3 jle     short loc_80495C2
.text:080495F5 mov     edx, offset buffer              ; 关键判断
.text:080495FA mov     eax, offset aHxuitwoa           ; "HXUITWOA"
.text:080495FF mov     ecx, 9
.text:08049604 mov     esi, edx
.text:08049606 mov     edi, eax
.text:08049608 repe cmpsb
.text:0804960A setnbe  dl
.text:0804960D setb    al
.text:08049610 sub     edx, eax
.text:08049612 mov     eax, edx
.text:08049614 movsx   eax, al
.text:08049617 test    eax, eax
.text:08049619 jz      short loc_8049635
.text:0804961B sub     esp, 0Ch
.text:0804961E push    offset s                        ; "Try again."
.text:08049623 call    _puts
.text:08049628 add     esp, 10h
.text:0804962B sub     esp, 0Ch
.text:0804962E push    1                               ; status
.text:08049630 call    _exit
.text:08049635 ; ---------------------------------------------------------------------------
.text:08049635
.text:08049635 loc_8049635:                            ; CODE XREF: main+128↑j
.text:08049635 sub     esp, 0Ch
.text:08049638 push    offset aGoodJob                 ; "Good Job."
.text:0804963D call    _puts
.text:08049642 add     esp, 10h
.text:08049645 sub     esp, 0Ch
.text:08049648 push    0                               ; status
.text:0804964A call    _exit
.text:0804964A ; } // starts at 80494F1
.text:0804964A main endp
.text:0804964A
.text:0804964A ; ---------------------------------------------------------------------------
.text:0804964F align 10h
.text:08049650
.text:08049650 ; =============== S U B R O U T I N E =======================================

起始有一个简单的解法就是直接获取固定的内存地址0x804C0A0h注入符号即可求解.

但是在scaffold07中有规定要使用模拟文件来解这道题

使用Angr求解

代码

# This challenge could, in theory, be solved in multiple ways. However, for the
# sake of learning how to simulate an alternate filesystem, please solve this
# challenge according to structure provided below. As a challenge, once you have
# an initial solution, try solving this in an alternate way.
#
# Problem description and general solution strategy:
# The binary loads the password from a file using the fread function. If the
# password is correct, it prints "Good Job." In order to keep consistency with
# the other challenges, the input from the console is written to a file in the 
# ignore_me function. As the name suggests, ignore it, as it only exists to
# maintain consistency with other challenges.
# We want to:
# 1. Determine the file from which fread reads.
# 2. Use Angr to simulate a filesystem where that file is replaced with our own
#    simulated file.
# 3. Initialize the file with a symbolic value, which will be read with fread
#    and propogated through the program.
# 4. Solve for the symbolic input to determine the password.
# 从理论上来讲, 这一关可以通过多种方式来解决. 但是, 为了学习模拟备用文件系统, 请
# 根据下面提供的步骤来通过此关卡. 作为一个挑战, 一旦你有了一个初步的解决方案, 试
# 这用另一种方式解决这个挑战.

# 问题描述和一般解决策略: 二进制文件使用fread函数从文件中加载密码. 如果密码正确,
# 它会打印"Good Job". 为了与其他挑战保持一致, 来自控制台的输入被写入ignore_me函
# 中的文件.顾名思义, 忽略它, 它的存在只是为了与其他挑战保持一致

#我们要:
# 1. 确定fread读取的文件
# 2. 使用Angr模拟一个文件系统, 该文件被我们自己的模拟文件替换
# 3. 用符号值初始化文件, 将用fread读取并通过程序传播
# 4. 求解符号输入以确定密码
import angr
import claripy
import sys

def main(argv):
  path_to_binary = argv[1]
  project = angr.Project(path_to_binary)

  start_address = 0x08049564 # 先设在memset后面
  initial_state = project.factory.blank_state(
    addr=start_address,
    add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
  )

  # Specify some information needed to construct a simulated file. For this
  # challenge, the filename is hardcoded, but in theory, it could be symbolic. 
  # Note: to read from the file, the binary calls
  # 'fread(buffer, sizeof(char), 64, file)'.
  # 指定构建模拟文件所需的一些信息. 对于这个挑战, 文件名是硬编码的, 但理论上它是可以是符号化的.
  # 注意: 要从文件中读取, 二进制文件调用'fread()'
  # (!)
  filename = "PNESFFEG.txt"  # :string, 文件名
  symbolic_file_size_bytes = 64 # 文件大小(字节)

  # Construct a bitvector for the password and then store it in the file's
  # backing memory. For example, imagine a simple file, 'hello.txt':
  # 为flag构建一个符号位向量, 然后将其存储在文件的后背内存中.
  # Hello world, my name is John.
  # ^                       ^
  # ^ address 0             ^ address 24 (count the number of characters)
  # In order to represent this in memory, we would want to write the string to
  # the beginning of the file:
  # 为了在内存中表示它, 我们希望将字符串写入文件的开头
  # 
  # hello_txt_contents = claripy.BVV('Hello world, my name is John.', 30*8)
  #
  # Perhaps, then, we would want to replace John with a
  # symbolic variable. We would call:
  # 那么, 也许我们想用一个符号变量代替John, 我们可以使用下面的语句
  #
  # name_bitvector = claripy.BVS('symbolic_name', 4*8)
  #
  # Then, after the program calls fopen('hello.txt', 'r') and then
  # fread(buffer, sizeof(char), 30, hello_txt_file), the buffer would contain
  # the string from the file, except four symbolic bytes where the name would be
  # stored.
  # 然后系统调用fopen("hello.txt", "r")和fread(buffer, sizeof(char), 30, hello_txt_file)
  # 在此之后, buffer将包含文件中的字符串, 除了那四个将保存名称的符号字节.
  # (!)
  # 设定的符号大小等于文件的大小
  password = claripy.BVS('password', symbolic_file_size_bytes * 8)

  # Construct the symbolic file. The file_options parameter specifies the Linux
  # file permissions (read, read/write, execute etc.) The content parameter
  # specifies from where the stream of data should be supplied. If content is
  # an instance of SimSymbolicMemory (we constructed one above), the stream will
  # contain the contents (including any symbolic contents) of the memory,
  # beginning from address zero.
  # Set the content parameter to our BVS instance that holds the symbolic data.
  # 构造符号文件. file_options参数将指定Linux文件权限(读, 写, 执行等).
  # content参数将指定从何处提供数据流. 如果content是SimSymbolicMemory的一个实例, 
  # 则数据流将包含内存的内容, 从地址0开始. 将content参数设置为保存符号数据的BVS实例.
  # (!)
  password_file = angr.storage.SimFile(filename, content=password, size = symbolic_file_size_bytes)# 这里多添加了一个size参数, 后面如果有错误的话就删了
  
  # Add the symbolic file we created to the symbolic filesystem.
  # 将我们创建的符号文件添加到符号文件系统
  initial_state.fs.insert(filename, password_file)

  simulation = project.factory.simgr(initial_state)

  def is_successful(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return "Good Job".encode() in stdout_output

  def should_abort(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return "Try again".encode() in stdout_output

  simulation.explore(find=is_successful, avoid=should_abort)

  if simulation.found:
    solution_state = simulation.found[0]

    solution = solution_state.solver.eval(password,cast_to=bytes).decode()

    print(solution)
  else:
    raise Exception('Could not find the solution')

if __name__ == '__main__':
  main(sys.argv)

去掉注释后的代码

import angr
import claripy
import sys

def main(argv):
  path_to_binary = argv[1]
  project = angr.Project(path_to_binary)

  start_address = 0x08049564
  initial_state = project.factory.blank_state(
    addr=start_address,
    add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
  )

  filename = "PNESFFEG.txt"
  symbolic_file_size_bytes = 64

  password = claripy.BVS('password', symbolic_file_size_bytes * 8)

  password_file = angr.storage.SimFile(filename, content=password, size = symbolic_file_size_bytes)

  initial_state.fs.insert(filename, password_file)

  simulation = project.factory.simgr(initial_state)

  def is_successful(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return "Good Job".encode() in stdout_output

  def should_abort(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return "Try again".encode() in stdout_output

  simulation.explore(find=is_successful, avoid=should_abort)

  if simulation.found:
    solution_state = simulation.found[0]

    solution = solution_state.solver.eval(password,cast_to=bytes).decode()

    print(solution)
  else:
    raise Exception('Could not find the solution')

if __name__ == '__main__':
  main(sys.argv)

构造模拟文件系统的步骤

1. 准备文件参数

文件名
文件大小

1 2	filename = "PNESFFEG.txt" symbolic_file_size_bytes = 64

2. 创建符号位向量

1
2

password = claripy.BVS('password', symbolic_file_size_bytes * 8)
# 本题的输入是64个字符, 虽然后面只检查前八个字符, 但是为了满足题目需求, 还是设置64个字符的符号位向量

3. 创建符号文件

利用我们上面的参数(文件名, 文件大小), 并将符号注入到文件中来充当文件的内容, 三个条件合在一起就可以创建一个文件.

同时, 作为符号文件, 我们最终条件约束的也是该文件中的符号内容.

1	password_file = angr.storage.SimFile(filename, content=password, size = symbolic_file_size_by

4. 将符号文件添加到模拟系统中

第一个参数是符号文件名, 第二个参数是符号文件本身

1	initial_state.fs.insert(filename, password_file)

后面二进制文件中读取的文件将会从模拟的文件系统中读取, 读取到的文件数据中就含有我们的符号(对于这一关准确来说全部都是符号数据), 这些数据就是我们的未知数x.

08_angr_constraints

编译并运行

lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/08_angr_constraints$ python3 generate.py 1234
 08_angr_constraints
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/08_angr_constraints$ ./08_angr_constraints
Enter the password: aaaaa
Try again.
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/08_angr_constraints$

分析

使用IDA进行分析

主函数

.text:080492EA ; int __cdecl main(int argc, const char **argv, const char **envp)
.text:080492EA                 public main
.text:080492EA main            proc near               ; DATA XREF: _start+2A↑o
.text:080492EA
.text:080492EA var_C           = dword ptr -0Ch
.text:080492EA var_4           = dword ptr -4
.text:080492EA argc            = dword ptr  8
.text:080492EA argv            = dword ptr  0Ch
.text:080492EA envp            = dword ptr  10h
.text:080492EA
.text:080492EA ; __unwind {
.text:080492EA                 endbr32
.text:080492EE                 lea     ecx, [esp+4]
.text:080492F2                 and     esp, 0FFFFFFF0h
.text:080492F5                 push    dword ptr [ecx-4]
.text:080492F8                 push    ebp
.text:080492F9                 mov     ebp, esp
.text:080492FB                 push    ecx
.text:080492FC                 sub     esp, 14h
.text:080492FF                 mov     ds:password, 49555848h ; 都是password, IDA分析错误, 这里是对参考数据进行赋值
.text:08049309                 mov     ds:dword_804C034, 414F5754h
.text:08049313                 mov     ds:dword_804C038, 53454E50h
.text:0804931D                 mov     ds:dword_804C03C, 47454646h
.text:08049327                 sub     esp, 4
.text:0804932A                 push    11h             ; n
.text:0804932C                 push    0               ; c
.text:0804932E                 push    offset buffer   ; s
.text:08049333                 call    _memset         ; 初始化
.text:08049338                 add     esp, 10h
.text:0804933B                 sub     esp, 0Ch
.text:0804933E                 push    offset aEnterThePasswo ; "Enter the password: "
.text:08049343                 call    _printf
.text:08049348                 add     esp, 10h
.text:0804934B                 sub     esp, 8
.text:0804934E                 push    offset buffer
.text:08049353                 push    offset a16s     ; "%16s"
.text:08049358                 call    ___isoc99_scanf ; 输入存放在buffer
.text:0804935D                 add     esp, 10h
.text:08049360                 mov     [ebp+var_C], 0
.text:08049367                 jmp     short loc_804939E
.text:08049369 ; ---------------------------------------------------------------------------
.text:08049369
.text:08049369 loc_8049369:                            ; CODE XREF: main+B8↓j
.text:08049369                 mov     eax, 0Fh
.text:0804936E                 sub     eax, [ebp+var_C]
.text:08049371                 mov     edx, eax
.text:08049373                 mov     eax, [ebp+var_C]
.text:08049376                 add     eax, 804C040h   ; 对输入进行变换
.text:0804937B                 movzx   eax, byte ptr [eax]
.text:0804937E                 movsx   eax, al
.text:08049381                 sub     esp, 8
.text:08049384                 push    edx
.text:08049385                 push    eax
.text:08049386                 call    complex_function
.text:0804938B                 add     esp, 10h
.text:0804938E                 mov     edx, eax
.text:08049390                 mov     eax, [ebp+var_C]
.text:08049393                 add     eax, 804C040h
.text:08049398                 mov     [eax], dl
.text:0804939A                 add     [ebp+var_C], 1
.text:0804939E
.text:0804939E loc_804939E:                            ; CODE XREF: main+7D↑j
.text:0804939E                 cmp     [ebp+var_C], 0Fh
.text:080493A2                 jle     short loc_8049369
.text:080493A4                 sub     esp, 8
.text:080493A7                 push    10h
.text:080493A9                 push    offset buffer
.text:080493AE                 call    check_equals_HXUITWOAPNESFFEG ; 这里是关键判断
.text:080493B3                 add     esp, 10h
.text:080493B6                 test    eax, eax        ; 测试返回值是否为１，　如果不为１则打印＂ｇｏｏｄ　ｊｏｂ＂
.text:080493B8                 jnz     short loc_80493CC
.text:080493BA                 sub     esp, 0Ch
.text:080493BD                 push    offset s        ; "Try again."
.text:080493C2                 call    _puts
.text:080493C7                 add     esp, 10h
.text:080493CA                 jmp     short loc_80493DC
.text:080493CC ; ---------------------------------------------------------------------------
.text:080493CC
.text:080493CC loc_80493CC:                            ; CODE XREF: main+CE↑j
.text:080493CC                 sub     esp, 0Ch
.text:080493CF                 push    offset aGoodJob ; "Good Job."
.text:080493D4                 call    _puts
.text:080493D9                 add     esp, 10h
.text:080493DC
.text:080493DC loc_80493DC:                            ; CODE XREF: main+E0↑j
.text:080493DC                 mov     eax, 0
.text:080493E1                 mov     ecx, [ebp+var_4]
.text:080493E4                 leave
.text:080493E5                 lea     esp, [ecx-4]
.text:080493E8                 retn
.text:080493E8 ; } // starts at 80492EA
.text:080493E8 main            endp
.text:080493E8
.text:080493E8 ; ---------------------------------------------------------------------------
.text:080493E9                 align 10h
.text:080493F0
.text:080493F0 ; =============== S U B R O U T I N E =======================================

跟进check_equals_HXUITWOAPNESFFEG()

.text:08049298 ; _BOOL4 __cdecl check_equals_HXUITWOAPNESFFEG(int, unsigned int)
.text:08049298                 public check_equals_HXUITWOAPNESFFEG
.text:08049298 check_equals_HXUITWOAPNESFFEG proc near ; CODE XREF: main+C4↓p
.text:08049298
.text:08049298 var_8           = dword ptr -8
.text:08049298 var_4           = dword ptr -4
.text:08049298 arg_0           = dword ptr  8
.text:08049298 arg_4           = dword ptr  0Ch
.text:08049298
.text:08049298 ; __unwind {
.text:08049298                 endbr32
.text:0804929C                 push    ebp
.text:0804929D                 mov     ebp, esp
.text:0804929F                 sub     esp, 10h
.text:080492A2                 mov     [ebp+var_8], 0
.text:080492A9                 mov     [ebp+var_4], 0
.text:080492B0                 jmp     short loc_80492D4
.text:080492B2 ; ---------------------------------------------------------------------------
.text:080492B2
.text:080492B2 loc_80492B2:                            ; CODE XREF: check_equals_HXUITWOAPNESFFEG+42↓j
.text:080492B2                 mov     edx, [ebp+var_4]
.text:080492B5                 mov     eax, [ebp+arg_0]
.text:080492B8                 add     eax, edx
.text:080492BA                 movzx   edx, byte ptr [eax]
.text:080492BD                 mov     eax, [ebp+var_4]
.text:080492C0                 add     eax, 804C030h   ; 这里就是参考数据
.text:080492C5                 movzx   eax, byte ptr [eax]
.text:080492C8                 cmp     dl, al          ; 参考数据跟变换后的输入进行比较
.text:080492CA                 jnz     short loc_80492D0
.text:080492CC                 add     [ebp+var_8], 1
.text:080492D0
.text:080492D0 loc_80492D0:                            ; CODE XREF: check_equals_HXUITWOAPNESFFEG+32↑j
.text:080492D0                 add     [ebp+var_4], 1
.text:080492D4
.text:080492D4 loc_80492D4:                            ; CODE XREF: check_equals_HXUITWOAPNESFFEG+18↑j
.text:080492D4                 mov     eax, [ebp+var_4]
.text:080492D7                 cmp     [ebp+arg_4], eax
.text:080492DA                 ja      short loc_80492B2
.text:080492DC                 mov     eax, [ebp+var_8]
.text:080492DF                 cmp     eax, [ebp+arg_4]
.text:080492E2                 setz    al
.text:080492E5                 movzx   eax, al
.text:080492E8                 leave
.text:080492E9                 retn
.text:080492E9 ; } // starts at 8049298
.text:080492E9 check_equals_HXUITWOAPNESFFEG endp
.text:080492E9
.text:080492EA
.text:080492EA ; =============== S U B R O U T I N E =======================================

其中0x804C030就是参考数据的地址, 位于buffer的上方

使用Angr解题

代码

# The binary asks for a 16 character password to which is applies a complex
# function and then compares with a reference string with the function
# check_equals_[reference string]. (Decompile the binary and take a look at it!)
# The source code for this function is provided here. However, the reference
# string in your version will be different than AABBCCDDEEFFGGHH:
#
# #define REFERENCE_PASSWORD = "AABBCCDDEEFFGGHH";
# int check_equals_AABBCCDDEEFFGGHH(char* to_check, size_t length) {
#   uint32_t num_correct = 0;
#   for (int i=0; i<length; ++i) {
#     if (to_check[i] == REFERENCE_PASSWORD[i]) {
#       num_correct += 1;
#     }
#   }
#   return num_correct == length;
# }
# 二进制文件要求输入16个字符的密码, 该密码应用了一个complex_function()进行变换
# 随后在check_equals_HXUITWOAPNESFFEG()中与参考数据进行比较
# 下面是源代码
# ...
# 
# char* input = user_input();
# char* encrypted_input = complex_function(input);
# if (check_equals_AABBCCDDEEFFGGHH(encrypted_input, 16)) {
#   puts("Good Job.");
# } else {
#   puts("Try again.");
# }
#
# The function checks if *to_check == "AABBCCDDEEFFGGHH". Verify this yourself.
# While you, as a human, can easily determine that this function is equivalent
# to simply comparing the strings, the computer cannot. Instead the computer 
# would need to branch every time the if statement in the loop was called (16 
# times), resulting in 2^16 = 65,536 branches, which will take too long of a 
# time to evaluate for our needs.
# check_equals_HXUITWOAPNESFFEG()检查结果 == "AABBCCDDEEFFGGHH", 人类可以轻松确定这是在比较字符串
# 但是计算机不能. 每次调用if时都需要进行分支, 导致了2^16 = 65536个分支, 这将花费很长时间来评估我们的需求
# 为什么会有这些分支, 阅读Angr给出的PPT教程, 可以知道在Angr的机制中, 一遇上if, state就会产生两个分支
# 多重的if叠加, 就造成了路径爆炸, 所以这一关的主要目标就是减少分支到我们可接受的范围内.
#
# We do not know how the complex_function works, but we want to find an input
# that, when modified by complex_function, will produce the string:
# AABBCCDDEEFFGGHH.
# 我们不知道complex_function()具体干了些什么, 但是我们知道我们的目的是然输入在被complex_function变换后的结果是: AABBCCDDEEFFGGHH
#
# In this puzzle, your goal will be to stop the program before this function is
# called and manually constrain the to_check variable to be equal to the
# password you identify by decompiling the binary. Since, you, as a human, know
# that if the strings are equal, the program will print "Good Job.", you can
# be assured that if the program can solve for an input that makes them equal,
# the input will be the correct password.
# 在这个谜题中, 你的目标是在调用此函数之前停止程序, 并手动将to_check变量约束为你通过反编译二进制文件识别的密码.
# 因为, 作为人类, 你知道如果字符串相等, 程序将打印"Good Job".
#
import angr
import claripy
import sys

def main(argv):
  path_to_binary = argv[1]
  project = angr.Project(path_to_binary)

  # 设定符号执行起点, 因为使用的是固定的内存, 所以执行起点设在scanf后面, 如果不设在scanf, Angr复原的将是在调用scanf之前的内存数据, 即未初始化的垃圾数据
  start_address = 0x08049360
  initial_state = project.factory.blank_state(
    addr=start_address,
    add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
  )

  # 创建符号位向量
  password_len_bits = 16 * 8     #总共十六个字符
  password = claripy.BVS('password', password_len_bits)# 创建一个符号位向量

  # 注入符号
  password_address = 0x0804C040     # buffer的地址
  initial_state.memory.store(password_address, password)     # 将符号注入到目标内存

  # 创建模拟管理器
  simulation = project.factory.simgr(initial_state)

  # Angr will not be able to reach the point at which the binary prints out
  # 'Good Job.'. We cannot use that as the target anymore. 
  # Angr将无法到达到达二进制打印出"Good Job"的点, 我们不能用它作为目标了
  # (!)

  # 符号执行到这里便停下
  address_to_check_constraint = 0x080493AE     #题目要求的使在调用该函数之前停下来
  simulation.explore(find=address_to_check_constraint)

  # 开始约束条件
  if simulation.found:
    solution_state = simulation.found[0]

    # Recall that we need to constrain the to_check parameter (see top) of the 
    # check_equals_ function. Determine the address that is being passed as the
    # parameter and load it into a bitvector so that we can constrain it.
    # 回想一下, 我们需要约束check_equals函数的to_check参数(上面的源码).
    # 确定作为参数传递的地址并将其加载到位向量中, 以便我们可以对其进行约束
    # (!)

    # 约束参数
    constrained_parameter_address = 0x0804C040     #约束参数的地址
    constrained_parameter_size_bytes = 16     #约束参数的大小(以字节为单位)
    constrained_parameter_bitvector = solution_state.memory.load(     #将其加载到一个位向量中, 这个位向量存有该内存中的数据, 而这段内存已经被我们注入了符号
      constrained_parameter_address,
      constrained_parameter_size_bytes
    )
    # We want to constrain the system to find an input that will make
    # constrained_parameter_bitvector equal the desired value.
    # 我们希望约束系统找到一个输入, 使得constrained_parameter_bitvector等于所需值
    # (!)
    constrained_parameter_desired_value = "HXUITWOAPNESFFEG" # :string (encoded)

    # Specify a claripy expression (using Pythonic syntax) that tests whether
    # constrained_parameter_bitvector == constrained_parameter_desired_value.
    # Add the constraint to the state to let z3 attempt to find an input that
    # will make this expression true.
    # 指定一个清晰的表达式(使用Pythonic语法)来测试constrained_parameter_bitvector == constrained_parameter_desired_value.
    # 将约束添加到状态从而使z3尝试找到是该表达式满足的输入
    solution_state.add_constraints(constrained_parameter_bitvector == constrained_parameter_desired_value)     # 通过上面得到的位向量来添加约束条件

    # Solve for the constrained_parameter_bitvector.
    # 求解constrained_parameter_bitvector
    # (!)
    solution = solution_state.solver.eval(password, cast_to=bytes)

    print(solution)
  else:
    raise Exception('Could not find the solution')

if __name__ == '__main__':
  main(sys.argv)

去掉注释后的代码

import angr
import claripy
import sys

def main(argv):
  path_to_binary = argv[1]
  project = angr.Project(path_to_binary)

  start_address = 0x08049360
  initial_state = project.factory.blank_state(
    addr=start_address,
    add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
  )

  password_len_bits = 16 * 8
  password = claripy.BVS('password', password_len_bits)

  password_address = 0x0804C040
  initial_state.memory.store(password_address, password)

  simulation = project.factory.simgr(initial_state)

  address_to_check_constraint = 0x080493AE
  simulation.explore(find=address_to_check_constraint)

  if simulation.found:
    solution_state = simulation.found[0]

    constrained_parameter_address = 0x0804C040
    constrained_parameter_size_bytes = 16
    constrained_parameter_bitvector = solution_state.memory.load(
      constrained_parameter_address,
      constrained_parameter_size_bytes
    )

    solution_state.add_constraints(constrained_parameter_bitvector == constrained_parameter_desired_value)     # 通过上面得到的位向量来添加约束条件

    solution = solution_state.solver.eval(password, cast_to=bytes)

    print(solution)
  else:
    raise Exception('Could not find the solution')

if __name__ == '__main__':
  main(sys.argv)

因为本题的约束条件是循环if, 一个if会分成两个state, 这样会造成指数级增长的分支, 所以我们要手动添加约束条件来解出flag.

手动添加约束条件的步骤

1. 确定约束什么, 怎么约束

在使用Angr实现约束之前, 我们要清楚的知道我们要约束什么, 怎么约束.

要约束什么, 我们就要按照关卡中的约束条件去模拟.

本关中约束的对象是经过complex_function加密过后的字符串, 约束的方法是跟一个参考数据进行比较, 相等则print”Good Job”. 这也就是我们要模拟的约束条件

2. 确定符号执行位置

可以参考前两关, 应该在scanf之后开始符号执行

3. 创建符号并注入

这里使用的是全局变量, 直接注入即可

password_len_bits = 16 * 8
password = claripy.BVS('password', password_len_bits)

password_address = 0x0804C040  # 全局变量的地址
initial_state.memory.store(password_address, password)

4. 去掉原本的约束条件

因为原本的约束条件难以实现, 所以我们才要自己添加约束条件来替换它. 所谓替换, 就是去掉(不执行)原本的约束语句, 只用我们自己的约束条件. 在本关中是check_equals_HXUITWOAPNESFFEG()函数.

1 2	address_to_check_constraint = 0x080493AE #在调用check函数之前停下来 simulation.explore(find=address_to_check_constraint)

去掉原本的约束条件的实现是在check_equals_HXUITWOAPNESFFEG()之前停止符号执行

5. 添加自己的约束条件

我们读取此前注入了符号的内存(注意, 此时这段内存已经经过了complex_function的变换), 并通过==参考数据, 来添加约束条件.

因为这段内存注入了符号, 所以在当前的state中, 约束这段内存就相当于约束了complex_function变换后的符号位向量.

if simulation.found:
  solution_state = simulation.found[0]

  constrained_parameter_address = 0x0804C040  #从该地址开始读取内容
  constrained_parameter_size_bytes = 16    # 读取的字节长度
  constrained_parameter_bitvector = solution_state.memory.load(
    constrained_parameter_address,
    constrained_parameter_size_bytes
  )  # 读取到constrained_parameter_bitvector位向量中去.

# 添加约束条件
constrained_parameter_desired_value = "HXUITWOAPNESFFEG"
solution_state.add_constraints(constrained_parameter_bitvector == constrained_parameter_desired_value)

6. 求解

使用z3求解

1	solution = solution_state.solver.eval(password, cast_to=bytes)

09_angr_hooks

编译并运行

lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/09_angr_hooks$ python3 generate.py 1234 09_angr_hooks
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/09_angr_hooks$ ./09_angr_hooks
Enter the password: aaaaaaaaaa
Try again.
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/09_angr_hooks$

分析

使用IDA分析

.text:0804930A                   ; int __cdecl main(int argc, const char **argv, const char **envp)
.text:0804930A                                   public main
.text:0804930A                   main            proc near               ; DATA XREF: _start+2A↑o
.text:0804930A
.text:0804930A                   var_10          = dword ptr -10h
.text:0804930A                   var_C           = dword ptr -0Ch
.text:0804930A                   var_4           = dword ptr -4
.text:0804930A                   argc            = dword ptr  8
.text:0804930A                   argv            = dword ptr  0Ch
.text:0804930A                   envp            = dword ptr  10h
.text:0804930A
.text:0804930A                   ; __unwind {
.text:0804930A F3 0F 1E FB                       endbr32
.text:0804930E 8D 4C 24 04                       lea     ecx, [esp+4]
.text:08049312 83 E4 F0                          and     esp, 0FFFFFFF0h
.text:08049315 FF 71 FC                          push    dword ptr [ecx-4]
.text:08049318 55                                push    ebp
.text:08049319 89 E5                             mov     ebp, esp
.text:0804931B 51                                push    ecx
.text:0804931C 83 EC 14                          sub     esp, 14h
.text:0804931F C7 05 34 C0 04 08+                mov     ds:password, 49555848h ; 参考数据
.text:0804931F 48 58 55 49
.text:08049329 C7 05 38 C0 04 08+                mov     ds:dword_804C038, 414F5754h
.text:08049329 54 57 4F 41
.text:08049333 C7 05 3C C0 04 08+                mov     ds:dword_804C03C, 53454E50h
.text:08049333 50 4E 45 53
.text:0804933D C7 05 40 C0 04 08+                mov     ds:dword_804C040, 47454646h
.text:0804933D 46 46 45 47
.text:08049347 83 EC 04                          sub     esp, 4
.text:0804934A 6A 11                             push    11h             ; n
.text:0804934C 6A 00                             push    0               ; c
.text:0804934E 68 44 C0 04 08                    push    offset buffer   ; s
.text:08049353 E8 98 FD FF FF                    call    _memset         ; 初始化buffer
.text:08049358 83 C4 10                          add     esp, 10h
.text:0804935B 83 EC 0C                          sub     esp, 0Ch
.text:0804935E 68 16 A0 04 08                    push    offset aEnterThePasswo ; "Enter the password: "
.text:08049363 E8 48 FD FF FF                    call    _printf
.text:08049368 83 C4 10                          add     esp, 10h
.text:0804936B 83 EC 08                          sub     esp, 8
.text:0804936E 68 44 C0 04 08                    push    offset buffer
.text:08049373 68 2B A0 04 08                    push    offset a16s     ; "%16s"
.text:08049378 E8 83 FD FF FF                    call    ___isoc99_scanf ; 获取用户输入
.text:0804937D 83 C4 10                          add     esp, 10h
.text:08049380 C7 45 F0 00 00 00+                mov     [ebp+var_10], 0
.text:08049380 00
.text:08049387 EB 35                             jmp     short loc_80493BE
.text:08049389                   ; ---------------------------------------------------------------------------
.text:08049389
.text:08049389                   loc_8049389:                            ; CODE XREF: main+B8↓j
.text:08049389 B8 12 00 00 00                    mov     eax, 12h
.text:0804938E 2B 45 F0                          sub     eax, [ebp+var_10]
.text:08049391 89 C2                             mov     edx, eax
.text:08049393 8B 45 F0                          mov     eax, [ebp+var_10]
.text:08049396 05 44 C0 04 08                    add     eax, 804C044h
.text:0804939B 0F B6 00                          movzx   eax, byte ptr [eax]
.text:0804939E 0F BE C0                          movsx   eax, al
.text:080493A1 83 EC 08                          sub     esp, 8
.text:080493A4 52                                push    edx
.text:080493A5 50                                push    eax
.text:080493A6 E8 AD FE FF FF                    call    complex_function ; 进行变换
.text:080493AB 83 C4 10                          add     esp, 10h
.text:080493AE 89 C2                             mov     edx, eax
.text:080493B0 8B 45 F0                          mov     eax, [ebp+var_10]
.text:080493B3 05 44 C0 04 08                    add     eax, 804C044h
.text:080493B8 88 10                             mov     [eax], dl
.text:080493BA 83 45 F0 01                       add     [ebp+var_10], 1
.text:080493BE
.text:080493BE                   loc_80493BE:                            ; CODE XREF: main+7D↑j
.text:080493BE 83 7D F0 0F                       cmp     [ebp+var_10], 0Fh
.text:080493C2 7E C5                             jle     short loc_8049389
.text:080493C4 83 EC 08                          sub     esp, 8
.text:080493C7 6A 10                             push    10h
.text:080493C9 68 44 C0 04 08                    push    offset buffer
.text:080493CE E8 E5 FE FF FF                    call    check_equals_HXUITWOAPNESFFEG ; 最终检查
.text:080493D3 83 C4 10                          add     esp, 10h
.text:080493D6 A3 58 C0 04 08                    mov     ds:equals, eax
.text:080493DB C7 45 F4 00 00 00+                mov     [ebp+var_C], 0
.text:080493DB 00
.text:080493E2 EB 31                             jmp     short loc_8049415
.text:080493E4                   ; ---------------------------------------------------------------------------
.text:080493E4
.text:080493E4                   loc_80493E4:                            ; CODE XREF: main+10F↓j
.text:080493E4 8B 45 F4                          mov     eax, [ebp+var_C]
.text:080493E7 8D 50 09                          lea     edx, [eax+9]
.text:080493EA 8B 45 F4                          mov     eax, [ebp+var_C]
.text:080493ED 05 34 C0 04 08                    add     eax, 804C034h
.text:080493F2 0F B6 00                          movzx   eax, byte ptr [eax]
.text:080493F5 0F BE C0                          movsx   eax, al
.text:080493F8 83 EC 08                          sub     esp, 8
.text:080493FB 52                                push    edx
.text:080493FC 50                                push    eax
.text:080493FD E8 56 FE FF FF                    call    complex_function
.text:08049402 83 C4 10                          add     esp, 10h
.text:08049405 89 C2                             mov     edx, eax
.text:08049407 8B 45 F4                          mov     eax, [ebp+var_C]
.text:0804940A 05 34 C0 04 08                    add     eax, 804C034h
.text:0804940F 88 10                             mov     [eax], dl
.text:08049411 83 45 F4 01                       add     [ebp+var_C], 1
.text:08049415
.text:08049415                   loc_8049415:                            ; CODE XREF: main+D8↑j
.text:08049415 83 7D F4 0F                       cmp     [ebp+var_C], 0Fh
.text:08049419 7E C9                             jle     short loc_80493E4
.text:0804941B 83 EC 08                          sub     esp, 8
.text:0804941E 68 44 C0 04 08                    push    offset buffer
.text:08049423 68 2B A0 04 08                    push    offset a16s     ; "%16s"
.text:08049428 E8 D3 FC FF FF                    call    ___isoc99_scanf
.text:0804942D 83 C4 10                          add     esp, 10h
.text:08049430 A1 58 C0 04 08                    mov     eax, ds:equals
.text:08049435 85 C0                             test    eax, eax
.text:08049437 74 22                             jz      short loc_804945B
.text:08049439 83 EC 04                          sub     esp, 4
.text:0804943C 6A 10                             push    10h             ; n
.text:0804943E 68 34 C0 04 08                    push    offset password ; s2
.text:08049443 68 44 C0 04 08                    push    offset buffer   ; s1
.text:08049448 E8 C3 FC FF FF                    call    _strncmp
.text:0804944D 83 C4 10                          add     esp, 10h
.text:08049450 85 C0                             test    eax, eax
.text:08049452 75 07                             jnz     short loc_804945B
.text:08049454 B8 01 00 00 00                    mov     eax, 1
.text:08049459 EB 05                             jmp     short loc_8049460
.text:0804945B                   ; ---------------------------------------------------------------------------
.text:0804945B
.text:0804945B                   loc_804945B:                            ; CODE XREF: main+12D↑j
.text:0804945B                                                           ; main+148↑j
.text:0804945B B8 00 00 00 00                    mov     eax, 0
.text:08049460
.text:08049460                   loc_8049460:                            ; CODE XREF: main+14F↑j
.text:08049460 A3 58 C0 04 08                    mov     ds:equals, eax
.text:08049465 A1 58 C0 04 08                    mov     eax, ds:equals
.text:0804946A 85 C0                             test    eax, eax
.text:0804946C 75 12                             jnz     short loc_8049480
.text:0804946E 83 EC 0C                          sub     esp, 0Ch
.text:08049471 68 0B A0 04 08                    push    offset s        ; "Try again."
.text:08049476 E8 45 FC FF FF                    call    _puts
.text:0804947B 83 C4 10                          add     esp, 10h
.text:0804947E EB 10                             jmp     short loc_8049490
.text:08049480                   ; ---------------------------------------------------------------------------
.text:08049480
.text:08049480                   loc_8049480:                            ; CODE XREF: main+162↑j
.text:08049480 83 EC 0C                          sub     esp, 0Ch
.text:08049483 68 30 A0 04 08                    push    offset aGoodJob ; "Good Job."
.text:08049488 E8 33 FC FF FF                    call    _puts
.text:0804948D 83 C4 10                          add     esp, 10h
.text:08049490
.text:08049490                   loc_8049490:                            ; CODE XREF: main+174↑j
.text:08049490 B8 00 00 00 00                    mov     eax, 0
.text:08049495 8B 4D FC                          mov     ecx, [ebp+var_4]
.text:08049498 C9                                leave
.text:08049499 8D 61 FC                          lea     esp, [ecx-4]
.text:0804949C C3                                retn
.text:0804949C                   ; } // starts at 804930A
.text:0804949C                   main            endp
.text:0804949C
.text:0804949C                   ; ---------------------------------------------------------------------------
.text:0804949D 66 90 90                          align 10h
.text:080494A0
.text:080494A0                   ; =============== S U B R O U T I N E =======================================
.text:080494A0
.text:080494A0
.text:080494A0                                   public __libc_csu_init
.text:080494A0                   __libc_csu_init proc near               ; DATA XREF: _start+21↑o
.text:080494A0
.text:080494A0                   arg_0           = dword ptr  4
.text:080494A0                   arg_4           = dword ptr  8
.text:080494A0                   arg_8           = dword ptr  0Ch
.text:080494A0
.text:080494A0                   ; __unwind {
.text:080494A0 F3 0F 1E FB                       endbr32
.text:080494A4 55                                push    ebp
.text:080494A5 E8 6B 00 00 00                    call    __x86_get_pc_thunk_bp
.text:080494AA 81 C5 56 2B 00 00                 add     ebp, (offset _GLOBAL_OFFSET_TABLE_ - $)
.text:080494B0 57                                push    edi
.text:080494B1 56                                push    esi
.text:080494B2 53                                push    ebx
.text:080494B3 83 EC 0C                          sub     esp, 0Ch
.text:080494B6 89 EB                             mov     ebx, ebp
.text:080494B8 8B 7C 24 28                       mov     edi, [esp+1Ch+arg_8]
.text:080494BC E8 3F FB FF FF                    call    _init_proc
.text:080494C1 8D 9D 10 FF FF FF                 lea     ebx, [ebp-0F0h]
.text:080494C7 8D 85 0C FF FF FF                 lea     eax, [ebp-0F4h]
.text:080494CD 29 C3                             sub     ebx, eax
.text:080494CF C1 FB 02                          sar     ebx, 2
.text:080494D2 74 29                             jz      short loc_80494FD
.text:080494D4 31 F6                             xor     esi, esi
.text:080494D6 8D B4 26 00 00 00+                lea     esi, [esi+0]
.text:080494D6 00
.text:080494DD 8D 76 00                          lea     esi, [esi+0]
.text:080494E0
.text:080494E0                   loc_80494E0:                            ; CODE XREF: __libc_csu_init+5B↓j
.text:080494E0 83 EC 04                          sub     esp, 4
.text:080494E3 57                                push    edi
.text:080494E4 FF 74 24 2C                       push    [esp+24h+arg_4]
.text:080494E8 FF 74 24 2C                       push    [esp+28h+arg_0]
.text:080494EC FF 94 B5 0C FF FF+                call    ss:(__frame_dummy_init_array_entry - 804C000h)[ebp+esi*4]
.text:080494EC FF
.text:080494F3 83 C6 01                          add     esi, 1
.text:080494F6 83 C4 10                          add     esp, 10h
.text:080494F9 39 F3                             cmp     ebx, esi
.text:080494FB 75 E3                             jnz     short loc_80494E0
.text:080494FD
.text:080494FD                   loc_80494FD:                            ; CODE XREF: __libc_csu_init+32↑j
.text:080494FD 83 C4 0C                          add     esp, 0Ch
.text:08049500 5B                                pop     ebx
.text:08049501 5E                                pop     esi
.text:08049502 5F                                pop     edi
.text:08049503 5D                                pop     ebp
.text:08049504 C3                                retn
.text:08049504                   ; } // starts at 80494A0
.text:08049504                   __libc_csu_init endp
.text:08049504
.text:08049504                   ; ---------------------------------------------------------------------------
.text:08049505 8D B4 26 00 00 00+                align 10h
.text:08049510
.text:08049510                   ; =============== S U B R O U T I N E =======================================

进行了两次输入和检查:

第一次输入后, 对输入进行加密, 然后与参考数据比较
第二次输入后, 对参考数据进行加密, 然后与输入进行比较

其中我们需要修改的是第一次比较部分, 因为使用了check_equals_HXUITWOAPNESFFEG()函数, 而在这个函数中循环使用了if, 造成过多的分支, 第二次的检验用的是sctcmp所以不用修改.

所以我们要用Angr构建一个检查函数来替换掉check_equals_HXUITWOAPNESFFEG(), 这是一个Hook过程.

使用Angr来解题

代码

# This level performs the following computations:
#
# 1. Get 16 bytes of user input and encrypt it.
# 2. Save the result of check_equals_AABBCCDDEEFFGGHH (or similar)
# 3. Get another 16 bytes from the user and encrypt it.
# 4. Check that it's equal to a predefined password.
#
# The ONLY part of this program that we have to worry about is #2. We will be
# replacing the call to check_equals_ with our own version, using a hook, since
# check_equals_ will run too slowly otherwise.
# 这一关的流程:
# 1. 获取用户16字符输入并加密
# 2. 从check_equals_HXUITWOAPNESFFEG()检验并保存检验结果(相同则1, 不相同则0)
# 3. 再次获取用户输入, 并加密参考数据
# 4. 再次检查, 只不过用的是strcmp()进行检查
# 我们需要担心的只有第二步, 我们要使用自己的检查函数来替换原本的check_equals_HXUITWOAPNESFFEG(), 因为里面有太多的if分支了
# 我们替换的方法是使用钩子

import angr
import claripy
import sys

def main(argv):
  # 建立Angr项目
  path_to_binary = argv[1]
  project = angr.Project(path_to_binary)

  # Since Angr can handle the initial call to scanf, we can start from the
  # beginning.
  # 由于Angr可以处理对scanf的初始调用, 所以这里默认从mian函数开始即可
  initial_state = project.factory.entry_state(
    add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
  )

  # Hook the address of where check_equals_ is called.
  # 钩取调用check_equals_HXUITWOAPNESFFEG()函数的地址
  # (!)
  check_equals_called_address = 0x080493CE # 调用check_equals_HXUITWOAPNESFFEG()的地址

  # The length parameter in angr.Hook specifies how many bytes the execution
  # engine should skip after completing the hook. This will allow hooks to
  # replace certain instructions (or groups of instructions). Determine the
  # instructions involved in calling check_equals_, and then determine how many
  # bytes are used to represent them in memory. This will be the skip length.
  # Angr.Hook中的lenth参数指定执行引擎在完成Hook后应该跳过多少字节.
  # 这将使Hook替换某些指令(call指令, 后面清理栈的指令)
  # 确定调用check_equals_HXUITWOAPNESFFEG()所涉及的指令, 然后确定在内存中使用多少字节来表示他们, 这就是lenth
  # (!)
  instruction_to_skip_length = 5 + 3 # call指令占5个字节, 下面清理参数栈的add指令占3字节
  @project.hook(check_equals_called_address, length=instruction_to_skip_length)
  def skip_check_equals_(state):
    # Determine the address where user input is stored. It is passed as a
    # parameter ot the check_equals_ function. Then, load the string. Reminder:
    # int check_equals_(char* to_check, int length) { ...
    # 确定存储用户输入的地址. 它作为check_equals_HXUITWOAPNESFFEG()的参数传递
    # 然后加载字符串.
    user_input_buffer_address = 0x0804C044 # :integer, probably hexadecimal
    user_input_buffer_length = 0x10

    # Reminder: state.memory.load will read the stored value at the address
    # user_input_buffer_address of byte length user_input_buffer_length.
    # It will return a bitvector holding the value. This value can either be
    # symbolic or concrete, depending on what was stored there in the program.
    # state.memry.load()会读取地址user_input+buffer_address处存储的值, 长度为user_input_buffer_lenth
    # 它将返回一个保存该值的位向量.这个值可以是符号位向量, 也可以是常数位向量.
    # 取决于程序中存储的内容
    # 这里加载的buffer中存储的是符号值, 虽然我们没有创建符号并注入, 但是开始时使用的时默认的执行起始状态
    # 所以会自动注入符号到scanf所指向的内存, 也就是上面的buffer
    user_input_string = state.memory.load(
      user_input_buffer_address,
      user_input_buffer_length
    )
  
    # Determine the string this function is checking the user input against.
    # It's encoded in the name of this function; decompile the program to find
    # it.
    # 确定此函数正在检查的用户输入的字符串
    # 它以这个函数的名称编码
    check_against_string = "HXUITWOAPNESFFEG" # :string

    # gcc uses eax to store the return value, if it is an integer. We need to
    # set eax to 1 if check_against_string == user_input_string and 0 otherwise.
    # However, since we are describing an equation to be used by z3 (not to be
    # evaluated immediately), we cannot use Python if else syntax. Instead, we 
    # have to use claripy's built in function that deals with if statements.
    # claripy.If(expression, ret_if_true, ret_if_false) will output an
    # expression that evaluates to ret_if_true if expression is true and
    # ret_if_false otherwise.
    # Think of it like the Python "value0 if expression else value1".
    # gcc使用eax来存储返回值, 如果它是一个整数.
    # 如果check_against_string == user_input_string, 就将eax设为1, 否则为0
    # 但是, 由于我们描述的是z3使用的方程(不立即计算), 我们不能使用Python的if else语法
    # 相反, 我们必须使用claripy的内置函数来处理if语句
    # claripy.if(expression, ret_if_true, ret_if_false)
    # 根据参数名可以知道第一个参数expression为真, 则返回第二个参数, 反之则返回第三个参数
    state.regs.eax = claripy.If(
      user_input_string == check_against_string, 
      claripy.BVV(1, 32), 
      claripy.BVV(0, 32)
    )

  simulation = project.factory.simgr(initial_state)

  def is_successful(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return "Good Job".encode() in stdout_output

  def should_abort(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return "Try again".encode() in stdout_output

  simulation.explore(find=is_successful, avoid=should_abort)

  if simulation.found:
    solution_state = simulation.found[0]

    # Since we are allowing Angr to handle the input, retrieve it by printing
    # the contents of stdin. Use one of the early levels as a reference.
    # 因为我们允许Angr处理输入, 所以通过打印stdin的内容来查看它.
    solution = solution_state.posix.dumps(sys.stdin.fileno())
    print(solution)
  else:
    raise Exception('Could not find the solution')

if __name__ == '__main__':
  main(sys.argv)

去掉注释后的代码

import angr
import claripy
import sys

def main(argv):
  path_to_binary = argv[1]
  project = angr.Project(path_to_binary)

  initial_state = project.factory.entry_state(
    add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
  )

  check_equals_called_address = 0x080493CE 

  instruction_to_skip_length = 5 + 3 
  @project.hook(check_equals_called_address, length=instruction_to_skip_length)

  def skip_check_equals_(state):

    user_input_buffer_address = 0x0804C044
    user_input_buffer_length = 0x10

    user_input_string = state.memory.load(
      user_input_buffer_address,
      user_input_buffer_length
    )
  
    check_against_string = "HXUITWOAPNESFFEG"

    state.regs.eax = claripy.If(
      user_input_string == check_against_string, 
      claripy.BVV(1, 32), 
      claripy.BVV(0, 32)
    )

  simulation = project.factory.simgr(initial_state)

  def is_successful(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return "Good Job".encode() in stdout_output

  def should_abort(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return "Try again".encode() in stdout_output

  simulation.explore(find=is_successful, avoid=should_abort)

  if simulation.found:
    solution_state = simulation.found[0]

    solution = solution_state.posix.dumps(sys.stdin.fileno())
    print(solution)
  else:
    raise Exception('Could not find the solution')

if __name__ == '__main__':
  main(sys.argv)

使用Angr实现Hook过程

1. 确定要Hook函数的调用地址

我们的目标是替换掉check_equals_HXUITWOAPNESFFEG(), 并实现与其相同的功能, 所以要先知道在哪里调用了函数才能实现Hook

1	check_equals_called_address = 0x080493CE

2. 开始Hook, 并保证Hook后程序能够正常运行

像是这道题, 如果我们不处理掉原本存在的call check_equals_HXUITWOAPNESFFEG的话, 仍会调用这个函数, 导致Angr不能解出题目, 所以我们还要把原本的call语句覆盖掉, 这个call指令长度为5字节.

  instruction_to_skip_length = 5 # call指令占5个字节, 下面清理参数栈的add指令占3字节
  @project.hook(check_equals_called_address, length=instruction_to_skip_length)
# check_equals_called_address参数表示要Hook替换指令的起始地址
# length参数表示要Hook替换指令的指令长度

3. 定义Hook函数

紧跟在Hook的下方

def skip_check_equals_(state):

  user_input_buffer_address = 0x0804C044
  user_input_buffer_length = 0x10

  user_input_string = state.memory.load(
    user_input_buffer_address,
    user_input_buffer_length
  )

  check_against_string = "HXUITWOAPNESFFEG"

  state.regs.eax = claripy.If(
    user_input_string == check_against_string, 
    claripy.BVV(1, 32), 
    claripy.BVV(0, 32)
  )

需要注意的是我们描述的是z3使用的方程, 所以不能直接使用if语句, 而是使用claripy内置的函数来表示if.

至此我们便完成了hook

10_angr_simprocedures

编译并运行

lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/10_angr_simprocedures$ python3 generate.py 12
34 10_angr_simprocedures
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/10_angr_simprocedures$ ./10_angr_simprocedure
s
Enter the password: aaaaaaaaaaa
Try again.
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/10_angr_simprocedures$

分析

使用IDA分析

.text:0804932A ; int __cdecl main(int argc, const char **argv, const char **envp)
.text:0804932A                 public main
.text:0804932A main            proc near               ; DATA XREF: _start+2A↑o
.text:0804932A
.text:0804932A var_3C          = dword ptr -3Ch
.text:0804932A var_2C          = dword ptr -2Ch
.text:0804932A var_28          = dword ptr -28h
.text:0804932A var_24          = dword ptr -24h
.text:0804932A s               = byte ptr -1Dh
.text:0804932A var_C           = dword ptr -0Ch
.text:0804932A var_4           = dword ptr -4
.text:0804932A argc            = dword ptr  8
.text:0804932A argv            = dword ptr  0Ch
.text:0804932A envp            = dword ptr  10h
.text:0804932A
.text:0804932A ; __unwind {
.text:0804932A                 endbr32
.text:0804932E                 lea     ecx, [esp+4]
.text:08049332                 and     esp, 0FFFFFFF0h
.text:08049335                 push    dword ptr [ecx-4]
.text:08049338                 push    ebp
.text:08049339                 mov     ebp, esp
.text:0804933B                 push    ecx
.text:0804933C                 sub     esp, 44h
.text:0804933F                 mov     eax, ecx
.text:08049341                 mov     eax, [eax+4]
.text:08049344                 mov     [ebp+var_3C], eax
.text:08049347                 mov     eax, large gs:14h
.text:0804934D                 mov     [ebp+var_C], eax
.text:08049350                 xor     eax, eax
.text:08049352                 mov     [ebp+var_24], 0DEADBEEFh
.text:08049359                 mov     [ebp+var_2C], 11h
.text:08049360                 sub     esp, 4
.text:08049363                 push    10h             ; n
.text:08049365                 push    offset aHxuitwoapnesff ; "HXUITWOAPNESFFEG"
.text:0804936A                 push    offset password ; dest
.text:0804936F                 call    _memcpy
.text:08049374                 add     esp, 10h
.text:08049377                 sub     esp, 4
.text:0804937A                 push    11h             ; n
.text:0804937C                 push    0               ; c
.text:0804937E                 lea     eax, [ebp+s]
.text:08049381                 push    eax             ; s
.text:08049382                 call    _memset
.text:08049387                 add     esp, 10h
.text:0804938A                 sub     esp, 0Ch
.text:0804938D                 push    offset aEnterThePasswo ; "Enter the password: "
.text:08049392                 call    _printf
.text:08049397                 add     esp, 10h
.text:0804939A                 sub     esp, 8
.text:0804939D                 lea     eax, [ebp+s]
.text:080493A0                 push    eax
.text:080493A1                 push    offset a16s     ; "%16s"
.text:080493A6                 call    ___isoc99_scanf
.text:080493AB                 add     esp, 10h
.text:080493AE                 mov     [ebp+var_28], 0
.text:080493B5                 jmp     short loc_80493EC
.text:080493B7 ; ---------------------------------------------------------------------------
.text:080493B7
.text:080493B7 loc_80493B7:                            ; CODE XREF: main+C6↓j
.text:080493B7                 mov     eax, 12h
.text:080493BC                 sub     eax, [ebp+var_28]
.text:080493BF                 mov     edx, eax
.text:080493C1                 lea     ecx, [ebp+s]
.text:080493C4                 mov     eax, [ebp+var_28]
.text:080493C7                 add     eax, ecx
.text:080493C9                 movzx   eax, byte ptr [eax]
.text:080493CC                 movsx   eax, al
.text:080493CF                 sub     esp, 8
.text:080493D2                 push    edx
.text:080493D3                 push    eax
.text:080493D4                 call    complex_function
.text:080493D9                 add     esp, 10h
.text:080493DC                 mov     ecx, eax
.text:080493DE                 lea     edx, [ebp+s]
.text:080493E1                 mov     eax, [ebp+var_28]
.text:080493E4                 add     eax, edx
.text:080493E6                 mov     [eax], cl
.text:080493E8                 add     [ebp+var_28], 1
.text:080493EC
.text:080493EC loc_80493EC:                            ; CODE XREF: main+8B↑j
.text:080493EC                 cmp     [ebp+var_28], 0Fh
.text:080493F0                 jle     short loc_80493B7
.text:080493F2                 cmp     [ebp+var_24], 0DEADBEEFh
.text:080493F9                 jz      loc_804A532
.text:080493FF                 cmp     [ebp+var_24], 0DEADBEEFh
.text:08049406                 jnz     loc_8049C9F
.text:0804940C                 cmp     [ebp+var_24], 0DEADBEEFh
.text:08049413                 jnz     loc_804985C
.text:08049419                 cmp     [ebp+var_24], 0DEADBEEFh
.text:08049420                 jz      loc_8049641
.text:08049426                 cmp     [ebp+var_24], 0DEADBEEFh
.text:0804942D                 jnz     loc_804953A
.text:08049433                 cmp     [ebp+var_24], 0DEADBEEFh
.text:0804943A                 jz      short loc_80494BB
.text:0804943C                 cmp     [ebp+var_24], 0DEADBEEFh
.text:08049443                 jz      short loc_8049480
.text:08049445                 cmp     [ebp+var_24], 0DEADBEEFh
.text:0804944C                 jz      short loc_8049467
.text:0804944E                 sub     esp, 8
.text:08049451                 push    10h
.text:08049453                 lea     eax, [ebp+s]
.text:08049456                 push    eax
.text:08049457                 call    check_equals_HXUITWOAPNESFFEG
.text:0804945C                 add     esp, 10h
.text:0804945F                 mov     [ebp+var_2C], eax
.text:08049462                 jmp     loc_804B654
.text:08049467 ; ---------------------------------------------------------------------------
.text:08049467
.text:08049467 loc_8049467:                            ; CODE XREF: main+122↑j
.text:08049467                 sub     esp, 8
.text:0804946A                 push    10h
.text:0804946C                 lea     eax, [ebp+s]
.text:0804946F                 push    eax
.text:08049470                 call    check_equals_HXUITWOAPNESFFEG
.text:08049475                 add     esp, 10h
.text:08049478                 mov     [ebp+var_2C], eax
.text:0804947B                 jmp     loc_804B654
.text:08049480 ; ---------------------------------------------------------------------------
.text:08049480
.text:08049480 loc_8049480:                            ; CODE XREF: main+119↑j
.text:08049480                 cmp     [ebp+var_24], 0DEADBEEFh
.text:08049487                 jnz     short loc_80494A2
.text:08049489                 sub     esp, 8
.text:0804948C                 push    10h
.text:0804948E                 lea     eax, [ebp+s]
.text:08049491                 push    eax
.text:08049492                 call    check_equals_HXUITWOAPNESFFEG
.text:08049497                 add     esp, 10h
.text:0804949A                 mov     [ebp+var_2C], eax
.text:0804949D                 jmp     loc_804B654
.text:080494A2 ; ---------------------------------------------------------------------------
.text:080494A2
.text:080494A2 loc_80494A2:                            ; CODE XREF: main+15D↑j
.text:080494A2                 sub     esp, 8
.text:080494A5                 push    10h
.text:080494A7                 lea     eax, [ebp+s]
.text:080494AA                 push    eax
.text:080494AB                 call    check_equals_HXUITWOAPNESFFEG
.text:080494B0                 add     esp, 10h
.text:080494B3                 mov     [ebp+var_2C], eax
.text:080494B6                 jmp     loc_804B654
.text:080494BB ; ---------------------------------------------------------------------------
.text:080494BB
.text:080494BB loc_80494BB:                            ; CODE XREF: main+110↑j
.text:080494BB                 cmp     [ebp+var_24], 0DEADBEEFh
.text:080494C2                 jnz     short loc_80494FF
.text:080494C4                 cmp     [ebp+var_24], 0DEADBEEFh
.text:080494CB                 jz      short loc_80494E6
.text:080494CD                 sub     esp, 8
.text:080494D0                 push    10h
.text:080494D2                 lea     eax, [ebp+s]
.text:080494D5                 push    eax
.text:080494D6                 call    check_equals_HXUITWOAPNESFFEG
.text:080494DB                 add     esp, 10h
.text:080494DE                 mov     [ebp+var_2C], eax
.text:080494E1                 jmp     loc_804B654
.text:080494E6 ; ---------------------------------------------------------------------------
.text:080494E6
.text:080494E6 loc_80494E6:                            ; CODE XREF: main+1A1↑j
.text:080494E6                 sub     esp, 8
.text:080494E9                 push    10h
.text:080494EB                 lea     eax, [ebp+s]
.text:080494EE                 push    eax
.text:080494EF                 call    check_equals_HXUITWOAPNESFFEG
.text:080494F4                 add     esp, 10h
.text:080494F7                 mov     [ebp+var_2C], eax
.text:080494FA                 jmp     loc_804B654
.text:080494FF ; ---------------------------------------------------------------------------

这一关跟上一关不一样的地方在于使用了非常多的check_equals_HXUITWOAPNESFFEG()函数, 大部分都是无用的代码, 但是能够加大我们Hook的难度, 我们的思路要从函数调用处Hook转换到在函数本身进行Hook才能提高效率, 而Angr提供了这样的功能

使用Angr解题

代码

# This challenge is similar to the previous one. It operates under the same
# premise that you will have to replace the check_equals_ function. In this
# case, however, check_equals_ is called so many times that it wouldn't make
# sense to hook where each one was called. Instead, use a SimProcedure to write
# your own check_equals_ implementation and then hook the check_equals_ symbol
# to replace all calls to scanf with a call to your SimProcedure.
# 本次挑战与上一次类似, 你必须替换掉check_equals_HXUITWOAPNESFFEG()它才能正常运行
# 但是, 在这一关中这个函数被调用非常多次, 以至于对每个调用的位置进行Hook是非常低效且无意义的
# 但是如果使用SimProcedure模拟管理器编写你自己的check函数实现, 然后Hook挂钩到check_equals_HXUITWOAPNESFFEG()符号
# 从而实现对所有输入的检查替换为对SimProcedure的调用
#
# You may be thinking:
#   Why can't I just use hooks? The function is called many times, but if I hook
#   the address of the function itself (rather than the addresses where it is
#   called), I can replace its behavior everywhere. Furthermore, I can get the
#   parameters by reading them off the stack (with memory.load(regs.esp + xx)),
#   and return a value by simply setting eax! Since I know the length of the
#   function in bytes, I can return from the hook just before the 'ret'
#   instruction is called, which will allow the program to jump back to where it
#   was before it called my hook.
# If you thought that, then congratulations! You have just invented the idea of
# SimProcedures! Instead of doing all of that by hand, you can let the already-
# implemented SimProcedures do the boring work for you so that you can focus on
# writing a replacement function in a Pythonic way.
# As a bonus, SimProcedures allow you to specify custom calling conventions, but
# unfortunately it is not covered in this CTF.
# 你可能会想:
# 为什么我不能只使用Hook, 该函数被多次调用, 我们直接Hook函数本身即可
# 但是这个原理其实就是上面SimProcedure的想法, 只是SimProcedure替你解决大部分细节的实现, 让这个想法的实现变得更加简单

import angr
import claripy
import sys

def main(argv):
  # 建立Angr项目
  path_to_binary = argv[1]
  project = angr.Project(path_to_binary)

  # 由于使用的是scanf默认的输入, 所以可以使用默认方式从main开始进行符号执行
  initial_state = project.factory.entry_state(
    add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
  )

  # Define a class that inherits angr.SimProcedure in order to take advantage
  # of Angr's SimProcedures.
  # 定义一个继承angr.SimProcedure的类, 以便使用Angr的SimProcedures
  class ReplacementCheckEquals(angr.SimProcedure):
    # A SimProcedure replaces a function in the binary with a simulated one
    # written in Python. Other than it being written in Python, the function
    # acts largely the same as any function written in C. Any parameter after
    # 'self' will be treated as a parameter to the function you are replacing.
    # The parameters will be bitvectors. Additionally, the Python can return in
    # the ususal Pythonic way. Angr will treat this in the same way it would
    # treat a native function in the binary returning. An example:
    # SimProcedure用Python编写的模拟函数替换二进制文件中的函数.
    # 除了它是用Python编写的之外, 该函数的行为与任何使用C编写的函数基本相同.
    # "self"之后的任何参数都将被视为您要替换的函数的参数, 参数将是位向量
    # 此外, Python可以以通常的Pythonic方式返回.
    # Angr将像对待二进制返回中的本机函数一样对待它, 下面是一个例子
    #
    # int add_if_positive(int a, int b) {
    #   if (a >= 0 && b >= 0) return a + b;
    #   else return 0;
    # }
    #
    # could be simulated with...
    # 可以被模拟成
    #
    # class ReplacementAddIfPositive(angr.SimProcedure):
    #   def run(self, a, b):
    #     if a >= 0 and b >=0:
    #       return a + b
    #     else:
    #       return 0
    #
    # Finish the parameters to the check_equals_ function. Reminder:
    # int check_equals_AABBCCDDEEFFGGHH(char* to_check, int length) { ...
    # 完成check_equals_HXUITWOAPNESFFEG(), 提示: int check_equals_XXXXXXXXXXXXXXXX(char* to_check, int length){}
    # (!)
    def run(self, to_check, len):
      # We can almost copy and paste the solution from the previous challenge.
      # Hint: Don't look up the address! It's passed as a parameter.
      # 我们几乎可以复制上一关的解决方案
      # 提示: 不要查地址, 它作为参数传递
      # (!)
      user_input_buffer_address = to_check
      user_input_buffer_length = len

      # Note the use of self.state to find the state of the system in a 
      # SimProcedure.
      # 注意使用self.state在SimProcedure中查找系统状态
      user_input_string = self.state.memory.load(
        user_input_buffer_address,
        user_input_buffer_length
      )

      check_against_string = "HXUITWOAPNESFFEG"
    
      # Finally, instead of setting eax, we can use a Pythonic return statement
      # to return the output of this function. 
      # Hint: Look at the previous solution.
      return claripy.If(user_input_string == check_against_string, claripy.BVV(1, 32), claripy.BVV(0, 32))


  # Hook the check_equals symbol. Angr automatically looks up the address 
  # associated with the symbol. Alternatively, you can use 'hook' instead
  # of 'hook_symbol' and specify the address of the function. To find the 
  # correct symbol, disassemble the binary.
  # Hook check_equals_HXUITWOAPNESFFEG()的符号. Angr自动查找与符号相关联的地址.
  # 或者, 你可以使用"hook"而不是"hook_symbol"并指定函数的地址
  # 要找到正确的符号, 请反汇编二进制文件
  # (!)
  check_equals_symbol = "check_equals_HXUITWOAPNESFFEG" # :string
  project.hook_symbol(check_equals_symbol, ReplacementCheckEquals())

  # 创建模拟管理器
  simulation = project.factory.simgr(initial_state)

  def is_successful(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return "Good Job".encode() in stdout_output

  def should_abort(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return "Try again".encode() in stdout_output

  simulation.explore(find=is_successful, avoid=should_abort)

  if simulation.found:
    solution_state = simulation.found[0]

    solution = solution_state.posix.dumps(sys.stdin.fileno())
    print(solution)
  else:
    raise Exception('Could not find the solution')

if __name__ == '__main__':
  main(sys.argv)

去掉注释后的代码

import angr
import claripy
import sys

def main(argv):
  path_to_binary = argv[1]
  project = angr.Project(path_to_binary)

  initial_state = project.factory.entry_state(
    add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
  )

  class ReplacementCheckEquals(angr.SimProcedure):
    def run(self, to_check, len):
      user_input_buffer_address = to_check
      user_input_buffer_length = len

      user_input_string = self.state.memory.load(
        user_input_buffer_address,
        user_input_buffer_length
      )

      check_against_string = "HXUITWOAPNESFFEG"
  
      return claripy.If(user_input_string == check_against_string, claripy.BVV(1, 32), claripy.BVV(0, 32))

  check_equals_symbol = "check_equals_HXUITWOAPNESFFEG"
  project.hook_symbol(check_equals_symbol, ReplacementCheckEquals())

  simulation = project.factory.simgr(initial_state)

  def is_successful(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return "Good Job".encode() in stdout_output

  def should_abort(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return "Try again".encode() in stdout_output

  simulation.explore(find=is_successful, avoid=should_abort)

  if simulation.found:
    solution_state = simulation.found[0]

    solution = solution_state.posix.dumps(sys.stdin.fileno())
    print(solution)
  else:
    raise Exception('Could not find the solution')

if __name__ == '__main__':
  main(sys.argv)

使用Angr实现Hook函数的过程

1. 定义一个继承angr.SimProcedure的类, 以此使用SimProcedure

1	class ReplacementCheckEquals(angr.SimProcedure):

2. 定义我们自己的Hook函数

注意缩进, 函数定义包含在类中

def run(self, to_check, len):
  user_input_buffer_address = to_check
  user_input_buffer_length = len

  user_input_string = self.state.memory.load(
    user_input_buffer_address,
    user_input_buffer_length
  )

  check_against_string = "HXUITWOAPNESFFEG"

  return claripy.If(user_input_string == check_against_string, claripy.BVV(1, 32), claripy.BVV(0, 32))

注意第一个self后面才是实际程序中传入的参数

我们从传入的参数中得到了用户输入的地址, 并读取相应的内容(这个内容是符号内容, 因为是默认符号执行)

然后使用claripy定义的if模拟原本函数的逻辑

3. 执行Hook函数

定义完Hook函数了, 就可以开始使用这个Hook函数替换原本的函数了

Angr提供的功能非常强大, 你只需要知道符号名, Angr就会自动查找到相应的位置进行Hook.

1 2	check_equals_symbol = "check_equals_HXUITWOAPNESFFEG" project.hook_symbol(check_equals_symbol, ReplacementCheckEquals())

11_angr_sim_scanf

编译并运行

lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/11_angr_sim_scanf$ python3 generate.py 1234 1
1_angr_sim_scanf
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/11_angr_sim_scanf$ ./11_angr_sim_scanf
Enter the password: aaaaaaaaaaaaaaaa
Try again.
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/11_angr_sim_scanf$

分析

使用IDA分析

int __cdecl main(int argc, const char **argv, const char **envp)
{
  int i; // [esp+20h] [ebp-28h]
  char s[20]; // [esp+28h] [ebp-20h] BYREF
  unsigned int v7; // [esp+3Ch] [ebp-Ch]

  v7 = __readgsdword(0x14u);
  memset(s, 0, sizeof(s));
  qmemcpy(s, "HXUITWOA", 8);
  for ( i = 0; i <= 7; ++i )
    s[i] = complex_function(s[i], i);           // 对参考数据进行加密
  printf("Enter the password: ");
  __isoc99_scanf("%u %u", buffer0, buffer1);    // 输入的格式是无符号整数
  if ( !strncmp(buffer0, s, 4u) && !strncmp(buffer1, &s[4], 4u) )
    puts("Good Job.");
  else
    puts("Try again.");
  return 0;
}

这一关scanf有两个输入, 之前我们是在后面注入符号, 在这一关我们Hook掉scanf()函数

使用Angr解题

代码

# This time, the solution involves simply replacing scanf with our own version,
# since Angr does not support requesting multiple parameters with scanf.
# 这一次, 通关方法是将scanf替换为我们自己的版本, 因为Angr 不支持scanf输入多个参数
import angr
import claripy
import sys

def main(argv):
  # 建立Angr项目
  path_to_binary = argv[1]
  project = angr.Project(path_to_binary)

  # 默认执行起始状态, 从main开始
  initial_state = project.factory.entry_state(
    add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
  )

  # 这一次我们用的是上一关的Hook, 不过Hook掉的是scanf函数
  class ReplacementScanf(angr.SimProcedure):
    # Finish the parameters to the scanf function. Hint: 'scanf("%u %u", ...)'.
    # (!)
    def run(self, format_string, scanf0_address, scanf1_address):
      # Hint: scanf0_address is passed as a parameter, isn't it?
      scanf_data_len = 4 * 8
      scanf0 = claripy.BVS('scanf0', scanf_data_len)#一个无符号整型长度为4字节, 32位
      scanf1 = claripy.BVS('scanf1', scanf_data_len)

      # The scanf function writes user input to the buffers to which the 
      # parameters point.
      # scanf函数将用户输入写入参数指向的缓冲区
      # 就是将我们创建的符号位向量载入我们变量的位置, 变量的位置通过参数获得
      self.state.memory.store(scanf0_address, scanf0, endness=project.arch.memory_endness)
      self.state.memory.store(scanf1_address, scanf1, endness=project.arch.memory_endness)

      # Now, we want to 'set aside' references to our symbolic values in the
      # globals plugin included by default with a state. You will need to
      # store multiple bitvectors. You can either use a list, tuple, or multiple
      # keys to reference the different bitvectors.
      # 现在, 我们要在默认情况下包含在状态中的globals插件
      # (!)
      self.state.globals['solution0'] = scanf0
      self.state.globals['solution1'] = scanf1


  scanf_symbol = "__isoc99_scanf"
  project.hook_symbol(scanf_symbol, ReplacementScanf())

  simulation = project.factory.simgr(initial_state)

  def is_successful(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return "Good Job".encode() in stdout_output

  def should_abort(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return "Try again".encode() in stdout_output

  simulation.explore(find=is_successful, avoid=should_abort)

  if simulation.found:
    solution_state = simulation.found[0]

    # Grab whatever you set aside in the globals dict.
    # 获取你在globals dict中放置的东西
    stored_solutions0 = solution_state.globals['solution0']
    stored_solutions1 = solution_state.globals['solution1']
    solution0 = solution_state.solver.eval(stored_solutions0)
    solution1 = solution_state.solver.eval(stored_solutions1)

    print(solution0)
    print(solution1)

  else:
    raise Exception('Could not find the solution')

if __name__ == '__main__':
  main(sys.argv)

去掉注释后的代码

import angr
import claripy
import sys

def main(argv):
  path_to_binary = argv[1]
  project = angr.Project(path_to_binary)

  initial_state = project.factory.entry_state(
    add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
  )

  class ReplacementScanf(angr.SimProcedure):

    def run(self, format_string, scanf0_address, scanf1_address):
      scanf_data_len = 4 * 8
      scanf0 = claripy.BVS('scanf0', scanf_data_len)
      scanf1 = claripy.BVS('scanf1', scanf_data_len)

      self.state.memory.store(scanf0_address, scanf0, endness=project.arch.memory_endness)
      self.state.memory.store(scanf1_address, scanf1, endness=project.arch.memory_endness)

      self.state.globals['solution0'] = scanf0
      self.state.globals['solution1'] = scanf1

  scanf_symbol = "__isoc99_scanf"
  project.hook_symbol(scanf_symbol, ReplacementScanf())

  simulation = project.factory.simgr(initial_state)

  def is_successful(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return "Good Job".encode() in stdout_output

  def should_abort(state):
    stdout_output = state.posix.dumps(sys.stdout.fileno())
    return "Try again".encode() in stdout_output

  simulation.explore(find=is_successful, avoid=should_abort)

  if simulation.found:
    solution_state = simulation.found[0]

    stored_solutions0 = solution_state.globals['solution0']
    stored_solutions1 = solution_state.globals['solution1']
    solution0 = solution_state.solver.eval(stored_solutions0)
    solution1 = solution_state.solver.eval(stored_solutions1)

    print(solution0)
    print(solution1)

  else:
    raise Exception('Could not find the solution')

if __name__ == '__main__':
  main(sys.argv)

Hook流程

整体的hook流程跟上一个关卡差不多, 但是这里多了一个对globals的存取

#存入globals中
      self.state.globals['solution0'] = scanf0
      self.state.globals['solution1'] = scanf1
#从globals中取出
    stored_solutions0 = solution_state.globals['solution0']
    stored_solutions1 = solution_state.globals['solution1']

存入

首先我们要知道存入的对象是谁, scanf0和scanf1是我们创建的符号位向量

但是这两个符号都是在对象声明, 但是我们最终对这两个符号进行约束是在对象外部的, 而两个符号对于外部是不可见的, 所以我们需要一个全局的变量对这两个符号进行存储, 以便在外部能对其进行约束.

取出并约束

取出后进行约束, 还是跟前面的关卡一样使用eval进行约束, 最终得到答案

12_angr_veritesting

编译并运行

lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/12_angr_veritesting$ python3 generate.py 1234 12_angr_veritesting
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/12_angr_veritesting$ ./12_angr_veritesting
Enter the password: aaaaaaaaaaaaaaaa
Try again.
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/12_angr_veritesting$

分析

.text:080492B8 ; int __cdecl main(int argc, const char **argv, const char **envp)
.text:080492B8                 public main
.text:080492B8 main            proc near               ; DATA XREF: _start+2A↑o
.text:080492B8
.text:080492B8 var_4C          = dword ptr -4Ch
.text:080492B8 var_3C          = dword ptr -3Ch
.text:080492B8 var_38          = dword ptr -38h
.text:080492B8 var_34          = dword ptr -34h
.text:080492B8 s               = byte ptr -2Dh
.text:080492B8 var_C           = dword ptr -0Ch
.text:080492B8 anonymous_0     = dword ptr -8
.text:080492B8 argc            = dword ptr  8
.text:080492B8 argv            = dword ptr  0Ch
.text:080492B8 envp            = dword ptr  10h
.text:080492B8
.text:080492B8 ; __unwind {
.text:080492B8                 endbr32
.text:080492BC                 lea     ecx, [esp+4]
.text:080492C0                 and     esp, 0FFFFFFF0h
.text:080492C3                 push    dword ptr [ecx-4]
.text:080492C6                 push    ebp
.text:080492C7                 mov     ebp, esp
.text:080492C9                 push    ebx
.text:080492CA                 push    ecx
.text:080492CB                 sub     esp, 50h
.text:080492CE                 mov     eax, ecx
.text:080492D0                 mov     eax, [eax+4]
.text:080492D3                 mov     [ebp+var_4C], eax
.text:080492D6                 mov     eax, large gs:14h
.text:080492DC                 mov     [ebp+var_C], eax
.text:080492DF                 xor     eax, eax
.text:080492E1                 sub     esp, 4
.text:080492E4                 push    21h ; '!'       ; n
.text:080492E6                 push    0               ; c
.text:080492E8                 lea     eax, [ebp+s]
.text:080492EB                 push    eax             ; s
.text:080492EC                 call    _memset
.text:080492F1                 add     esp, 10h
.text:080492F4                 sub     esp, 0Ch
.text:080492F7                 push    offset aEnterThePasswo ; "Enter the password: "
.text:080492FC                 call    _printf
.text:08049301                 add     esp, 10h
.text:08049304                 sub     esp, 8
.text:08049307                 lea     eax, [ebp+s]
.text:0804930A                 push    eax
.text:0804930B                 push    offset a32s     ; "%32s"
.text:08049310                 call    ___isoc99_scanf
.text:08049315                 add     esp, 10h
.text:08049318                 mov     [ebp+var_3C], 0
.text:0804931F                 mov     [ebp+var_34], 0
.text:08049326                 mov     [ebp+var_38], 0
.text:0804932D                 jmp     short loc_804935F
.text:0804932F ; ---------------------------------------------------------------------------
.text:0804932F
.text:0804932F loc_804932F:                            ; CODE XREF: main+AB↓j
.text:0804932F                 lea     edx, [ebp+s]
.text:08049332                 mov     eax, [ebp+var_38]
.text:08049335                 add     eax, edx
.text:08049337                 movzx   eax, byte ptr [eax]
.text:0804933A                 movsx   ebx, al
.text:0804933D                 mov     eax, [ebp+var_38]
.text:08049340                 add     eax, 85h
.text:08049345                 sub     esp, 8
.text:08049348                 push    eax
.text:08049349                 push    48h ; 'H'
.text:0804934B                 call    complex_function
.text:08049350                 add     esp, 10h
.text:08049353                 cmp     ebx, eax
.text:08049355                 jnz     short loc_804935B
.text:08049357                 add     [ebp+var_3C], 1
.text:0804935B
.text:0804935B loc_804935B:                            ; CODE XREF: main+9D↑j
.text:0804935B                 add     [ebp+var_38], 1
.text:0804935F
.text:0804935F loc_804935F:                            ; CODE XREF: main+75↑j
.text:0804935F                 cmp     [ebp+var_38], 1Fh
.text:08049363                 jle     short loc_804932F
.text:08049365                 cmp     [ebp+var_3C], 20h ; ' '
.text:08049369                 jnz     short loc_8049385
.text:0804936B                 movzx   eax, byte ptr [ebp+var_C]
.text:0804936F                 test    al, al
.text:08049371                 jnz     short loc_8049385
.text:08049373                 sub     esp, 0Ch
.text:08049376                 push    offset aGoodJob ; "Good Job."
.text:0804937B                 call    _puts
.text:08049380                 add     esp, 10h
.text:08049383                 jmp     short loc_8049395
.text:08049385 ; ---------------------------------------------------------------------------
.text:08049385
.text:08049385 loc_8049385:                            ; CODE XREF: main+B1↑j
.text:08049385                                         ; main+B9↑j
.text:08049385                 sub     esp, 0Ch
.text:08049388                 push    offset s        ; "Try again."
.text:0804938D                 call    _puts
.text:08049392                 add     esp, 10h
.text:08049395
.text:08049395 loc_8049395:                            ; CODE XREF: main+CB↑j
.text:08049395                 mov     eax, 0
.text:0804939A                 mov     ecx, [ebp+var_C]
.text:0804939D                 xor     ecx, large gs:14h
.text:080493A4                 jz      short loc_80493AB
.text:080493A6                 call    ___stack_chk_fail
.text:080493AB ; ---------------------------------------------------------------------------
.text:080493AB
.text:080493AB loc_80493AB:                            ; CODE XREF: main+EC↑j
.text:080493AB                 lea     esp, [ebp-8]
.text:080493AE                 pop     ecx
.text:080493AF                 pop     ebx
.text:080493B0                 pop     ebp
.text:080493B1                 lea     esp, [ecx-4]
.text:080493B4                 retn
.text:080493B4 ; } // starts at 80492B8
.text:080493B4 main            endp
.text:080493B4
.text:080493B4 ; ---------------------------------------------------------------------------
.text:080493B5                 align 10h
.text:080493C0
.text:080493C0 ; =============== S U B R O U T I N E =======================================

这道题有一个循环的if语句, 会导致路径爆炸, 但是与前面不同的是输入加密跟检验是同时进行的, 不想是之前单独用一个函数来检验了.这样会加大的Hook的难度.

veritesting

但是题目给了提示就是使用Veritesting, 百度一下可以知道这个方法就是解决路径爆炸的, 采用的是: 结合静态符号执行以及动态符号执行的方式.

什么是静态符号执行, 什么是动态符号执行

动态符号执行: 以具体数值作为输入来模拟执行程序代码, 启动代码模拟执行器, 并从当前路径的分支语句的为此中搜集所有符号约束. 然后修改该符号约束内容,构造出一条新的可行路径约束, 并用约束求解器求解出一个可行的新的具体输入, 然后进行新的一轮分析. 我简单的理解为走一步看一步.
静态符号执行: 使用的是抽象的符号代替具体值**(不是具体值)**, 在遇到分支语句是, 会探索每一个分支, 将分支条件加入到相应的路径约束中. 最终通过路径约束找出具体值.

关于二者之间的区别, 用一个例子来讲会更便于理解:

动态符号执行像是毛利小五郎, 他老是根据一个线索就随意指认一个嫌疑人是凶手(往往是错的), 然后这个嫌疑人就会向他解释并提供一个新的线索, 这样毛利小五郎就有两个线索了. 这时他在根据这两个线索武断另一个嫌疑人是凶手, 这样下一个嫌疑人也会解释自己不是凶手并提供第三个线索, 以此类推.虽然不能一下子就得到真正的凶手, 但至少能缩小凶手的范围.(只不过嫌疑人不再是经典三选一, 很可能是2^16个嫌疑人)
而静态符号执行像是柯南, 他会冷静的搜索每一个线索, 最终整理这些线索, 一下子就找到凶手.

而Veritesting结合了二者: 先使用动态执行, 遇到简单的代码就切换到静态执行(不含系统调用, 简介跳转, 或难以精准推断的语句), 在静态模式下, 首先动态恢复控制流图, 找到静态执行容易分析的语句和难以分析的语句. 然后换回动态执行去处理静态不好解决的情况(复杂的情况用具体值来”猜测”会更好解决)

使用Angr解题

代码

直接用第一关的代码, 不同的是在创建模拟执行器的时候添加一个参数veritesting = True, simulation = project.factory.simgr(initial_state, veritesting = True)

# When you construct a simulation manager, you will want to enable Veritesting:
# project.factory.simgr(initial_state, veritesting=True)
# Hint: use one of the first few levels' solutions as a reference.
# 当你构建一个模拟管理器时, 你会想要启用Veritesting:
# 提示: 使用前面几个关卡的其中一个解决方案
# 这一道题有一个循环if, 来检验加密后的字符串, 这会导致路径指数级增长.

import angr
import claripy
import sys

def main(argv):
    # 建立Angr项目
    file_path = argv[1]
    project = angr.Project(file_path)

    # 确定符号执行的起始状态, 只有一个参数, 正常进入main函数开始即可
    initial_state = project.factory.entry_state(
        add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY, 
                        angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
    )

    simulation = project.factory.simgr(initial_state, veritesting = True)

    # 暂且先试试直接符号执行, 因为有了Veritesting符号增强
    def is_successful(state):
        stdout_output = state.posix.dumps(sys.stdout.fileno())
        return "Good Job".encode() in stdout_output

    def is_false(state):
        stdout_output = state.posix.dumps(sys.stdout.fileno())
        return "Try again".encode() in stdout_output

    simulation.explore(find = is_successful, avoid = is_false)

    if simulation.found:
        solution_state = simulation.found[0]
        print(solution_state.posix.dumps(sys.stdin.fileno()))
    else:
        raise Exception('Could not find the solution')

if __name__ == '__main__':
    main(sys.argv)

去掉注释后的代码

import angr
import claripy
import sys

def main(argv):
    file_path = argv[1]
    project = angr.Project(file_path)

    initial_state = project.factory.entry_state(
        add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY, 
                        angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
    )

    simulation = project.factory.simgr(initial_state, veritesting = True)

    def is_successful(state):
        stdout_output = state.posix.dumps(sys.stdout.fileno())
        return "Good Job".encode() in stdout_output

    def is_false(state):
        stdout_output = state.posix.dumps(sys.stdout.fileno())
        return "Try again".encode() in stdout_output

    simulation.explore(find = is_successful, avoid = is_false)

    if simulation.found:
        solution_state = simulation.found[0]
        print(solution_state.posix.dumps(sys.stdin.fileno()))
    else:
        raise Exception('Could not find the solution')

if __name__ == '__main__':
    main(sys.argv)

13_angr_static_binary

编译并运行

lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/13_angr_static_binary$ python3 generate.py 1234 13_angr_static_binary
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/13_angr_static_binary$ ./13_angr_static_binary
Enter the password: aaaaaaaaaaaaaaaa
Try again.
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/13_angr_static_binary$

分析

使用伪代码会更清楚

int __cdecl main(int argc, const char **argv, const char **envp)
{
  int i; // [esp+1Ch] [ebp-3Ch]
  int j; // [esp+20h] [ebp-38h]
  char v6[20]; // [esp+24h] [ebp-34h] BYREF
  char v7[20]; // [esp+38h] [ebp-20h] BYREF
  unsigned int v8; // [esp+4Ch] [ebp-Ch]

  v8 = __readgsdword(0x14u);
  for ( i = 0; i <= 19; ++i )
    v7[i] = 0;
  qmemcpy(v7, "HXUITWOA", 8);
  printf("Enter the password: ");
  _isoc99_scanf("%8s", v6);
  for ( j = 0; j <= 7; ++j )
    v6[j] = complex_function(v6[j], j);
  if ( j_strcmp_ifunc(v6, v7) )
    puts("Try again.");
  else
    puts("Good Job.");
  return 0;
}

是一个正常的单个输入flag, 然后经过内置函数strcmp检验的程序

使用Angr解题

根据题目要求需要使用内置的SimProceures函数替换原本的一些库函数, 这样会让Angr的运行速度更快

代码

# This challenge is the exact same as the first challenge, except that it was
# compiled as a static binary. Normally, Angr automatically replaces standard
# library functions with SimProcedures that work much more quickly.
# 这一关与第一关完全相同, 只是它被编译为静态二进制文件.
# 通常, Angr会自动将标准库函数替换为运行速度更快的SimProceures
#
# To solve the challenge, manually hook any standard library c functions that
# are used. Then, ensure that you begin the execution at the beginning of the
# main function. Do not use entry_state.
# 要通过这一关, 手动Hook任何使用标准C函数. 然后确保在main函数的开头开始执行.
# 不要使用entry_state
#
# Here are a few SimProcedures Angr has already written for you. They implement
# standard library functions. You will not need all of them:
# 这里有一些Angr已经为你编写好了的SimProcedures. 它们相当于标准库函数
# 你不需要全部的函数, 一部分就可以了
# angr.SIM_PROCEDURES['libc']['malloc']
# angr.SIM_PROCEDURES['libc']['fopen']
# angr.SIM_PROCEDURES['libc']['fclose']
# angr.SIM_PROCEDURES['libc']['fwrite']
# angr.SIM_PROCEDURES['libc']['getchar']
# angr.SIM_PROCEDURES['libc']['strncmp']
# angr.SIM_PROCEDURES['libc']['strcmp']
# angr.SIM_PROCEDURES['libc']['scanf']
# angr.SIM_PROCEDURES['libc']['printf']
# angr.SIM_PROCEDURES['libc']['puts']
# angr.SIM_PROCEDURES['libc']['exit']
#
# As a reminder, you can hook functions with something similar to:
# project.hook(malloc_address, angr.SIM_PROCEDURES['libc']['malloc']())
# 提醒一下, 你可以使用下面的语句来实现Hook
# project.hook(函数地址, angr.SIM_PROCEDURES['libc']['要替换的内置函数名']())
#
# There are many more, see:
# 了解更多, 请看下面的网站
# https://github.com/angr/angr/tree/master/angr/procedures/libc
#
# Additionally, note that, when the binary is executed, the main function is not
# the first piece of code called. In the _start function, __libc_start_main is
# called to start your program. The initialization that occurs in this function
# can take a long time with Angr, so you should replace it with a SimProcedure.
# angr.SIM_PROCEDURES['glibc']['__libc_start_main']
# Note 'glibc' instead of 'libc'.
# 另外, 请注意, 执行二进制文件时, 主函数不是最先被调用的代码,.
# 而是在_start函数中, 通过调用__libc_start_main来启动main函数
# 使用Angr在此函数中进行的初始化可能需要很长的时间, 因此你需要将其替换为SimPorcedure
# ...
# 注意是'glibc'而不是'libc'

import angr
import sys

def main(argv):
    binary_path = argv[1]
    project = angr.Project(binary_path)

    # 符号执行初始状态在main函数的开头
    start_address = 0x08049E0F
    initial_state = project.factory.blank_state(
        addr = start_address, 
        add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                        angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
    )

    #开始Hook
    start_main_address = 0x0804A240
    project.hook(start_main_address, angr.SIM_PROCEDURES['glibc']['__libc_start_main']())
    strcmp_address = 0x080490D0
    project.hook(strcmp_address, angr.SIM_PROCEDURES['libc']['strcmp']())
    scanf_address = 0x08051330
    project.hook(scanf_address, angr.SIM_PROCEDURES['libc']['scanf']())
    puts_address = 0x0805EC90
    project.hook(puts_address, angr.SIM_PROCEDURES['libc']['puts']())

    simulation = project.factory.simgr(initial_state)

    # 创建模拟执行器
    def is_successful(state):
        stdout_output = state.posix.dumps(sys.stdout.fileno())
        return "Good Job".encode() in stdout_output

    def is_false(state):
        stdout_output = state.posix.dumps(sys.stdout.fileno())
        return "Try again".encode() in stdout_output

    simulation.explore(find = is_successful, avoid = is_false)

    if simulation.found:
        solution_state = simulation.found[0]
        print(solution_state.posix.dumps(sys.stdin.fileno()).decode())
    else:
        raise Exception('Could not find the solution')

if __name__ == '__main__':
    main(sys.argv)

去掉注释后的代码

import angr
import sys

def main(argv):
    binary_path = argv[1]
    project = angr.Project(binary_path)

    start_address = 0x08049E0F
    initial_state = project.factory.blank_state(
        addr = start_address, 
        add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                        angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
    )

    start_main_address = 0x0804A240
    project.hook(start_main_address, angr.SIM_PROCEDURES['glibc']['__libc_start_main']())
    strcmp_address = 0x080490D0
    project.hook(strcmp_address, angr.SIM_PROCEDURES['libc']['strcmp']())
    scanf_address = 0x08051330
    project.hook(scanf_address, angr.SIM_PROCEDURES['libc']['scanf']())
    puts_address = 0x0805EC90
    project.hook(puts_address, angr.SIM_PROCEDURES['libc']['puts']())

    simulation = project.factory.simgr(initial_state)

    def is_successful(state):
        stdout_output = state.posix.dumps(sys.stdout.fileno())
        return "Good Job".encode() in stdout_output

    def is_false(state):
        stdout_output = state.posix.dumps(sys.stdout.fileno())
        return "Try again".encode() in stdout_output

    simulation.explore(find = is_successful, avoid = is_false)

    if simulation.found:
        solution_state = simulation.found[0]
        print(solution_state.posix.dumps(sys.stdin.fileno()).decode())
    else:
        raise Exception('Could not find the solution')

if __name__ == '__main__':
    main(sys.argv)

14_angr_shared_library

编译(动态库直接运行不了)

运行不了后面就没法验证自己解出的flag是不是对的, 可以自己照着加密函数写一个test程序, 也可以直接载入so文件调用validate函数, 第一个方法更简单一些.

1
2

lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/14_angr_shared_library$ python3 generate.py 1234 14_angr_shared_library_so.c
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/14_angr_shared_library$

分析

是一个so文件根据提示找到我们需要的函数validate

_BOOL4 __cdecl validate(char *s1, int a2)
{
  char v3; // al
  char s2[20]; // [esp+4h] [ebp-24h] BYREF
  int j; // [esp+18h] [ebp-10h]
  int i; // [esp+1Ch] [ebp-Ch]

  if ( a2 <= 7 )
    return 0;
  for ( i = 0; i <= 19; ++i )
    s2[i] = 0;
  qmemcpy(s2, "HXUITWOA", 8);
  for ( j = 0; j <= 7; ++j )
  {
    v3 = complex_function(s1[j], j);
    s1[j] = v3;
  }
  return strcmp(s1, s2) == 0;
}

使用Angr解题

代码

# The shared library has the function validate, which takes a string and returns
# either true (1) or false (0). The binary calls this function. If it returns
# true, the program prints "Good Job." otherwise, it prints "Try again."
# 共享库有函数validate, 它接受一个字符串并返回true(1)或false(0).
# 二进制文件调用此函数, 如果返回True, 程序将打印"Good Job", 否则打印"Tyr again"
#
# Note: When you run this script, make sure you run it on
# lib14_angr_shared_library.so, not the executable. This level is intended to
# teach how to analyse binary formats that are not typical executables.
# 注意: 运行此脚本是, 请确保在lib14_angr_shared_library.so上运行
# 而不是在可以执行文件上运行
# 这个关卡的目的在于如何分析非典型可执行文件的二进制格式

import angr
import claripy
import sys

def main(argv):
  path_to_binary = argv[1]

  # The shared library is compiled with position-independent code. You will need
  # to specify the base address. All addresses in the shared library will be
  # base + offset, where offset is their address in the file.
  # 共享库用的是与位置无关代码编译的, 你需要指定基地址.
  # 共享库中的所有地址都是base + offset, 其中offset是它们在文件中的偏移
  # (!)
  base = 0x400000
  project = angr.Project(path_to_binary, load_options={
    'main_opts' : {
      'base_addr' : base
    }
  })

  # Initialize any symbolic values here; you will need at least one to pass to
  # the validate function.
  # 在这里初始化任何符号值, 你至少需要传递一个符号给验证函数
  # (!)


  buffer_pointer = claripy.BVV(0x500000, 32)  # 指针指向一段内存

  # Begin the state at the beginning of the validate function, as if it was
  # called by the program. Determine the parameters needed to call validate and
  # replace 'parameters...' with bitvectors holding the values you wish to pass.
  # Recall that 'claripy.BVV(value, size_in_bits)' constructs a bitvector
  # initialized to a single value.
  # Remember to add the base value you specified at the beginning to the
  # function address!
  # Hint: int validate(char* buffer, int length) { ...
  # Another hint: the password is 8 bytes long.
  # 从验证函数起始状态开始, 就像是它被程序调用了. 确定调用验证所需的参数,
  # 并将"参数"替换为你像传递的值的位向量.
  # 回想一下, 'claripy.BVV(value.size_in_bits)'构造了一个初始化为单个值的位向量.
  # 不要忘记将你在开头指定的基址放在函数地址
  # 提示: int validate(char * buffer, int lenth){...}
  # 另一个提示: 密码长度为8字节
  # (!)
  validate_function_address = base + 0x0000129C
  initial_state = project.factory.call_state(validate_function_address, buffer_pointer, 8) # 八个字符

  # You will need to add code to inject a symbolic value into the program. Also,
  # at the end of the function, constrain eax to equal true (value of 1) just
  # before the function returns. There are multiple ways to do this:
  # 1. Use a hook.
  # 2. Search for the address just before the function returns and then
  #    constrain eax (this may require putting code elsewhere)
  # 你需要添加代码从而将符号值注入程序.
  # 此外, 在函数结束时, 在函数返回之前将eax约束为等于True, 有很多方法可以做到: 
  # 1. 使用钩子
  # 2. 搜索函数返回前的地址, 然后约束eax
  # (!)
  # 创建符号
  password_len_bits = 8 * 8
  password = claripy.BVS("password", password_len_bits)
  # 符号化
  initial_state.memory.store( buffer_pointer,  password)

  simulation = project.factory.simgr(initial_state)

  success_address = base + 0x0000134C
  simulation.explore(find=success_address)

  if simulation.found:
    solution_state = simulation.found[0]

    # Determine where the program places the return value, and constrain it so
    # that it is true. Then, solve for the solution and print it.
    # 确定程序将返回值放在哪里, 并对其进行约束, 使其为真, 然后求解并打印
    # (!)

    solution_state.add_constraints( solution_state.regs.eax != 0 )
    solution = solution_state.solver.eval(password, cast_to = bytes)
    print(solution)
  else:
    raise Exception('Could not find the solution')

if __name__ == '__main__':
  main(sys.argv)

去掉注释后的代码

import angr
import claripy
import sys

def main(argv):
  path_to_binary = argv[1]

  base = 0x400000
  project = angr.Project(path_to_binary, load_options={
    'main_opts' : {
      'base_addr' : base
    }
  })

  buffer_pointer = claripy.BVV(0x500000, 32)

  validate_function_address = base + 0x0000129C
  initial_state = project.factory.call_state(validate_function_address, buffer_pointer, 8)

  password_len_bits = 8 * 8
  password = claripy.BVS("password", password_len_bits)

  initial_state.memory.store( buffer_pointer,  password)

  simulation = project.factory.simgr(initial_state)

  success_address = base + 0x0000134C
  simulation.explore(find=success_address)

  if simulation.found:
    solution_state = simulation.found[0]

    solution_state.add_constraints( solution_state.regs.eax != 0 )
    solution = solution_state.solver.eval(password, cast_to = bytes)
    print(solution)
  else:
    raise Exception('Could not find the solution')

if __name__ == '__main__':
  main(sys.argv)

符号执行so文件的过程

因为so文件无法执行其函数, 所以我们需要通过Angr来符号执行它

找到要符号执行的函数的地址(记得加上基址)

要知道函数在哪里才能符号执行

1	validate_function_address = base + 0x0000129C

创建参数(符号位向量), 并模拟调用validate

原函数int validate(char * input, int n);

第一个参数: 该函数的地址(base + offset)
第二个参数(从这个参数开始就是原函数应有的参数): input也得是一个符号位向量, 大小为4个字节
第三个参数: 因为n只能是8, 所以直接传入一个常量就行了

1 2	validate_function_address = base + 0x0000129C initial_state = project.factory.call_state(validate_function_address, buffer_pointer, 8)

注入buffer的符号位向量

因为前面已经给出了指向buffer的指针input了, 所以我们符号注入的地址自然也是input

password_len_bits = 8 * 8
password = claripy.BVS("password", password_len_bits) # 创建符号

initial_state.memory.store( buffer_pointer,  password) # 注入符号

符号执行

我们的目标是返回值为1, 所以我们必须让state在return的位置时结束符号执行, 然后再这个state约束eax寄存器, 从而实现约束返回值

success_address = base + 0x0000134C # retn汇编指令
simulation.explore(find=success_address)

solution_state.add_constraints( solution_state.regs.eax != 0 ) # 在该状态下约束此时的eax为1即可解得flag

15_angr_arbitrary_read

编译并运行

lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/15_angr_arbitrary_read$ python3 generate.py 1234 15_angr_arbitrary_read
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/15_angr_arbitrary_read$ ./15_angr_arbitrary_read
Enter the password: aaaaa
Try again.
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/15_angr_arbitrary_read$

分析

int __cdecl main(int argc, const char **argv, const char **envp)
{
  char v4; // [esp+Ch] [ebp-1Ch] BYREF
  char *s; // [esp+1Ch] [ebp-Ch]

  s = try_again;
  printf("Enter the password: ");
  __isoc99_scanf("%u %20s", &key, &v4);
  if ( key == 41217380 )
    puts(s);
  else
    puts(try_again);
  return 0;
}

本题我们要利用栈溢出, 实现打印”Good Job”

使用Angr解题

代码

# This binary takes both an integer and a string as a parameter. A certain
# integer input causes the program to reach a buffer overflow with which we can
# read a string from an arbitrary memory location. Our goal is to use Angr to
# search the program for this buffer overflow and then automatically generate
# an exploit to read the string "Good Job."
# 这个二进制接受一个整数和一个字符串输入, 某个整数输入会导致程序达到缓冲区溢出
# 我们可以使用该缓冲区从任意内存位置读取字符串. 我们的目标是使用Angr在程序搜索
# 缓冲区溢出, 然后自动生成一个漏洞利用来读取字符串"Good Job"
#
# What is the point of reading the string "Good Job."?
# This CTF attempts to replicate a simplified version of a possible vulnerability
# where a user can exploit the program to print a secret, such as a password or
# a private key. In order to keep consistency with the other challenges and to
# simplify the challenge, the goal of this program will be to print "Good Job."
# instead.
# 读取字符串"Good Job"有什么意义?
# 这个关卡试图复现一个漏洞的简化版本, 用户可以利用这个漏洞打印一个自己想要打印的东西
# 为了与其他挑战保持一直并简化挑战, 我们需要打印"Good Job"来证明自己已经通关
#
# The general strategy for crafting this script will be to:
# 制作此脚本的通用策略是: 
# 1) Search for calls of the 'puts' function, which will eventually be exploited
#    to print out "Good Job."
# 1. 搜索"puts"函数的调用, 最终利用这个函数并打印出"Good Job"
# 2) Determine if the first parameter of 'puts', a pointer to the string to be
#    printed, can be controlled by the user to be set to the location of the
#    "Good Job." string.
# 2. 确定"puts"的第一个参数: 一个指向要打印的字符串的指针. 是否可以由用户控制并设为"Good Job"
# 3) Solve for the input that prints "Good Job."
# 3. 求解打印"Good Job"的输入
#
# Note: The script is structured to implement step #2 before #1.
# 提示: 在脚本中2.的实现早于1.

# Some of the source code for this challenge:
# 这个关卡的部分源代码
# #include <stdio.h>
# #include <stdlib.h>
# #include <string.h>
# #include <stdint.h>
# 
# // This will all be in .rodata
# 这都会放在.rodata段区中
# char msg[] = "${ description }$";
# char* try_again = "Try again.";
# char* good_job = "Good Job.";
# uint32_t key;
# 
# void print_msg() {
#   printf("%s", msg);
# }
#
# uint32_t complex_function(uint32_t input) {
#   ...
# }
# 
# struct overflow_me {
#   char buffer[16];# 上一个输入造成溢出到print的地址, 如果输入的是"Good Job"的地址那么就会导致最终打印的是"Good Job"
#   char* to_print;
# }; 
# 
# int main(int argc, char* argv[]) {
#   struct overflow_me locals;
#   locals.to_print = try_again;# 初始化为"Try again"的地址
# 
#   print_msg();
# 
#   printf("Enter the password: ");
#   scanf("%u %20s", &key, locals.buffer);注意这里是%20s, 剩下的四个字节就是我们要填入"Good Job"地址的地方
#
#   key = complex_function(key);
# 
#   switch (key) {
#     case ?:
#       puts(try_again);
#       break;
#
#     ...
#
#     case ?:
#       // Our goal is to trick this call to puts to print the "secret
#       // password" (which happens, in our case, to be the string
#       // "Good Job.")
# 我们的目标是骗过puts的调用, 并打印"Good Job"
#       puts(locals.to_print);
#       break;
#   
#     ...
#   }
# 
#   return 0;
# }

import angr
import claripy
import sys

def main(argv):
  path_to_binary = argv[1]
  project = angr.Project(path_to_binary)

  # You can either use a blank state or an entry state; just make sure to start
  # at the beginning of the program.
  # (!)
  initial_state = project.factory.entry_state(
    add_options = {angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY, 
                   angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
  )

  # Again, scanf needs to be replaced.
  # scanf需要再一次被替代
  class ReplacementScanf(angr.SimProcedure):
    # Hint: scanf("%u %20s")
    def run(self, format_string, scanf0_address, scanf1_address):
      # %u
      scanf0 = claripy.BVS('scanf0', 4 * 8)
    
      # %20s
      scanf1 = claripy.BVS('scanf1', 20 * 8)

      # The bitvector.chop(bits=n) function splits the bitvector into a Python
      # list containing the bitvector in segments of n bits each. In this case,
      # we are splitting them into segments of 8 bits (one byte.)
      # bitvector.chop(bits = n)函数将位向量拆分为一个Python列表, 其中很多个包含n位小节的位向量元素
      #  我们现在将其分为8位一字节的小结
      for char in scanf1.chop(bits=8):
        # Ensure that each character in the string is printable. An interesting
        # experiment, once you have a working solution, would be to run the code
        # without constraining the characters to the printable range of ASCII.
        # Even though the solution will technically work without this, it's more
        # difficult to enter in a solution that contains character you can't
        # copy, paste, or type into your terminal or the web form that checks 
        # your solution.
        # 确保字符串中的每个字符都是可打印的.
        # 保证你的解决方案的有效简洁
        # (!)
        self.state.add_constraints(char >= 0x21, char <= 126)

      # Warning: Endianness only applies to integers. If you store a string in
      # memory and treat it as a little-endian integer, it will be backwards.
      # 警告: 大小端字节序仅适用于整数, 字符串还是正常排列
      # key是全局变量所以可以直接用地址来注入符号
      #scanf0_address = 0x48587030
      self.state.memory.store(scanf0_address, scanf0, endness=project.arch.memory_endness)
      self.state.memory.store(scanf1_address, scanf1)

      self.state.globals['solution0'] = scanf0
      self.state.globals['solution1'] = scanf1

  # Hook该scanf()函数
  scanf_symbol = '__isoc99_scanf'  # :string
  project.hook_symbol(scanf_symbol, ReplacementScanf())

  # We will call this whenever puts is called. The goal of this function is to
  # determine if the pointer passed to puts is controllable by the user, such
  # that we can rewrite it to point to the string "Good Job."
  # 只要puts被调用, 我们就会调用它, 此函数的目标是确定传递给puts的指针是否可由用户控制
  # 以便我们可以重写为指向字符串"Good Job"的地址
  def check_puts(state):
    # Recall that puts takes one parameter, a pointer to the string it will
    # print. If we load that pointer from memory, we can analyse it to determine
    # if it can be controlled by the user input in order to point it to the
    # location of the "Good Job." string.
    # 回想一下, puts有一个参数: 指向字符串的指针.
    # 如果我们从内存中找到这个指针,
    # 我们可以分析它以确定它是否可以由用户输入控制
    #
    # Treat the implementation of this function as if puts was just called.
    # The stack, registers, memory, etc should be set up as if the x86 call
    # instruction was just invoked (but, of course, the function hasn't copied
    # the buffers yet.)
    # The stack will look as follows:
    # 将此函数代替puts
    # ...
    # esp + 7 -> /----------------\
    # esp + 6 -> |      puts      |
    # esp + 5 -> |    parameter   |     参数先压入栈中
    # esp + 4 -> \----------------/
    # esp + 3 -> /----------------\
    # esp + 2 -> |     return     |
    # esp + 1 -> |     address    |     调用puts时的call指令会将返回地址压入栈中
    #     esp -> \----------------/
    #
    # Hint: Look at level 08, 09, or 10 to review how to load a value from a
    # memory address. Remember to use the correct endianness in the future when
    # loading integers; it has been included for you here.
    # 查看第8, 9, 10关如何从内存地址加载值
    # 记住以后再加载整数时要使用正确的大小端字节序, 这个已经被包含在下面的参数中了
    # (!)
    puts_parameter = state.memory.load(state.regs.esp + 4, 4, endness=project.arch.memory_endness)

    # The following function takes a bitvector as a parameter and checks if it
    # can take on more than one value. While this does not necessary tell us we
    # have found an exploitable state, it is a strong indication that the 
    # bitvector we checked may be controllable by the user.
    # Use it to determine if the pointer passed to puts is symbolic.
    # 下面的函数将用一个位向量作为参数, 并检查它是否可以接受多个值
    # 虽然这并不一定代表我们发现漏洞, 但是这强烈表明我们检查的位向量可能由用户控制
    # 使用它来确定传递给puts的指针是否是符号的
    # (!)
    if state.solver.symbolic(puts_parameter):
      # Determine the location of the "Good Job." string. We want to print it
      # out, and we will do so by attempting to constrain the puts parameter to
      # equal it. Hint: use 'objdump -s <binary>' to look for the string's
      # address in .rodata.
      # 确定好"Good Job"的地址 我们想要将其输出, 我们通过对puts参数约束为"Good Job"的地址来实现
      # 使用objdump -s <二进制文件>来查看"Good Job"的地址
      # (!)
      good_job_string_address = 0x4858554B # :integer, probably hexadecimal

      # Create an expression that will test if puts_parameter equals
      # good_job_string_address. If we add this as a constraint to our solver,
      # it will try and find an input to make this expression true. Take a look
      # at level 08 to remind yourself of the syntax of this.
      # 创建一个表达式来测试puts_parameter是否等于"Good Job"的地址
      # 如果我们将此作为约束添加到求解器, 它将尝试找到一个输入以使得该表达式为真
      # 看一下第八关, 看看怎么使用该语法
      # (!)
      is_vulnerable_expression = puts_parameter == good_job_string_address # :boolean bitvector expression

      # Have Angr evaluate the state to determine if all the constraints can
      # be met, including the one we specified above. If it can be satisfied,
      # we have found our exploit!
      #
      # When doing this, however, we do not want to edit our state in case we
      # have not yet found what we are looking for. To test if our expression
      # is satisfiable without editing the original, we need to clone the state.
      # 让Angr评估状态从而确定是否可以满足所有约束, 包括我们上面指定的约束. 如果可以满足, 我们就可以找到漏洞
      # 但是当我们这样做时, 我们不希望编辑我们的状态, 因为我们还没有找到想要的东西.
      # 为了此时我们的表达式是否可以再不编辑原始内容的情况下满足, 我们需要克隆状态
      copied_state = state.copy()

      # We can now play around with the copied state without changing the
      # original. We need to add our vulnerable expression as a state to test it.
      # Look at level 08 and compare this call to how it is called there.
      # 我们现在可以在不改变原始状态的情况下使用克隆状态.
      # 我们需要添加易受攻击的表达式作为state从而测试他.
      copied_state.add_constraints(is_vulnerable_expression)

      # Finally, we test if we can satisfy the constraints of the state.
      # 最后, 我们测试可以满足条件的约束
      if copied_state.satisfiable():
        # Before we return, let's add the constraint to the solver for real,
        # instead of just querying whether the constraint _could_ be added.
        # 在我们返回之前, 我们要将约束添加到求解其中, 而不仅仅只是查看是否可以约束
        state.add_constraints(is_vulnerable_expression)
        return True
      else:
        return False
    else: # not state.solver.symbolic(???)
      return False

  simulation = project.factory.simgr(initial_state)

  # In order to determine if we have found a vulnerable call to 'puts',  we need
  # to run the function check_puts (defined above) whenever we reach a 'puts'
  # call. To do this, we will look for the place where the instruction pointer,
  # state.addr, is equal to the beginning of the puts function.
  # 为了确定我们是否发现了对"puts"函数的易受攻击的调用, 我们需要在调用"puts"调用运行时运行上面定义阿check_ptus
  # 为此, 我们将IP指针state.addr变成puts函数开头的地址.
  def is_successful(state):
    # We are looking for puts. Check that the address is at the (very) beginning
    # of the puts function. Warning: while, in theory, you could look for
    # any address in puts, if you execute any instruction that adjusts the stack
    # pointer, the stack diagram above will be incorrect. Therefore, it is
    # recommended that you check for the very beginning of puts.
    # (!)
    puts_address = 0x08049090
    if state.addr == puts_address:
      # Return True if we determine this call to puts is exploitable.
      return check_puts(state)
    else:
      # We have not yet found a call to puts; we should continue!
      return False

  simulation.explore(find=is_successful)

  if simulation.found:
    solution_state = simulation.found[0]

    stored_solutions0 = solution_state.globals['solution0']
    stored_solutions1 = solution_state.globals['solution1']
    solution0 = solution_state.solver.eval(stored_solutions0)
    solution1 = solution_state.solver.eval(stored_solutions1, cast_to = bytes)
    print(solution0)
    print(solution1)
  else:
    raise Exception('Could not find the solution')

if __name__ == '__main__':
  main(sys.argv)

去掉注释后的代码

import angr
import claripy
import sys

def main(argv):
  path_to_binary = argv[1]
  project = angr.Project(path_to_binary)

  initial_state = project.factory.entry_state(
    add_options = {angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY, 
                   angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
  )

  class ReplacementScanf(angr.SimProcedure):

    def run(self, format_string, scanf0_address, scanf1_address):
      scanf0 = claripy.BVS('scanf0', 4 * 8)
  
      scanf1 = claripy.BVS('scanf1', 20 * 8)

      for char in scanf1.chop(bits=8):
        self.state.add_constraints(char >= 0x21, char <= 126)

      self.state.memory.store(scanf0_address, scanf0, endness=project.arch.memory_endness)
      self.state.memory.store(scanf1_address, scanf1)

      self.state.globals['solution0'] = scanf0
      self.state.globals['solution1'] = scanf1

  scanf_symbol = '__isoc99_scanf'  # :string
  project.hook_symbol(scanf_symbol, ReplacementScanf())

  def check_puts(state):
    puts_parameter = state.memory.load(state.regs.esp + 4, 4, endness=project.arch.memory_endness)

    if state.solver.symbolic(puts_parameter):
      good_job_string_address = 0x4858554B

      is_vulnerable_expression = puts_parameter == good_job_string_address

      copied_state = state.copy()

      copied_state.add_constraints(is_vulnerable_expression)

      if copied_state.satisfiable():

        state.add_constraints(is_vulnerable_expression)
        return True
      else:
        return False
    else:
      return False

  simulation = project.factory.simgr(initial_state)

  def is_successful(state):
    puts_address = 0x08049090
    if state.addr == puts_address:
      return check_puts(state)
    else:
      return False

  simulation.explore(find=is_successful)

  if simulation.found:
    solution_state = simulation.found[0]

    stored_solutions0 = solution_state.globals['solution0']
    stored_solutions1 = solution_state.globals['solution1']
    solution0 = solution_state.solver.eval(stored_solutions0)
    solution1 = solution_state.solver.eval(stored_solutions1, cast_to = bytes)
    print(solution0)
    print(solution1)
  else:
    raise Exception('Could not find the solution')

if __name__ == '__main__':
  main(sys.argv)

查找漏洞思路

Hook scanf()

因为这道题有两个输入, 所以我们需要先进行Hook

class ReplacementScanf(angr.SimProcedure):

  def run(self, format_string, scanf0_address, scanf1_address):
    scanf0 = claripy.BVS('scanf0', 4 * 8)
  
    scanf1 = claripy.BVS('scanf1', 20 * 8)

    for char in scanf1.chop(bits=8):
      self.state.add_constraints(char >= 0x21, char <= 126)

    self.state.memory.store(scanf0_address, scanf0, endness=project.arch.memory_endness)
    self.state.memory.store(scanf1_address, scanf1)

    self.state.globals['solution0'] = scanf0
    self.state.globals['solution1'] = scanf1

scanf_symbol = '__isoc99_scanf'  # :string
project.hook_symbol(scanf_symbol, ReplacementScanf())

注意在这时我们创建了两个符号scanf0和scanf1, 并且存入了globals中, 存入golbal中是因为后面对符号的约束求解是在该<类>的外部的, 所以需要将符号值存入全局变量中.

定义check_puts()

def check_puts(state):
  puts_parameter = state.memory.load(state.regs.esp + 4, 4, endness=project.arch.memory_endness)

  if state.solver.symbolic(puts_parameter):
    good_job_string_address = 0x4858554B

    is_vulnerable_expression = puts_parameter == good_job_string_address

    copied_state = state.copy()

    copied_state.add_constraints(is_vulnerable_expression)

    if copied_state.satisfiable():

      state.add_constraints(is_vulnerable_expression)
      return True
    else:
      return False
  else:
    return False

该函数的是在找到puts函数的时候调用的, 也就是说此时符号执行已经到达了puts开头.我们将此时的state传入check_puts中.

这时就可以获取该状态(刚进入puts的上下文)下esp的值, 从而获取参数值(要打印的字符串的指针).

找到目标打印字符串

1
2
3

if state.solver.symbolic(puts_parameter):
  good_job_string_address = 0x4858554B

复制状态, 检验是否满足条件

因为要对状态进行修改, 我们需要先”测试”能不能满足条件得到我们想要的结果

is_vulnerable_expression = puts_parameter == good_job_string_address

copied_state = state.copy()

copied_state.add_constraints(is_vulnerable_expression)

如果可满足, 则给当前状态添加约束

1 2	if copied_state.satisfiable(): state.add_constraints(is_vulnerable_expression)

外部符号执行, 并约束求解

simulation = project.factory.simgr(initial_state)

def is_successful(state):
  puts_address = 0x08049090
  if state.addr == puts_address:
    return check_puts(state)
  else:
    return False

simulation.explore(find=is_successful)

if simulation.found:
  solution_state = simulation.found[0]

  stored_solutions0 = solution_state.globals['solution0']
  stored_solutions1 = solution_state.globals['solution1']
  solution0 = solution_state.solver.eval(stored_solutions0)
  solution1 = solution_state.solver.eval(stored_solutions1, cast_to = bytes)
  print(solution0)
  print(solution1)
else:
  raise Exception('Could not find the solution')

16_angr_arbitrary_write

编译并执行

lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/16_angr_arbitrary_write$ python3 generate.py
1234 16_angr_arbitrary_write
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/16_angr_arbitrary_write$ ./16_angr_arbitrary_
write
Enter the password: aaaaaaa
Try again.
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/16_angr_arbitrary_write$

分析

int __cdecl main(int argc, const char **argv, const char **envp)
{
  char s[16]; // [esp+Ch] [ebp-1Ch] BYREF
  char *dest; // [esp+1Ch] [ebp-Ch]

  dest = unimportant_buffer;
  memset(s, 0, sizeof(s));
  strncpy(password_buffer, "PASSWORD", 0xCu);
  printf("Enter the password: ");
  __isoc99_scanf("%u %20s", &key, s);
  if ( key == 0xA95593 )
    strncpy(dest, s, 0x10u);
  else
    strncpy(unimportant_buffer, s, 0x10u);
  if ( !strncmp(password_buffer, "TWOAPNES", 8u) )
    puts("Good Job.");
  else
    puts("Try again.");
  return 0;
}

可以看到不论我们按照规定输入什么字符都无法使得程序输出”Good Job”, 所以我们仍然需要利用栈溢出的原理得到”Good Job”

我们先看看栈区图

所以只要我们输入的20个字符的最后四个覆盖dest的字符是unimportant_buffer的地址的话, 后面strncpy(dest, s, 0x10u);实际上就是给unimportant_buffer赋值, 注意第二个参数s也是我们可以控制的, 因为这个就是我们的input, 只要input = “TWOAPNES”, 我们就可以实现输出Good Job了.

使用Angr解题

代码

# Essentially, the program does the following:
# 本质上, 程序执行以下操作: 
#
# scanf("%d %20s", &key, user_input);
# ...
#   // if certain unknown conditions are true...
#   // 如果某些未知条件为真
#   strncpy(random_buffer, user_input);
#                  ^
# ...              |
# if (strncmp(secure_buffer, reference_string)) {
#   // The secure_buffer does not equal the reference string.
#   // secure_buffer不等于reference_string
#   puts("Try again.");
# } else {
#   // The two are equal.
#   // 如果二者相等就打印Good Job
#   puts("Good Job.");
# }
#
# If this program has no bugs in it, it would _always_ print "Try again." since
# user_input copies into random_buffer, not secure_buffer.
# 如果程序正常运行, 它只会打印"Try again", 因为输入无法影响到secure_buffer
#
# The question is: can we find a buffer overflow that will allow us to overwrite
# the random_buffer pointer to point to secure_buffer? (Spoiler: we can, but we
# will need to use Angr.)
# 问题是: 我们是否可以找到一个缓冲区溢出, 让我们可以覆盖random_buffer指针指向secure_buffer
# 提示: 我们可以使用Angr实现
#
# We want to identify a place in the binary, when strncpy is called, when we can:
# 我们想在二进制文件中确定一个位置, 当调用strncpy时, 我们可以:
#  1) Control the source contents (not the source pointer!)
#  1) 控制源内容
#     * This will allow us to write arbitrary data to the destination.
#     * 这将允许我们将任何数据写入目的地
#  2) Control the destination pointer
#  2) 控制目标指针
#     * This will allow us to write to an arbitrary location.
#     * 这将允许我们写入任何位置
# If we can meet both of those requirements, we can write arbitrary data to an
# arbitrary location. Finally, we need to contrain the source contents to be
# equal to the reference_string and the destination pointer to be equal to the
# secure_buffer.
# 如果我们能同时满足这两个要求, 我们就可以将任意数据写入任意位置
# 最后, 我们需要约束源内容等于reference_string, 目标指针等于secur_buffer

import angr
import claripy
import sys

def main(argv):
  path_to_binary = argv[1]
  project = angr.Project(path_to_binary)

  # You can either use a blank state or an entry state; just make sure to start
  # at the beginning of the program.
  # entry_state和blank_state都行, 只需保证在程序开头
  initial_state = project.factory.entry_state(
    add_options = {angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY, 
                   angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
  )

  class ReplacementScanf(angr.SimProcedure):
    # Hint: scanf("%u %20s")
    def run(self, format_string, scanf0_address, scanf1_address):
      # %u
      scanf0 = claripy.BVS('scanf0', 32)
    
      # %20s
      scanf1 = claripy.BVS('scanf1', 20* 8)

      for char in scanf1.chop(bits=8):
        self.state.add_constraints(char >= 0x21, char <= 0x7A)

      scanf0_address = 0x4858554C
      self.state.memory.store(scanf0_address, scanf0, endness=project.arch.memory_endness)
      self.state.memory.store(scanf1_address, scanf1)# 因为是字符串, 所以不用考虑字节序的问题

      self.state.globals['solutions0'] = scanf0
      self.state.globals['solutions1'] = scanf1

  scanf_symbol = '__isoc99_scanf'  # :string
  project.hook_symbol(scanf_symbol, ReplacementScanf())

  # In this challenge, we want to check strncpy to determine if we can control
  # both the source and the destination. It is common that we will be able to
  # control at least one of the parameters, (such as when the program copies a
  # string that it received via stdin).
  # 在这个挑战中, 我们要检查strcpy来确定我们是否可以同时控制src和dest.
  # 通常我们能够至少控制一个参数()
  def check_strncpy(state):
    # The stack will look as follows:
    # 栈空间如下图所示
    # ...          ________________
    # esp + 15 -> /                \
    # esp + 14 -> |     param2     |
    # esp + 13 -> |      len       |
    # esp + 12 -> \________________/
    # esp + 11 -> /                \
    # esp + 10 -> |     param1     |
    #  esp + 9 -> |      src       |
    #  esp + 8 -> \________________/
    #  esp + 7 -> /                \
    #  esp + 6 -> |     param0     |
    #  esp + 5 -> |      dest      |
    #  esp + 4 -> \________________/
    #  esp + 3 -> /                \
    #  esp + 2 -> |     return     |
    #  esp + 1 -> |     address    |
    #      esp -> \________________/
    # (!)
    strncpy_src = state.memory.load(state.regs.esp + 8, 4, endness = project.arch.memory_endness)    #这里读错了, 4字节当成位了
    strncpy_dest = state.memory.load(state.regs.esp + 4, 4, endness = project.arch.memory_endness)  # 
    strncpy_len = state.memory.load(state.regs.esp + 12, 4, endness = project.arch.memory_endness) 

    # We need to find out if src is symbolic, however, we care about the
    # contents, rather than the pointer itself. Therefore, we have to load the
    # the contents of src to determine if they are symbolic.
    # Hint: How many bytes is strncpy copying?
    # 我们需要确定src是否是符号, 然而我们关心的是内容, 而不是指针本身
    # 因此, 我们必须加载src的内容来确定它们是否是符号.
    # 提示: strncpy复制了多少字节
    # (!)
    src_contents = state.memory.load(strncpy_src, strncpy_len)

    # Our goal is to determine if we can write arbitrary data to an arbitrary
    # location. This means determining if the source contents are symbolic
    # (arbitrary data) and the destination pointer is symbolic (arbitrary
    # destination).
    # 我们的目标是确定我们是否可以将任意数据写入任意位置.
    # 这意味着必须确定src内容是符号, 以及目标指针是符号的.
    # (!)
    if state.solver.symbolic(strncpy_dest) and state.solver.symbolic(src_contents):
      # Use ltrace to determine the password. Decompile the binary to determine
      # the address of the buffer it checks the password against. Our goal is to
      # overwrite that buffer to store the password.
      # 使用ltrace确定密码, 反编译二进制文件来确定它检查密码的缓冲区的地址
      # 我们的目标是覆盖该缓冲区用来存储密码
      # (!)
      password_string = 'TWOAPNES' # :string
      buffer_address = 0x4858553C # :integer, probably in hexadecimal

      # Create an expression that tests if the first n bytes is length. Warning:
      # while typical Python slices (array[start:end]) will work with bitvectors,
      # they are indexed in an odd way. The ranges must start with a high value
      # and end with a low value. Additionally, the bits are indexed from right
      # to left. For example, let a bitvector, b, equal 'ABCDEFGH', (64 bits).
      # The following will read bit 0-7 (total of 1 byte) from the right-most
      # bit (the end of the string).
      #  b[7:0] == 'H'
      # To access the beginning of the string, we need to access the last 16
      # bits, or bits 48-63:
      #  b[63:48] == 'AB'
      # In this specific case, since we don't necessarily know the length of the
      # contents (unless you look at the binary), we can use the following:
      #  b[-1:-16] == 'AB', since, in Python, -1 is the end of the list, and -16
      # is the 16th element from the end of the list. The actual numbers should
      # correspond with the length of password_string.
      # (!)
      does_src_hold_password = src_contents[-1:-64] == password_string

      # Create an expression to check if the dest parameter can be set to
      # buffer_address. If this is true, then we have found our exploit!
      # 创建一个表达式来检查dest参数是否可以设置为buffer_address
      # 如果为真, 我们就找到了漏洞
      # (!)
      does_dest_equal_buffer_address = strncpy_dest == buffer_address

      # In the previous challenge, we copied the state, added constraints to the
      # copied state, and then determined if the constraints of the new state
      # were satisfiable. Since that pattern is so common, Angr implemented a
      # parameter 'extra_constraints' for the satisfiable function that does the
      # exact same thing.  Note that we can pass multiple expressions to
      # extra_constraints.
      # 在之前的挑战中, 我们复制了状态, 并给复制的状态添加了约束, 然后判断<复制状态>的约束是否满足
      # 由于这种方法很常见, 所以Angr为可满足函数事项了一个参数"extra_constraints", 它具有相同的功能
      if state.satisfiable(extra_constraints=(does_src_hold_password, does_dest_equal_buffer_address)):
        state.add_constraints(does_src_hold_password, does_dest_equal_buffer_address)
        return True
      else:
        return False
    else: # not state.solver.symbolic(???)
      return False

  simulation = project.factory.simgr(initial_state)

  def is_successful(state):
    strncpy_address = 0x080490F0
    if state.addr == strncpy_address:
      return check_strncpy(state)
    else:
      return False

  simulation.explore(find=is_successful)

  if simulation.found:
    solution_state = simulation.found[0]

    solution0 = solution_state.solver.eval(solution_state.globals['solutions0'])
    solution1 = solution_state.solver.eval(solution_state.globals['solutions1'], cast_to = bytes)
    print(solution0)
    print(solution1)

  else:
    raise Exception('Could not find the solution')

if __name__ == '__main__':
  main(sys.argv)

去掉注释后的代码

import angr
import claripy
import sys

def main(argv):
  path_to_binary = argv[1]
  project = angr.Project(path_to_binary)

  initial_state = project.factory.entry_state(
    add_options = {angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY, 
                   angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
  )

  class ReplacementScanf(angr.SimProcedure):
    def run(self, format_string, scanf0_address, scanf1_address):
      scanf0 = claripy.BVS('scanf0', 32)
      scanf1 = claripy.BVS('scanf1', 20 * 8)

      for char in scanf1.chop(bits=8):
        self.state.add_constraints(char >= 0x21, char <= 0x7A)

      #scanf0_address = 0x4858554C
      self.state.memory.store(scanf0_address, scanf0, endness=project.arch.memory_endness)
      self.state.memory.store(scanf1_address, scanf1)

      self.state.globals['solutions0'] = scanf0
      self.state.globals['solutions1'] = scanf1

  scanf_symbol = '__isoc99_scanf'  # :string
  project.hook_symbol(scanf_symbol, ReplacementScanf())

  def check_strncpy(state):
    strncpy_src = state.memory.load(state.regs.esp + 8, 4, endness = project.arch.memory_endness)    #这里读错了, 4字节当成位了
    strncpy_dest = state.memory.load(state.regs.esp + 4, 4, endness = project.arch.memory_endness)  # 
    strncpy_len = state.memory.load(state.regs.esp + 12, 4, endness = project.arch.memory_endness) 

    src_contents = state.memory.load(strncpy_src, strncpy_len)

    if state.solver.symbolic(strncpy_dest) and state.solver.symbolic(src_contents):
      password_string = 'TWOAPNES' # :string
      buffer_address = 0x4858553C # :integer, probably in hexadecimal
      does_src_hold_password = src_contents[-1:-64] == password_string

      does_dest_equal_buffer_address = strncpy_dest == buffer_address

      if state.satisfiable(extra_constraints=(does_src_hold_password, does_dest_equal_buffer_address)):
        state.add_constraints(does_src_hold_password, does_dest_equal_buffer_address)
        return True
      else:
        return False
    else: 
      return False

  simulation = project.factory.simgr(initial_state)

  def is_successful(state):
    strncpy_address = 0x080490F0
    if state.addr == strncpy_address:
      return check_strncpy(state)
    else:
      return False

  simulation.explore(find=is_successful)

  if simulation.found:
    solution_state = simulation.found[0]

    solution0 = solution_state.solver.eval(solution_state.globals['solutions0'])
    solution1 = solution_state.solver.eval(solution_state.globals['solutions1'], cast_to = bytes)
    print(solution0)
    print(solution1)

  else:
    raise Exception('Could not find the solution')

if __name__ == '__main__':
  main(sys.argv)

解题过程:

1. Hook掉scanf, 实现两个参数的符号化

前几关都有相关的过程

class ReplacementScanf(angr.SimProcedure):
  def run(self, format_string, scanf0_address, scanf1_address):
    scanf0 = claripy.BVS('scanf0', 32)
    scanf1 = claripy.BVS('scanf1', 20 * 8)

    for char in scanf1.chop(bits=8):
      self.state.add_constraints(char >= 0x21, char <= 0x7A)

    #scanf0_address = 0x4858554C
    self.state.memory.store(scanf0_address, scanf0, endness=project.arch.memory_endness)
    self.state.memory.store(scanf1_address, scanf1)

    self.state.globals['solutions0'] = scanf0
    self.state.globals['solutions1'] = scanf1

scanf_symbol = '__isoc99_scanf'  # :string
project.hook_symbol(scanf_symbol, ReplacementScanf())

2. 定义check_strncpy(state)

这一步我们先往下看, 调用了这个函数的地方

可以看到只要我们一调用strncpy, 就会调用这个函数, 也就是说传入check_strncpy的state是刚进入strncpy的state

def is_successful(state):
  strncpy_address = 0x080490F0
  if state.addr == strncpy_address:
    return check_strncpy(state)
  else:
    return False

3. 根据当前的state获取esp, 从而获取参数值

1
2
3

strncpy_src = state.memory.load(state.regs.esp + 8, 4, endness = project.arch.memory_endness)    #这里读错了, 4字节当成位了
strncpy_dest = state.memory.load(state.regs.esp + 4, 4, endness = project.arch.memory_endness)  # 
strncpy_len = state.memory.load(state.regs.esp + 12, 4, endness = project.arch.memory_endness)

分别获取的源字符串地址, 目标缓冲区地址, 长度变量

4. 获取src的内容(即我们为溢出的输入)

1	src_contents = state.memory.load(strncpy_src, strncpy_len)

5. 验证符号化(验证是否溢出)

如果我们找到了漏洞的话, 那么此时目标缓冲区地址应该是我们的输入溢出所覆盖的内容, 而前面我们的输入已经符号化了, 所以这个地址应该也是符号化的未知数.同理源字符串地址就是我们的scanf1, 也应该是符号化的.我们就可以根据这两个特征来进行约束, 表示我们成功造成了栈溢出

1 2	if state.satisfiable(extra_constraints=(does_src_hold_password, does_dest_equal_buffer_address)): state.add_constraints(does_src_hold_password, does_dest_equal_buffer_address)

6. 进一步约束

在验证我们了已经造成了栈溢出后, 我们要进行进一步的约束, 以求达到输出”Good Job”的目标

进行了两个约束:

约束src(也就是我们的scanf1)的内容为: TWOAPNES
约束dest(栈溢出到dest的那部分)的地址是: secure_buffer的地址

password_string = 'TWOAPNES' # :string
buffer_address = 0x4858553C # :integer, probably in hexadecimal
does_src_hold_password = src_contents[-1:-64] == password_string

does_dest_equal_buffer_address = strncpy_dest == buffer_address

7. 检验是否满足条件

见上一关的描述

1 2	if state.satisfiable(extra_constraints=(does_src_hold_password, does_dest_equal_buffer_address)): state.add_constraints(does_src_hold_password, does_dest_equal_buffer_address)

17_angr_arbitrary_jump

编译并运行

lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/17_angr_arbitrary_jump$ python3 generate.py 1234 17_angr_arbitrary_jump
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/17_angr_arbitrary_jump$ ./17_angr_arbitrary_j
ump
Enter the password: aaaaa
Try again.
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/17_angr_arbitrary_jump$

分析

int __cdecl main(int argc, const char **argv, const char **envp)
{
  printf("Enter the password: ");
  read_input();
  puts("Try again.");
  return 0;
}

可以看到主函数中没有任何跟”Good Job”相关的函数, 我们使用字符串查找找到了一个print_good()函数

void __noreturn print_good()
{
  puts("Good Job.");
  exit(0);
}

但是没有任何引用, 这一关要求我们自己构造Jmp实现调用该函数

在read_input()的栈空间中return的地址是紧贴在v1后面的, v1造成栈溢出并将覆盖return的地址就可以实现调用print_good()

使用Angr解题

代码

# An unconstrained state occurs when there are too many
# possible branches from a single instruction. This occurs, among other ways,
# when the instruction pointer (on x86, eip) is completely symbolic, meaning
# that user input can control the address of code the computer executes.
# For example, imagine the following pseudo assembly:
# 当一条指令有太多分支时, 就会出现不受约束的状态.
# 当指令指针eip成为符号时, 就会发生这种情况, 这意味着用户输入可以控制计算机执行的代码的地址
# 通常Angr遇到不受约束的情况时, 它会将其抛出.
# 而在这一观众我们想利用这种不受约束的情况来条状到我们想要的位置
# 稍后我们将讨论如何禁用Angr的默认行为
#
# mov user_input, eax
# jmp eax
#
# The value of what the user entered dictates the next instruction. This
# is an unconstrained state. It wouldn't usually make sense for the execution
# engine to continue. (Where should the program jump to if eax could be
# anything?) Normally, when Angr encounters an unconstrained state, it throws
# it out. In our case, we want to exploit the unconstrained state to jump to
# a location of our choosing. We will get to how to disable Angr's default
# behavior later.
# 用户输入的值决定下一条指令, 这是一个不受约束的状态.
# 执行引擎继续运行通常没有意义.
# (如果eax可以是任何东西, 程序应该跳转到哪里)
# 本关我们利用不受约束的状态跳转到我们选择的位置.
#
# This challenge represents a classic stack-based buffer overflow attack to
# overwrite the return address and jump to a function that prints "Good Job."
# Our strategy for solving the challenge is as follows:
# 1. Initialize the simulation and ask Angr to record unconstrained states.
# 2. Step through the simulation until we have found a state where eip is
#    symbolic.
# 3. Constrain eip to equal the address of the "print_good" function.
# 这一关代表了一个经典的基于堆栈的缓冲区溢出攻击, 覆盖返回值并跳转到打印"Good Job"的函数
# 步骤: 
# 1. 初始化模拟并要求Angr记录无状态
# 2. 逐步模拟, 知道我们找到eip变成符号的状态
# 3. 约束eip等于"print_good"函数的地址


import angr
import claripy
import sys

def main(argv):
  path_to_binary = argv[1]
  project = angr.Project(path_to_binary)

  # Make a symbolic input that has a decent size to trigger overflow
  # 创建一个合适大小的符号来触发溢出
  # (!)
  symbolic_input = claripy.BVS("input", 8 * 59)

  # Create initial state and set stdin to the symbolic input
  # 创建初始状态并将标准输入设置为符号输入
  initial_state = project.factory.entry_state(
          stdin=symbolic_input,#!!!!!!!!!!!!!!!!!!!!!!!!
          add_options = {
              angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
              angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS
              }
          )

  # Ensure that every byte of input is within the acceptable ASCII range (A..Z)
  # 确保输入的每个字符都时可见字符(题目限制的时A到Z)
  # (!)
  for byte in symbolic_input.chop(bits=8):
    initial_state.add_constraints(
      claripy.And(
        byte >= 'A',
        byte <= 'Z'
      )
    )

  # The save_unconstrained=True parameter specifies to Angr to not throw out
  # unconstrained states. Instead, it will move them to the list called
  # 'simulation.unconstrained'.  Additionally, we will be using a few stashes
  # that are not included by default, such as 'found' and 'not_needed'. You will
  # see how these are used later.
  # save_unconstrained = True参数指定Angr不抛出不受约束的状态
  # 相对应, 它将会将这些状态添加到"simulation.unconstrained"列表中
  # 此外, 我们将使用一些默认情况下不包含的存储, 例如"found"和"not_needed"
  # (!)
  simulation = project.factory.simgr(
    initial_state,
    save_unconstrained=True,
    stashes={
      'active' : [initial_state],
      'unconstrained' : [],
      'found' : [],
      'not_needed' : []
    }
  )

  # Explore will not work for us, since the method specified with the 'find'
  # parameter will not be called on an unconstrained state. Instead, we want to
  # explore the binary ourselves. To get started, construct an exit condition
  # to know when the simulation has found a solution. We will later move
  # states from the unconstrained list to the simulation.found list.
  # Create a boolean value that indicates a state has been found.
  # 负责约束的explore不再起作用了, 因为find参数指定的方法在不受约束的情况下失效
  # 我们需要自己探索二进制文件
  # 首先, 构造退出条件, 当simulation发现解决方案时我们就可以知道这一消息.
  # 我们稍后会将state从不受约束的列表移动到simulation.found列表
  # 并创建一个bool值标识一个被找到的state
  def has_found_solution():
    return simulation.found

  # An unconstrained state occurs when there are too many possible branches
  # from a single instruction. This occurs, among other ways, when the
  # instruction pointer (on x86, eip) is completely symbolic, meaning
  # that user input can control the address of code the computer executes.
  # For example, imagine the following pseudo assembly:
  # 当一条指令有太多<可能分支>时, 就会出现不受约束的状态(比如JMP 可以跳转的地址有太多了)
  # 当EIP时符号时就会出现这种情况, 这意味着用户输入可以控制计算机执行代码的地址.
  # 例如下面的汇编(ATT风格的)
  #
  # mov user_input, eax
  # jmp eax
  #
  # The value of what the user entered dictates the next instruction. This
  # is an unconstrained state. It wouldn't usually make sense for the execution
  # engine to continue. (Where should the program jump to if eax could be
  # anything?) Normally, when Angr encounters an unconstrained state, it throws
  # it out. In our case, we want to exploit the unconstrained state to jump to
  # a location of our choosing.  Check if there are still unconstrained states
  # by examining the simulation.unconstrained list.
  # 用户输入的数据决定了下一条指令的. 这就是一个不受约束的state
  # 执行引擎继续运行通常没有意义(如果eax可以是任何值, 程序应该跳到哪里)
  # Angr遇见这种情况, 通常会抛出异常
  # 而在这一关中, 我们想利用这个不受约束的状态来跳转到我们的print_good()函数
  # 通过检查Simultion.unconstrained列表检查是否仍然存在不受约束的状态
  # 
  # (!)
  def has_unconstrained_to_check():
    return simulation.unconstrained

  # The list simulation.active is a list of all states that can be explored
  # further.
  # 列表simulation.active是可以进一步探索所有状态的列表
  # (!)
  def has_active():
    return simulation.active

  while (has_active() or has_unconstrained_to_check()) and (not has_found_solution()):
    for unconstrained_state in simulation.unconstrained:
      def should_move(s):
        return s is unconstrained_state
        # Look for unconstrained states and move them to the 'found' stash.
        # A 'stash' should be a string that corresponds to a list that stores
        # all the states that the state group keeps. Values include:
        #  'active' = states that can be stepped
        #  'deadended' = states that have exited the program
        #  'errored' = states that encountered an error with Angr
        #  'unconstrained' = states that are unconstrained
        #  'found' = solutions
        #  anything else = whatever you want, perhaps you want a 'not_needed',
        #                  you can call it whatever you want
        # (!)
        # 寻找不受约束的状态并将它们移动到'found'stash中, 'stash'应该是一个字符串, 对应于一个存储state组所保留的所有state的列表
      simulation.move('unconstrained', 'found', filter_func = should_move)

    # Advance the simulation.
    simulation.step()

  if simulation.found:
    solution_state = simulation.found[0]

    # Constrain the instruction pointer to target the print_good function and
    # then solve for the user input (recall that this is
    # 'solution_state.posix.dumps(sys.stdin.fileno())')
    # 约束指令指针指向print_good函数, 然后求解用户输入
    # (!)
    solution_state.add_constraints(solution_state.regs.eip == 0x48585558)

    solution = solution_state.posix.dumps(sys.stdin.fileno()).decode()
    print(solution)
  else:
    raise Exception('Could not find the solution')

if __name__ == '__main__':
  main(sys.argv)

去掉注释后的代码

import angr
import claripy
import sys

def main(argv):
  path_to_binary = argv[1]
  project = angr.Project(path_to_binary)


  symbolic_input = claripy.BVS("input", 8 * 59)

  initial_state = project.factory.entry_state(
          stdin=symbolic_input,
          add_options = {
              angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
              angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS
              }
          )

  for byte in symbolic_input.chop(bits=8):
    initial_state.add_constraints(
      claripy.And(
        byte >= 'A',
        byte <= 'Z'
      )
    )

  simulation = project.factory.simgr(
    initial_state,
    save_unconstrained=True,
    stashes={
      'active' : [initial_state],
      'unconstrained' : [],
      'found' : [],
      'not_needed' : []
    }
  )

  def has_found_solution():
    return simulation.found

  def has_unconstrained_to_check():
    return simulation.unconstrained

  def has_active():
    return simulation.active

  while (has_active() or has_unconstrained_to_check()) and (not has_found_solution()):
    for unconstrained_state in simulation.unconstrained:
      def should_move(s):
        return s is unconstrained_state

      simulation.move('unconstrained', 'found', filter_func = should_move)

    simulation.step()

  if simulation.found:
    solution_state = simulation.found[0]

    solution_state.add_constraints(solution_state.regs.eip == 0x48585558)

    solution = solution_state.posix.dumps(sys.stdin.fileno()).decode()
    print(solution)
  else:
    raise Exception('Could not find the solution')

if __name__ == '__main__':
  main(sys.argv)

有点奇怪的就是符号的长度试了只能是59字节及其以上, 但是我们实际溢出所需要的字节数只是52个字节

而且得到的输出后面的也是无用的字符

AAAABAABBAAAABAABBBABBBBBBBABBBAAABBAABABABBABBBXUXHBABAABB

只有XUXH是我们要覆盖的返回地址, 后面的ABAABB都没有用, 但是不使用长度为>=59bytes的符号就跑不出来.

Angr_Lab常用的函数

angr.Project(path_to_binary)

这个函数的作用是建立Angr项目, 参数path_to_binary是二进制文件的路径

使用示例

1 2	path_to_binary = argv[1] #用执行参数得到文件路径 project = angr.Project(path_to_binary) #在该二进制文件上建立Angr项目

project.factory.entry_state(add_options = {})

设置执行起始状态, 指示Angr从main函数开始执行

使用示例

initial_state = project.factory.entry_state(
  add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                  angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
)

project.factory.blank_state(addr = , add_options = {})

设置执行起始状态, 指示Angr从指定的地址处开始执行.

注意容易和上一个函数混淆: 一个是entry_state, 一个是blank_state

使用示例

initial_state = project.factory.blank_state(
    addr=start_address,
    add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,
                    angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}
  )

claripy.BVS(‘’, num)

创建符号位向量, 第一个参数是符号名, 第二个参数是符号位向量的长度(bit为单位)

使用示例

1
2
3

password_size_bits = 32
password0 = claripy.BVS('password0', password_size_bits)
password1 = claripy.BVS('password1', password_size_bits)

simulation.explore(find= , avoid=)

符号执行, find表示<接受路径>, avoid表示<避免路径>

两个参数的形式是多样的

使用地址为参数

当用地址为参数时, find和avoid都接受地址来作为<接受/避免路径>

1 2	print_good_address = 0x080492F3 simulation.explore(find=print_good_address)

使用回调函数作为参数

当使用回到函数作为参数时, 其函数返回值要为True或False, 而不是地址.

def is_successful(state):
  stdout_output = state.posix.dumps(sys.stdout.fileno())

  return "Good Job".encode() in stdout_output#如果"Good Job."字符串在标准输出中, 则返回真

def should_abort(state):
  stdout_output = state.posix.dumps(sys.stdout.fileno())

  return "Try again".encode() in stdout_output#如果"Tyr again."字符串在标准输出中, 则返回真

simulation.explore(find=is_successful, avoid=should_abort)

solution_state.posix.dumps(sys.stdin.fileno())

当使用默认的符号注入时, 最终得到的符号就是sys.stdin.fileno()

使用示例

1	print(solution_state.posix.dumps(sys.stdin.fileno()).decode())# 解码为unicode并打印

solution_state.add_constraints( solution_state.xx)

添加约束条件, 注意这个状态一定要用solution_state(之前用成了initial_state)

使用示例

1
2
3

solution_state = simulation.found[0]  
solution_state.add_constraints( solution_state.regs.eax != 0 )
solution = solution_state.solver.eval(password, cast_to = bytes)

solution_state.solver.eval(symbol)

我们自己创建符号并求解的时候, 就需要使用这个函数来约束求解符号的值.

使用示例

整数

if simulation.found:
  solution_state = simulation.found[0]

  solution0 = solution_state.solver.eval(password0)
  solution1 = solution_state.solver.eval(password1)
  solution2 = solution_state.solver.eval(password2)

  solution = str(hex(solution0)) + ' ' + str(hex(solution1)) + ' ' + str(hex(solution2))   # :string, 注意这里要转换成十六进制, 因为scanf用的是%x
  print(solution)

字符串

1	solution = solution_state.solver.eval(password,cast_to=bytes)

Angr_Lab

实验准备

编译

运行

分析

使用Angr解题

00_angr_find

main()

complex_function(int a1, int a2)

汇编查看”Good Job”的地址

使用Angr解题

代码

把注释去掉以后

使用的流程

01_angr_avoid

编译并执行

分析

“Good Job”所在地址

mian()函数

跟进跳转

需要避开的地址

使用Angr解题

代码

去掉注释后

02_angr_find_codition

编译并运行

分析

使用Angr解题

代码

去掉注释后

03_angr_symbolic_registers

编译并运行

分析

main函数

get_user_input()

complex_function_1()

使用Angr解题

下面是代码

去掉注释后

04_angr_symbolic_stack

编译并运行

分析

mian()

handle_user()

使用Angr解题

以下是解题代码

去掉注释后

问题一: 符号执行的起点是在add esp, 10h前面还是后面

关于栈注入符号的理解

如何注入

1. 构造栈

向栈中注入符号

符号执行

05_angr_symbolic_memory

编译并执行

分析

使用Angr解题

代码

去掉注释后的代码

符号执行的地址的位置

memset()

scanf()

add esp, 10h

06_angr_symbolic_dynamic_memory

编译并运行

分析

使用Angr解题

代码

去掉注释后的代码

确定符号执行地址

创建符号

替换内存

注入符号

符号执行

结果

07_angr_symbolic_file

编译并运行

分析

使用Angr求解

代码

问题一: 符号执行的起点是在`add esp, 10h`前面还是后面

`add esp, 10h`