# Before you begin, here are a few notes about these capture-the-flag # challenges. # # Each binary, when run, will ask for a password, which can be entered via stdin # (typing it into the console.) Many of the levels will accept many different # passwords. Your goal is to find a single password that works for each binary. # # If you enter an incorrect password, the program will print "Try again." If you # enter a correct password, the program will print "Good Job." # # Each challenge will be accompanied by a file like this one, named # "scaffoldXX.py". It will offer guidance as well as the skeleton of a possible # solution. You will have to edit each file. In some cases, you will have to # edit it significantly. While use of these files is recommended, you can write # a solution without them, if you find that they are too restrictive. # # Places in the scaffoldXX.py that require a simple substitution will be marked # with three question marks (???). Places that require more code will be marked # with an ellipsis (...). Comments will document any new concepts, but will be # omitted for concepts that have already been covered (you will need to use # previous scaffoldXX.py files as a reference to solve the challenges.) If a # comment documents a part of the code that needs to be changed, it will be # marked with an exclamation point at the end, on a separate line (!).
defmain(argv): # Create an Angr project.创建一个Angr项目 # If you want to be able to point to the binary from the command line, you can # use argv[1] as the parameter. Then, you can run the script from the command # line as follows: #如果你想能够从命令行指向二进制文件, 可以使用argv[1]作为参数. 然后你从命令行运行脚本, 如下所示: # python ./scaffold00.py [binary] #意思就是你可以在打开文件的时候在后面加上一个参数, 这个参数是需要解的文件, 然后path_to_binary = argv[1] # (!) path_to_binary = argv[1] # :string project = angr.Project(path_to_binary)#创建Angr项目, 参数一个文件路径
# Tell Angr where to start executing (should it start from the main() # function or somewhere else?) For now, use the entry_state function # to instruct Angr to start from the main() function. #告诉Angr应该从哪里开始执行(应该从main()函数还是其他地方开始执行), 下面使用entry_state函数指示Angr从main()函数开始 initial_state = project.factory.entry_state( add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY,#符号 填充 不受限制的 内存 angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS}#符号 填充 不受限制的 寄存器 )
# Create a simulation manager initialized with the starting state. It provides # a number of useful tools to search and execute the binary. # 创建一个初始化为"起始状态"的模拟管理器. 它提供了大量的工具去搜索和执行二进制文件 simulation = project.factory.simgr(initial_state)
# Explore the binary to attempt to find the address that prints "Good Job." # You will have to find the address you want to find and insert it here. # This function will keep executing until it either finds a solution or it # has explored every possible path through the executable. # 探索二进制文件以尝试找到打印"Good Job"的地址. 你必须找到你想要搜索的地址并将该地址赋给print_good_address变量 # 此函数将继续执行, 直到找到解决方案或探索了可执行文件的所有可能路径 # (!)下面有需要修改的地址 print_good_address = 0x080492F3# :integer (probably in hexadecimal) simulation.explore(find=print_good_address)
# Check that we have found a solution. The simulation.explore() method will # set simulation.found to a list of the states that it could find that reach # the instruction we asked it to search for. Remember, in Python, if a list # is empty, it will be evaluated as false, otherwise true. # 检查我们是否找到了解决方案, simulation.explore()方法把simulation.found设置为一个<状态列表> # 它可以找到我们要求它搜索的指令的状态. 请记住: 在Python中, 如果列表为空, 它将被判断为false, 否则为true if simulation.found:#类似于全局变量在simulation.explore中被设置成了我们想要的执行状态 # The explore method stops after it finds a single state that arrives at the # target address. # explore方法找到目标地址的单个状态后停止 solution_state = simulation.found[0]
# Print the string that Angr wrote to stdin to follow solution_state. This # is our solution. # 打印Angr写入标准输入的字符串以跟随solution_state, 这就是我们的解决方案 print(solution_state.posix.dumps(sys.stdin.fileno()).decode()) else: # If Angr could not find a path that reaches print_good_address, throw an # error. Perhaps you mistyped the print_good_address? # 如果Angr不能找到一条到达print_good_address, 则抛出一个error. #也许是你写错了print_good_address raise Exception('Could not find the solution')
if simulation.found: solution_state = simulation.found[0] print(solution_state.posix.dumps(sys.stdin.fileno()).decode()) else: raise Exception('Could not find the solution')
# Explore the binary, but this time, instead of only looking for a state that # reaches the print_good_address, also find a state that does not reach # will_not_succeed_address. The binary is pretty large, to save you some time, # everything you will need to look at is near the beginning of the address # space. # 尝试分析二进制文件, 但是这一次, 除了要找到我们的"接收的路径"print_good_address, # 还要找到一个"避免的路径"will_not_succeed_address. 二进制文件非常大, 为了节省你的 # 时间, 你需要找的所有内容都在地址内容的开头附近 # (!) print_good_address = 0x080492FB will_not_succeed_address = 0x080492C2 # 使用模拟管理器的探索: 开始符号执行, 并搜索find地址, 避开avoid地址 simulation.explore(find=print_good_address, avoid=will_not_succeed_address)
#最终的结果会存储在simulation的found位向量中 if simulation.found:#根据found是否为空来判断是否找到了我们想要的符号值 solution_state = simulation.found[0]#获取符号值 print(solution_state.posix.dumps(sys.stdin.fileno()).decode())#解码为字符串并打印 else: raise Exception('Could not find the solution')
if simulation.found: solution_state = simulation.found[0] print(solution_state.posix.dumps(sys.stdin.fileno()).decode()) else: raise Exception('Could not find the solution')
# It is very useful to be able to search for a state that reaches a certain # instruction. However, in some cases, you may not know the address of the # specific instruction you want to reach (or perhaps there is no single # instruction goal.) In this challenge, you don't know which instruction # grants you success. Instead, you just know that you want to find a state where # the binary prints "Good Job." # 能够搜索到一个到达特定指令的状态是非常有用的. 但是, 在某些情况下, 你可能不知道要到达的特定指令的地址 # (或者可能没有单一的指令目标). 在这个关卡中, 你不知道那条指令可以让你成功. 相反, 你只知道你想找的是打印 # "Good Job"的状态
# Angr is powerful in that it allows you to search for a states that meets an # arbitrary condition that you specify in Python, using a predicate you define # as a function that takes a state and returns True if you have found what you # are looking for, and False otherwise. # Angr的强大之处在于它允许你去搜索一个满足你已经在Python中声明了的任意条件的状态 # 使用你定义为函数的谓词, 该函数接收状态并在找到所需内容时返回True, 否则为False import angr import sys
# Define a function that checks if you have found the state you are looking # for. # 定义一个函数来检查你是否找到了你正在寻找的状态 defis_successful(state): # Dump whatever has been printed out by the binary so far into a string. # 将目前二进制文件打印出来的任何内容转储到一个字符串中 stdout_output = state.posix.dumps(sys.stdout.fileno())
# Return whether 'Good Job.' has been printed yet. # 返回是否已经打印出"Good Job" # (!) return"Good Job".encode() in stdout_output#如果"Good Job."字符串在标准输出中, 则返回真
# Same as above, but this time check if the state should abort. If you return # False, Angr will continue to step the state. In this specific challenge, the # only time at which you will know you should abort is when the program prints # "Try again." # 与上面相同, 但这次检查状态是否应该终止, 如果返回False, Angr将继续步入状态, 在 # 这个特殊的挑战中, 你应该知道终止的唯一可能是在程序打印"Try again"的时候. defshould_abort(state): stdout_output = state.posix.dumps(sys.stdout.fileno())
return"Try again".encode() in stdout_output#如果"Tyr again."字符串在标准输出中, 则返回真
# Tell Angr to explore the binary and find any state that is_successful identfies # as a successful state by returning True. # 让Angr探索二进制文件, 并通过返回True找到is_successful状态并识别为成功状态 simulation.explore(find=is_successful, avoid=should_abort)
if simulation.found: solution_state = simulation.found[0] print(solution_state.posix.dumps(sys.stdin.fileno()).decode()) else: raise Exception('Could not find the solution')
if simulation.found: solution_state = simulation.found[0] print(solution_state.posix.dumps(sys.stdin.fileno()).decode()) else: raise Exception('Could not find the solution')
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/03_angr_symbolic_registers$ python3 generate.py 1234 03_angr_symbolic_registers lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/03_angr_symbolic_registers$ ./03_angr_symbolic_registers Enter the password: aaaa a a Try again.
# Angr doesn't currently support reading multiple things with scanf (Ex: # scanf("%u %u).) You will have to tell the simulation engine to begin the # program after scanf is called and manually inject the symbols into registers. # Angr目前不支持使用scanf读取多个内容, (如: scanf("%u %u")), 你必须告诉模拟引擎在调用 # scanf后启动程序并手动将符号注入寄存器 import angr import claripy import sys
# Sometimes, you want to specify where the program should start. The variable # start_address will specify where the symbolic execution engine should begin. # Note that we are using blank_state, not entry_state. # 有时, 你想要明确指定程序应该从哪里开始. 变量start_address将指示符号引擎从哪里开始 # 请注意, 我们使用的是blank_state()而不是entry_state() # (!) start_address = 0x080495F9# :integer (probably hexadecimal) initial_state = project.factory.blank_state( addr=start_address, add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY, angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS} )
# Create a symbolic bitvector (the datatype Angr uses to inject symbolic # values into the binary.) The first parameter is just a name Angr uses # to reference it. # You will have to construct multiple bitvectors. Copy the two lines below # and change the variable names. To figure out how many (and of what size) # you need, dissassemble the binary and determine the format parameter passed # to scanf. # 创建一个符号位向量(Angr用来将符号值注入二进制文件的数据类型), 第一个参数只是Angr用来引用它的名称 # 你必须将多个寄存器设置为不同的位向量, 先声明一定数量的符号位向量, 然后在合适的位置, 合适的寄存器注入符号 # (!) password0_size_in_bits = 32# :integer, 这个是符号位向量的位数 password0 = claripy.BVS('password0', password0_size_in_bits)# 成功创建符号 password1 = claripy.BVS('password1', password0_size_in_bits) password2 = claripy.BVS('password2', password0_size_in_bits) ...
# Set a register to a symbolic value. This is one way to inject symbols into # the program. # 将寄存器设置为符号值, 这是将符号注入程序的一种方法 # initial_state.regs stores a number of convenient attributes that reference # registers by name. For example, to set eax to password0, use: # initial_state.regs存储了许多按名称引用寄存器的遍历属性. # 如, 将eax设置为password0, 请使用下面的语句 # initial_state.regs.eax = password0 # # You will have to set multiple registers to distinct bitvectors. Copy and # paste the line below and change the register. To determine which registers # to inject which symbol, dissassemble the binary and look at the instructions # immediately following the call to scanf. # 你必须将多个寄存器设置为不同的位向量. # 复制并粘贴下面的行并更改寄存器. # 要确定哪些寄存器要注入哪些符号, 请反汇编二进制文件并查看调用scanf后立即执行的指令 # (!) initial_state.regs.eax = password0 initial_state.regs.ebx = password1 initial_state.regs.edx = password2
if simulation.found: solution_state = simulation.found[0]# 如果找到了<接受路径>
# Solve for the symbolic values. If there are multiple solutions, we only # care about one, so we can use eval, which returns any (but only one) # solution. Pass eval the bitvector you want to solve for. # 求解符号值. 如果有多个解决方案, 我们只关心一个, 所以我们可以使用eval()方法, 这个方法可以返回任何(但只有一个) # 解决方案, 将需要求解的位向量传递给eval()方法 # (!) solution0 = solution_state.solver.eval(password0) solution1 = solution_state.solver.eval(password1) solution2 = solution_state.solver.eval(password2)
# Aggregate and format the solutions you computed above, and then print # the full string. Pay attention to the order of the integers, and the # expected base (decimal, octal, hexadecimal, etc). # 合并, 格式话你在上面得到的答案, 然后打印完整的字符串. 注意整数的顺序, 以及预期的基数 # solution = str(hex(solution0)) + ' ' + str(hex(solution1)) + ' ' + str(hex(solution2)) # :string, 注意这里要转换成十六进制, 因为scanf用的是%x print(solution) else: raise Exception('Could not find the solution')
# This challenge will be more challenging than the previous challenges that you # have encountered thus far. Since the goal of this CTF is to teach symbolic # execution and not how to construct stack frames, these comments will work you # through understanding what is on the stack. # ! ! ! # IMPORTANT: Any addresses in this script aren't necessarily right! Dissassemble # the binary yourself to determine the correct addresses! # ! ! ! # 这个挑战比之前的都要难一些. 由于这个CTF的目标是教你Angr而不是堆栈, 因此这些注释将 # 帮助你了解栈上的内容 # 重要提示: 此脚本中的任何地址都不一定是正确的, 需要自己反汇编二进制代码查看 import angr import claripy import sys
# For this challenge, we want to begin after the call to scanf. Note that this # is in the middle of a function. #针对这个关卡, 我们想要再调用完scanf以后开始符号执行. 记住这是在函数的中间完成的 # This challenge requires dealing with the stack, so you have to pay extra # careful attention to where you start, otherwise you will enter a condition # where the stack is set up incorrectly. In order to determine where after # scanf to start, we need to look at the dissassembly of the call and the # instruction immediately following it: # 这个挑战需要处理堆栈, 所以你需要特别注意你从哪里开始, 否则你会进入错误的堆栈. # 为了确定scanf之后从哪里开始, 我们需要查看调用的反汇编和紧随其后的指令 # sub $0x4,%esp # lea -0x10(%ebp),%eax # push %eax # lea -0xc(%ebp),%eax # push %eax # push $0x80489c3 # call 8048370 <__isoc99_scanf@plt> # add $0x10,%esp # Now, the question is: do we start on the instruction immediately following # scanf (add $0x10,%esp), or the instruction following that (not shown)? # Consider what the 'add $0x10,%esp' is doing. Hint: it has to do with the # scanf parameters that are pushed to the stack before calling the function. # Given that we are not calling scanf in our Angr simulation, where should we # start? # 现在的问题是: 我们是从scanf之后(add $0x10, %esp)开始, 还是在这个指令之后的指令开始. # 提示: 它与调用函数之前被压入堆栈的scanf参数有关. # 鉴于我们没有在Angr模拟中调用scanf, 我们应该从哪里开始呢? # 答案是在add后面开始, 因为该语句负责清理scanf的堆栈, 如果直接在这条语句处开始, 那么使用要调用者函数的栈数据的话, 地址都需要加上0x10 # (!) start_address = 0x80493F2 initial_state = project.factory.blank_state( addr=start_address, add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY, angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS} )
# We are jumping into the middle of a function! Therefore, we need to account # for how the function constructs the stack. The second instruction of the # function is: # 我们正在跳入一个函数的中间! 因此, 我们需要知道堆栈的结构, 该函数的第二条指令是 # mov %esp,%ebp这个是ATT风格的汇编语句 # At which point it allocates the part of the stack frame we plan to target: # 在这里上, 它分配了我们计划起始的栈帧的一部分 # sub $0x18,%esp # Note the value of esp relative to ebp. The space between them is (usually) # the stack space. Since esp was decreased by 0x18 # 注意esp相对于ebp的值, 它们之间的空间是堆栈栈空间. 由于esp减少了0x18 # # 高地址 # /-------- The stack --------\ # ebp -> | | # |---------------------------| # | | # |---------------------------| # . . . (total of 0x18 bytes) # . . . Somewhere in here is # . . . the data that stores # . . . the result of scanf. # esp -> | | # \---------------------------/ # 低地址 # Since we are starting after scanf, we are skipping this stack construction # step. To make up for this, we need to construct the stack ourselves. Let us # start by initializing ebp in the exact same way the program does. # 因为我们是在scanf之后开始的, 所以我们跳过了这个堆栈构建步骤. # 为了弥补这一点, 我们需要自己构建堆栈. 然我们以与程序完全相同的方式初始化ebp initial_state.regs.ebp = initial_state.regs.esp
# scanf("%u %u") needs to be replaced by injecting two bitvectors. The # reason for this is that Angr does not (currently) automatically inject # symbols if scanf has more than one input parameter. This means Angr can # handle 'scanf("%u")', but not 'scanf("%u %u")'. # You can either copy and paste the line below or use a Python list. # scanf("%u %u")需要通过注入两个位向量来替换. 原因是如果scanf有多个输入参数, Angr不会自动注入符号 # 这意味着Angr可以处理scanf("%u"), 但不能处理scanf("%u %u"). # 你可以复制粘贴下面的行或使用Python列表 # (!) password_size_bits = 32 password0 = claripy.BVS('password0', password_size_bits) password1 = claripy.BVS('password1', password_size_bits)
# Here is the hard part. We need to figure out what the stack looks like, at # least well enough to inject our symbols where we want them. In order to do # that, let's figure out what the parameters of scanf are: # 这是最难的部分, 我们需要能清楚堆栈长什么样, 至少能让我们的符号注入我们想要的地方. # 为了做到这一点, 让我们能清楚scanf的参数是什么 # sub $0x4,%esp # lea -0x10(%ebp),%eax # push %eax # lea -0xc(%ebp),%eax # push %eax # push $0x80489c3 # call 8048370 <__isoc99_scanf@plt> # add $0x10,%esp # As you can see, the call to scanf looks like this: # 正如你所看见的, 对scanf的调用如下所示 # scanf( 0x80489c3, ebp - 0xc, ebp - 0x10 ) # format_string password0 password1 # From this, we can construct our new, more accurate stack diagram: # 由此, 我们可以构建新的, 更准确的堆栈图 # # /-------- The stack --------\ # ebp -> | padding | # |---------------------------| # ebp - 0x01 | more padding | # |---------------------------| # ebp - 0x02 | even more padding | # |---------------------------| # . . . <- How much padding? Hint: how # |---------------------------| many bytes is password0? 填充了9个字节, password0从9开始 # ebp - 0x0b | password0, second byte | # |---------------------------| # ebp - 0x0c | password0, first byte | # |---------------------------| # ebp - 0x0d | password1, last byte | # |---------------------------| # . . . # |---------------------------| # ebp - 0x10 | password1, first byte | # |---------------------------| # . . . # |---------------------------| # esp -> | | # \---------------------------/ # # Figure out how much space there is and allocate the necessary padding to # the stack by decrementing esp before you push the password bitvectors. # 计算出有多少空间, 并在你压入password位向量之前通过递减esp来分配栈空间 padding_length_in_bytes = 0x8# :integer initial_state.regs.esp -= padding_length_in_bytes#相当于汇编: sub esp, 0x10
# Push the variables to the stack. Make sure to push them in the right order! # The syntax for the following function is: #将变量压入堆栈, 确保以正确的形式压入它们(正确的顺序), 以下函数的语法是: # initial_state.stack_push(bitvector) # # This will push the bitvector on the stack, and increment esp the correct # amount. You will need to push multiple bitvectors on the stack. # 这会将位向量压入栈中, 并给esp加上正确的值 # 你将需要再栈中压入许多的位向量 # (!) initial_state.stack_push(password0) # :bitvector (claripy.BVS, claripy.BVV, claripy.BV) initial_state.stack_push(password1)
simulation = project.factory.simgr(initial_state)
defis_successful(state): stdout_output = state.posix.dumps(sys.stdout.fileno()) return"Good Job".encode() in stdout_output
defshould_abort(state): stdout_output = state.posix.dumps(sys.stdout.fileno()) return"Try again".encode() in stdout_output
# Determine the address of the global variable to which scanf writes the user # input. The function 'initial_state.memory.store(address, value)' will write # 'value' (a bitvector) to 'address' (a memory location, as an integer.) The # 'address' parameter can also be a bitvector (and can be symbolic!). # 确定scanf写入输入的全局变量的地址. # 函数initial_state.memory.store(address, value)的作用是 # 将参数value(位向量)写入地址为address的内存 # 参数address也可以是位向量(也可以是符号) # (!) password0_address = 0x8134360 initial_state.memory.store(password0_address, password0) initial_state.memory.store(password0_address + 0x8, password1) initial_state.memory.store(password0_address + 0x10, password2) initial_state.memory.store(password0_address + 0x18, password3)
simulation = project.factory.simgr(initial_state)
defis_successful(state): stdout_output = state.posix.dumps(sys.stdout.fileno()) return"Good Job".encode() in stdout_output
defshould_abort(state): stdout_output = state.posix.dumps(sys.stdout.fileno()) return"Try again".encode() in stdout_output
if simulation.found: solution_state = simulation.found[0]
# Solve for the symbolic values. We are trying to solve for a string. # Therefore, we will use eval, with named parameter cast_to=bytes # which returns bytes that can be decoded to a string instead of an integer. # 求解符号值. # 我们正常是解出一个字符串, 因此, 我们将使用带有 命名参数cast_to=bytes的eval函数 # 加上该参数后这个函数将返回解码为字符串的数据. 而不是整数的字节 # (!) # cast_to = bytes表示切片成单字节, 然后外部用decode解码成unicode solution0 = solution_state.solver.eval(password0,cast_to=bytes).decode() solution1 = solution_state.solver.eval(password1,cast_to=bytes).decode() solution2 = solution_state.solver.eval(password2,cast_to=bytes).decode() solution3 = solution_state.solver.eval(password3,cast_to=bytes).decode() solution = solution0 + solution1 + solution2 + solution3
print(solution) else: raise Exception('Could not find the solution')
# Instead of telling the binary to write to the address of the memory # allocated with malloc, we can simply fake an address to any unused block of # memory and overwrite the pointer to the data. This will point the pointer # with the address of pointer_to_malloc_memory_address0 to fake_heap_address. # Be aware, there is more than one pointer! Analyze the binary to determine # global location of each pointer. # Note: by default, Angr stores integers in memory with big-endianness. To # specify to use the endianness of your architecture, use the parameter # endness=project.arch.memory_endness. On x86, this is little-endian. # 我们可以轻松的伪造一个未被使用的内存的地址指针来覆盖指向原本数据的指针, 不是使用二进制文件从malloc那里得到的内存指针(这样比较麻烦) # 这会使得内容为pointer_to_malloc_memory_address0的指针指向fake_heap_address # 注意, 指针不止一个 # 分析二进制文件来确定每个指针的全局位置. # 注意: 默认情况下, Angr以大端顺序将整数存储再内存中, 要指定使用架构的字节序, 请使用参数endness=project.arch.memory_endness. 在x86上, 这是小端 # (!) # 简而言之, 主函数中会有一个变量(全局或是局部)来存储分配的内存的地址, 我们要先找到那个变量的地址, 然后直接修改存储的值(原本的值为malloc分配的内存地址, 现在我们要改成一段未被使用的内存地址) fake_heap_address0 = 0x0804C100# 我先使用的是.bss段的内存 pointer_to_malloc_memory_address0 = 0x0BB9CE00#这个是buffer0的地址, 而buffer0存储的是malloc分配的内存的地址, 现在我们要将buffer0的内容修改另一个内存的地址 initial_state.memory.store(pointer_to_malloc_memory_address0, fake_heap_address0, endness=project.arch.memory_endness) fake_heap_address1 = 0x804C108 pointer_to_malloc_memory_address1 = 0x0BB9CE08 initial_state.memory.store(pointer_to_malloc_memory_address1, fake_heap_address1, endness=project.arch.memory_endness)
# Store our symbolic values at our fake_heap_address. Look at the binary to # determine the offsets from the fake_heap_address where scanf writes. # 将我们的符号值存储在我们的fake_heap_address中 # 查看二进制文件来确定scanf写入的fake_heap_address的偏移量 # (!)
# This challenge could, in theory, be solved in multiple ways. However, for the # sake of learning how to simulate an alternate filesystem, please solve this # challenge according to structure provided below. As a challenge, once you have # an initial solution, try solving this in an alternate way. # # Problem description and general solution strategy: # The binary loads the password from a file using the fread function. If the # password is correct, it prints "Good Job." In order to keep consistency with # the other challenges, the input from the console is written to a file in the # ignore_me function. As the name suggests, ignore it, as it only exists to # maintain consistency with other challenges. # We want to: # 1. Determine the file from which fread reads. # 2. Use Angr to simulate a filesystem where that file is replaced with our own # simulated file. # 3. Initialize the file with a symbolic value, which will be read with fread # and propogated through the program. # 4. Solve for the symbolic input to determine the password. # 从理论上来讲, 这一关可以通过多种方式来解决. 但是, 为了学习模拟备用文件系统, 请 # 根据下面提供的步骤来通过此关卡. 作为一个挑战, 一旦你有了一个初步的解决方案, 试 # 这用另一种方式解决这个挑战.
# Specify some information needed to construct a simulated file. For this # challenge, the filename is hardcoded, but in theory, it could be symbolic. # Note: to read from the file, the binary calls # 'fread(buffer, sizeof(char), 64, file)'. # 指定构建模拟文件所需的一些信息. 对于这个挑战, 文件名是硬编码的, 但理论上它是可以是符号化的. # 注意: 要从文件中读取, 二进制文件调用'fread()' # (!) filename = "PNESFFEG.txt"# :string, 文件名 symbolic_file_size_bytes = 64# 文件大小(字节)
# Construct a bitvector for the password and then store it in the file's # backing memory. For example, imagine a simple file, 'hello.txt': # 为flag构建一个符号位向量, 然后将其存储在文件的后背内存中. # Hello world, my name is John. # ^ ^ # ^ address 0 ^ address 24 (count the number of characters) # In order to represent this in memory, we would want to write the string to # the beginning of the file: # 为了在内存中表示它, 我们希望将字符串写入文件的开头 # # hello_txt_contents = claripy.BVV('Hello world, my name is John.', 30*8) # # Perhaps, then, we would want to replace John with a # symbolic variable. We would call: # 那么, 也许我们想用一个符号变量代替John, 我们可以使用下面的语句 # # name_bitvector = claripy.BVS('symbolic_name', 4*8) # # Then, after the program calls fopen('hello.txt', 'r') and then # fread(buffer, sizeof(char), 30, hello_txt_file), the buffer would contain # the string from the file, except four symbolic bytes where the name would be # stored. # 然后系统调用fopen("hello.txt", "r")和fread(buffer, sizeof(char), 30, hello_txt_file) # 在此之后, buffer将包含文件中的字符串, 除了那四个将保存名称的符号字节. # (!) # 设定的符号大小等于文件的大小 password = claripy.BVS('password', symbolic_file_size_bytes * 8)
# Construct the symbolic file. The file_options parameter specifies the Linux # file permissions (read, read/write, execute etc.) The content parameter # specifies from where the stream of data should be supplied. If content is # an instance of SimSymbolicMemory (we constructed one above), the stream will # contain the contents (including any symbolic contents) of the memory, # beginning from address zero. # Set the content parameter to our BVS instance that holds the symbolic data. # 构造符号文件. file_options参数将指定Linux文件权限(读, 写, 执行等). # content参数将指定从何处提供数据流. 如果content是SimSymbolicMemory的一个实例, # 则数据流将包含内存的内容, 从地址0开始. 将content参数设置为保存符号数据的BVS实例. # (!) password_file = angr.storage.SimFile(filename, content=password, size = symbolic_file_size_bytes)# 这里多添加了一个size参数, 后面如果有错误的话就删了 # Add the symbolic file we created to the symbolic filesystem. # 将我们创建的符号文件添加到符号文件系统 initial_state.fs.insert(filename, password_file)
simulation = project.factory.simgr(initial_state)
defis_successful(state): stdout_output = state.posix.dumps(sys.stdout.fileno()) return"Good Job".encode() in stdout_output
defshould_abort(state): stdout_output = state.posix.dumps(sys.stdout.fileno()) return"Try again".encode() in stdout_output
# The binary asks for a 16 character password to which is applies a complex # function and then compares with a reference string with the function # check_equals_[reference string]. (Decompile the binary and take a look at it!) # The source code for this function is provided here. However, the reference # string in your version will be different than AABBCCDDEEFFGGHH: # # #define REFERENCE_PASSWORD = "AABBCCDDEEFFGGHH"; # int check_equals_AABBCCDDEEFFGGHH(char* to_check, size_t length) { # uint32_t num_correct = 0; # for (int i=0; i<length; ++i) { # if (to_check[i] == REFERENCE_PASSWORD[i]) { # num_correct += 1; # } # } # return num_correct == length; # } # 二进制文件要求输入16个字符的密码, 该密码应用了一个complex_function()进行变换 # 随后在check_equals_HXUITWOAPNESFFEG()中与参考数据进行比较 # 下面是源代码 # ... # # char* input = user_input(); # char* encrypted_input = complex_function(input); # if (check_equals_AABBCCDDEEFFGGHH(encrypted_input, 16)) { # puts("Good Job."); # } else { # puts("Try again."); # } # # The function checks if *to_check == "AABBCCDDEEFFGGHH". Verify this yourself. # While you, as a human, can easily determine that this function is equivalent # to simply comparing the strings, the computer cannot. Instead the computer # would need to branch every time the if statement in the loop was called (16 # times), resulting in 2^16 = 65,536 branches, which will take too long of a # time to evaluate for our needs. # check_equals_HXUITWOAPNESFFEG()检查结果 == "AABBCCDDEEFFGGHH", 人类可以轻松确定这是在比较字符串 # 但是计算机不能. 每次调用if时都需要进行分支, 导致了2^16 = 65536个分支, 这将花费很长时间来评估我们的需求 # 为什么会有这些分支, 阅读Angr给出的PPT教程, 可以知道在Angr的机制中, 一遇上if, state就会产生两个分支 # 多重的if叠加, 就造成了路径爆炸, 所以这一关的主要目标就是减少分支到我们可接受的范围内. # # We do not know how the complex_function works, but we want to find an input # that, when modified by complex_function, will produce the string: # AABBCCDDEEFFGGHH. # 我们不知道complex_function()具体干了些什么, 但是我们知道我们的目的是然输入在被complex_function变换后的结果是: AABBCCDDEEFFGGHH # # In this puzzle, your goal will be to stop the program before this function is # called and manually constrain the to_check variable to be equal to the # password you identify by decompiling the binary. Since, you, as a human, know # that if the strings are equal, the program will print "Good Job.", you can # be assured that if the program can solve for an input that makes them equal, # the input will be the correct password. # 在这个谜题中, 你的目标是在调用此函数之前停止程序, 并手动将to_check变量约束为你通过反编译二进制文件识别的密码. # 因为, 作为人类, 你知道如果字符串相等, 程序将打印"Good Job". # import angr import claripy import sys
# Angr will not be able to reach the point at which the binary prints out # 'Good Job.'. We cannot use that as the target anymore. # Angr将无法到达到达二进制打印出"Good Job"的点, 我们不能用它作为目标了 # (!)
# 开始约束条件 if simulation.found: solution_state = simulation.found[0]
# Recall that we need to constrain the to_check parameter (see top) of the # check_equals_ function. Determine the address that is being passed as the # parameter and load it into a bitvector so that we can constrain it. # 回想一下, 我们需要约束check_equals函数的to_check参数(上面的源码). # 确定作为参数传递的地址并将其加载到位向量中, 以便我们可以对其进行约束 # (!)
# 约束参数 constrained_parameter_address = 0x0804C040#约束参数的地址 constrained_parameter_size_bytes = 16#约束参数的大小(以字节为单位) constrained_parameter_bitvector = solution_state.memory.load( #将其加载到一个位向量中, 这个位向量存有该内存中的数据, 而这段内存已经被我们注入了符号 constrained_parameter_address, constrained_parameter_size_bytes ) # We want to constrain the system to find an input that will make # constrained_parameter_bitvector equal the desired value. # 我们希望约束系统找到一个输入, 使得constrained_parameter_bitvector等于所需值 # (!) constrained_parameter_desired_value = "HXUITWOAPNESFFEG"# :string (encoded)
# Specify a claripy expression (using Pythonic syntax) that tests whether # constrained_parameter_bitvector == constrained_parameter_desired_value. # Add the constraint to the state to let z3 attempt to find an input that # will make this expression true. # 指定一个清晰的表达式(使用Pythonic语法)来测试constrained_parameter_bitvector == constrained_parameter_desired_value. # 将约束添加到状态从而使z3尝试找到是该表达式满足的输入 solution_state.add_constraints(constrained_parameter_bitvector == constrained_parameter_desired_value) # 通过上面得到的位向量来添加约束条件
# Solve for the constrained_parameter_bitvector. # 求解constrained_parameter_bitvector # (!) solution = solution_state.solver.eval(password, cast_to=bytes)
print(solution) else: raise Exception('Could not find the solution')
# This level performs the following computations: # # 1. Get 16 bytes of user input and encrypt it. # 2. Save the result of check_equals_AABBCCDDEEFFGGHH (or similar) # 3. Get another 16 bytes from the user and encrypt it. # 4. Check that it's equal to a predefined password. # # The ONLY part of this program that we have to worry about is #2. We will be # replacing the call to check_equals_ with our own version, using a hook, since # check_equals_ will run too slowly otherwise. # 这一关的流程: # 1. 获取用户16字符输入并加密 # 2. 从check_equals_HXUITWOAPNESFFEG()检验并保存检验结果(相同则1, 不相同则0) # 3. 再次获取用户输入, 并加密参考数据 # 4. 再次检查, 只不过用的是strcmp()进行检查 # 我们需要担心的只有第二步, 我们要使用自己的检查函数来替换原本的check_equals_HXUITWOAPNESFFEG(), 因为里面有太多的if分支了 # 我们替换的方法是使用钩子
# Since Angr can handle the initial call to scanf, we can start from the # beginning. # 由于Angr可以处理对scanf的初始调用, 所以这里默认从mian函数开始即可 initial_state = project.factory.entry_state( add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY, angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS} )
# Hook the address of where check_equals_ is called. # 钩取调用check_equals_HXUITWOAPNESFFEG()函数的地址 # (!) check_equals_called_address = 0x080493CE# 调用check_equals_HXUITWOAPNESFFEG()的地址
# The length parameter in angr.Hook specifies how many bytes the execution # engine should skip after completing the hook. This will allow hooks to # replace certain instructions (or groups of instructions). Determine the # instructions involved in calling check_equals_, and then determine how many # bytes are used to represent them in memory. This will be the skip length. # Angr.Hook中的lenth参数指定执行引擎在完成Hook后应该跳过多少字节. # 这将使Hook替换某些指令(call指令, 后面清理栈的指令) # 确定调用check_equals_HXUITWOAPNESFFEG()所涉及的指令, 然后确定在内存中使用多少字节来表示他们, 这就是lenth # (!) instruction_to_skip_length = 5 + 3# call指令占5个字节, 下面清理参数栈的add指令占3字节 @project.hook(check_equals_called_address, length=instruction_to_skip_length) defskip_check_equals_(state): # Determine the address where user input is stored. It is passed as a # parameter ot the check_equals_ function. Then, load the string. Reminder: # int check_equals_(char* to_check, int length) { ... # 确定存储用户输入的地址. 它作为check_equals_HXUITWOAPNESFFEG()的参数传递 # 然后加载字符串. user_input_buffer_address = 0x0804C044# :integer, probably hexadecimal user_input_buffer_length = 0x10
# Reminder: state.memory.load will read the stored value at the address # user_input_buffer_address of byte length user_input_buffer_length. # It will return a bitvector holding the value. This value can either be # symbolic or concrete, depending on what was stored there in the program. # state.memry.load()会读取地址user_input+buffer_address处存储的值, 长度为user_input_buffer_lenth # 它将返回一个保存该值的位向量.这个值可以是符号位向量, 也可以是常数位向量. # 取决于程序中存储的内容 # 这里加载的buffer中存储的是符号值, 虽然我们没有创建符号并注入, 但是开始时使用的时默认的执行起始状态 # 所以会自动注入符号到scanf所指向的内存, 也就是上面的buffer user_input_string = state.memory.load( user_input_buffer_address, user_input_buffer_length ) # Determine the string this function is checking the user input against. # It's encoded in the name of this function; decompile the program to find # it. # 确定此函数正在检查的用户输入的字符串 # 它以这个函数的名称编码 check_against_string = "HXUITWOAPNESFFEG"# :string
# gcc uses eax to store the return value, if it is an integer. We need to # set eax to 1 if check_against_string == user_input_string and 0 otherwise. # However, since we are describing an equation to be used by z3 (not to be # evaluated immediately), we cannot use Python if else syntax. Instead, we # have to use claripy's built in function that deals with if statements. # claripy.If(expression, ret_if_true, ret_if_false) will output an # expression that evaluates to ret_if_true if expression is true and # ret_if_false otherwise. # Think of it like the Python "value0 if expression else value1". # gcc使用eax来存储返回值, 如果它是一个整数. # 如果check_against_string == user_input_string, 就将eax设为1, 否则为0 # 但是, 由于我们描述的是z3使用的方程(不立即计算), 我们不能使用Python的if else语法 # 相反, 我们必须使用claripy的内置函数来处理if语句 # claripy.if(expression, ret_if_true, ret_if_false) # 根据参数名可以知道第一个参数expression为真, 则返回第二个参数, 反之则返回第三个参数 state.regs.eax = claripy.If( user_input_string == check_against_string, claripy.BVV(1, 32), claripy.BVV(0, 32) )
simulation = project.factory.simgr(initial_state)
defis_successful(state): stdout_output = state.posix.dumps(sys.stdout.fileno()) return"Good Job".encode() in stdout_output
defshould_abort(state): stdout_output = state.posix.dumps(sys.stdout.fileno()) return"Try again".encode() in stdout_output
if simulation.found: solution_state = simulation.found[0]
# Since we are allowing Angr to handle the input, retrieve it by printing # the contents of stdin. Use one of the early levels as a reference. # 因为我们允许Angr处理输入, 所以通过打印stdin的内容来查看它. solution = solution_state.posix.dumps(sys.stdin.fileno()) print(solution) else: raise Exception('Could not find the solution')
# This challenge is similar to the previous one. It operates under the same # premise that you will have to replace the check_equals_ function. In this # case, however, check_equals_ is called so many times that it wouldn't make # sense to hook where each one was called. Instead, use a SimProcedure to write # your own check_equals_ implementation and then hook the check_equals_ symbol # to replace all calls to scanf with a call to your SimProcedure. # 本次挑战与上一次类似, 你必须替换掉check_equals_HXUITWOAPNESFFEG()它才能正常运行 # 但是, 在这一关中这个函数被调用非常多次, 以至于对每个调用的位置进行Hook是非常低效且无意义的 # 但是如果使用SimProcedure模拟管理器编写你自己的check函数实现, 然后Hook挂钩到check_equals_HXUITWOAPNESFFEG()符号 # 从而实现对所有输入的检查替换为对SimProcedure的调用 # # You may be thinking: # Why can't I just use hooks? The function is called many times, but if I hook # the address of the function itself (rather than the addresses where it is # called), I can replace its behavior everywhere. Furthermore, I can get the # parameters by reading them off the stack (with memory.load(regs.esp + xx)), # and return a value by simply setting eax! Since I know the length of the # function in bytes, I can return from the hook just before the 'ret' # instruction is called, which will allow the program to jump back to where it # was before it called my hook. # If you thought that, then congratulations! You have just invented the idea of # SimProcedures! Instead of doing all of that by hand, you can let the already- # implemented SimProcedures do the boring work for you so that you can focus on # writing a replacement function in a Pythonic way. # As a bonus, SimProcedures allow you to specify custom calling conventions, but # unfortunately it is not covered in this CTF. # 你可能会想: # 为什么我不能只使用Hook, 该函数被多次调用, 我们直接Hook函数本身即可 # 但是这个原理其实就是上面SimProcedure的想法, 只是SimProcedure替你解决大部分细节的实现, 让这个想法的实现变得更加简单
# Define a class that inherits angr.SimProcedure in order to take advantage # of Angr's SimProcedures. # 定义一个继承angr.SimProcedure的类, 以便使用Angr的SimProcedures classReplacementCheckEquals(angr.SimProcedure): # A SimProcedure replaces a function in the binary with a simulated one # written in Python. Other than it being written in Python, the function # acts largely the same as any function written in C. Any parameter after # 'self' will be treated as a parameter to the function you are replacing. # The parameters will be bitvectors. Additionally, the Python can return in # the ususal Pythonic way. Angr will treat this in the same way it would # treat a native function in the binary returning. An example: # SimProcedure用Python编写的模拟函数替换二进制文件中的函数. # 除了它是用Python编写的之外, 该函数的行为与任何使用C编写的函数基本相同. # "self"之后的任何参数都将被视为您要替换的函数的参数, 参数将是位向量 # 此外, Python可以以通常的Pythonic方式返回. # Angr将像对待二进制返回中的本机函数一样对待它, 下面是一个例子 # # int add_if_positive(int a, int b) { # if (a >= 0 && b >= 0) return a + b; # else return 0; # } # # could be simulated with... # 可以被模拟成 # # class ReplacementAddIfPositive(angr.SimProcedure): # def run(self, a, b): # if a >= 0 and b >=0: # return a + b # else: # return 0 # # Finish the parameters to the check_equals_ function. Reminder: # int check_equals_AABBCCDDEEFFGGHH(char* to_check, int length) { ... # 完成check_equals_HXUITWOAPNESFFEG(), 提示: int check_equals_XXXXXXXXXXXXXXXX(char* to_check, int length){} # (!) defrun(self, to_check, len): # We can almost copy and paste the solution from the previous challenge. # Hint: Don't look up the address! It's passed as a parameter. # 我们几乎可以复制上一关的解决方案 # 提示: 不要查地址, 它作为参数传递 # (!) user_input_buffer_address = to_check user_input_buffer_length = len
# Note the use of self.state to find the state of the system in a # SimProcedure. # 注意使用self.state在SimProcedure中查找系统状态 user_input_string = self.state.memory.load( user_input_buffer_address, user_input_buffer_length )
check_against_string = "HXUITWOAPNESFFEG" # Finally, instead of setting eax, we can use a Pythonic return statement # to return the output of this function. # Hint: Look at the previous solution. return claripy.If(user_input_string == check_against_string, claripy.BVV(1, 32), claripy.BVV(0, 32))
# Hook the check_equals symbol. Angr automatically looks up the address # associated with the symbol. Alternatively, you can use 'hook' instead # of 'hook_symbol' and specify the address of the function. To find the # correct symbol, disassemble the binary. # Hook check_equals_HXUITWOAPNESFFEG()的符号. Angr自动查找与符号相关联的地址. # 或者, 你可以使用"hook"而不是"hook_symbol"并指定函数的地址 # 要找到正确的符号, 请反汇编二进制文件 # (!) check_equals_symbol = "check_equals_HXUITWOAPNESFFEG"# :string project.hook_symbol(check_equals_symbol, ReplacementCheckEquals())
# This time, the solution involves simply replacing scanf with our own version, # since Angr does not support requesting multiple parameters with scanf. # 这一次, 通关方法是将scanf替换为我们自己的版本, 因为Angr 不支持scanf输入多个参数 import angr import claripy import sys
# 这一次我们用的是上一关的Hook, 不过Hook掉的是scanf函数 classReplacementScanf(angr.SimProcedure): # Finish the parameters to the scanf function. Hint: 'scanf("%u %u", ...)'. # (!) defrun(self, format_string, scanf0_address, scanf1_address): # Hint: scanf0_address is passed as a parameter, isn't it? scanf_data_len = 4 * 8 scanf0 = claripy.BVS('scanf0', scanf_data_len)#一个无符号整型长度为4字节, 32位 scanf1 = claripy.BVS('scanf1', scanf_data_len)
# The scanf function writes user input to the buffers to which the # parameters point. # scanf函数将用户输入写入参数指向的缓冲区 # 就是将我们创建的符号位向量载入我们变量的位置, 变量的位置通过参数获得 self.state.memory.store(scanf0_address, scanf0, endness=project.arch.memory_endness) self.state.memory.store(scanf1_address, scanf1, endness=project.arch.memory_endness)
# Now, we want to 'set aside' references to our symbolic values in the # globals plugin included by default with a state. You will need to # store multiple bitvectors. You can either use a list, tuple, or multiple # keys to reference the different bitvectors. # 现在, 我们要在默认情况下包含在状态中的globals插件 # (!) self.state.globals['solution0'] = scanf0 self.state.globals['solution1'] = scanf1
# When you construct a simulation manager, you will want to enable Veritesting: # project.factory.simgr(initial_state, veritesting=True) # Hint: use one of the first few levels' solutions as a reference. # 当你构建一个模拟管理器时, 你会想要启用Veritesting: # 提示: 使用前面几个关卡的其中一个解决方案 # 这一道题有一个循环if, 来检验加密后的字符串, 这会导致路径指数级增长.
if simulation.found: solution_state = simulation.found[0] print(solution_state.posix.dumps(sys.stdin.fileno())) else: raise Exception('Could not find the solution')
if simulation.found: solution_state = simulation.found[0] print(solution_state.posix.dumps(sys.stdin.fileno())) else: raise Exception('Could not find the solution')
if __name__ == '__main__': main(sys.argv)
13_angr_static_binary
编译并运行
1 2 3 4 5
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/13_angr_static_binary$ python3 generate.py 1234 13_angr_static_binary lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/13_angr_static_binary$ ./13_angr_static_binary Enter the password: aaaaaaaaaaaaaaaa Try again. lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/13_angr_static_binary$
# This challenge is the exact same as the first challenge, except that it was # compiled as a static binary. Normally, Angr automatically replaces standard # library functions with SimProcedures that work much more quickly. # 这一关与第一关完全相同, 只是它被编译为静态二进制文件. # 通常, Angr会自动将标准库函数替换为运行速度更快的SimProceures # # To solve the challenge, manually hook any standard library c functions that # are used. Then, ensure that you begin the execution at the beginning of the # main function. Do not use entry_state. # 要通过这一关, 手动Hook任何使用标准C函数. 然后确保在main函数的开头开始执行. # 不要使用entry_state # # Here are a few SimProcedures Angr has already written for you. They implement # standard library functions. You will not need all of them: # 这里有一些Angr已经为你编写好了的SimProcedures. 它们相当于标准库函数 # 你不需要全部的函数, 一部分就可以了 # angr.SIM_PROCEDURES['libc']['malloc'] # angr.SIM_PROCEDURES['libc']['fopen'] # angr.SIM_PROCEDURES['libc']['fclose'] # angr.SIM_PROCEDURES['libc']['fwrite'] # angr.SIM_PROCEDURES['libc']['getchar'] # angr.SIM_PROCEDURES['libc']['strncmp'] # angr.SIM_PROCEDURES['libc']['strcmp'] # angr.SIM_PROCEDURES['libc']['scanf'] # angr.SIM_PROCEDURES['libc']['printf'] # angr.SIM_PROCEDURES['libc']['puts'] # angr.SIM_PROCEDURES['libc']['exit'] # # As a reminder, you can hook functions with something similar to: # project.hook(malloc_address, angr.SIM_PROCEDURES['libc']['malloc']()) # 提醒一下, 你可以使用下面的语句来实现Hook # project.hook(函数地址, angr.SIM_PROCEDURES['libc']['要替换的内置函数名']()) # # There are many more, see: # 了解更多, 请看下面的网站 # https://github.com/angr/angr/tree/master/angr/procedures/libc # # Additionally, note that, when the binary is executed, the main function is not # the first piece of code called. In the _start function, __libc_start_main is # called to start your program. The initialization that occurs in this function # can take a long time with Angr, so you should replace it with a SimProcedure. # angr.SIM_PROCEDURES['glibc']['__libc_start_main'] # Note 'glibc' instead of 'libc'. # 另外, 请注意, 执行二进制文件时, 主函数不是最先被调用的代码,. # 而是在_start函数中, 通过调用__libc_start_main来启动main函数 # 使用Angr在此函数中进行的初始化可能需要很长的时间, 因此你需要将其替换为SimPorcedure # ... # 注意是'glibc'而不是'libc'
if simulation.found: solution_state = simulation.found[0] print(solution_state.posix.dumps(sys.stdin.fileno()).decode()) else: raise Exception('Could not find the solution')
if simulation.found: solution_state = simulation.found[0] print(solution_state.posix.dumps(sys.stdin.fileno()).decode()) else: raise Exception('Could not find the solution')
# The shared library has the function validate, which takes a string and returns # either true (1) or false (0). The binary calls this function. If it returns # true, the program prints "Good Job." otherwise, it prints "Try again." # 共享库有函数validate, 它接受一个字符串并返回true(1)或false(0). # 二进制文件调用此函数, 如果返回True, 程序将打印"Good Job", 否则打印"Tyr again" # # Note: When you run this script, make sure you run it on # lib14_angr_shared_library.so, not the executable. This level is intended to # teach how to analyse binary formats that are not typical executables. # 注意: 运行此脚本是, 请确保在lib14_angr_shared_library.so上运行 # 而不是在可以执行文件上运行 # 这个关卡的目的在于如何分析非典型可执行文件的二进制格式
import angr import claripy import sys
defmain(argv): path_to_binary = argv[1]
# The shared library is compiled with position-independent code. You will need # to specify the base address. All addresses in the shared library will be # base + offset, where offset is their address in the file. # 共享库用的是与位置无关代码编译的, 你需要指定基地址. # 共享库中的所有地址都是base + offset, 其中offset是它们在文件中的偏移 # (!) base = 0x400000 project = angr.Project(path_to_binary, load_options={ 'main_opts' : { 'base_addr' : base } })
# Initialize any symbolic values here; you will need at least one to pass to # the validate function. # 在这里初始化任何符号值, 你至少需要传递一个符号给验证函数 # (!)
# Begin the state at the beginning of the validate function, as if it was # called by the program. Determine the parameters needed to call validate and # replace 'parameters...' with bitvectors holding the values you wish to pass. # Recall that 'claripy.BVV(value, size_in_bits)' constructs a bitvector # initialized to a single value. # Remember to add the base value you specified at the beginning to the # function address! # Hint: int validate(char* buffer, int length) { ... # Another hint: the password is 8 bytes long. # 从验证函数起始状态开始, 就像是它被程序调用了. 确定调用验证所需的参数, # 并将"参数"替换为你像传递的值的位向量. # 回想一下, 'claripy.BVV(value.size_in_bits)'构造了一个初始化为单个值的位向量. # 不要忘记将你在开头指定的基址放在函数地址 # 提示: int validate(char * buffer, int lenth){...} # 另一个提示: 密码长度为8字节 # (!) validate_function_address = base + 0x0000129C initial_state = project.factory.call_state(validate_function_address, buffer_pointer, 8) # 八个字符
# You will need to add code to inject a symbolic value into the program. Also, # at the end of the function, constrain eax to equal true (value of 1) just # before the function returns. There are multiple ways to do this: # 1. Use a hook. # 2. Search for the address just before the function returns and then # constrain eax (this may require putting code elsewhere) # 你需要添加代码从而将符号值注入程序. # 此外, 在函数结束时, 在函数返回之前将eax约束为等于True, 有很多方法可以做到: # 1. 使用钩子 # 2. 搜索函数返回前的地址, 然后约束eax # (!) # 创建符号 password_len_bits = 8 * 8 password = claripy.BVS("password", password_len_bits) # 符号化 initial_state.memory.store( buffer_pointer, password)
simulation = project.factory.simgr(initial_state)
success_address = base + 0x0000134C simulation.explore(find=success_address)
if simulation.found: solution_state = simulation.found[0]
# Determine where the program places the return value, and constrain it so # that it is true. Then, solve for the solution and print it. # 确定程序将返回值放在哪里, 并对其进行约束, 使其为真, 然后求解并打印 # (!)
solution_state.add_constraints( solution_state.regs.eax != 0 ) solution = solution_state.solver.eval(password, cast_to = bytes) print(solution) else: raise Exception('Could not find the solution')
# This binary takes both an integer and a string as a parameter. A certain # integer input causes the program to reach a buffer overflow with which we can # read a string from an arbitrary memory location. Our goal is to use Angr to # search the program for this buffer overflow and then automatically generate # an exploit to read the string "Good Job." # 这个二进制接受一个整数和一个字符串输入, 某个整数输入会导致程序达到缓冲区溢出 # 我们可以使用该缓冲区从任意内存位置读取字符串. 我们的目标是使用Angr在程序搜索 # 缓冲区溢出, 然后自动生成一个漏洞利用来读取字符串"Good Job" # # What is the point of reading the string "Good Job."? # This CTF attempts to replicate a simplified version of a possible vulnerability # where a user can exploit the program to print a secret, such as a password or # a private key. In order to keep consistency with the other challenges and to # simplify the challenge, the goal of this program will be to print "Good Job." # instead. # 读取字符串"Good Job"有什么意义? # 这个关卡试图复现一个漏洞的简化版本, 用户可以利用这个漏洞打印一个自己想要打印的东西 # 为了与其他挑战保持一直并简化挑战, 我们需要打印"Good Job"来证明自己已经通关 # # The general strategy for crafting this script will be to: # 制作此脚本的通用策略是: # 1) Search for calls of the 'puts' function, which will eventually be exploited # to print out "Good Job." # 1. 搜索"puts"函数的调用, 最终利用这个函数并打印出"Good Job" # 2) Determine if the first parameter of 'puts', a pointer to the string to be # printed, can be controlled by the user to be set to the location of the # "Good Job." string. # 2. 确定"puts"的第一个参数: 一个指向要打印的字符串的指针. 是否可以由用户控制并设为"Good Job" # 3) Solve for the input that prints "Good Job." # 3. 求解打印"Good Job"的输入 # # Note: The script is structured to implement step #2 before #1. # 提示: 在脚本中2.的实现早于1.
# Some of the source code for this challenge: # 这个关卡的部分源代码 # #include <stdio.h> # #include <stdlib.h> # #include <string.h> # #include <stdint.h> # # // This will all be in .rodata # 这都会放在.rodata段区中 # char msg[] = "${ description }$"; # char* try_again = "Try again."; # char* good_job = "Good Job."; # uint32_t key; # # void print_msg() { # printf("%s", msg); # } # # uint32_t complex_function(uint32_t input) { # ... # } # # struct overflow_me { # char buffer[16];# 上一个输入造成溢出到print的地址, 如果输入的是"Good Job"的地址那么就会导致最终打印的是"Good Job" # char* to_print; # }; # # int main(int argc, char* argv[]) { # struct overflow_me locals; # locals.to_print = try_again;# 初始化为"Try again"的地址 # # print_msg(); # # printf("Enter the password: "); # scanf("%u %20s", &key, locals.buffer);注意这里是%20s, 剩下的四个字节就是我们要填入"Good Job"地址的地方 # # key = complex_function(key); # # switch (key) { # case ?: # puts(try_again); # break; # # ... # # case ?: # // Our goal is to trick this call to puts to print the "secret # // password" (which happens, in our case, to be the string # // "Good Job.") # 我们的目标是骗过puts的调用, 并打印"Good Job" # puts(locals.to_print); # break; # # ... # } # # return 0; # }
# You can either use a blank state or an entry state; just make sure to start # at the beginning of the program. # (!) initial_state = project.factory.entry_state( add_options = {angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY, angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS} )
# The bitvector.chop(bits=n) function splits the bitvector into a Python # list containing the bitvector in segments of n bits each. In this case, # we are splitting them into segments of 8 bits (one byte.) # bitvector.chop(bits = n)函数将位向量拆分为一个Python列表, 其中很多个包含n位小节的位向量元素 # 我们现在将其分为8位一字节的小结 for char in scanf1.chop(bits=8): # Ensure that each character in the string is printable. An interesting # experiment, once you have a working solution, would be to run the code # without constraining the characters to the printable range of ASCII. # Even though the solution will technically work without this, it's more # difficult to enter in a solution that contains character you can't # copy, paste, or type into your terminal or the web form that checks # your solution. # 确保字符串中的每个字符都是可打印的. # 保证你的解决方案的有效简洁 # (!) self.state.add_constraints(char >= 0x21, char <= 126)
# Warning: Endianness only applies to integers. If you store a string in # memory and treat it as a little-endian integer, it will be backwards. # 警告: 大小端字节序仅适用于整数, 字符串还是正常排列 # key是全局变量所以可以直接用地址来注入符号 #scanf0_address = 0x48587030 self.state.memory.store(scanf0_address, scanf0, endness=project.arch.memory_endness) self.state.memory.store(scanf1_address, scanf1)
# We will call this whenever puts is called. The goal of this function is to # determine if the pointer passed to puts is controllable by the user, such # that we can rewrite it to point to the string "Good Job." # 只要puts被调用, 我们就会调用它, 此函数的目标是确定传递给puts的指针是否可由用户控制 # 以便我们可以重写为指向字符串"Good Job"的地址 defcheck_puts(state): # Recall that puts takes one parameter, a pointer to the string it will # print. If we load that pointer from memory, we can analyse it to determine # if it can be controlled by the user input in order to point it to the # location of the "Good Job." string. # 回想一下, puts有一个参数: 指向字符串的指针. # 如果我们从内存中找到这个指针, # 我们可以分析它以确定它是否可以由用户输入控制 # # Treat the implementation of this function as if puts was just called. # The stack, registers, memory, etc should be set up as if the x86 call # instruction was just invoked (but, of course, the function hasn't copied # the buffers yet.) # The stack will look as follows: # 将此函数代替puts # ... # esp + 7 -> /----------------\ # esp + 6 -> | puts | # esp + 5 -> | parameter | 参数先压入栈中 # esp + 4 -> \----------------/ # esp + 3 -> /----------------\ # esp + 2 -> | return | # esp + 1 -> | address | 调用puts时的call指令会将返回地址压入栈中 # esp -> \----------------/ # # Hint: Look at level 08, 09, or 10 to review how to load a value from a # memory address. Remember to use the correct endianness in the future when # loading integers; it has been included for you here. # 查看第8, 9, 10关如何从内存地址加载值 # 记住以后再加载整数时要使用正确的大小端字节序, 这个已经被包含在下面的参数中了 # (!) puts_parameter = state.memory.load(state.regs.esp + 4, 4, endness=project.arch.memory_endness)
# The following function takes a bitvector as a parameter and checks if it # can take on more than one value. While this does not necessary tell us we # have found an exploitable state, it is a strong indication that the # bitvector we checked may be controllable by the user. # Use it to determine if the pointer passed to puts is symbolic. # 下面的函数将用一个位向量作为参数, 并检查它是否可以接受多个值 # 虽然这并不一定代表我们发现漏洞, 但是这强烈表明我们检查的位向量可能由用户控制 # 使用它来确定传递给puts的指针是否是符号的 # (!) if state.solver.symbolic(puts_parameter): # Determine the location of the "Good Job." string. We want to print it # out, and we will do so by attempting to constrain the puts parameter to # equal it. Hint: use 'objdump -s <binary>' to look for the string's # address in .rodata. # 确定好"Good Job"的地址 我们想要将其输出, 我们通过对puts参数约束为"Good Job"的地址来实现 # 使用objdump -s <二进制文件>来查看"Good Job"的地址 # (!) good_job_string_address = 0x4858554B# :integer, probably hexadecimal
# Create an expression that will test if puts_parameter equals # good_job_string_address. If we add this as a constraint to our solver, # it will try and find an input to make this expression true. Take a look # at level 08 to remind yourself of the syntax of this. # 创建一个表达式来测试puts_parameter是否等于"Good Job"的地址 # 如果我们将此作为约束添加到求解器, 它将尝试找到一个输入以使得该表达式为真 # 看一下第八关, 看看怎么使用该语法 # (!) is_vulnerable_expression = puts_parameter == good_job_string_address # :boolean bitvector expression
# Have Angr evaluate the state to determine if all the constraints can # be met, including the one we specified above. If it can be satisfied, # we have found our exploit! # # When doing this, however, we do not want to edit our state in case we # have not yet found what we are looking for. To test if our expression # is satisfiable without editing the original, we need to clone the state. # 让Angr评估状态从而确定是否可以满足所有约束, 包括我们上面指定的约束. 如果可以满足, 我们就可以找到漏洞 # 但是当我们这样做时, 我们不希望编辑我们的状态, 因为我们还没有找到想要的东西. # 为了此时我们的表达式是否可以再不编辑原始内容的情况下满足, 我们需要克隆状态 copied_state = state.copy()
# We can now play around with the copied state without changing the # original. We need to add our vulnerable expression as a state to test it. # Look at level 08 and compare this call to how it is called there. # 我们现在可以在不改变原始状态的情况下使用克隆状态. # 我们需要添加易受攻击的表达式作为state从而测试他. copied_state.add_constraints(is_vulnerable_expression)
# Finally, we test if we can satisfy the constraints of the state. # 最后, 我们测试可以满足条件的约束 if copied_state.satisfiable(): # Before we return, let's add the constraint to the solver for real, # instead of just querying whether the constraint _could_ be added. # 在我们返回之前, 我们要将约束添加到求解其中, 而不仅仅只是查看是否可以约束 state.add_constraints(is_vulnerable_expression) returnTrue else: returnFalse else: # not state.solver.symbolic(???) returnFalse
simulation = project.factory.simgr(initial_state)
# In order to determine if we have found a vulnerable call to 'puts', we need # to run the function check_puts (defined above) whenever we reach a 'puts' # call. To do this, we will look for the place where the instruction pointer, # state.addr, is equal to the beginning of the puts function. # 为了确定我们是否发现了对"puts"函数的易受攻击的调用, 我们需要在调用"puts"调用运行时运行上面定义阿check_ptus # 为此, 我们将IP指针state.addr变成puts函数开头的地址. defis_successful(state): # We are looking for puts. Check that the address is at the (very) beginning # of the puts function. Warning: while, in theory, you could look for # any address in puts, if you execute any instruction that adjusts the stack # pointer, the stack diagram above will be incorrect. Therefore, it is # recommended that you check for the very beginning of puts. # (!) puts_address = 0x08049090 if state.addr == puts_address: # Return True if we determine this call to puts is exploitable. return check_puts(state) else: # We have not yet found a call to puts; we should continue! returnFalse
simulation.explore(find=is_successful)
if simulation.found: solution_state = simulation.found[0]
# Essentially, the program does the following: # 本质上, 程序执行以下操作: # # scanf("%d %20s", &key, user_input); # ... # // if certain unknown conditions are true... # // 如果某些未知条件为真 # strncpy(random_buffer, user_input); # ^ # ... | # if (strncmp(secure_buffer, reference_string)) { # // The secure_buffer does not equal the reference string. # // secure_buffer不等于reference_string # puts("Try again."); # } else { # // The two are equal. # // 如果二者相等就打印Good Job # puts("Good Job."); # } # # If this program has no bugs in it, it would _always_ print "Try again." since # user_input copies into random_buffer, not secure_buffer. # 如果程序正常运行, 它只会打印"Try again", 因为输入无法影响到secure_buffer # # The question is: can we find a buffer overflow that will allow us to overwrite # the random_buffer pointer to point to secure_buffer? (Spoiler: we can, but we # will need to use Angr.) # 问题是: 我们是否可以找到一个缓冲区溢出, 让我们可以覆盖random_buffer指针指向secure_buffer # 提示: 我们可以使用Angr实现 # # We want to identify a place in the binary, when strncpy is called, when we can: # 我们想在二进制文件中确定一个位置, 当调用strncpy时, 我们可以: # 1) Control the source contents (not the source pointer!) # 1) 控制源内容 # * This will allow us to write arbitrary data to the destination. # * 这将允许我们将任何数据写入目的地 # 2) Control the destination pointer # 2) 控制目标指针 # * This will allow us to write to an arbitrary location. # * 这将允许我们写入任何位置 # If we can meet both of those requirements, we can write arbitrary data to an # arbitrary location. Finally, we need to contrain the source contents to be # equal to the reference_string and the destination pointer to be equal to the # secure_buffer. # 如果我们能同时满足这两个要求, 我们就可以将任意数据写入任意位置 # 最后, 我们需要约束源内容等于reference_string, 目标指针等于secur_buffer
# You can either use a blank state or an entry state; just make sure to start # at the beginning of the program. # entry_state和blank_state都行, 只需保证在程序开头 initial_state = project.factory.entry_state( add_options = {angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY, angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS} )
# In this challenge, we want to check strncpy to determine if we can control # both the source and the destination. It is common that we will be able to # control at least one of the parameters, (such as when the program copies a # string that it received via stdin). # 在这个挑战中, 我们要检查strcpy来确定我们是否可以同时控制src和dest. # 通常我们能够至少控制一个参数() defcheck_strncpy(state): # The stack will look as follows: # 栈空间如下图所示 # ... ________________ # esp + 15 -> / \ # esp + 14 -> | param2 | # esp + 13 -> | len | # esp + 12 -> \________________/ # esp + 11 -> / \ # esp + 10 -> | param1 | # esp + 9 -> | src | # esp + 8 -> \________________/ # esp + 7 -> / \ # esp + 6 -> | param0 | # esp + 5 -> | dest | # esp + 4 -> \________________/ # esp + 3 -> / \ # esp + 2 -> | return | # esp + 1 -> | address | # esp -> \________________/ # (!) strncpy_src = state.memory.load(state.regs.esp + 8, 4, endness = project.arch.memory_endness) #这里读错了, 4字节当成位了 strncpy_dest = state.memory.load(state.regs.esp + 4, 4, endness = project.arch.memory_endness) # strncpy_len = state.memory.load(state.regs.esp + 12, 4, endness = project.arch.memory_endness)
# We need to find out if src is symbolic, however, we care about the # contents, rather than the pointer itself. Therefore, we have to load the # the contents of src to determine if they are symbolic. # Hint: How many bytes is strncpy copying? # 我们需要确定src是否是符号, 然而我们关心的是内容, 而不是指针本身 # 因此, 我们必须加载src的内容来确定它们是否是符号. # 提示: strncpy复制了多少字节 # (!) src_contents = state.memory.load(strncpy_src, strncpy_len)
# Our goal is to determine if we can write arbitrary data to an arbitrary # location. This means determining if the source contents are symbolic # (arbitrary data) and the destination pointer is symbolic (arbitrary # destination). # 我们的目标是确定我们是否可以将任意数据写入任意位置. # 这意味着必须确定src内容是符号, 以及目标指针是符号的. # (!) if state.solver.symbolic(strncpy_dest) and state.solver.symbolic(src_contents): # Use ltrace to determine the password. Decompile the binary to determine # the address of the buffer it checks the password against. Our goal is to # overwrite that buffer to store the password. # 使用ltrace确定密码, 反编译二进制文件来确定它检查密码的缓冲区的地址 # 我们的目标是覆盖该缓冲区用来存储密码 # (!) password_string = 'TWOAPNES'# :string buffer_address = 0x4858553C# :integer, probably in hexadecimal
# Create an expression that tests if the first n bytes is length. Warning: # while typical Python slices (array[start:end]) will work with bitvectors, # they are indexed in an odd way. The ranges must start with a high value # and end with a low value. Additionally, the bits are indexed from right # to left. For example, let a bitvector, b, equal 'ABCDEFGH', (64 bits). # The following will read bit 0-7 (total of 1 byte) from the right-most # bit (the end of the string). # b[7:0] == 'H' # To access the beginning of the string, we need to access the last 16 # bits, or bits 48-63: # b[63:48] == 'AB' # In this specific case, since we don't necessarily know the length of the # contents (unless you look at the binary), we can use the following: # b[-1:-16] == 'AB', since, in Python, -1 is the end of the list, and -16 # is the 16th element from the end of the list. The actual numbers should # correspond with the length of password_string. # (!) does_src_hold_password = src_contents[-1:-64] == password_string
# Create an expression to check if the dest parameter can be set to # buffer_address. If this is true, then we have found our exploit! # 创建一个表达式来检查dest参数是否可以设置为buffer_address # 如果为真, 我们就找到了漏洞 # (!) does_dest_equal_buffer_address = strncpy_dest == buffer_address
# In the previous challenge, we copied the state, added constraints to the # copied state, and then determined if the constraints of the new state # were satisfiable. Since that pattern is so common, Angr implemented a # parameter 'extra_constraints' for the satisfiable function that does the # exact same thing. Note that we can pass multiple expressions to # extra_constraints. # 在之前的挑战中, 我们复制了状态, 并给复制的状态添加了约束, 然后判断<复制状态>的约束是否满足 # 由于这种方法很常见, 所以Angr为可满足函数事项了一个参数"extra_constraints", 它具有相同的功能 if state.satisfiable(extra_constraints=(does_src_hold_password, does_dest_equal_buffer_address)): state.add_constraints(does_src_hold_password, does_dest_equal_buffer_address) returnTrue else: returnFalse else: # not state.solver.symbolic(???) returnFalse
if state.satisfiable(extra_constraints=(does_src_hold_password, does_dest_equal_buffer_address)): state.add_constraints(does_src_hold_password, does_dest_equal_buffer_address)
if state.satisfiable(extra_constraints=(does_src_hold_password, does_dest_equal_buffer_address)): state.add_constraints(does_src_hold_password, does_dest_equal_buffer_address)
17_angr_arbitrary_jump
编译并运行
1 2 3 4 5 6
lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/17_angr_arbitrary_jump$ python3 generate.py 1234 17_angr_arbitrary_jump lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/17_angr_arbitrary_jump$ ./17_angr_arbitrary_j ump Enter the password: aaaaa Try again. lamecrow@LAPTOP-PUE31HT9:/mnt/d/TRY/SYC_TODOLIST/angr_ctf/17_angr_arbitrary_jump$
分析
1 2 3 4 5 6 7
int __cdecl main(int argc, constchar **argv, constchar **envp) { printf("Enter the password: "); read_input(); puts("Try again."); return0; }
# An unconstrained state occurs when there are too many # possible branches from a single instruction. This occurs, among other ways, # when the instruction pointer (on x86, eip) is completely symbolic, meaning # that user input can control the address of code the computer executes. # For example, imagine the following pseudo assembly: # 当一条指令有太多分支时, 就会出现不受约束的状态. # 当指令指针eip成为符号时, 就会发生这种情况, 这意味着用户输入可以控制计算机执行的代码的地址 # 通常Angr遇到不受约束的情况时, 它会将其抛出. # 而在这一观众我们想利用这种不受约束的情况来条状到我们想要的位置 # 稍后我们将讨论如何禁用Angr的默认行为 # # mov user_input, eax # jmp eax # # The value of what the user entered dictates the next instruction. This # is an unconstrained state. It wouldn't usually make sense for the execution # engine to continue. (Where should the program jump to if eax could be # anything?) Normally, when Angr encounters an unconstrained state, it throws # it out. In our case, we want to exploit the unconstrained state to jump to # a location of our choosing. We will get to how to disable Angr's default # behavior later. # 用户输入的值决定下一条指令, 这是一个不受约束的状态. # 执行引擎继续运行通常没有意义. # (如果eax可以是任何东西, 程序应该跳转到哪里) # 本关我们利用不受约束的状态跳转到我们选择的位置. # # This challenge represents a classic stack-based buffer overflow attack to # overwrite the return address and jump to a function that prints "Good Job." # Our strategy for solving the challenge is as follows: # 1. Initialize the simulation and ask Angr to record unconstrained states. # 2. Step through the simulation until we have found a state where eip is # symbolic. # 3. Constrain eip to equal the address of the "print_good" function. # 这一关代表了一个经典的基于堆栈的缓冲区溢出攻击, 覆盖返回值并跳转到打印"Good Job"的函数 # 步骤: # 1. 初始化模拟并要求Angr记录无状态 # 2. 逐步模拟, 知道我们找到eip变成符号的状态 # 3. 约束eip等于"print_good"函数的地址
# Make a symbolic input that has a decent size to trigger overflow # 创建一个合适大小的符号来触发溢出 # (!) symbolic_input = claripy.BVS("input", 8 * 59)
# Create initial state and set stdin to the symbolic input # 创建初始状态并将标准输入设置为符号输入 initial_state = project.factory.entry_state( stdin=symbolic_input,#!!!!!!!!!!!!!!!!!!!!!!!! add_options = { angr.options.SYMBOL_FILL_UNCONSTRAINED_MEMORY, angr.options.SYMBOL_FILL_UNCONSTRAINED_REGISTERS } )
# Ensure that every byte of input is within the acceptable ASCII range (A..Z) # 确保输入的每个字符都时可见字符(题目限制的时A到Z) # (!) for byte in symbolic_input.chop(bits=8): initial_state.add_constraints( claripy.And( byte >= 'A', byte <= 'Z' ) )
# The save_unconstrained=True parameter specifies to Angr to not throw out # unconstrained states. Instead, it will move them to the list called # 'simulation.unconstrained'. Additionally, we will be using a few stashes # that are not included by default, such as 'found' and 'not_needed'. You will # see how these are used later. # save_unconstrained = True参数指定Angr不抛出不受约束的状态 # 相对应, 它将会将这些状态添加到"simulation.unconstrained"列表中 # 此外, 我们将使用一些默认情况下不包含的存储, 例如"found"和"not_needed" # (!) simulation = project.factory.simgr( initial_state, save_unconstrained=True, stashes={ 'active' : [initial_state], 'unconstrained' : [], 'found' : [], 'not_needed' : [] } )
# Explore will not work for us, since the method specified with the 'find' # parameter will not be called on an unconstrained state. Instead, we want to # explore the binary ourselves. To get started, construct an exit condition # to know when the simulation has found a solution. We will later move # states from the unconstrained list to the simulation.found list. # Create a boolean value that indicates a state has been found. # 负责约束的explore不再起作用了, 因为find参数指定的方法在不受约束的情况下失效 # 我们需要自己探索二进制文件 # 首先, 构造退出条件, 当simulation发现解决方案时我们就可以知道这一消息. # 我们稍后会将state从不受约束的列表移动到simulation.found列表 # 并创建一个bool值标识一个被找到的state def has_found_solution(): return simulation.found
# An unconstrained state occurs when there are too many possible branches # from a single instruction. This occurs, among other ways, when the # instruction pointer (on x86, eip) is completely symbolic, meaning # that user input can control the address of code the computer executes. # For example, imagine the following pseudo assembly: # 当一条指令有太多<可能分支>时, 就会出现不受约束的状态(比如JMP 可以跳转的地址有太多了) # 当EIP时符号时就会出现这种情况, 这意味着用户输入可以控制计算机执行代码的地址. # 例如下面的汇编(ATT风格的) # # mov user_input, eax # jmp eax # # The value of what the user entered dictates the next instruction. This # is an unconstrained state. It wouldn't usually make sense for the execution # engine to continue. (Where should the program jump to if eax could be # anything?) Normally, when Angr encounters an unconstrained state, it throws # it out. In our case, we want to exploit the unconstrained state to jump to # a location of our choosing. Check if there are still unconstrained states # by examining the simulation.unconstrained list. # 用户输入的数据决定了下一条指令的. 这就是一个不受约束的state # 执行引擎继续运行通常没有意义(如果eax可以是任何值, 程序应该跳到哪里) # Angr遇见这种情况, 通常会抛出异常 # 而在这一关中, 我们想利用这个不受约束的状态来跳转到我们的print_good()函数 # 通过检查Simultion.unconstrained列表检查是否仍然存在不受约束的状态 # # (!) def has_unconstrained_to_check(): return simulation.unconstrained
# The list simulation.active is a list of all states that can be explored # further. # 列表simulation.active是可以进一步探索所有状态的列表 # (!) def has_active(): return simulation.active
while (has_active() or has_unconstrained_to_check()) and (not has_found_solution()): for unconstrained_state in simulation.unconstrained: def should_move(s): return s is unconstrained_state # Look for unconstrained states and move them to the 'found' stash. # A 'stash' should be a string that corresponds to a list that stores # all the states that the state group keeps. Values include: # 'active' = states that can be stepped # 'deadended' = states that have exited the program # 'errored' = states that encountered an error with Angr # 'unconstrained' = states that are unconstrained # 'found' = solutions # anything else = whatever you want, perhaps you want a 'not_needed', # you can call it whatever you want # (!) # 寻找不受约束的状态并将它们移动到'found'stash中, 'stash'应该是一个字符串, 对应于一个存储state组所保留的所有state的列表 simulation.move('unconstrained', 'found', filter_func = should_move)
# Advance the simulation. simulation.step()
if simulation.found: solution_state = simulation.found[0]
# Constrain the instruction pointer to target the print_good function and # then solve for the user input (recall that this is # 'solution_state.posix.dumps(sys.stdin.fileno())') # 约束指令指针指向print_good函数, 然后求解用户输入 # (!) solution_state.add_constraints(solution_state.regs.eip == 0x48585558)
solution = solution_state.posix.dumps(sys.stdin.fileno()).decode() print(solution) else: raise Exception('Could not find the solution')
while (has_active() or has_unconstrained_to_check()) and (not has_found_solution()): for unconstrained_state in simulation.unconstrained: def should_move(s): return s is unconstrained_state