XCTF2023 flagio

XCTF2023 flagio

原文链接: 7th XCTF Final - Super Flagio
出题人给出的WP思路非常清晰, 非常值得学习复现!!!

前置知识

LUA文件编译流程

跟很多虚拟机语言类似(比如Java虚拟机)

B4742E59FEACA232715645E0A37665D1

什么是LuaJIT

一个快速的Lua解释器, 支持JIT编译(即执行时编译为本地机器代码, 从而提高运行速率).

LuaJIT与coco2d-x之间的关系

cocos2d-x使用C++编写, 该游戏框架提供了众多C++接口来控制动画, 渲染, 物理引擎, 音频等, 同时将这些C++接口暴露给LuaJIT, 实现在执行Lua逻辑时可以控制游戏中的各种资源.

二者关系的一个例子

在一个cocos2d-x游戏中, 角色的移动动画通常由cocos2d-x提供的C++接口实现, 比如:

  • 左: left(); // 使用cocos2d-x的C++代码实现
  • 右: right(); // 同上
  • 跳: jump(); // 同上

而实际的游戏逻辑是在lua中执行的, 而游戏逻辑要体现在动画上, 就需要把C++接口暴露给lua代码, 然后由lua代码控制动画.

LuaJIT安装

参考: Installation (luajit.org)

note: 下面自主题中的很多内容都是从官网上直接翻译过来使用的.

简介

LuaJIT仅以源码包的形式发布, 该页面介绍: 在不同的操作系统和C编译器下如何构建和安装LuaJIT.

要求

系统要求

LuaJIT在大多数系统上都是开箱即用

交叉编译LuaJIT

基于GNU Makefile构建系统允许在任意宿主机上为任意支持的客户机进行交叉编译, 只要两个架构拥有相同大小的指针.如果你要在64位操作系统上为任意一个32位客户机交叉编译, 你需要安装multilib开发包, 然后构建32位的宿主机上的支持.

当宿主机和客户机的操作系统不同时, 你需要明确指出TARGET_SYS, 否则将在汇编或链接时报错.

例如: 如果你在Windows宿主机上为嵌入式Linux或Android编译时, 你需要在下面的示例中添加TARGET_SYS=Linux.

对于Android

对于Android, 你可以使用Android NDK来交叉编译.

实现编译

这里跳了, 先看确定LuaJIT版本[^1], 然后还需要按照android-ndk-r20b

改一下官网的脚本. LuaJIT-2.1.0-beta3编译没有成功, 就换了LuaJIT-2.1编译也能用.

1
2
3
4
5
6
7
8
9
10
11
12
13
# environment: kali-linux-2022.1-vmware-amd64
# Android/ARM, armeabi-v7a (ARMv7 VFP), Android 4.1+ (JB)

NDKDIR=/home/kali/Android/NDK/android-ndk-r20b
NDKBIN=$NDKDIR/toolchains/llvm/prebuilt/linux-x86_64/bin
NDKCROSS=$NDKBIN/arm-linux-androideabi-
NDKCC=$NDKBIN/armv7a-linux-androideabi16-clang
OUTPUT_DIR=output_dir
make HOST_CC="gcc -m32" CROSS=$NDKCROSS \
STATIC_CC=$NDKCC DYNAMIC_CC="$NDKCC -fPIC" \
TARGET_LD=$NDKCC TARGET_AR="$NDKBIN/llvm-ar rcus" \
TARGET_STRIP=$NDKBIN/llvm-strip \
PREFIX="$OUTPUT_DIR"

确定LuaJIT版本

步骤

  • IDA打开libgame.so
  • 字符串搜索LuaJIT

发现版本为LuaJIT 2.1.0-beta3

image

反调试

由于下一步需要使用frida进行调试, 所以需要先看看是否有frida的代码

Java已经看过了, 主要是root检测, 主要查看:

  • JNI_OnLoad
  • .init_array

JNI_OnLoad

创建了一个线程

1
2
3
4
5
6
7
8
9
10
11
12
13
jint JNI_OnLoad(JavaVM *vm, void *reserved)
{
__int64 v2; // x0
__int64 v3; // x0
pthread_t v5[2]; // [xsp+0h] [xbp-20h] BYREF

v5[1] = *(_QWORD *)(_ReadStatusReg(ARM64_SYSREG(3, 3, 13, 0, 2)) + 40);
thread_key(vm, reserved);
v2 = pthread_create(v5, 0LL, (void *(*)(void *))sub_1D4E0C, 0LL);// 创建线程
v3 = pthread_getspecific_(v2);
sub_1D30F4(v3); // not important
return 65540;
}

线程逻辑

1
2
3
4
5
6
7
8
9
10
11
12
13
void __noreturn sub_1D4E0C()
{
unsigned int pid; // w19
__int64 flag; // x0

pid = getpid();
while ( 1 )
{
flag = check_function(pid); // 第一个检测函数
sub_1D4CFC(flag); // 可能是检测函数
sleep(1u);
}
}

检测函数

检测status文件, 如果有进程调试该应用, v5不为1, 所以我们的目的是让该函数返回0

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
bool __fastcall check_function(unsigned int a1)
{
FILE *v1; // x0
FILE *v2; // x19
_BOOL4 v3; // w20
int v5; // [xsp+14h] [xbp-53Ch] BYREF
char buffer[1024]; // [xsp+18h] [xbp-538h] BYREF
char s[264]; // [xsp+418h] [xbp-138h] BYREF

_ReadStatusReg(ARM64_SYSREG(3, 3, 13, 0, 2));
snprintf(s, 0xFFu, "/proc/%d/status", a1); // 检查status文件
v1 = fopen(s, "rt");
if ( v1 )
{
v2 = v1;
while ( fgets(buffer, 1024, v2) )
{
if ( !memcmp(buffer, "TracerPid:", 0xAu) )// 检查是否读取到TracerPid的那一行
{
v5 = 0;
sscanf(buffer, "%*s%d", &v5);
v3 = v5 != 0;
printf("%s", buffer);
goto LABEL_8;
}
}
v3 = 1;
LABEL_8:
fclose(v2);
}
else
{
return 1;
}
return v3;
}

.init_array

没有什么检测函数的特征

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
.init_array:0000000000E7A708                               ; ELF Initialization Function Table
.init_array:0000000000E7A708 ; ===========================================================================
.init_array:0000000000E7A708
.init_array:0000000000E7A708 ; Segment type: Pure data
.init_array:0000000000E7A708 AREA .init_array, DATA, ALIGN=3
.init_array:0000000000E7A708 ; ORG 0xE7A708
.init_array:0000000000E7A708 00 60 1C 00 00 00 00 00 off_E7A708 DCQ start ; DATA XREF: LOAD:off_88↑o
.init_array:0000000000E7A708 ; LOAD:00000000000001D8↑o
.init_array:0000000000E7A710 48 60 1C 00 00 00 00 00 DCQ sub_1C6048
.init_array:0000000000E7A718 E8 60 1C 00 00 00 00 00 DCQ sub_1C60E8
.init_array:0000000000E7A720 08 61 1C 00 00 00 00 00 DCQ sub_1C6108
.init_array:0000000000E7A728 DC 62 1C 00 00 00 00 00 DCQ sub_1C62DC

........

.init_array:0000000000E7AAF8 ; .init_array ends

Hook

绕过后即可开始进行Hook操作

由于大部分的Lua字节码文件都经过加密处理, 而我们可以通过钩取加载Lua文件的函数, 从而找到实际加载的Lua字节码文件.

已知的Hook点(有错误, 后面”恢复乱序opcode”时更改)

已知的Hook点: luaL_loadbuffer(), 但是由于我们分析的libgame.so文件是去符号的, 所以需要恢复符号, 找到luaL_loadbuffer()的文件偏移.

恢复符号(并不是这个函数, 后面”恢复乱序opcode”时更改)

我们可以编译相同版本的libluajit.so, 然后通过bindiff[^2]插件来恢复符号.

关于版本我们前面已经知道了(确定LuaJIT版本[^1])

恢复后我们搜索得到了luaL_loadbuffer()

确定为第一个匹配项的原因:

  • 匹配度高, 且其他项的匹配度太低了
  • 参数数量和类型与luaL_loadbuffer()相同 (类型长度相同, 需要自己修)

image

luaL_loadbuffer()原型

1
LUALIB_API int luaL_loadbuffer(lua_State *L, const char *buf, size_t size, const char *name);

参数

  • L: Lua解释器状态
  • buf: 指向当前加载的Lua文件的二进制内容
  • size: buf的大小
  • name: 模块路径名称

Hook luaL_loadbuffer()

得到函数地址

前面已经恢复了luaL_loadbuffer()函数, 只需要查看其文件偏移即可

image

编写Hook脚本

编写hook脚本, 就是通过module, 然后通过相对地址来Hook函数 (因为是去符号的, 所以只能用地址来Hook)

1
2
3
4
5
6
7
8
9
10
11
{
var luaL_loadbuffer = base.add(0xAC4E9C);
Interceptor.attach(luaL_loadbuffer, {
onEnter: function(args) {
console.log("[luaL_loadbuffer]: name = ", args[3].readCString());
},
onLeave: function(retval) {

}
})
}

游戏进入主界面后输出

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
[luaL_loadbuffer]: name =  assets/src/537350069
[luaL_loadbuffer]: name = cocos/init.pyc
[luaL_loadbuffer]: name = cocos/cocos2d/Cocos2d.pyc
[luaL_loadbuffer]: name = cocos/cocos2d/Cocos2dConstants.pyc
[luaL_loadbuffer]: name = cocos/cocos2d/extern.pyc
[luaL_loadbuffer]: name = cocos/cocos2d/bitExtend.pyc
[luaL_loadbuffer]: name = cocos/cocos2d/DrawPrimitives.pyc
[luaL_loadbuffer]: name = cocos/cocos2d/Opengl.pyc
[luaL_loadbuffer]: name = cocos/cocos2d/OpenglConstants.pyc
[luaL_loadbuffer]: name = cocos/cocosbuilder/CCBReaderLoad.pyc
[luaL_loadbuffer]: name = cocos/cocosdenshion/AudioEngine.pyc
[luaL_loadbuffer]: name = cocos/cocostudio/CocoStudio.pyc
[luaL_loadbuffer]: name = cocos/cocos2d/json.pyc
[luaL_loadbuffer]: name = cocos/cocostudio/StudioConstants.pyc
[luaL_loadbuffer]: name = cocos/ui/GuiConstants.pyc
[luaL_loadbuffer]: name = cocos/ui/experimentalUIConstants.pyc
[luaL_loadbuffer]: name = cocos/extension/ExtensionConstants.pyc
[luaL_loadbuffer]: name = cocos/network/NetworkConstants.pyc
[luaL_loadbuffer]: name = cocos/spine/SpineConstants.pyc
[luaL_loadbuffer]: name = core/Enums.pyc
[luaL_loadbuffer]: name = core/Resources.pyc
[luaL_loadbuffer]: name = scene/MainMenu.pyc

点击开始游戏后输出

1
2
3
4
[luaL_loadbuffer]: name =  scene/GameScene.pyc
[luaL_loadbuffer]: name = core/GameMap.pyc
[luaL_loadbuffer]: name = entity/Enemy.pyc
[luaL_loadbuffer]: name = entity/Mario.pyc

顶问号方块后输出

我们知道顶到问号方块会给你一个无敌蘑菇, 所以最终输入检验的逻辑应该在这里, 所以core/Util.pyc文件应该包含检验逻辑.

1
[luaL_loadbuffer]: name =  core/Util.pyc

确定文件类型

根据luaL_loadbuffer()原型可以知道, 其中的buff参数是指向文件内容的指针, 我们可以读取前四个字节, 确定文件类型

修改js代码

添加了一个文件前四字节的输出

1
2
3
4
5
6
7
8
9
10
11
12
13
{
var luaL_loadbuffer = base.add(0xAC4E9C);
Interceptor.attach(luaL_loadbuffer, {
onEnter: function(args) {
console.log("[luaL_loadbuffer]: name = ", args[3].readCString());
var head = Memory.readByteArray(args[1], 4);
console.log("[luaL_loadbuffer]: content = ", head);
},
onLeave: function(retval) {

}
})
}

得到输出

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
[luaL_loadbuffer]: name =  assets/src/537350069
[luaL_loadbuffer]: content = 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
00000000 1b 4c 4a 02 .LJ.
[luaL_loadbuffer]: name = cocos/init.pyc
[luaL_loadbuffer]: content = 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
00000000 1b 4c 4a 02 .LJ.
[luaL_loadbuffer]: name = cocos/cocos2d/Cocos2d.pyc
[luaL_loadbuffer]: content = 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
00000000 1b 4c 4a 02 .LJ.
[luaL_loadbuffer]: name = cocos/cocos2d/Cocos2dConstants.pyc
[luaL_loadbuffer]: content = 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
00000000 1b 4c 4a 02 .LJ.
[luaL_loadbuffer]: name = cocos/cocos2d/extern.pyc
[luaL_loadbuffer]: content = 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
00000000 1b 4c 4a 02 .LJ.
[luaL_loadbuffer]: name = cocos/cocos2d/bitExtend.pyc
[luaL_loadbuffer]: content = 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
00000000 1b 4c 4a 02 .LJ.
[luaL_loadbuffer]: name = cocos/cocos2d/DrawPrimitives.pyc
[luaL_loadbuffer]: content = 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
00000000 1b 4c 4a 02 .LJ.
[luaL_loadbuffer]: name = cocos/cocos2d/Opengl.pyc
[luaL_loadbuffer]: content = 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
00000000 1b 4c 4a 02 .LJ.
[luaL_loadbuffer]: name = cocos/cocos2d/OpenglConstants.pyc
[luaL_loadbuffer]: content = 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
00000000 1b 4c 4a 02 .LJ.
[luaL_loadbuffer]: name = cocos/cocosbuilder/CCBReaderLoad.pyc
[luaL_loadbuffer]: content = 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
00000000 1b 4c 4a 02 .LJ.
[luaL_loadbuffer]: name = cocos/cocosdenshion/AudioEngine.pyc
[luaL_loadbuffer]: content = 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
00000000 1b 4c 4a 02 .LJ.
[luaL_loadbuffer]: name = cocos/cocostudio/CocoStudio.pyc
[luaL_loadbuffer]: content = 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
00000000 1b 4c 4a 02 .LJ.
[luaL_loadbuffer]: name = cocos/cocos2d/json.pyc
[luaL_loadbuffer]: content = 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
00000000 1b 4c 4a 02 .LJ.
[luaL_loadbuffer]: name = cocos/cocostudio/StudioConstants.pyc
[luaL_loadbuffer]: content = 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
00000000 1b 4c 4a 02 .LJ.
[luaL_loadbuffer]: name = cocos/ui/GuiConstants.pyc
[luaL_loadbuffer]: content = 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
00000000 1b 4c 4a 02 .LJ.
[luaL_loadbuffer]: name = cocos/ui/experimentalUIConstants.pyc
[luaL_loadbuffer]: content = 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
00000000 1b 4c 4a 02 .LJ.
[luaL_loadbuffer]: name = cocos/extension/ExtensionConstants.pyc
[luaL_loadbuffer]: content = 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
00000000 1b 4c 4a 02 .LJ.
[luaL_loadbuffer]: name = cocos/network/NetworkConstants.pyc
[luaL_loadbuffer]: content = 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
00000000 1b 4c 4a 02 .LJ.
[luaL_loadbuffer]: name = cocos/spine/SpineConstants.pyc
[luaL_loadbuffer]: content = 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
00000000 1b 4c 4a 02 .LJ.
[luaL_loadbuffer]: name = core/Enums.pyc
[luaL_loadbuffer]: content = 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
00000000 1b 4c 4a 02 .LJ.
[luaL_loadbuffer]: name = core/Resources.pyc
[luaL_loadbuffer]: content = 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
00000000 1b 4c 4a 02 .LJ.
[luaL_loadbuffer]: name = scene/MainMenu.pyc
[luaL_loadbuffer]: content = 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
00000000 1b 4c 4a 02
[luaL_loadbuffer]: name = scene/GameScene.pyc
[luaL_loadbuffer]: content = 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
00000000 1b 4c 4a 02 .LJ.
[luaL_loadbuffer]: name = core/GameMap.pyc
[luaL_loadbuffer]: content = 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
00000000 1b 4c 4a 02 .LJ.
[luaL_loadbuffer]: name = entity/Enemy.pyc
[luaL_loadbuffer]: content = 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
00000000 1b 4c 4a 02 .LJ.
[luaL_loadbuffer]: name = entity/Mario.pyc
[luaL_loadbuffer]: content = 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
00000000 1b 4c 4a 02
[luaL_loadbuffer]: name = core/Util.pyc
[luaL_loadbuffer]: content = 0 1 2 3 4 5 6 7 8 9 A B C D E F 0123456789ABCDEF
00000000 1b 4c 4a 02 .LJ.

luac64文件

输出的文件头五个字节都是: 1b 4c 4a 02 0A.

  • 1b 4c 4a: LuaJIT魔数
  • 02: 版本号
  • 0A: 标识64位luac64文件

一个经典luac文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
1b 4c 4a 01             | Header LuaJIT 2.0 BC
00 | Flags: None
11 40 74 65 73 74 73 2f | Chunkname: @tests/test-1.lua
74 65 73 74 2d 31 2e 6c |
75 61 |
| .. prototype ..
8a 01 | prototype length 138
02 | prototype flags PROTO_VARARG
00 | parameters number 0
07 | framesize 7
00 01 01 12 | size uv: 0 kgc: 1 kn: 1 bc: 19
31 | debug size 49
00 07 | firstline: 0 numline: 7
| .. bytecode ..
32 00 00 00 | 0001 TNEW 0 0
27 01 01 00 | 0002 KSHORT 1 1
27 02 0a 00 | 0003 KSHORT 2 10
27 03 01 00 | 0004 KSHORT 3 1
49 01 04 80 | 0005 FORI 1 => 0010
20 05 04 04 | 0006 => MULVV 5 4 4
14 05 00 05 | 0007 ADDVN 5 5 0 ; 1
39 05 04 00 | 0008 TSETV 5 0 4
4b 01 fc 7f | 0009 FORL 1 => 0006
27 01 01 00 | 0010 => KSHORT 1 1
27 02 0a 00 | 0011 KSHORT 2 10
27 03 01 00 | 0012 KSHORT 3 1
49 01 04 80 | 0013 FORI 1 => 0018
34 05 00 00 | 0014 => GGET 5 0 ; "print"
36 06 04 00 | 0015 TGETV 6 0 4
3e 05 02 01 | 0016 CALL 5 1 2
4b 01 fc 7f | 0017 FORL 1 => 0014
47 00 01 00 | 0018 => RET0 0 1
| .. uv ..
| .. kgc ..
0a 70 72 69 6e 74 | kgc: "print"
| .. knum ..
02 | knum int: 1
| .. debug ..
01 | pc001: line 1
02 | pc002: line 2
02 | pc003: line 2
02 | pc004: line 2
02 | pc005: line 2
...

Dump内存得到内存中的luac64文件

frida脚本

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
{
var luaL_loadbuffer = base.add(0xAC4E9C);
Interceptor.attach(luaL_loadbuffer, {
onEnter: function(args) {
var chunk_addr = args[1]; // buffer pointer
var chunk_name = args[3].readCString();
var chunk_size = args[2].toInt32();

var new_name = chunk_name.slice(chunk_name.lastIndexOf('/') + 1, -3) + "luac64";

var currentApplication = Java.use("android.app.ActivityThread").currentApplication();
var dir = currentApplication.getApplicationContext().getFilesDir().getPath();

console.log(dir + new_name);
var file = new File(dir + new_name, "wb");
file.write(chunk_addr.readByteArray(chunk_size));
file.close;
},
onLeave: function(retval) {

}
})
}

得到luac64文件

adb pull只能一个一个拉

image

恢复乱序OPcode

回头看一下LUA文件编译流程[^4], 如果我们修改了JIT引擎opcode的顺序, 那么用正常的lua反编译就无法获得正确的源码.

lj_obj.h

使用VScode打开LuaJIT安装[^5]的源码, lj_obj.h文件位于src目录中, 其中的内容有:

  • 实现垃圾回收功能
  • 定义JIT中使用的各种数据类型
  • 定义一些函数, 实现垃圾回收, 控制各种类型操作

其中最重要的是: lua_State结构体, 跟opcode解释有关, 所以我们需要跟进该结构体.

lua_State

来自: lj_obj.h

功能: 解释器执行解析时的状态结构体, 包含解析器的全部状态信息. 在JIT解释器运行时会实例化一个lua_State对象, 来管理堆栈/全局变量, 加载和执行代码, 管理内存.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
struct lua_State {
GCHeader;
uint8_t dummy_ffid; /* Fake FF_C for curr_funcisL() on dummy frames. */
uint8_t status; /* Thread status. */
MRef glref; /* Link to global state. */
GCRef gclist; /* GC chain. */
TValue *base; /* Base of currently executing function. */
TValue *top; /* First free slot in the stack. */
MRef maxstack; /* Last free slot in the stack. */
MRef stack; /* Stack base. */
GCRef openupval; /* List of open upvalues in the stack. */
GCRef env; /* Thread environment (table of globals). */
void *cframe; /* End of C stack frame chain. */
MSize stacksize; /* True stack size (incl. LJ_STACK_EXTRA). */
};

各个成员成员变量的功能:

  • GCHeade: 不重要

    • 用于内存管理
  • dummy_ffid: 不重要

    • 用于确定函数调用是否需要一个新的栈帧
  • status: 不重要

    • 用于记录线程状态
  • glref: 重要

    • 全局状态结构体Global_State的指针(官方注释中有给出), 用于访问全局变量和全局状态信息.
    • 其类型为MRef​​, 就是一个指针类型, 根据一个全局的Flag判断是64位还是32位. 重要的是运行时指向的是: 全局状态结构体Global_State.
  • gclist: 不重要

    • 即将被回收的对象的指针
  • base:

    • 当前执行函数的栈帧
  • top

    • 栈顶指针
  • stack:

    • 栈的起始地址
  • cframe:

    • C语言栈的栈顶

Global_State

来自: lua_State的glref成员变量指向的内容.

功能: 保存了LuaJIT运行时的全局变量, 所有线程共享.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
typedef struct global_State {
GCRef *strhash; /* String hash table (hash chain anchors). */
MSize strmask; /* String hash mask (size of hash table - 1). */
MSize strnum; /* Number of strings in hash table. */
lua_Alloc allocf; /* Memory allocator. */
void *allocd; /* Memory allocator data. */
GCState gc; /* Garbage collector. */
volatile int32_t vmstate; /* VM state or current JIT code trace number. */
SBuf tmpbuf; /* Temporary string buffer. */
GCstr strempty; /* Empty string. */
uint8_t stremptyz; /* Zero terminator of empty string. */
uint8_t hookmask; /* Hook mask. */
uint8_t dispatchmode; /* Dispatch mode. */
uint8_t vmevmask; /* VM event mask. */
GCRef mainthref; /* Link to main thread. */
TValue registrytv; /* Anchor for registry. */
TValue tmptv, tmptv2; /* Temporary TValues. */
Node nilnode; /* Fallback 1-element hash part (nil key and value). */
GCupval uvhead; /* Head of double-linked list of all open upvalues. */
int32_t hookcount; /* Instruction hook countdown. */
int32_t hookcstart; /* Start count for instruction hook counter. */
lua_Hook hookf; /* Hook function. */
lua_CFunction wrapf; /* Wrapper for C function calls. */
lua_CFunction panic; /* Called as a last resort for errors. */
BCIns bc_cfunc_int; /* Bytecode for internal C function calls. */
BCIns bc_cfunc_ext; /* Bytecode for external C function calls. */
GCRef cur_L; /* Currently executing lua_State. */
MRef jit_base; /* Current JIT code L->base or NULL. */
MRef ctype_state; /* Pointer to C type state. */
GCRef gcroot[GCROOT_MAX]; /* GC roots. */
} global_State;

lua_newstate()

来自: 通过交叉引用lua_State[^6]和Global_State[^7]可以找到.

功能: 在程序启动时调用, 用于初始化一个新的lua_State对象, 且该lua_State对象是唯一的全局lua_State对象. (后面新线程中会有专属的线程lua_State对象).

逻辑: 在其中又有一个新的GG_State结构体, 根据下面的代码可以发现lua_State[^6]Global_State[^7]都是它的成员.

1
2
3
4
5
6
7
8
9
10
LUA_API lua_State *lua_newstate(lua_Alloc f, void *ud)
#endif
{
GG_State *GG = (GG_State *)f(ud, NULL, 0, sizeof(GG_State)); // 一个更上层的全局状态结构体
lua_State *L = &GG->L; // 包含了lua_State
global_State *g = &GG->g; // 包含了global_State

..... // lua_State和global_State的初始化操作

}

GG_State

来自: lua_State中新出现的结构体类型

功能: 存储LuaJIT虚拟机的运行时数据

1
2
3
4
5
6
7
8
9
10
11
12
13
typedef struct GG_State {
lua_State L; /* Main thread. */
global_State g; /* Global state. */
#if LJ_TARGET_MIPS
ASMFunction got[LJ_GOT__MAX]; /* Global offset table. */
#endif
#if LJ_HASJIT
jit_State J; /* JIT state. */
HotCount hotcount[HOTCOUNT_SIZE]; /* Hot counters. */
#endif
ASMFunction dispatch[GG_LEN_DISP]; /* Instruction dispatch tables. */
BCIns bcff[GG_NUM_ASMFF]; /* Bytecode for ASM fast functions. */
} GG_State;

各成员变量功能:

  • L:

    • 主线程的lua_State, 访问读取线程的堆栈信息等
  • g:

    • 全局State
  • got:

    • 不重要
  • J:

    • 记录JIT编译器状态
  • hotcout:

    • 记录Lua函数的热点计数, 编译JIT编译器对高热点进行优化
  • dispatch: 重要重要重要

    • 官方注释: Instruction dispatch tables, 指令调度表
  • bcff:

    • 字节码

luaL_loadbuffer()

从加载字节码的方向继续跟进查看LuaJIT虚拟机执行流程

调用顺序是: luaL_loadbuffer() -> luaL_loadbufferx() -> lua_loadx()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36

LUALIB_API int luaL_loadstring(lua_State *L, const char *s)
{
return luaL_loadbuffer(L, s, strlen(s), s);
}

LUALIB_API int luaL_loadbuffer(lua_State *L, const char *buf, size_t size,
const char *name)
{
return luaL_loadbufferx(L, buf, size, name, NULL);
}

LUALIB_API int luaL_loadbufferx(lua_State *L, const char *buf, size_t size,
const char *name, const char *mode)
{
StringReaderCtx ctx;
ctx.str = buf;
ctx.size = size;
return lua_loadx(L, reader_string, &ctx, name, mode);
}

LUA_API int lua_loadx(lua_State *L, lua_Reader reader, void *data,
const char *chunkname, const char *mode)
{
LexState ls;
int status;
ls.rfunc = reader;
ls.rdata = data;
ls.chunkarg = chunkname ? chunkname : "?";
ls.mode = mode;
lj_buf_init(L, &ls.sb);
status = lj_vm_cpcall(L, NULL, &ls, cpparser);
lj_lex_cleanup(L, &ls);
lj_gc_check(L);
return status;
}

lua_loadx()

继续跟进该函数

逻辑:

  • 前面结构体初始化(包括字节码)
  • 调用lj_vm_cpcall(L, NULL, &ls, cpparser);​启动虚拟机, 且传入了字节码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
LUA_API int lua_loadx(lua_State *L, lua_Reader reader, void *data,
const char *chunkname, const char *mode)
{

LexState ls;
int status;
ls.rfunc = reader;
ls.rdata = data;// 含有字节码成员变量的结构体
ls.chunkarg = chunkname ? chunkname : "?";
ls.mode = mode;
lj_buf_init(L, &ls.sb);
// 启动虚拟机
status = lj_vm_cpcall(L, NULL, &ls, cpparser);
lj_lex_cleanup(L, &ls);
lj_gc_check(L);
return status;
}

lj_vm_cpcall()

功能: 启动虚拟机

逻辑:

  • 由于性能要求, 虚拟机的功能逻辑很多直接使用汇编实现, 所以我们要找ARM64的dasc文件.
1
2
LJ_ASMF int lj_vm_cpcall(lua_State *L, lua_CFunction func, void *ud,
lua_CPFunction cp);

参数:

  • L: lua_State的指针
  • func: 要执行的C函数指针
  • ud: 用户数据指针
  • cp:

汇编符号

用到的符号(跟后面的汇编对照, “->”表示该寄存器的值发生变化, 变化后的值所具有的含义在后面给出):

  • CARG1:

    • 寄存器: x0
    • 功能: 参数寄存器args[0] -> 一个栈偏移还是啥的
  • L:

    • 类型: lua_State
    • 寄存器: x23
    • 功能: 表示lua_State变量
  • LREG:

    • 寄存器: x23
    • 功能: 保存lua_State地址
  • RA: (调用保留寄存器)

    • 寄存器: x27
    • 功能: 当前stack的起始地址 -> stack长度
  • SAVE_L:

    • 内存: [sp, #176]
    • 功能: 保存参数的lua_State
  • GL:

    • 类型: global_State
    • 寄存器: x22
    • 功能: 表示GG_State保留变量
  • RB:

    • 寄存器: x17
    • 功能: 栈顶指针
  • SAVEPC:

    • 内存: [sp, #168]
    • 功能: 还是保存lua_State指针(?)
  • RC: (调用保留寄存器)

    • 寄存器: x28
    • 功能: C语言栈顶指针
  • SAVE_NRES

    • 内存: [sp, #200]
    • 功能: 堆栈长度(?)
  • SAVE_ERRF:

    • 内存: [sp, #196]
    • 功能: 异常函数个数
  • SAVE_CFRAME:

    • 内存: [sp, #160]
    • 功能: C语言栈顶指针
  • fp:

    • 寄存器: x29
    • 功能: 栈帧
  • CARG4

    • 寄存器: x3
    • 功能: args[3]
  • BASE

    • 寄存器: x19
    • 功能: args[0]

汇编

逻辑: 初始化初始化初始化….

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
// 该函数中所有用到的符号在文件开头的定义
.type L, lua_State, LREG
.define LREG, x23 // Register holding lua_State (also in SAVE_L).
.define CARG1, x0
.define RA, x27
.define SAVE_L, [sp, #176] // 一段栈空间为SAVE_L
.type GL, global_State, GLREG
.define GLREG, x22
.define RB, x17
.define SAVE_PC, [sp, #168]
.define RC, x28
.define SAVE_NRES, [sp, #200]
.define SAVE_ERRF, [sp, #196]
.define SAVE_CFRAME, [sp, #160]
.define fp, x29 // Yes, we have to maintain a frame pointer.
.define CARG4, x3
.define BASE, x19
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
|->vm_cpcall:				// Setup protected C frame, call C.
| // (lua_State *L, lua_CFunction func, void *ud, lua_CPFunction cp)
| saveregs
| // 获取传入的lua_State指针
| mov L, CARG1
| // 单独保存lua_State中的stack起始地址
| ldr RA, L:CARG1->stack
| // 将lua_State保存至栈中
| str CARG1, SAVE_L
| // 从lua_State中获取global_State指针
| ldr GL, L->glref // Setup pointer to global state.
| // RB获取栈顶指针
| ldr RB, L->top
| // 保存lua_State指针到栈中
| str CARG1, SAVE_PC // Any value outside of bytecode is ok.
| // RC保存C语言的栈顶指针
| ldr RC, L->cframe
| // RA = RA - RB, 堆栈起始地址 - 栈顶指针 = stack的长度
| sub RA, RA, RB // Compute -savestack(L, L->top).
| // 去RA的地位部分, 保存到SAVE_NRES中
| str RAw, SAVE_NRES // Neg. delta means cframe w/o frame.
| // wzr是通用零寄存器, 其64位全为零, SAVE_ERRF = 0表示没有异常函数
| str wzr, SAVE_ERRF // No error function.
| // 将C栈顶指针保存到栈中
| str RC, SAVE_CFRAME
| // 保存C栈帧
| str fp, L->cframe // Add our C frame to cframe chain.
| // 保存lua_State指针到global_State的成员中
| str L, GL->cur_L
| // 无条件跳转到CARG4
| blr CARG4 // (lua_State *L, lua_CFunction func, void *ud)
| // 存储lua_State
| mov BASE, CRET1
| // #FRAME_CP是一个宏定义, 表示当前栈帧的相等偏移量
| mov PC, #FRAME_CP
| // 测试BASE, 不为零跳转至<3标签
| cbnz BASE, <3 // Else continue with the call.
|
| b ->vm_leave_cp // No base? Just remove C frame.

3:标签

汇编符号

  • RB:

    • 寄存器: x17
    • 功能: 栈顶指针 -> old base
  • LJ_TISNUM:

    • 功能: JIT虚拟机的表示数字类型的tag值, 用于区分不同的类型
  • 感觉都是一些初始化的操作, 后面就没有太仔细看了

汇编

主要任务是: 跟进后面的ins_call, 注意CARG3(因为在C调用中该参数传入的结构体中含有opcode)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
|  // vm_cpcall的入口点
|3: // Entry point for vm_cpcall/vm_resume (BASE = base, PC = ftype).
| // glocal_State.cur_L = lua_State
| str L, GL->cur_L
| // RB保存之前的栈帧
| ldp RB, CARG1, L->base // RB = old base (for vmeta_call).
| // TISNUM = ((LJ_TISNUM >> 1) & 0xffff) << 48;
| movz TISNUM, #(LJ_TISNUM>>1)&0xffff, lsl #48
| // TIMSNUMhi = ((LJ_TISNUM >> 1) & 0x0xffff) << 16;
| movz TISNUMhi, #(LJ_TISNUM>>1)&0xffff, lsl #16
| //
| add PC, PC, BASE
| // TISNIL = 0
| movn TISNIL, #0
| //
| sub PC, PC, RB // PC = frame delta + frame type
| //
| sub NARGS8:RC, CARG1, BASE
| //
| st_vmstate ST_INTERP
| //
|->vm_call_dispatch:
| // RB = old base, BASE = new base, RC = nargs*8, PC = caller PC
| ldr CARG3, [BASE, FRAME_FUNC]
| //
| checkfunc CARG3, ->vmeta_call
| //
|->vm_call_dispatch_f:
| // 调用ins_call
| ins_call
| //
| // BASE = new base, CARG3 = func, RC = nargs*8, PC = caller PC

ins_call

汇编

还是跟上面一样的思路: 继续跟进, 注意含有opcode的CARG3

1
2
3
4
5
|.macro ins_call
| // BASE = new base, CARG3 = LFUNC/CFUNC, RC = nargs*8, PC = caller PC
| str PC, [BASE, FRAME_PC]
| ins_callt // 继续跟进
|.endmacro

ins_callt

汇编

注意opcode的指针: ls->rodata->str

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
|.macro ins_callt
| // BASE = new base, CARG3 = LFUNC/CFUNC, RC = nargs*8, FRAME_PC(BASE) = PC
| // 加载opcode的地址, 虽然不知道为啥结构体可以直接取pc字段, 而不是`ls->rodata->str`
| ldr PC, LFUNC:CARG3->pc
| 去除固定四字节长度的指令
| ldr INSw, [PC], #4
| // TMP1 = (((uint_32)(INS & 0xFF)) << 3) + GL
| add TMP1, GL, INS, uxtb #3
| // 译码解析操作数
| decode_RA RA, INS
| // TMP0 = *(uint_64*)((void*)TMP1 + GG_G2DISP)
| // TMP1加上GG常量值后, 从内存取出一个uint64的值出来, 该值就是对应指令的地址
| ldr TMP0, [TMP1, #GG_G2DISP]
| // 解析操作数
| add RA, BASE, RA, lsl #3
| // 跳转到对应的逻辑处执行
| br TMP0
|.endmacro

虚拟机查找执行逻辑地址

  • 找到patch

    • TAMP1位指令值 * 8 (因为指令长度为8)
    • 加上GL的值
    • 原本GL的值 + GG_G2DISP = dispatch数组
    • 加上指令序号 * 8就是最终的opcode的地址
1
2
#define GG_OFS(field)	((int)offsetof(GG_State, field)) // 返回一个字段相对于GG_State起始的偏移
#define GG_G2DISP (GG_OFS(dispatch) - GG_OFS(g)) // 计算得到了dispatch和g字段之间地址的差

找到ins_call()对应偏移地址, 确定GG_G2DISP

作者并没有讲是怎么得到对应指令的文件偏移, 一开始试了特征代码查找, 但是筛出来的东西太多了.

于是回头看了一下源码, 发现前面bindiff找的luaL_loadbuffer()其实是lua_loadx(). (才发现编译的libluajit.so是32位的, 可能这就是bindiff匹配度比较低的原因吧)

根据下面的对比发现都有相同的字符串"?"​, 所以可以确定时lua_loadx().

lua_loadx()源码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
LUA_API int lua_loadx(lua_State *L, lua_Reader reader, void *data,
const char *chunkname, const char *mode)
{
LexState ls;
int status;
ls.rfunc = reader;
ls.rdata = data;
ls.chunkarg = chunkname ? chunkname : "?";
ls.mode = mode;
lj_buf_init(L, &ls.sb);
status = lj_vm_cpcall(L, NULL, &ls, cpparser);
lj_lex_cleanup(L, &ls);
lj_gc_check(L);
return status;
}

ida反编译

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
__int64 __fastcall luaL_loadbuffer(__int64 a1, __int64 a2, __int64 a3, char *a4)
{
char *v4; // x8
unsigned int v6; // w20
__int64 v8[2]; // [xsp+0h] [xbp-E0h] BYREF
char v9[64]; // [xsp+10h] [xbp-D0h] BYREF
__int64 v10; // [xsp+50h] [xbp-90h]
__int64 v11; // [xsp+58h] [xbp-88h]
__int64 v12; // [xsp+60h] [xbp-80h]
__int64 v13; // [xsp+68h] [xbp-78h]
__int64 (__fastcall *v14)(); // [xsp+70h] [xbp-70h]
__int64 *v15; // [xsp+78h] [xbp-68h]
char *v16; // [xsp+90h] [xbp-50h]
__int64 v17; // [xsp+98h] [xbp-48h]

v14 = sub_AC4E7C;
v15 = v8;
v4 = "?"; // 特征字符串
if ( a4 )
v4 = a4;
v8[0] = a2;
v8[1] = a3;
v16 = v4;
v17 = 0LL;
v12 = 0LL;
v13 = a1;
v10 = 0LL;
v11 = 0LL;
v6 = sub_ACFEB4(a1, 0LL);
sub_ABA3AC(a1, v9);
if ( *(_QWORD *)(*(_QWORD *)(a1 + 16) + 32LL) >= *(_QWORD *)(*(_QWORD *)(a1 + 16) + 40LL) )
sub_AD1D54(a1);
return v6;
}

继续跟进lua_loadx()

由于lua_loadx()最终会调用lj_vm_cpcall(), 可以尝试跟进找到对应的汇编代码.

跟进了上述IDA反编译代码的v6 = sub_ACFEB4(a1, 0LL);

sub_ACFEB4()

其中有两个JUMPOUT, 这个是非常可疑的, 因为IDA分析出现JUMPOUT的一个原因就是无条件跳转, 而ins_call的代码是使用汇编实现的, 所以有理由怀疑这里可能是ins_call的具体逻辑

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
void __fastcall sub_ACFEB4(_QWORD *a1, __int64 a2, __int64 a3, __int64 (*a4)(void))
{
__int64 v4; // x27
__int64 v5; // x22
__int64 v6; // x17
__int64 v7; // x28
__int64 v8[24]; // [xsp+0h] [xbp+0h] BYREF
int v9; // [xsp+C4h] [xbp+C4h]
int v10; // [xsp+C8h] [xbp+C8h]

v4 = a1[7];
v8[22] = (__int64)a1;
v5 = a1[2];
v6 = a1[5];
v8[21] = (__int64)a1;
v7 = a1[10];
v10 = v4 - v6;
v9 = 0;
v8[20] = v7;
a1[10] = v8;
*(_QWORD *)(v5 + 344) = a1;
if ( !a4() )
JUMPOUT(0xACFC10LL);
JUMPOUT(0xACFE5CLL);
}

跟进JUMPOUT

再第二个JUMPOUT找到了对应的汇编代码, 在倒数第五行的位置得到GG_G2DISP = 0xF30

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
.text:0000000000ACFE5C                               loc_ACFE5C                              ; CODE XREF: sub_ACFD34+5C↑j
.text:0000000000ACFE5C ; sub_ACFEB4+6C↓j
.text:0000000000ACFE5C D7 AE 00 F9 STR X23, [X22,#0x158]
.text:0000000000ACFE60 F1 02 42 A9 LDP X17, X0, [X23,#0x20]
.text:0000000000ACFE64 38 FF FF D2 MOV X24, #0xFFF9000000000000
.text:0000000000ACFE68 39 FF BF D2 MOV X25, #0xFFF90000
.text:0000000000ACFE6C B5 02 13 8B ADD X21, X21, X19
.text:0000000000ACFE70 1A 00 80 92 MOV X26, #0xFFFFFFFFFFFFFFFF
.text:0000000000ACFE74 B5 02 11 CB SUB X21, X21, X17
.text:0000000000ACFE78 1C 00 13 CB SUB X28, X0, X19
.text:0000000000ACFE7C DA 82 00 B9 STR W26, [X22,#0x80]
.text:0000000000ACFE7C
.text:0000000000ACFE80
.text:0000000000ACFE80 loc_ACFE80 ; CODE XREF: sub_ACDAF0+2750↓j
.text:0000000000ACFE80 ; .text:0000000000AD0644↓j
.text:0000000000ACFE80 ; .text:0000000000AD065C↓j
.text:0000000000ACFE80 ; .text:0000000000AD0690↓j
.text:0000000000ACFE80 ; sub_ACFCA8+1624↓j
.text:0000000000ACFE80 62 02 5F F8 LDUR X2, [X19,#-0x10]
.text:0000000000ACFE84 4F FC 6F 93 ASR X15, X2, #0x2F ; '/'
.text:0000000000ACFE88 FF 25 00 B1 CMN X15, #9
.text:0000000000ACFE8C 42 B8 40 92 AND X2, X2, #0x7FFFFFFFFFFF
.text:0000000000ACFE90 61 1E 00 54 B.NE loc_AD025C
.text:0000000000ACFE90
.text:0000000000ACFE94
.text:0000000000ACFE94 loc_ACFE94 ; CODE XREF: sub_ACEE20+1218↓j
.text:0000000000ACFE94 ; sub_ACEE20+12CC↓j
.text:0000000000ACFE94 75 82 1F F8 STUR X21, [X19,#-8]
.text:0000000000ACFE98 55 10 40 F9 LDR X21, [X2,#0x20]
.text:0000000000ACFE9C B0 46 40 B8 LDR W16, [X21],#4
.text:0000000000ACFEA0 C9 0E 30 8B ADD X9, X22, W16,UXTB#3
.text:0000000000ACFEA4 1B 3E 48 D3 UBFX X27, X16, #8, #8
.text:0000000000ACFEA8 28 99 47 F9 LDR X8, [X9,#0xF30]
.text:0000000000ACFEAC 7B 0E 1B 8B ADD X27, X19, X27,LSL#3
.text:0000000000ACFEB0 00 01 1F D6 BR X8
.text:0000000000ACFEB0
.text:0000000000ACFEB0 ; End of function sub_ACFE08

查看opcode

在lj_bc.h中可以找到共有97种opcode

1
2
3
4
5
6
7
8
9
_(ISLT,	var,	___,	var,	lt) \
_(ISGE, var, ___, var, lt) \
_(ISLE, var, ___, var, le) \
_(ISGT, var, ___, var, le) \
\
_(ISEQV, var, ___, var, eq) \
_(ISNEV, var, ___, var, eq) \
_(ISEQS, var, ___, str, eq) \
......

Hook汇编代码找到所有的opcode地址

Hook指令执行

一开始想直接Hook指令执行, 这样不行的原因是:

  • 指令执行会执行大量的代码, 且会有很多重复(我的hook下来的输出就是这样), 效率极低
  • 有些指令没有被执行, 无法获取全部的指令
1
2
3
4
5
6
7
8
9
10
11
// 错误的脚本, hook的是跳转到指定指令的逻辑时的寄存器
{
var load_opcode = base.add(0xACFEB0);
Interceptor.attach(load_opcode, {
onEnter: function(args) {
var opcode_addr = this.context.x8;
console.log(opcode_addr);
},
onLeave: function(retval) {}
})
}

Hook掉GL地址, 用前面的GG_G2DISP偏移找到dispatch表, 然后打印出来

这个的Hook点挺多的, 前面lj_vm_cpcall()里面的初始化操作多次用到lj_vm_cpcall(), GL对应的寄存器是x22

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
{
var load_opcode = base.add(0xACFEA0);
var GG_G2DISP = 0xF30;
Interceptor.attach(load_opcode, {
onEnter: function(args) {
var GG_State_addr = this.context.x22;
var dispatch = GG_State_addr.add(GG_G2DISP); // 获取到dispatch字段偏移
for (var i = 0; i < 97; i++) {
var p = dispatch.add(i * 8).readPointer(); // 注意: 因为dispatch是以数组的形式直接在结构体中保存, 所以直接+8即可逐个读取各个指令的指针值
console.log("[dispatch]: ", p.sub(base)); // 注意: 减去so的基址才是文件偏移
}
console.log(opcode_addr);
},
onLeave: function(retval) {}
})
}

得到了各个指令的文件偏移

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
[dispatch]:  0xacdaf0;
[dispatch]: 0xacdb70;
[dispatch]: 0xacdbf0;
[dispatch]: 0xacdc70;
[dispatch]: 0xacdcf0;
[dispatch]: 0xacdd74;
[dispatch]: 0xacddf4
[dispatch]: 0xacde44
[dispatch]: 0xacde94
[dispatch]: 0xacdf20
[dispatch]: 0xacdfac
[dispatch]: 0xacdff0
[dispatch]: 0xace034
[dispatch]: 0xace060
[dispatch]: 0xace08c
[dispatch]: 0xace0b0
[dispatch]: 0xace0d0
[dispatch]: 0xace0f0
[dispatch]: 0xace120
[dispatch]: 0xace160
[dispatch]: 0xace1a0
[dispatch]: 0xace1d8
[dispatch]: 0xace210
[dispatch]: 0xace234
[dispatch]: 0xace258
[dispatch]: 0xace278
[dispatch]: 0xace2a8
[dispatch]: 0xace2ec
[dispatch]: 0xace334
[dispatch]: 0xace348
[dispatch]: 0xace3f0
[dispatch]: 0xace458
[dispatch]: 0xace4c8
[dispatch]: 0xace534
[dispatch]: 0xace5a0
[dispatch]: 0xace614
[dispatch]: 0xace65c
[dispatch]: 0xace6d0
[dispatch]: 0xace73c
[dispatch]: 0xace7a8
[dispatch]: 0xace81c
[dispatch]: 0xace864
[dispatch]: 0xace8d8
[dispatch]: 0xace944
[dispatch]: 0xace9b0
[dispatch]: 0xacea24
[dispatch]: 0xacea6c
[dispatch]: 0xaceae0
[dispatch]: 0xaceb28
[dispatch]: 0xaceb74
[dispatch]: 0xaceba8
[dispatch]: 0xacec18
[dispatch]: 0xacec80
[dispatch]: 0xacecb4
[dispatch]: 0xacece8
[dispatch]: 0xaced24
[dispatch]: 0xaced6c
[dispatch]: 0xacedcc
[dispatch]: 0xacee20
[dispatch]: 0xacee38
[dispatch]: 0xacee50
[dispatch]: 0xaceedc
[dispatch]: 0xacef70
[dispatch]: 0xacefdc
[dispatch]: 0xacf024
[dispatch]: 0xacf0d4
[dispatch]: 0xacf1d0
[dispatch]: 0xacf260
[dispatch]: 0xacf2f4
[dispatch]: 0xacf35c
[dispatch]: 0xacf36c
[dispatch]: 0xacf3b4
[dispatch]: 0xacf3c0
[dispatch]: 0xacf47c
[dispatch]: 0xacf4cc
[dispatch]: 0xacf570
[dispatch]: 0xacf62c
[dispatch]: 0xacf6a4
[dispatch]: 0xacf734
[dispatch]: 0xacf7ec
[dispatch]: 0xacf7ec
[dispatch]: 0xacf878
[dispatch]: 0xacf918
[dispatch]: 0xacf918
[dispatch]: 0xacf94c
[dispatch]: 0xacf998
[dispatch]: 0xacf998
[dispatch]: 0xacf9b0
[dispatch]: 0xacf9d4
[dispatch]: 0xacfa10
[dispatch]: 0xacfa10
[dispatch]: 0xacfa50
[dispatch]: 0xacfa80
[dispatch]: 0xacfa80
[dispatch]: 0xacfb04
[dispatch]: 0xacfb08
[dispatch]: 0xacfb50

找到对应的opcode

根据frida中的地址找到所有的opcode, 并对照vm_arm64.dasc中的build_ins()函数恢复指令. 主要根据特征的汇编代码找到.

vm_arm64.dasc中用到了很多的define符号, 使用替换功能换回寄存器的形式, 这样跟IDA做比较的时候更容易.

实例

用第一个地址恢复作为实例

vm_arm64.dasc原始代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
case BC_ISLT: case BC_ISGE: case BC_ISLE: case BC_ISGT:
| // RA = src1, RC = src2, JMP with RC = target
| ldr CARG1, [BASE, RA, lsl #3]
| ldrh RBw, [PC, # OFS_RD]
| ldr CARG2, [BASE, RC, lsl #3]
| add PC, PC, #4
| add RB, PC, RB, lsl #2
| sub RB, RB, #0x20000
| checkint CARG1, >3
| checkint CARG2, >4
| cmp CARG1w, CARG2w
if (op == BC_ISLT) {
| csel PC, RB, PC, lt
} else if (op == BC_ISGE) {
| csel PC, RB, PC, ge
} else if (op == BC_ISLE) {
| csel PC, RB, PC, le
} else {
| csel PC, RB, PC, gt

替换后dasc代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
case BC_ISLT: case BC_ISGE: case BC_ISLE: case BC_ISGT:
| // x27 = src1, x28 = src2, JMP with x28 = target
| ldr x0, [x19, x27, lsl #3]
| ldrh w17, [x21, #2]
| ldr x1, [x19, x28, lsl #3]
| add x21, x21, #4
| add x17, x21, x17, lsl #2
| sub x17, x17, #0x20000
| checkint x0, >3
| checkint x1, >4
| cmp w0, w1
if (op == BC_ISLT) {
| csel x21, x17, x21, lt
} else if (op == BC_ISGE) {
| csel x21, x17, x21, ge
} else if (op == BC_ISLE) {
| csel x21, x17, x21, le
} else {
| csel x21, x17, x21, gt

IDA代码

几乎与替换后代码一样, 最后的代码是特征代码, 确定是BC_ISLT指令

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
.text:0000000000ACDAF0                               ; __unwind { // AAD020
.text:0000000000ACDAF0 60 7A 7B F8 LDR X0, [X19,X27,LSL#3]
.text:0000000000ACDAF4 B1 06 40 79 LDRH W17, [X21,#2]
.text:0000000000ACDAF8 61 7A 7C F8 LDR X1, [X19,X28,LSL#3]
.text:0000000000ACDAFC B5 12 00 91 ADD X21, X21, #4
.text:0000000000ACDB00 B1 0A 11 8B ADD X17, X21, X17,LSL#2
.text:0000000000ACDB04 31 82 40 D1 SUB X17, X17, #0x20,LSL#12 ; ' '
.text:0000000000ACDB08 3F 83 40 EB CMP X25, X0,LSR#32
.text:0000000000ACDB0C 61 01 00 54 B.NE loc_ACDB38
.text:0000000000ACDB0C
.text:0000000000ACDB10 3F 83 41 EB CMP X25, X1,LSR#32
.text:0000000000ACDB14 21 02 00 54 B.NE loc_ACDB58
.text:0000000000ACDB14
.text:0000000000ACDB18 1F 00 01 6B CMP W0, W1
.text:0000000000ACDB1C 35 B2 95 9A CSEL X21, X17, X21, LT ; 特征代码

得到opcode顺序

其中需要注意的点:

  • 代码中定义的符号使用文本替换, 前面讲过了

  • 有些常量定义在其他文件中, 也是全局搜索然后替换就行了

  • .macro宏定义, 比如: 最常见的ins_next(几乎每个指令都有, 可以拿来确定end), mov_false, mov_true, 主要记住第一条特征指令, 能认出来即可.

  • 还有~取反操作跟取负数的区别

  • 有些指令会定义在宏定义中, 然后直接调用宏定义. (比如: BC_ADDVN, BC_ADDNV, BC_ADDVV)

    • 另外说一下, BC_ADDVN这类指令的识别, 主要通过其使用的宏定义ins_arithdn adds, fadd​传入的adds参数和fadd参数, 根据两个参数生成的vk来识别同类的具体指令.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
[dispatch]:  0xacdaf0 = BC_ISLT
[dispatch]: 0xacdb70 = BC_ISGE
[dispatch]: 0xacdbf0 = BC_ISLE
[dispatch]: 0xacdc70 = BC_ISGT
[dispatch]: 0xacdcf0 = BC_ISEQV
[dispatch]: 0xacdd74 = BC_ISNEV
[dispatch]: 0xacddf4 = BC_ISEQS
[dispatch]: 0xacde44 = BC_ISNES
[dispatch]: 0xacde94 = BC_ISEQN
[dispatch]: 0xacdf20 = BC_ISNEN
[dispatch]: 0xacdfac = BC_ISEQP
[dispatch]: 0xacdff0 = BC_ISNEP
[dispatch]: 0xace034 = BC_KSTR 确定该指令需要确定LJ_TSTR常量值, 该常量值在lj_obj.h中定义, 作用是区分指针的类型(比如: 字符串, 函数等)
[dispatch]: 0xace060 = BC_KCDATA
[dispatch]: 0xace08c = BC_KSHORT
[dispatch]: 0xace0b0 = BC_KNUM
[dispatch]: 0xace0d0 = BC_KPRI
[dispatch]: 0xace0f0 = BC_KNIL
[dispatch]: 0xace120 = BC_ISTC
[dispatch]: 0xace160 = BC_ISFC
[dispatch]: 0xace1a0 = BC_IST
[dispatch]: 0xace1d8 = BC_ISF
[dispatch]: 0xace210 = BC_ISTYPE
[dispatch]: 0xace234 = BC_ISNUM
[dispatch]: 0xace258 = BC_MOV
[dispatch]: 0xace278 = BC_NOT
[dispatch]: 0xace2a8 = BC_UNM
[dispatch]: 0xace2ec = BC_LEN
[dispatch]: 0xace334 = BC_RETM
[dispatch]: 0xace348 = BC_RET
[dispatch]: 0xace3f0 = BC_RET0
[dispatch]: 0xace458 = BC_RET1
[dispatch]: 0xace4c8 = BC_ADDVN
[dispatch]: 0xace534 = BC_SUBVN
[dispatch]: 0xace5a0 = BC_MULVN
[dispatch]: 0xace614 = BC_DIVVN
[dispatch]: 0xace65c = BC_MODVN
[dispatch]: 0xace6d0 = BC_ADDNV
[dispatch]: 0xace73c = BC_SUBNV
[dispatch]: 0xace7a8 = BC_MULNV
[dispatch]: 0xace81c = BC_DIVNV
[dispatch]: 0xace864 = BC_MODNV
[dispatch]: 0xace8d8 = BC_ADDVV
[dispatch]: 0xace944 = BC_SUBVV
[dispatch]: 0xace9b0 = BC_MULVV
[dispatch]: 0xacea24 = BC_DIVVV
[dispatch]: 0xacea6c = BC_MODVV
[dispatch]: 0xaceae0 = BC_POW
[dispatch]: 0xaceb28 = BC_CAT
[dispatch]: 0xaceb74 = BC_UGET
[dispatch]: 0xaceba8 = BC_USETV
[dispatch]: 0xacec18 = BC_USETS
[dispatch]: 0xacec80 = BC_USETN
[dispatch]: 0xacecb4 = BC_USETP
[dispatch]: 0xacece8 = BC_UCLO
[dispatch]: 0xaced24 = BC_FNEW
[dispatch]: 0xaced6c = BC_TNEW
[dispatch]: 0xacedcc = BC_TDUP
[dispatch]: 0xacee20 = BC_GGET
[dispatch]: 0xacee38 = BC_GSET
[dispatch]: 0xacee50 = BC_TGETV
[dispatch]: 0xaceedc = BC_TGETS
[dispatch]: 0xacef70 = BC_TGETB
[dispatch]: 0xacefdc = BC_TGETR
[dispatch]: 0xacf024 = BC_TSETV
[dispatch]: 0xacf0d4 = BC_TSETS
[dispatch]: 0xacf1d0 = BC_TSETB
[dispatch]: 0xacf260 = BC_TSETM
[dispatch]: 0xacf2f4 = BC_TSETR
[dispatch]: 0xacf35c = BC_CALLM
[dispatch]: 0xacf36c = BC_CALL
[dispatch]: 0xacf3b4 = BC_CALLMT
[dispatch]: 0xacf3c0 = BC_CALLT
[dispatch]: 0xacf47c = BC_ITERC
[dispatch]: 0xacf4cc = BC_ITERN
[dispatch]: 0xacf570 = BC_VARG
[dispatch]: 0xacf62c = BC_ISNEXT
[dispatch]: 0xacf6a4 = BC_FORI
[dispatch]: 0xacf734 = BC_JFORI
[dispatch]: 0xacf7ec = BC_IFORL 相同
[dispatch]: 0xacf7ec = BC_IFORL 相同
[dispatch]: 0xacf878 = BC_JFORL
[dispatch]: 0xacf918 = BC_IITERL 相同
[dispatch]: 0xacf918 = BC_IITERL 相同
[dispatch]: 0xacf94c = BC_JITERL
[dispatch]: 0xacf998 = ins_NEXT 相同
[dispatch]: 0xacf998 = ins_NEXT 相同
[dispatch]: 0xacf9b0 = BC_JLOOP
[dispatch]: 0xacf9d4 = BC_JMP
[dispatch]: 0xacfa10 = BC_IFUNCF 相同
[dispatch]: 0xacfa10 = BC_IFUNCF 相同
[dispatch]: 0xacfa50 = BC_JFUNCF
[dispatch]: 0xacfa80 = BC_IFUNCV 相同
[dispatch]: 0xacfa80 = BC_IFUNCV 相同
[dispatch]: 0xacfb04 = NYI
[dispatch]: 0xacfb08 = BC_FUNCC
[dispatch]: 0xacfb50 = BC_FUNCCW

总体同类指令一般贴在一起, 顺序很少发生变化, 但是量是真大, 所以整体分析下来很耗时.

上述代码跟原文还有些出入, 原文中只有93和94指令出现相同, 但是这里有5组相同.

反编译lua文件

使用luajit-decompiler, 是跟着原文的步骤做的.

第一处修改

全局搜索**_OPCODES元组**, 替换为下面的内容

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
(0x0, instructions.ISLT),
(0x1, instructions.ISGE),
(0x2, instructions.ISLE),
(0x3, instructions.ISGT),
(0x4, instructions.ISEQV),
(0x5, instructions.ISNEV),
(0x6, instructions.ISEQS),
(0x7, instructions.ISNES),
(0x8, instructions.ISEQN),
(0x9, instructions.ISNEN),
(0xa, instructions.ISEQP),
(0xb, instructions.ISNEP),
(0xc, instructions.KSTR),
(0xd, instructions.KCDATA),
(0xe, instructions.KSHORT),
(0xf, instructions.KNUM),
(0x10, instructions.KPRI),
(0x11, instructions.KNIL),
(0x12, instructions.ISTC),
(0x13, instructions.ISFC),
(0x14, instructions.IST),
(0x15, instructions.ISF),
(0x16, instructions.ISTYPE),
(0x17, instructions.ISNUM),
(0x18, instructions.MOV),
(0x19, instructions.NOT),
(0x1a, instructions.UNM),
(0x1b, instructions.LEN),
(0x1c, instructions.RETM),
(0x1d, instructions.RET),
(0x1e, instructions.RET0),
(0x1f, instructions.RET1),
(0x20, instructions.ADDVN),
(0x21, instructions.SUBVN),
(0x22, instructions.MULVN),
(0x23, instructions.DIVVN),
(0x24, instructions.MODVN),
(0x25, instructions.ADDNV),
(0x26, instructions.SUBNV),
(0x27, instructions.MULNV),
(0x28, instructions.DIVNV),
(0x29, instructions.MODNV),
(0x2a, instructions.ADDVV),
(0x2b, instructions.SUBVV),
(0x2c, instructions.MULVV),
(0x2d, instructions.DIVVV),
(0x2e, instructions.MODVV),
(0x2f, instructions.POW),
(0x30, instructions.CAT),
(0x31, instructions.UGET),
(0x32, instructions.USETV),
(0x33, instructions.USETS),
(0x34, instructions.USETN),
(0x35, instructions.USETP),
(0x36, instructions.UCLO),
(0x37, instructions.FNEW),
(0x38, instructions.TNEW),
(0x39, instructions.TDUP),
(0x3a, instructions.GGET),
(0x3b, instructions.GSET),
(0x3c, instructions.TGETV),
(0x3d, instructions.TGETS),
(0x3e, instructions.TGETB),
(0x3f, instructions.TGETR),
(0x40, instructions.TSETV),
(0x41, instructions.TSETS),
(0x42, instructions.TSETB),
(0x43, instructions.TSETM),
(0x44, instructions.TSETR),
(0x45, instructions.CALLM),
(0x46, instructions.CALL),
(0x47, instructions.CALLMT),
(0x48, instructions.CALLT),
(0x49, instructions.ITERC),
(0x4a, instructions.ITERN),
(0x4b, instructions.VARG),
(0x4c, instructions.ISNEXT),
(0x4d, instructions.FORI),
(0x4e, instructions.JFORI),
(0x4f, instructions.FORL),
(0x50, instructions.IFORL),
(0x51, instructions.JFORL),
(0x52, instructions.ITERL),
(0x53, instructions.IITERL),
(0x54, instructions.JITERL),
(0x55, instructions.LOOP),
(0x56, instructions.ILOOP),
(0x57, instructions.JLOOP),
(0x58, instructions.JMP),
(0x59, instructions.FUNCF),
(0x5a, instructions.IFUNCF),
(0x5b, instructions.JFUNCF),
(0x5c, instructions.FUNCV),
(0x5d, instructions.IFUNCV),
(0x5e, instructions.JFUNCV),
(0x5f, instructions.FUNCC),
(0x60, instructions.FUNCCW)

第二处修改

修改ljd/bytecode/instructions.py文件, 从97行开始替换

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
ISLT = _IDef("ISLT", 		T_VAR, 	None, 	T_VAR, 	"if {A} < {D}")
ISGE = _IDef("ISGE", T_VAR, None, T_VAR, "if {A} >= {D}")
ISLE = _IDef("ISLE", T_VAR, None, T_VAR, "if {A} <= {D}")
ISGT = _IDef("ISGT", T_VAR, None, T_VAR, "if {A} > {D}")

ISEQV = _IDef("ISEQV", T_VAR, None, T_VAR, "if {A} == {D}")
ISNEV = _IDef("ISNEV", T_VAR, None, T_VAR, "if {A} ~= {D}")

ISEQS = _IDef("ISEQS", T_VAR, None, T_STR, "if {A} == {D}")
ISNES = _IDef("ISNES", T_VAR, None, T_STR, "if {A} ~= {D}")

ISEQN = _IDef("ISEQN", T_VAR, None, T_NUM, "if {A} == {D}")
ISNEN = _IDef("ISNEN", T_VAR, None, T_NUM, "if {A} ~= {D}")

ISEQP = _IDef("ISEQP", T_VAR, None, T_PRI, "if {A} == {D}")
ISNEP = _IDef("ISNEP", T_VAR, None, T_PRI, "if {A} ~= {D}")

# Constant ops.

KSTR = _IDef("KSTR", T_DST, None, T_STR, "{A} = {D}")
KCDATA = _IDef("KCDATA", T_DST, None, T_CDT, "{A} = {D}")
KSHORT = _IDef("KSHORT", T_DST, None, T_SLIT, "{A} = {D}")
KNUM = _IDef("KNUM", T_DST, None, T_NUM, "{A} = {D}")
KPRI = _IDef("KPRI", T_DST, None, T_PRI, "{A} = {D}")

KNIL = _IDef("KNIL", T_BS, None, T_BS, "{from_A_to_D} = nil")

# Unary test and copy ops

ISTC = _IDef("ISTC", T_DST, None, T_VAR, "{A} = {D}; if {D}")
ISFC = _IDef("ISFC", T_DST, None, T_VAR, "{A} = {D}; if not {D}")

IST = _IDef("IST", None, None, T_VAR, "if {D}")
ISF = _IDef("ISF", None, None, T_VAR, "if not {D}")

ISTYPE = _IDef("ISTYPE", T_VAR, None, T_LIT, "ISTYPE unknow")
ISNUM = _IDef("ISNUM", T_VAR, None, T_LIT, "ISNUM unknow")
# Unary ops

MOV = _IDef("MOV", T_DST, None, T_VAR, "{A} = {D}")
NOT = _IDef("NOT", T_DST, None, T_VAR, "{A} = not {D}")
UNM = _IDef("UNM", T_DST, None, T_VAR, "{A} = -{D}")
LEN = _IDef("LEN", T_DST, None, T_VAR, "{A} = #{D}")

# Returns.

RETM = _IDef("RETM", T_BS, None, T_LIT,
"return {from_A_x_D_minus_one}, ...MULTRES")

RET = _IDef("RET", T_RBS, None, T_LIT,
"return {from_A_x_D_minus_two}")

RET0 = _IDef("RET0", T_RBS, None, T_LIT, "return")
RET1 = _IDef("RET1", T_RBS, None, T_LIT, "return {A}")

# Binary ops

ADDVN = _IDef("ADDVN", T_DST, T_VAR, T_NUM, "{A} = {B} + {C}")
SUBVN = _IDef("SUBVN", T_DST, T_VAR, T_NUM, "{A} = {B} - {C}")
MULVN = _IDef("MULVN", T_DST, T_VAR, T_NUM, "{A} = {B} * {C}")
DIVVN = _IDef("DIVVN", T_DST, T_VAR, T_NUM, "{A} = {B} / {C}")
MODVN = _IDef("MODVN", T_DST, T_VAR, T_NUM, "{A} = {B} % {C}")

ADDNV = _IDef("ADDNV", T_DST, T_VAR, T_NUM, "{A} = {C} + {B}")
SUBNV = _IDef("SUBNV", T_DST, T_VAR, T_NUM, "{A} = {C} - {B}")
MULNV = _IDef("MULNV", T_DST, T_VAR, T_NUM, "{A} = {C} * {B}")
DIVNV = _IDef("DIVNV", T_DST, T_VAR, T_NUM, "{A} = {C} / {B}")
MODNV = _IDef("MODNV", T_DST, T_VAR, T_NUM, "{A} = {C} % {B}")

ADDVV = _IDef("ADDVV", T_DST, T_VAR, T_VAR, "{A} = {B} + {C}")
SUBVV = _IDef("SUBVV", T_DST, T_VAR, T_VAR, "{A} = {B} - {C}")
MULVV = _IDef("MULVV", T_DST, T_VAR, T_VAR, "{A} = {B} * {C}")
DIVVV = _IDef("DIVVV", T_DST, T_VAR, T_VAR, "{A} = {B} / {C}")
MODVV = _IDef("MODVV", T_DST, T_VAR, T_VAR, "{A} = {B} % {C}")

POW = _IDef("POW", T_DST, T_VAR, T_VAR, "{A} = {B} ^ {C} (pow)")
CAT = _IDef("CAT", T_DST, T_RBS, T_RBS,
"{A} = {concat_from_B_to_C}")

# Upvalue and function ops.

UGET = _IDef("UGET", T_DST, None, T_UV, "{A} = {D}")

USETV = _IDef("USETV", T_UV, None, T_VAR, "{A} = {D}")
USETS = _IDef("USETS", T_UV, None, T_STR, "{A} = {D}")
USETN = _IDef("USETN", T_UV, None, T_NUM, "{A} = {D}")
USETP = _IDef("USETP", T_UV, None, T_PRI, "{A} = {D}")

UCLO = _IDef("UCLO", T_RBS, None, T_JMP,
"nil uvs >= {A}; goto {D}")

FNEW = _IDef("FNEW", T_DST, None, T_FUN, "{A} = function {D}")

# Table ops.

TNEW = _IDef("TNEW", T_DST, None, T_LIT, "{A} = new table("
" array: {D_array},"
" dict: {D_dict})")

TDUP = _IDef("TDUP", T_DST, None, T_TAB, "{A} = copy {D}")

GGET = _IDef("GGET", T_DST, None, T_STR, "{A} = _env[{D}]")
GSET = _IDef("GSET", T_VAR, None, T_STR, "_env[{D}] = {A}")

TGETV = _IDef("TGETV", T_DST, T_VAR, T_VAR, "{A} = {B}[{C}]")
TGETS = _IDef("TGETS", T_DST, T_VAR, T_STR, "{A} = {B}.{C}")
TGETB = _IDef("TGETB", T_DST, T_VAR, T_LIT, "{A} = {B}[{C}]")
TGETR = _IDef("TGETR", T_DST, T_VAR, T_VAR, "unkown TGETR")

TSETV = _IDef("TSETV", T_VAR, T_VAR, T_VAR, "{B}[{C}] = {A}")
TSETS = _IDef("TSETS", T_VAR, T_VAR, T_STR, "{B}.{C} = {A}")
TSETB = _IDef("TSETB", T_VAR, T_VAR, T_LIT, "{B}[{C}] = {A}")

TSETM = _IDef("TSETM", T_BS, None, T_NUM,
"for i = 0, MULTRES, 1 do"
" {A_minus_one}[{D_low} + i] = slot({A} + i)")

TSETR = _IDef("TSETR", T_VAR, T_VAR, T_VAR, "unkow TSETR")
# Calls and vararg handling. T = tail call.

CALLM = _IDef("CALLM", T_BS, T_LIT, T_LIT,
"{from_A_x_B_minus_two} = {A}({from_A_plus_one_x_C}, ...MULTRES)")

CALL = _IDef("CALL", T_BS, T_LIT, T_LIT,
"{from_A_x_B_minus_two} = {A}({from_A_plus_one_x_C_minus_one})")

CALLMT = _IDef("CALLMT", T_BS, None, T_LIT,
"return {A}({from_A_plus_one_x_D}, ...MULTRES)")

CALLT = _IDef("CALLT", T_BS, None, T_LIT,
"return {A}({from_A_plus_one_x_D_minus_one})")

ITERC = _IDef("ITERC", T_BS, T_LIT, T_LIT,
"{A}, {A_plus_one}, {A_plus_two} ="
" {A_minus_three}, {A_minus_two}, {A_minus_one};"
" {from_A_x_B_minus_two} ="
" {A_minus_three}({A_minus_two}, {A_minus_one})")

ITERN = _IDef("ITERN", T_BS, T_LIT, T_LIT,
"{A}, {A_plus_one}, {A_plus_two} ="
" {A_minus_three}, {A_minus_two}, {A_minus_one};"
" {from_A_x_B_minus_two} ="
" {A_minus_three}({A_minus_two}, {A_minus_one})")

VARG = _IDef("VARG", T_BS, T_LIT, T_LIT,
"{from_A_x_B_minus_two} = ...")

ISNEXT = _IDef("ISNEXT", T_BS, None, T_JMP,
"Verify ITERN at {D}; goto {D}")

# Loops and branches. I/J = interp/JIT, I/C/L = init/call/loop.

FORI = _IDef("FORI", T_BS, None, T_JMP,
"for {A_plus_three} = {A},{A_plus_one},{A_plus_two}"
" else goto {D}")

JFORI = _IDef("JFORI", T_BS, None, T_JMP,
"for {A_plus_three} = {A},{A_plus_one},{A_plus_two}"
" else goto {D}")

FORL = _IDef("FORL", T_BS, None, T_JMP,
"{A} = {A} + {A_plus_two};"
" if cmp({A}, sign {A_plus_two}, {A_plus_one}) goto {D}")

IFORL = _IDef("IFORL", T_BS, None, T_JMP,
"{A} = {A} + {A_plus_two};"
" if cmp({A}, sign {A_plus_two}, {A_plus_one}) goto {D}")

JFORL = _IDef("JFORL", T_BS, None, T_JMP,
"{A} = {A} + {A_plus_two};"
" if cmp({A}, sign {A_plus_two}, {A_plus_one}) goto {D}")

ITERL = _IDef("ITERL", T_BS, None, T_JMP,
"{A_minus_one} = {A}; if {A} != nil goto {D}")

IITERL = _IDef("IITERL", T_BS, None, T_JMP,
"{A_minus_one} = {A}; if {A} != nil goto {D}")

JITERL = _IDef("JITERL", T_BS, None, T_LIT,
"{A_minus_one} = {A}; if {A} != nil goto {D}")

LOOP = _IDef("LOOP", T_RBS, None, T_JMP, "Noop")
ILOOP = _IDef("ILOOP", T_RBS, None, T_JMP, "Noop")
JLOOP = _IDef("JLOOP", T_RBS, None, T_LIT, "Noop")

JMP = _IDef("JMP", T_RBS, None, T_JMP, " goto {D}")

# Function headers. I/J = interp/JIT, F/V/C = fixarg/vararg/C func.
# Shouldn't be ever seen - they are not stored in raw dump?

FUNCF = _IDef("FUNCF", T_RBS, None, None,
"Fixed-arg function with frame size {A}")

IFUNCF = _IDef("IFUNCF", T_RBS, None, None,
"Interpreted fixed-arg function with frame size {A}")

JFUNCF = _IDef("JFUNCF", T_RBS, None, T_LIT,
"JIT compiled fixed-arg function with frame size {A}")

FUNCV = _IDef("FUNCV", T_RBS, None, None,
"Var-arg function with frame size {A}")

IFUNCV = _IDef("IFUNCV", T_RBS, None, None,
"Interpreted var-arg function with frame size {A}")

JFUNCV = _IDef("JFUNCV", T_RBS, None, T_LIT,
"JIT compiled var-arg function with frame size {A}")

FUNCC = _IDef("FUNCC", T_RBS, None, None,
"C function with frame size {A}")
FUNCCW = _IDef("FUNCCW", T_RBS, None, None,
"Wrapped C function with frame size {A}")

UNKNW = _IDef("UNKNW", T_LIT, T_LIT, T_LIT, "Unknown instruction")

第三处修改

修改ljd/ast/builder.py

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
diff -urNa ljd-old/ast/builder.py ljd-new/ast/builder.py
--- ljd-old/ast/builder.py 2020-05-09 03:43:27.000000000 -0700
+++ ljd-new/ast/builder.py 2022-09-20 02:04:53.955771000 -0700
@@ -276,7 +276,7 @@
last = instructions[-1]
opcode = 256 if len(instructions) == 1 else instructions[-2].opcode

- if opcode <= ins.ISF.opcode:
+ if opcode in (ins.ISLT.opcode, ins.ISGE.opcode, ins.ISLE.opcode, ins.ISGT.opcode, ins.ISEQV.opcode, ins.ISNEV.opcode, ins.ISEQS.opcode, ins.ISNES.opcode, ins.ISEQN.opcode, ins.ISNEN.opcode, ins.ISEQP.opcode, ins.ISNEP.opcode, ins.ISTC.opcode, ins.ISFC.opcode, ins.IST.opcode, ins.ISF.opcode):
assert last.opcode != ins.ISNEXT.opcode
return _build_conditional_warp(state, last_addr, instructions)
else:
@@ -507,7 +507,7 @@
expression = _build_unary_expression(state, addr, instruction)

# Binary assignment operators (A = B op C)
- elif opcode <= ins.POW.opcode:
+ elif ins.ADDVN.opcode <= opcode <= ins.POW.opcode:
expression = _build_binary_expression(state, addr, instruction)

# Concat assignment type (A = B .. B + 1 .. ... .. C - 1 .. C)
@@ -515,7 +515,7 @@
expression = _build_concat_expression(state, addr, instruction)

# Constant assignment operators except KNIL, which is weird anyway
- elif opcode <= ins.KPRI.opcode:
+ elif ins.KSTR.opcode <= opcode <= ins.KPRI.opcode:
expression = _build_const_expression(state, addr, instruction)

elif opcode == ins.UGET.opcode:
@@ -524,7 +524,7 @@
elif opcode == ins.USETV.opcode:
expression = _build_slot(state, addr, instruction.CD)

- elif opcode <= ins.USETP.opcode:
+ elif ins.USETS.opcode <= opcode <= ins.USETP.opcode:
expression = _build_const_expression(state, addr, instruction)

elif opcode == ins.FNEW.opcode:

然后按照readme文件中的参数进行反编译即可

分析lua代码

GameScene.lua

collisionH()

首先分析该文件的原因是当我们最后顶到问号方格的时候才加载的, 说明是最终的验证函数.

分析找到了最终赢得游戏的逻辑, 主要的判断依据就是两个成员函数onWin()和doMarioDie(). 应该是最后碰到板栗仔, 如果是NORMAL状态就胜利, 否则马里奥死亡.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
function slot0.collisionH(slot0)
if slot0.curPos.x <= slot0.marioSize.width / 2 then
slot2.x = slot1
end

if slot2.x >= slot0.mapSize.width - slot1 then
slot2.x = slot0.mapSize.width - slot1
end

if (slot0.mario.face == MarioState.RIGHT or slot3 == MarioState.JUMP_RIGHT) and slot0.mainMap.enemyList[1] and slot4.position.x <= slot2.x + slot0.marioSize.width and slot2.y <= slot4.position.y + 10 then
if slot0.mario.bodyType == MarioBodyState.NORMAL then -- 马里奥的身体状态为NORMAL
slot0.win = true

slot0.mainMap:onWin() --胜利
else
slot0:doMarioDie() -- 死亡

return
end
end

if slot0.mainMap.flagPos.x <= slot2.x + slot0.marioSize.width then
slot0:stopForWin()

return
end

slot0.curPos = slot2

slot0.mario:setPosition(slot2)
end

slot0.collisionV()

知道了最终结果的判断跟slot0.mario.bodyType​有关, 则取查找与该字段相关的函数, 只有slot0.collisionV()与此相关.

同时有两个关键的逻辑:

  • slot0.mario:changeForGotMushroom()
  • slot0.mainMap:breakBrick(slot8, slot0.mario.bodyType) – 与该字段相关逻辑
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
function slot0.collisionV(slot0)
if slot0.curPos.y <= 0 then -- 跳下悬崖
slot0:doMarioDie() -- 死亡

return
end

.....

for slot6 = 6, slot0.marioSize.width - 7 do
-- 如果吃了蘑菇
if slot0.mainMap:isMarioEatMushroom(slot0.mainMap:pointToTileCoord(cc.p(slot1.x - slot0.marioSize.width / 2 + slot6, slot1.y + slot0.marioSize.height))) then
--调用该函数
slot0.mario:changeForGotMushroom()
cc.SimpleAudioEngine:getInstance():playEffect(music_eatmushroomOrFlower)
end

slot9 = cc.p(slot1.x, slot0.mainMap:tileCoordToPoint(slot8).y - slot0.marioSize.height)
slot11 = false

if TileType.BRICK == slot0.mainMap:tileTypeAtCoord(slot8) or TileType.LAND == slot10 or TileType.INPUT == slot10 then
if slot0.jumpOffset > 0 then
slot0.mainMap:breakBrick(slot8, slot0.mario.bodyType) -- 与该字段相关逻辑

slot0.jumpOffset = 0
slot0.curPos = slot9

slot0.mario:setPosition(slot9)

slot11 = true
end
elseif TileType.TRAP == slot10 then
slot0:doMarioDie()
end

.....
end

slot0.breakBrick()

一开始首先查找的是changeForGotMushroom()​函数, 但是并没有什么结果.

再次查找breakBrick()函数找到了其定义

关键:

  • slot0:showNewMushroom(slot1, slot2): 出现蘑菇

于是我们从该函数出发, 找出执行到这里的条件

  • 第一层的if需要执行到elseif

  • 第二层:

    • slot5获取input, 并转换成字符串
    • 传入slot5到core.Util的create()方法中, 并返回slot6
    • 调用slot6:OoO() (注意跟下面的函数不一样, 大小写相反, 是两个函数)
    • 根据slot6:oOo()的返回值判断是否执行slot0:showNewMushroom(slot1, slot2)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
function slot0.breakBrick(slot0, slot1, slot2)
if slot0:getPropertiesForGID(slot0.brickLayer:getTileGIDAt(slot1)) == 0 then
slot0:chageBlockState(slot1)
elseif slot4 == 601 and slot0:itemInList(slot0.mushroomCoordList, slot1) then
for slot9 = 0, 31 do
slot5 = "" .. slot0.labelList["input" .. tostring(slot9)]:getString()
end

slot6 = require("core.Util").create(slot5)

slot6:OoO()

if slot6:oOo() then
slot0:showNewMushroom(slot1, slot2)
slot0:removeItem(slot0.mushroomCoordList, slot1)
end
end
end

Util.create()

根据上面的逻辑继续跟进, 全局搜索找到create()函数.

note: lua中数组下标从1开始, 所以在还原虚拟机的时候需要关注下标的转换, 分为两类:

  • 相对距离: 比如slot1.slot[2][slot6] = string.byte(slot0, slot6 - 32)​中的slot6-32​就是一个相对距离, 无需修改
  • 绝对距离: 比如slot1.slot[2][slot6 + 96] = slot2[slot6 - 32]​中的[slot6 + 96]​就是一个绝对距离, 需要减一

逻辑: 根据slot2可以看出来是一个虚拟机, create()函数主要执行初始化操作, 执行逻辑在下面的OoO()中.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
function slot0.create(slot0)
slot2 = {
94,
106,
91,
110,
86,
100,
82,
20,
32,
20,
80,
21,
83,
107,
88,
98,
81,
19,
79,
10,
49,
117,
68,
120,
61,
13,
75,
115,
48,
8,
76,
123
}
uv0.new().slot[1] = {
65,
30,
37,
10,
50,
0,
37,
11,
36,
10,
11,
34,
10,
66,
10,
49,
0,
50,
191,
37,
10,
50,
1,
37,
11,
36,
10,
11,
33,
10,
66,
10,
49,
1,
50,
192,
37,
10,
50,
2,
37,
11,
36,
10,
11,
34,
10,
66,
10,
49,
2,
50,
193,
37,
10,
50,
3,
37,
11,
36,
10,
11,
33,
10,
66,
10,
49,
3,
50,
194,
37,
10,
50,
4,
37,
11,
36,
10,
11,
34,
10,
66,
10,
49,
4,
50,
195,
37,
10,
50,
5,
37,
11,
36,
10,
11,
33,
10,
66,
10,
49,
5,
50,
196,
37,
10,
50,
6,
37,
11,
36,
10,
11,
34,
10,
66,
10,
49,
6,
50,
197,
37,
10,
50,
7,
37,
11,
36,
10,
11,
33,
10,
66,
10,
49,
7,
50,
198,
37,
10,
50,
8,
37,
11,
36,
10,
11,
34,
10,
66,
10,
49,
8,
50,
199,
37,
10,
50,
9,
37,
11,
36,
10,
11,
33,
10,
66,
10,
49,
9,
50,
200,
37,
10,
50,
10,
37,
11,
36,
10,
11,
34,
10,
66,
10,
49,
10,
50,
201,
37,
10,
50,
11,
37,
11,
36,
10,
11,
33,
10,
66,
10,
49,
11,
50,
202,
37,
10,
50,
12,
37,
11,
36,
10,
11,
34,
10,
66,
10,
49,
12,
50,
203,
37,
10,
50,
13,
37,
11,
36,
10,
11,
33,
10,
66,
10,
49,
13,
50,
204,
37,
10,
50,
14,
37,
11,
36,
10,
11,
34,
10,
66,
10,
49,
14,
50,
205,
37,
10,
50,
15,
37,
11,
36,
10,
11,
33,
10,
66,
10,
49,
15,
50,
206,
37,
10,
50,
16,
37,
11,
36,
10,
11,
34,
10,
66,
10,
49,
16,
50,
207,
37,
10,
50,
17,
37,
11,
36,
10,
11,
33,
10,
66,
10,
49,
17,
50,
208,
37,
10,
50,
18,
37,
11,
36,
10,
11,
34,
10,
66,
10,
49,
18,
50,
209,
37,
10,
50,
19,
37,
11,
36,
10,
11,
33,
10,
66,
10,
49,
19,
50,
210,
37,
10,
50,
20,
37,
11,
36,
10,
11,
34,
10,
66,
10,
49,
20,
50,
211,
37,
10,
50,
21,
37,
11,
36,
10,
11,
33,
10,
66,
10,
49,
21,
50,
212,
37,
10,
50,
22,
37,
11,
36,
10,
11,
34,
10,
66,
10,
49,
22,
50,
213,
37,
10,
50,
23,
37,
11,
36,
10,
11,
33,
10,
66,
10,
49,
23,
50,
214,
37,
10,
50,
24,
37,
11,
36,
10,
11,
34,
10,
66,
10,
49,
24,
50,
215,
37,
10,
50,
25,
37,
11,
36,
10,
11,
33,
10,
66,
10,
49,
25,
50,
216,
37,
10,
50,
26,
37,
11,
36,
10,
11,
34,
10,
66,
10,
49,
26,
50,
217,
37,
10,
50,
27,
37,
11,
36,
10,
11,
33,
10,
66,
10,
49,
27,
50,
218,
37,
10,
50,
28,
37,
11,
36,
10,
11,
34,
10,
66,
10,
49,
28,
50,
219,
37,
10,
50,
29,
37,
11,
36,
10,
11,
33,
10,
66,
10,
49,
29,
50,
220,
37,
10,
50,
30,
37,
11,
36,
10,
11,
34,
10,
66,
10,
49,
30,
50,
221,
37,
10,
50,
31,
37,
11,
36,
10,
11,
34,
10,
66,
10,
49,
31,
144,
144,
144,
144
}

for slot6 = 1, 256 do
slot1.slot[2][slot6] = 0
end

for slot6 = 33, 65 do
slot1.slot[2][slot6] = string.byte(slot0, slot6 - 32) -- slot0是传入的字符串
slot1.slot[2][slot6 + 96] = slot2[slot6 - 32]
end

slot1.slot[3] = 0
slot1.slot[4] = 0
slot1.slot[5] = 1
slot1.slot[6] = 0

return slot1
end

function slot0.OoO(slot0)
while true do
opcode = slot0.slot[1][slot0.slot[5]]

if opcode == 17 then
slot1 = slot0.slot[1][slot0.slot[5] + 1]
slot0.slot[1][33 + slot1] = slot0:lil(slot0.slot[1][33 + slot1] + 1, 255)
slot0.slot[5] = slot0.slot[5] + 2
elseif opcode == 33 then
slot1 = slot0.slot[1][slot0.slot[5] + 1]
slot0.slot[slot1 - 7] = slot0:lil(slot0.slot[slot1 - 7] + 1, 255)
slot0.slot[5] = slot0.slot[5] + 2
elseif opcode == 65 then
slot0.slot[6] = slot0.slot[6] + 1
slot0.slot[2][slot0.slot[6]] = slot0.slot[1][slot0.slot[5] + 1]
slot0.slot[5] = slot0.slot[5] + 2
elseif opcode == 18 then
slot1 = slot0.slot[1][slot0.slot[5] + 1]
slot0.slot[1][33 + slot1] = slot0:lil(slot0.slot[1][33 + slot1] - 1, 255)
slot0.slot[5] = slot0.slot[5] + 2
elseif opcode == 35 then
slot1 = slot0.slot[1][slot0.slot[5] + 1]
slot0.slot[slot1 - 7] = slot0:ili(slot0.slot[slot1 - 7], slot0.slot[1][slot0.slot[5] + 2])
slot0.slot[5] = slot0.slot[5] + 3
elseif opcode == 50 then
slot0.slot[6] = slot0.slot[6] + 1
slot0.slot[2][slot0.slot[6]] = slot0.slot[2][33 + slot0.slot[1][slot0.slot[5] + 1]]
slot0.slot[5] = slot0.slot[5] + 2
elseif opcode == 36 then
slot1 = slot0.slot[1][slot0.slot[5] + 1]
slot0.slot[slot1 - 7] = slot0:ili(slot0.slot[slot1 - 7], slot0.slot[slot0.slot[1][slot0.slot[5] + 2] - 7])
slot0.slot[5] = slot0.slot[5] + 3
elseif opcode == 49 then
slot0.slot[2][224 + slot0.slot[1][slot0.slot[5] + 1]] = slot0.slot[2][slot0.slot[6]]
slot0.slot[6] = slot0.slot[6] - 1
slot0.slot[5] = slot0.slot[5] + 2
elseif opcode == 34 then
slot1 = slot0.slot[1][slot0.slot[5] + 1]
slot0.slot[slot1 - 7] = slot0:lil(slot0.slot[slot1 - 7] - 1, 255)
slot0.slot[5] = slot0.slot[5] + 2
elseif opcode == 66 then
slot0.slot[6] = slot0.slot[6] + 1
slot0.slot[2][slot0.slot[6]] = slot0.slot[slot0.slot[1][slot0.slot[5] + 1] - 7]
slot0.slot[5] = slot0.slot[5] + 2
elseif opcode == 37 then
slot0.slot[slot0.slot[1][slot0.slot[5] + 1] - 7] = slot0.slot[2][slot0.slot[6]]
slot0.slot[6] = slot0.slot[6] - 1
slot0.slot[5] = slot0.slot[5] + 2
elseif opcode == 144 then
return
else
break
end
end
end

slot0.OoO()

分析时需要注意的点:

  • 使用C++还原的时候lil(arg0, 255)​是可以省略的, 该函数主要是避免溢出问题, 而C++可以直接使用unsigned char
  • 使用文本替换的时候不要直接全部替换了, 该反编译器多个函数间统一使用slotx, 直接替换会把其他函数的变量也改名了, 最好一个一个换
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
function slot0.OoO(slot0)
while true do
opcode = slot0.slot[1][slot0.slot[5]]

if opcode == 17 then
slot1 = slot0.slot[1][slot0.slot[5] + 1]
slot0.slot[1][33 + slot1] = slot0:lil(slot0.slot[1][33 + slot1] + 1, 255)
slot0.slot[5] = slot0.slot[5] + 2
elseif opcode == 33 then
slot1 = slot0.slot[1][slot0.slot[5] + 1]
slot0.slot[slot1 - 7] = slot0:lil(slot0.slot[slot1 - 7] + 1, 255)
slot0.slot[5] = slot0.slot[5] + 2
elseif opcode == 65 then
slot0.slot[6] = slot0.slot[6] + 1
slot0.slot[2][slot0.slot[6]] = slot0.slot[1][slot0.slot[5] + 1]
slot0.slot[5] = slot0.slot[5] + 2
elseif opcode == 18 then
slot1 = slot0.slot[1][slot0.slot[5] + 1]
slot0.slot[1][33 + slot1] = slot0:lil(slot0.slot[1][33 + slot1] - 1, 255)
slot0.slot[5] = slot0.slot[5] + 2
elseif opcode == 35 then
slot1 = slot0.slot[1][slot0.slot[5] + 1]
slot0.slot[slot1 - 7] = slot0:ili(slot0.slot[slot1 - 7], slot0.slot[1][slot0.slot[5] + 2])
slot0.slot[5] = slot0.slot[5] + 3
elseif opcode == 50 then
slot0.slot[6] = slot0.slot[6] + 1
slot0.slot[2][slot0.slot[6]] = slot0.slot[2][33 + slot0.slot[1][slot0.slot[5] + 1]]
slot0.slot[5] = slot0.slot[5] + 2
elseif opcode == 36 then
slot1 = slot0.slot[1][slot0.slot[5] + 1]
slot0.slot[slot1 - 7] = slot0:ili(slot0.slot[slot1 - 7], slot0.slot[slot0.slot[1][slot0.slot[5] + 2] - 7])
slot0.slot[5] = slot0.slot[5] + 3
elseif opcode == 49 then
slot0.slot[2][224 + slot0.slot[1][slot0.slot[5] + 1]] = slot0.slot[2][slot0.slot[6]]
slot0.slot[6] = slot0.slot[6] - 1
slot0.slot[5] = slot0.slot[5] + 2
elseif opcode == 34 then
slot1 = slot0.slot[1][slot0.slot[5] + 1]
slot0.slot[slot1 - 7] = slot0:lil(slot0.slot[slot1 - 7] - 1, 255)
slot0.slot[5] = slot0.slot[5] + 2
elseif opcode == 66 then
slot0.slot[6] = slot0.slot[6] + 1
slot0.slot[2][slot0.slot[6]] = slot0.slot[slot0.slot[1][slot0.slot[5] + 1] - 7]
slot0.slot[5] = slot0.slot[5] + 2
elseif opcode == 37 then
slot0.slot[slot0.slot[1][slot0.slot[5] + 1] - 7] = slot0.slot[2][slot0.slot[6]]
slot0.slot[6] = slot0.slot[6] - 1
slot0.slot[5] = slot0.slot[5] + 2
elseif opcode == 144 then
return
else
break
end
end
end

还原虚拟机

使用了C++还原虚拟机逻辑, 分为

  • main.cpp
  • vm.h
  • vm.cpp

main.cpp

1
2
3
4
5
6
7
8
9
10
#include "vm.h"
#include <iostream>

using std::cout, std::cin, std::endl;

int main() {
char input[] = "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa";
VM vm0(input);
vm0.OoO_run();
}

vm.h

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
#ifndef VM_H_
#define VM_H_

class VM {
private:
unsigned char ins_values[32] = {94,106,91,110,86,100,82,20,32,20,
80,21,83,107,88,98,81,19,79,10,
49,117,68,120,61,13,75,115,48,8,
76,123};
unsigned char opcodes[548] = {
65, 30, 37, 10, 50, 0, 37, 11, 36, 10,
11, 34, 10, 66, 10, 49, 0, 50, 191, 37,
10, 50, 1, 37, 11, 36, 10, 11, 33, 10,
66, 10, 49, 1, 50, 192, 37, 10, 50, 2,
37, 11, 36, 10, 11, 34, 10, 66, 10, 49,
2, 50, 193, 37, 10, 50, 3, 37, 11, 36,
10, 11, 33, 10, 66, 10, 49, 3, 50, 194,
37, 10, 50, 4, 37, 11, 36, 10, 11, 34,
10, 66, 10, 49, 4, 50, 195, 37, 10, 50,
5, 37, 11, 36, 10, 11, 33, 10, 66, 10,
49, 5, 50, 196, 37, 10, 50, 6, 37, 11,
36, 10, 11, 34, 10, 66, 10, 49, 6, 50,
197, 37, 10, 50, 7, 37, 11, 36, 10, 11,
33, 10, 66, 10, 49, 7, 50, 198, 37, 10,
50, 8, 37, 11, 36, 10, 11, 34, 10, 66,
10, 49, 8, 50, 199, 37, 10, 50, 9, 37,
11, 36, 10, 11, 33, 10, 66, 10, 49, 9,
50, 200, 37, 10, 50, 10, 37, 11, 36, 10,
11, 34, 10, 66, 10, 49, 10, 50, 201, 37,
10, 50, 11, 37, 11, 36, 10, 11, 33, 10,
66, 10, 49, 11, 50, 202, 37, 10, 50, 12,
37, 11, 36, 10, 11, 34, 10, 66, 10, 49,
12, 50, 203, 37, 10, 50, 13, 37, 11, 36,
10, 11, 33, 10, 66, 10, 49, 13, 50, 204,
37, 10, 50, 14, 37, 11, 36, 10, 11, 34,
10, 66, 10, 49, 14, 50, 205, 37, 10, 50,
15, 37, 11, 36, 10, 11, 33, 10, 66, 10,
49, 15, 50, 206, 37, 10, 50, 16, 37, 11,
36, 10, 11, 34, 10, 66, 10, 49, 16, 50,
207, 37, 10, 50, 17, 37, 11, 36, 10, 11,
33, 10, 66, 10, 49, 17, 50, 208, 37, 10,
50, 18, 37, 11, 36, 10, 11, 34, 10, 66,
10, 49, 18, 50, 209, 37, 10, 50, 19, 37,
11, 36, 10, 11, 33, 10, 66, 10, 49, 19,
50, 210, 37, 10, 50, 20, 37, 11, 36, 10,
11, 34, 10, 66, 10, 49, 20, 50, 211, 37,
10, 50, 21, 37, 11, 36, 10, 11, 33, 10,
66, 10, 49, 21, 50, 212, 37, 10, 50, 22,
37, 11, 36, 10, 11, 34, 10, 66, 10, 49,
22, 50, 213, 37, 10, 50, 23, 37, 11, 36,
10, 11, 33, 10, 66, 10, 49, 23, 50, 214,
37, 10, 50, 24, 37, 11, 36, 10, 11, 34,
10, 66, 10, 49, 24, 50, 215, 37, 10, 50,
25, 37, 11, 36, 10, 11, 33, 10, 66, 10,
49, 25, 50, 216, 37, 10, 50, 26, 37, 11,
36, 10, 11, 34, 10, 66, 10, 49, 26, 50,
217, 37, 10, 50, 27, 37, 11, 36, 10, 11,
33, 10, 66, 10, 49, 27, 50, 218, 37, 10,
50, 28, 37, 11, 36, 10, 11, 34, 10, 66,
10, 49, 28, 50, 219, 37, 10, 50, 29, 37,
11, 36, 10, 11, 33, 10, 66, 10, 49, 29,
50, 220, 37, 10, 50, 30, 37, 11, 36, 10,
11, 34, 10, 66, 10, 49, 30, 50, 221, 37,
10, 50, 31, 37, 11, 36, 10, 11, 34, 10,
66, 10, 49, 31, 144, 144, 144, 144
};

unsigned char memory[256] = {0}; // 前32个字节是栈空间, 随后33个字节是input输入, 中间空了一些空间, 95开始时ins_values, 最后33个字节用于存储output
unsigned char eax; // slot[3]; 判断是通用寄存器, 判断依据: 通过指令码选取的寄存器, 没有直接调用过. (像是SP都有在vm.cpp中直接出现过的, 且功能明确)
unsigned char ebx; // slot[4]; 判断是通用寄存器, 判断依据: 同上
int PC; // slot[5]; 判断是PC寄存器, 判断依据: 每个指令都对其执行固定长度的自增操作.
unsigned char SP; // slot[6]; 判断该寄存器应该是栈顶指针, 判断依据: 多次自增和自检操作, 且涉及的指令逻辑都是出栈入栈的操作.

public:
VM(const char input[32]); // 构造方法相当于create()
void OoO_run();
unsigned char * getRegister(unsigned char operand);
char * registerString(unsigned char operand);
};

#endif

vm.cpp

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
#include "vm.h"
#include <iostream>

using std::cout, std::cin, std::endl;

VM::VM(const char *input) {
// 初始化内存
for (int i = 32; i <= 64; i++) { // <=64: 模仿lua的for循环
memory[i] = input[i - 32];
memory[i + 95] = ins_values[i - 32];
}
// 初始化寄存器
eax = 0; //
ebx = 0; //
PC = 0; // 补充: Lua下标从1开始, C++下标从0开始
SP = -1; // 补充: 因为Lua起始下标为1, 所以在C++中SP初始化要从0变为-1
}

void VM::OoO_run() {
unsigned char head;
unsigned char operand1 = 0;
unsigned char operand2 = 0;

while (true) {
// 读取指令头
head = opcodes[PC];

cout << "[PC: " << PC << "] ";

switch(head) {
// 特定内存自增
case 17: {
operand1 = opcodes[PC + 1];
opcodes[32 + operand1] += 1;
PC += 2;
cout << "add [" << (int)operand1 << ", 0x20], 1" << endl;
break;
}
// 指定寄存器自增
case 33: {
operand1 = opcodes[PC + 1];
*getRegister(operand1) += 1;
PC += 2;
cout << "add " << registerString(operand1) << ", 1" << endl;
break;
}
// 立即数入栈
case 65: {
SP += 1;
memory[SP] = opcodes[PC + 1];
PC += 2;
cout << "push " << (int)opcodes[PC + 1] << endl;
break;
}
// 特定内存自减
case 18: {
operand1= opcodes[PC + 1];
opcodes[32 + operand1] -= 1;
PC += 2;
cout << "sub [" << (int)operand1 << ", 0x20], 1" << endl;
break;
}
// 寄存器与立即数异或
case 35: {
operand1 = opcodes[PC + 1];
*getRegister(operand1) ^= opcodes[PC + 2];
PC += 3;
cout << "xor " << registerString(operand1) << ", " << (int)opcodes[PC + 2] << endl;
break;
}
// 特定内存的数据入栈
case 50: {
SP += 1;
memory[SP] = memory[32 + opcodes[PC + 1]];
PC += 2;
cout << "push [" << (int)opcodes[PC + 1] << ", 0x20]" << endl;
break;
}
// 两寄存器异或
case 36: {
operand1 = opcodes[PC + 1];
operand2 = opcodes[PC + 2];
*getRegister(operand1) = operand1 ^ operand2;
PC += 3;
cout << "xor " << registerString(operand1) << ", " << registerString(operand2) << endl;
break;
}
// 栈数据弹出到指定内存
case 49: {
operand1 = opcodes[PC + 1];
memory[224 + operand1] = memory[SP];
SP -= 1;
PC += 2;
cout << "pop [" << (int)operand1 << ", 0xE0]" << endl;
break;
}
// 寄存器自减
case 34: {
operand1 = opcodes[PC + 1];
*getRegister(operand1) -= 1;
PC += 2;
cout << "sub " << registerString(operand1) << ", 1" << endl;
break;
}
// 寄存器入栈
case 66: {
operand1 = opcodes[PC + 1];
SP += 1;
memory[SP] = *getRegister(operand1);
PC += 2;
cout << "push " << registerString(operand1) << endl;
break;
}
// 栈数据弹出到指定寄存器
case 37: {
operand1 = opcodes[PC + 1];
*getRegister(operand1) = memory[SP];
SP -= 1;
PC += 2;
cout << "pop " << registerString(operand1) << endl;
break;
}
case 144: {
return;
break;
}
}
}
}

unsigned char *VM::getRegister(unsigned char operand) {
switch(operand - 7) {
case 3:
return &eax;
case 4:
return &ebx;
case 5:
throw "expected PC register";
case 6:
return &SP;
default:
throw "expected register index";
}
}

char *VM::registerString(unsigned char operand) {
switch(operand - 7) {
case 3:
return "eax";
case 4:
return "ebx";
case 5:
return "PC";
case 6:
return "SP";
default:
throw "expected register index";
}
}

得到指令输出

分析前三个, 为相邻位异或和加减1, 注意最后一个加密减去了两次

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
push 1e
pop eax // eax = 0x1e
push [0, 0x20]
pop ebx // ebx = input[0]
xor eax, ebx // eax = input[0] ^ 0x1e
sub eax, 1 // eab = (input[0] ^ 0x1e) - 1
push eax
pop [0, 0xDF] // output[0] = input[0]


push [bf, 0x20]
pop eax // eax = output[0]
push [1, 0x20]
pop ebx // ebx = input[1]
xor eax, ebx // eax = output[0] ^ input[1] = ((input[0] ^ 0x1e) - 1) ^ input[1]
add eax, 1 // eax = (((input[0] ^ 0x1e) - 1) ^ input[1]) + 1
push eax
pop [1, 0xDF] // output[1] = (((input[0] ^ 0x1e) - 1) ^ input[1]) + 1


push [c0, 0x20]
pop eax // eax = output[1]
push [2, 0x20]
pop ebx // ebx = input[2]
xor eax, ebx // eax = output[1] ^ input[2]
sub eax, 1 // eax = (output[1] ^ input[2]) - 1
push eax
pop [2, 0xDF]


push [c1, 0x20]
pop eax
push [3, 0x20]
pop ebx
xor eax, ebx
add eax, 1
push eax
pop [3, 0xDF]


push [c2, 0x20]
pop eax
push [4, 0x20]
pop ebx
xor eax, ebx
sub eax, 1
push eax
pop [4, 0xDF]


push [c3, 0x20]
pop eax
push [5, 0x20]
pop ebx
xor eax, ebx
add eax, 1
push eax
pop [5, 0xDF]


push [c4, 0x20]
pop eax
push [6, 0x20]
pop ebx
xor eax, ebx
sub eax, 1
push eax
pop [6, 0xDF]


push [c5, 0x20]
pop eax
push [7, 0x20]
pop ebx
xor eax, ebx
add eax, 1
push eax
pop [7, 0xDF]


push [c6, 0x20]
pop eax
push [8, 0x20]
pop ebx
xor eax, ebx
sub eax, 1
push eax
pop [8, 0xDF]


push [c7, 0x20]
pop eax
push [9, 0x20]
pop ebx
xor eax, ebx
add eax, 1
push eax
pop [9, 0xDF]


push [c8, 0x20]
pop eax
push [a, 0x20]
pop ebx
xor eax, ebx
sub eax, 1
push eax
pop [a, 0xDF]


push [c9, 0x20]
pop eax
push [b, 0x20]
pop ebx
xor eax, ebx
add eax, 1
push eax
pop [b, 0xDF]


push [ca, 0x20]
pop eax
push [c, 0x20]
pop ebx
xor eax, ebx
sub eax, 1
push eax
pop [c, 0xDF]


push [cb, 0x20]
pop eax
push [d, 0x20]
pop ebx
xor eax, ebx
add eax, 1
push eax
pop [d, 0xDF]
push [cc, 0x20]
pop eax
push [e, 0x20]
pop ebx
xor eax, ebx
sub eax, 1
push eax
pop [e, 0xDF]
push [cd, 0x20]
pop eax
push [f, 0x20]
pop ebx
xor eax, ebx
add eax, 1
push eax
pop [f, 0xDF]
push [ce, 0x20]
pop eax
push [10, 0x20]
pop ebx
xor eax, ebx
sub eax, 1
push eax
pop [10, 0xDF]
push [cf, 0x20]
pop eax
push [11, 0x20]
pop ebx
xor eax, ebx
add eax, 1
push eax
pop [11, 0xDF]
push [d0, 0x20]
pop eax
push [12, 0x20]
pop ebx
xor eax, ebx
sub eax, 1
push eax
pop [12, 0xDF]
push [d1, 0x20]
pop eax
push [13, 0x20]
pop ebx
xor eax, ebx
add eax, 1
push eax
pop [13, 0xDF]
push [d2, 0x20]
pop eax
push [14, 0x20]
pop ebx
xor eax, ebx
sub eax, 1
push eax
pop [14, 0xDF]
push [d3, 0x20]
pop eax
push [15, 0x20]
pop ebx
xor eax, ebx
add eax, 1
push eax
pop [15, 0xDF]
push [d4, 0x20]
pop eax
push [16, 0x20]
pop ebx
xor eax, ebx
sub eax, 1
push eax
pop [16, 0xDF]
push [d5, 0x20]
pop eax
push [17, 0x20]
pop ebx
xor eax, ebx
add eax, 1
push eax
pop [17, 0xDF]
push [d6, 0x20]
pop eax
push [18, 0x20]
pop ebx
xor eax, ebx
sub eax, 1
push eax
pop [18, 0xDF]
push [d7, 0x20]
pop eax
push [19, 0x20]
pop ebx
xor eax, ebx
add eax, 1
push eax
pop [19, 0xDF]
push [d8, 0x20]
pop eax
push [1a, 0x20]
pop ebx
xor eax, ebx
sub eax, 1
push eax
pop [1a, 0xDF]
push [d9, 0x20]
pop eax
push [1b, 0x20]
pop ebx
xor eax, ebx
add eax, 1
push eax
pop [1b, 0xDF]
push [da, 0x20]
pop eax
push [1c, 0x20]
pop ebx
xor eax, ebx
sub eax, 1
push eax
pop [1c, 0xDF]
push [db, 0x20]
pop eax
push [1d, 0x20]
pop ebx
xor eax, ebx
add eax, 1
push eax
pop [1d, 0xDF]
push [dc, 0x20]
pop eax
push [1e, 0x20]
pop ebx
xor eax, ebx
sub eax, 1
push eax
pop [1e, 0xDF]
push [dd, 0x20]
pop eax
push [1f, 0x20]
pop ebx
xor eax, ebx
sub eax, 1
push eax
pop [1f, 0xDF]

进程已结束,退出代码0

得到虚拟机加密逻辑

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
void VM::encrypt(const char *input) {
char* output = new char[32];

for (int i = 0; i < 32; i++) {
if (i == 0)
output[i] = input[i] ^ 0x1e;
else
output[i] = input[i] ^ output[i - 1];

if (i % 2 == 0 || i == 31)
output[i] -= 1;
else
output[i] += 1;

cout << (int)output[i] << endl;
}
delete[] output;
}

得出解密逻辑

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
void VM::decrypt() {
char* output = new char[32];

for (int i = 31; i > -1; i--) {
if (i % 2 == 0 || i == 31)
ins_values[i] += 1;
else
ins_values[i] -= 1;

if (i == 0)
output[i] = ins_values[i] ^ 0x1e;
else
output[i] = ins_values[i - 1] ^ ins_values[i];

cout << (int)output[i] << endl;
}

cout << output << endl;
delete[] output;
}

得到flag

1
2
A76 6957 A53ED A929
0CC F8E0 3F1A9 B7E0

获取到蘑菇并解出题目

1BC2571D712893B54F4386114E1FFEB8

[^1]: # 确定LuaJIT版本

[^2]: # bindiff

通过比较两个IDA数据库(本质上应该就是比较具体的逻辑), 从而恢复函数符号.

# 使用

## IDA打开比较文件

两个文件都是用IDA打开:

* 要恢复符号的文件保留
* 函数库关闭IDA并保存数据库

## 使用BinDiff

* 点击file -> BinDiff
* 选择要比较的数据库

​![image](flagio1/image-20230428191458-e1l2k33.png)​

## 符号恢复

* 比较完成后会新建四个窗口

  * ​![image](flagio1/image-20230428191659-ooqxn6y.png)​
  * Matched Function (匹配成功的函数)
  * Statistics (统计)
  * primary unmatched (首次不匹配的函数)
  * secondary unmatched (第二次仍不匹配的函数)

## 如何打开分析的窗口

BinDiff总共四个窗口, 如果需要重新打开点击 View -> BinDiff -> ....

​![image](flagio1/image-20230504202243-3xz2slk.png)​

## Matched Function窗口

该窗口用于查看匹配的函数

### 背景颜色

背景颜色代表相似度(绿->红, 相似度逐渐下降), 具体数值可以看字段Similarity

​![image](flagio1/image-20230428191940-rnk9e4k.png)​

## 如何载入对比文件

使用BinDiff进行对比后, 退出IDA会提示是否保存对比结果, 点击是会保存一个`libgame.so_vs_libluajit.so.BinDiff`​文件, 下一次使用的时候可以加载该文件.

如何载入该文件: 点击 File -> Load file -> BinDiff results.

​![image](flagio1/image-20230504201848-au4ot0p.png)​

# 注意

## 函数名

匹配后没有直接修改函数名 (因为是按照相似度匹配, 所以是否需要改名需要分析者做出判断)

所以只能在Matched Function窗口[^3]中搜索

‍

[^3]: ## Matched Function窗口

该窗口用于查看匹配的函数

### 背景颜色

背景颜色代表相似度(绿->红, 相似度逐渐下降), 具体数值可以看字段Similarity

​![image](flagio1/image-20230428191940-rnk9e4k.png)​

[^4]: ## LUA文件编译流程

[^5]: ## LuaJIT安装

[^6]: ### lua_State

[^7]: ## Global_State

文章作者: LamのCrow
文章链接: http://example.com/2023/05/13/flagio/
版权声明: 本博客所有文章除特别声明外,均采用 CC BY-NC-SA 4.0 许可协议。转载请注明来自 LamのCrow