初探利用angr进行漏洞挖掘(上)
字数 1443 2025-08-24 16:48:07
利用angr进行漏洞挖掘(上)——栈溢出漏洞挖掘详解
一、angr简介
angr是一个基于Python开发的二进制分析框架,主要功能包括:
- 符号执行(Symbolic Execution)
- 二进制代码分析
- 约束求解
- 自动化漏洞挖掘(AEG - Automatic Exploit Generation)
在CTF逆向中,angr常用于:
- 通过约束求解找到复杂计算的正确解
- 自动化获取flag
- 漏洞自动化利用(AEG)
二、AEG基本流程
自动化漏洞利用生成(AEG)通常分为三个步骤:
- 挖掘漏洞:识别程序中潜在的漏洞
- 生成利用exp:基于发现的漏洞构造利用代码
- 验证exp:验证生成的利用代码是否有效
三、官方示例分析
3.1 示例代码分析
官方提供了一个简单的堆溢出漏洞示例:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
char component_name[128] = {0};
typedef struct component {
char name[32];
int (*do_something)(int arg);
} comp_t;
int sample_func(int x) {
printf(" - %s - recieved argument %d\n", component_name, x);
}
comp_t *initialize_component(char *cmp_name) {
int i = 0;
comp_t *cmp;
cmp = malloc(sizeof(struct component));
cmp->do_something = sample_func;
printf("Copying component name...\n");
while(*cmp_name)
cmp->name[i++] = *cmp_name++;
cmp->name[i] = '\0';
return cmp;
}
int main(void) {
comp_t *cmp;
printf("Component Name:\n");
read(0, component_name, sizeof component_name);
printf("Initializing component...\n");
cmp = initialize_component(component_name);
printf("Running component...\n");
cmp->do_something(1);
}
漏洞点:
component_name长度大于32时会溢出覆盖cmp->do_something成员- 后续调用
cmp->do_something(1)会导致程序控制流劫持
3.2 官方AEG脚本解析
官方提供的angr脚本主要逻辑:
def main(binary):
p = angr.Project(binary)
es = p.factory.entry_state(add_options={so.REVERSE_MEMORY_NAME_MAP, so.TRACK_ACTION_HISTORY})
sm = p.factory.simulation_manager(es, save_unconstrained=True)
# 寻找unconstrained状态
while exploitable_state is None:
sm.step()
if len(sm.unconstrained) > 0:
for u in sm.unconstrained:
if fully_symbolic(u, u.regs.pc):
exploitable_state = u
break
sm.drop(stash='unconstrained')
# 检查是否可以放置shellcode
for buf_addr in find_symbolic_buffer(ep, len(shellcode)):
memory = ep.memory.load(buf_addr, len(shellcode))
sc_bvv = ep.solver.BVV(shellcode)
if ep.satisfiable(extra_constraints=(memory == sc_bvv, ep.regs.pc == buf_addr)):
ep.add_constraints(memory == sc_bvv)
ep.add_constraints(ep.regs.pc == buf_addr)
break
关键点:
- 通过
save_unconstrained=True保存所有unconstrained状态 - 检查PC寄存器是否完全符号化
- 寻找可以放置shellcode的缓冲区
- 添加约束条件使PC指向shellcode
3.3 官方方法的局限性
官方方法仅通过查找unconstrained状态来识别漏洞,存在以下问题:
- 只能识别导致PC失控的漏洞
- 遇到第一个漏洞就停止,无法发现后续路径中的其他漏洞
- 对堆漏洞和部分栈溢出不敏感
四、改进的栈溢出挖掘方法
4.1 栈溢出原理回顾
栈溢出发生的完整过程:
- 栈空间被覆盖
- 覆盖到栈中rbp值(不考虑canary)
- 覆盖到栈中返回地址值
- 函数结束,开始返回:
leave指令(pop rbp; mov rsp, rbp)ret指令(pop rip)
- 发生crash
4.2 改进思路
核心思想:在函数返回前检测栈状态,修复栈数据使执行能继续
具体步骤:
- 进入新函数时保存正确的rbp值
- 函数返回前检查:
- 即将被pop的rbp值是否符号化
- 即将被pop的返回地址是否符号化
- 如果发现溢出:
- 记录漏洞信息
- 修复栈数据(恢复正确的rbp和返回地址)
- 继续执行以发现后续路径中的漏洞
4.3 关键实现技术
4.3.1 函数入口检测
通过识别函数序言指令检测函数入口:
def check_head(state):
insns = state.project.factory.block(state.addr).capstone.insns
if len(insns) >= 2:
# 检查 push rbp; mov rsp,rbp;
ins0 = insns[0].insn
ins1 = insns[1].insn
if (len(ins0.operands) == 1 and len(ins1.operands) == 2 and
ins0.mnemonic == "push" and ins0.reg_name(ins0.operands[0].reg) == "rbp" and
ins1.mnemonic == "mov" and ins1.reg_name(ins1.operands[0].reg) == "rbp" and
ins1.reg_name(ins1.operands[1].reg) == "rsp"):
# 保存当前rbp值
pre_target = state.callstack.ret_addr
state.globals['rbp_list'][hex(pre_target)] = state.regs.rbp
4.3.2 函数出口检测
通过识别leave; ret指令检测函数出口:
def check_end(state):
if state.addr == 0:
return
insns = state.project.factory.block(state.addr).capstone.insns
if len(insns) >= 2:
flag = 0
# 检查 leave; ret;
for ins in insns:
if ins.insn.mnemonic == "leave":
flag += 1
if ins.insn.mnemonic == "ret":
flag += 1
if flag == 2:
# 检查栈溢出
...
4.3.3 栈溢出检测
检查rbp和返回地址是否被符号化:
rsp = state.regs.rsp
rbp = state.regs.rbp
byte_s = state.arch.bytes
stack_rbp = state.memory.load(rbp, endness=angr.archinfo.Endness.LE)
stack_ret = state.memory.load(rbp + byte_s, endness=angr.archinfo.Endness.LE)
pre_target = state.callstack.ret_addr
pre_rbp = state.globals['rbp_list'][hex(pre_target)]
if stack_ret.symbolic: # 返回地址被覆盖
num = check_symbolic_bits(state, stack_ret)
print_pc_overflow_msg(state, num // byte_s)
# 修复栈
state.memory.store(rbp, pre_rbp, endness=angr.archinfo.Endness.LE)
state.memory.store(rbp + byte_s, state.solver.BVV(pre_target, 64),
endness=angr.archinfo.Endness.LE)
return
if stack_rbp.symbolic: # 仅rbp被覆盖
num = check_symbolic_bits(state, stack_rbp)
print_bp_overflow_msg(state, num // byte_s)
state.memory.store(rbp, pre_rbp, endness=angr.archinfo.Endness.LE)
4.3.4 符号化检测
检测值中有多少位是符号化的:
def check_symbolic_bits(state, val):
bits = 0
for idx in range(state.arch.bits):
if val[idx].symbolic:
bits += 1
return bits
4.4 完整实现代码
import angr
def check_symbolic_bits(state, val):
bits = 0
for idx in range(state.arch.bits):
if val[idx].symbolic:
bits += 1
return bits
def print_pc_overflow_msg(state, byte_s):
print("\n[========find a pc overflow========]")
print("over for", hex(byte_s), "bytes")
print("[PC]stdout:\n", state.posix.dumps(1))
print("[PC]trigger overflow input:")
print(state.posix.dumps(0))
def print_bp_overflow_msg(state, byte_s):
print("\n[========find a bp overflow========]")
print("over for", hex(byte_s), "bytes")
print("[PC]stdout:\n", state.posix.dumps(1))
print("[PC]trigger overflow input:")
print(state.posix.dumps(0))
def check_end(state):
if state.addr == 0:
return
insns = state.project.factory.block(state.addr).capstone.insns
if len(insns) >= 2:
flag = 0
#check for : leave; ret;
for ins in insns:
if ins.insn.mnemonic == "leave":
flag += 1
if ins.insn.mnemonic == "ret":
flag += 1
if flag == 2:
rsp = state.regs.rsp
rbp = state.regs.rbp
byte_s = state.arch.bytes
stack_rbp = state.memory.load(rbp, endness=angr.archinfo.Endness.LE)
stack_ret = state.memory.load(rbp + byte_s, endness=angr.archinfo.Endness.LE)
pre_target = state.callstack.ret_addr
pre_rbp = state.globals['rbp_list'][hex(pre_target)]
if stack_ret.symbolic:
num = check_symbolic_bits(state, stack_ret)
print_pc_overflow_msg(state, num // byte_s)
state.memory.store(rbp, pre_rbp, endness=angr.archinfo.Endness.LE)
state.memory.store(rbp + byte_s, state.solver.BVV(pre_target, 64),
endness=angr.archinfo.Endness.LE)
return
if stack_rbp.symbolic:
num = check_symbolic_bits(state, stack_rbp)
print_bp_overflow_msg(state, num // byte_s)
state.memory.store(rbp, pre_rbp, endness=angr.archinfo.Endness.LE)
def check_head(state):
insns = state.project.factory.block(state.addr).capstone.insns
if len(insns) >= 2:
#check for : push rbp; mov rsp,rbp;
ins0 = insns[0].insn
ins1 = insns[1].insn
if len(ins0.operands) == 1 and len(ins1.operands) == 2:
ins0_name = ins0.mnemonic
ins0_op0 = ins0.reg_name(ins0.operands[0].reg)
ins1_name = ins1.mnemonic
ins1_op0 = ins1.reg_name(ins1.operands[0].reg)
ins1_op1 = ins1.reg_name(ins1.operands[1].reg)
if (ins0_name == "push" and ins0_op0 == "rbp" and
ins1_name == "mov" and ins1_op0 == "rbp" and ins1_op1 == "rsp"):
pre_target = state.callstack.ret_addr
state.globals['rbp_list'][hex(pre_target)] = state.regs.rbp
if __name__ == '__main__':
filename = "stack1"
p = angr.Project(filename, auto_load_libs=False)
state = p.factory.entry_state()
state.globals['rbp_list'] = {}
simgr = p.factory.simulation_manager(state, save_unconstrained=True)
while simgr.active:
for act in simgr.active:
check_head(act)
check_end(act)
simgr.step()
五、实验验证
5.1 测试程序
#include <stdio.h>
void func() {
char pwd[0x10] = {0};
puts("input admin password:");
read(0, pwd, 0x20);
}
void over() {
puts("over!");
char c[0x10] = {0};
read(0, c, 0x20);
}
int main(int argc, char const *argv[]) {
char name[0x10] = {0};
puts("input your name:");
read(0, name, 0x10);
over();
if(strstr(name, "admin")) {
func();
puts("welcome admin~");
} else {
printf("welcome, %s\n", name);
}
return 0;
}
// 编译命令:gcc stack1.c -o stack1 -fno-stack-protector
5.2 测试结果
改进后的脚本能够成功发现两个栈溢出漏洞:
over()函数中的read(0, c, 0x20)导致的栈溢出func()函数中的read(0, pwd, 0x20)导致的栈溢出
六、总结
本文详细介绍了:
- angr的基本原理和AEG流程
- 官方AEG方法的实现和局限性
- 改进的栈溢出挖掘方法:
- 函数入口/出口检测
- 栈状态检查
- 栈数据修复
- 完整实现代码和实验验证
关键优势:
- 能够发现多个路径中的栈溢出漏洞
- 不仅能检测PC溢出,还能检测rbp溢出
- 通过修复栈数据实现深度路径探索
下篇将介绍如何利用angr进行堆漏洞(UAF、Double Free等)的自动化挖掘。