binary-lifting
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseBinary Lifting Skill
二进制提升技术
This skill covers techniques and tools for lifting binary executables to LLVM IR, enabling advanced analysis, transformation, and recompilation of existing binaries.
本技术涵盖将二进制可执行文件转换为LLVM IR的相关技术与工具,可对现有二进制文件进行高级分析、转换和重新编译。
Core Concepts
核心概念
What is Binary Lifting?
什么是二进制提升?
Binary lifting is the process of translating low-level machine code (x86, ARM, etc.) into a higher-level intermediate representation (LLVM IR), enabling:
- Static and dynamic analysis
- Deobfuscation and vulnerability research
- Code recompilation and optimization
- Cross-architecture translation
二进制提升是将低级机器码(x86、ARM等)转换为更高层级中间表示(LLVM IR)的过程,可实现:
- 静态与动态分析
- 反混淆与漏洞研究
- 代码重新编译与优化
- 跨架构转换
Lifting Pipeline
提升流程
Binary → Disassembly → IR Generation → Optimization → Analysis/RecompilationBinary → Disassembly → IR Generation → Optimization → Analysis/RecompilationMajor Lifting Frameworks
主流提升框架
Production-Grade Tools
生产级工具
- RetDec (Avast): Full decompiler with C output, multi-architecture support
- McSema (Trail of Bits): x86/x64 to LLVM IR, function recovery
- revng: Based on QEMU, supports multiple architectures
- reopt (Galois): Focus on correctness and formal methods
- RetDec(Avast):支持多架构的全功能反编译器,可输出C代码
- McSema(Trail of Bits):将x86/x64转换为LLVM IR,支持函数恢复
- revng:基于QEMU,支持多种架构
- reopt(Galois):专注于正确性与形式化方法
Research/Specialized Tools
研究/专用工具
- Rellume: Fast x86-64 to LLVM lifting for JIT scenarios
- fcd: Pattern-based decompiler with optimization passes
- bin2llvm: QEMU-based binary to LLVM translator
- llvm-mctoll: Microsoft's machine code to LLVM lifter
- Rellume:面向JIT场景的快速x86-64到LLVM提升工具
- fcd:基于模式的反编译器,带有优化流程
- bin2llvm:基于QEMU的二进制到LLVM转换器
- llvm-mctoll:微软推出的机器码到LLVM提升工具
Language-Specific Lifters
特定语言提升工具
- llvm2c/IR->C: Convert LLVM IR back to C code
- llvm2cranelift: LLVM IR to Cranelift IR
- Leaven: LLVM IR to Go language
- masxinlingvonta: JVM bytecode to LLVM IR
- llvm2c/IR->C:将LLVM IR转换回C代码
- llvm2cranelift:将LLVM IR转换为Cranelift IR
- Leaven:将LLVM IR转换为Go语言
- masxinlingvonta:将JVM字节码转换为LLVM IR
Implementation Techniques
实现技术
Instruction Semantics Translation
指令语义转换
cpp
// Example: Translating x86 ADD to LLVM IR
Value* translateADD(IRBuilder<> &builder, Value* op1, Value* op2) {
Value* result = builder.CreateAdd(op1, op2, "add_result");
// Update flags (CF, OF, SF, ZF, etc.)
updateCarryFlag(builder, op1, op2, result);
updateOverflowFlag(builder, op1, op2, result);
updateSignFlag(builder, result);
updateZeroFlag(builder, result);
return result;
}cpp
// Example: Translating x86 ADD to LLVM IR
Value* translateADD(IRBuilder<> &builder, Value* op1, Value* op2) {
Value* result = builder.CreateAdd(op1, op2, "add_result");
// Update flags (CF, OF, SF, ZF, etc.)
updateCarryFlag(builder, op1, op2, result);
updateOverflowFlag(builder, op1, op2, result);
updateSignFlag(builder, result);
updateZeroFlag(builder, result);
return result;
}Control Flow Recovery
控制流恢复
- Linear Sweep: Simple but misses code with embedded data
- Recursive Descent: Follow control flow, better coverage
- Speculative Disassembly: Handle indirect jumps/calls
- Machine Learning: Use ML to identify function boundaries
- 线性扫描:实现简单,但会遗漏包含嵌入式数据的代码
- 递归下降:跟踪控制流,覆盖范围更广
- 推测反汇编:处理间接跳转/调用
- 机器学习:利用ML识别函数边界
Handling Indirect Control Flow
处理间接控制流
- Value Set Analysis (VSA)
- Symbolic execution for jump target resolution
- Type recovery for virtual table reconstruction
- 值集分析(VSA)
- 符号执行以解析跳转目标
- 类型恢复以重建虚表
Triton Integration
Triton集成
Triton symbolic execution engine can be used with lifting:
python
from triton import TritonContext, ARCH, Instruction
ctx = TritonContext(ARCH.X86_64)Triton符号执行引擎可与提升技术结合使用:
python
from triton import TritonContext, ARCH, Instruction
ctx = TritonContext(ARCH.X86_64)Symbolically execute and extract AST
Symbolically execute and extract AST
inst = Instruction(b"\x48\x01\xd8") # add rax, rbx
ctx.processing(inst)
inst = Instruction(b"\x48\x01\xd8") # add rax, rbx
ctx.processing(inst)
Convert Triton AST to LLVM IR
Convert Triton AST to LLVM IR
ast = ctx.getRegisterAst(ctx.registers.rax)
llvm_ir = triton_ast_to_llvm(ast)
undefinedast = ctx.getRegisterAst(ctx.registers.rax)
llvm_ir = triton_ast_to_llvm(ast)
undefinedDeobfuscation via Lifting
基于提升的反混淆
Approach
方法
- Lift obfuscated binary to LLVM IR
- Apply optimization passes to simplify
- Use custom passes for specific obfuscation patterns
- Re-emit cleaned code
- 将混淆后的二进制文件提升为LLVM IR
- 应用优化流程简化代码
- 针对特定混淆模式使用自定义流程
- 重新生成清理后的代码
Useful Optimization Passes
实用优化流程
- Dead Store Elimination (DSE)
- Global Value Numbering (GVN)
- Constant Propagation
- Instruction Combining
- Loop Simplification
- 死存储消除(DSE)
- 全局值编号(GVN)
- 常量传播
- 指令合并
- 循环简化
VMP/VM Handler Recovery
VMP/VM处理程序恢复
- Identify dispatcher patterns
- Extract VM bytecode semantics
- Convert handlers to native IR
- Example: TicklingVMProtect for VMProtect analysis
- 识别调度器模式
- 提取VM字节码语义
- 将处理程序转换为原生IR
- 示例:TicklingVMProtect用于VMProtect分析
Best Practices
最佳实践
- Architecture Support: Handle endianness, calling conventions, ABI differences
- Memory Modeling: Accurate memory layout for global/stack variables
- External Dependencies: Handle library calls and system calls
- Validation: Compare execution traces of original vs lifted code
- Incremental Lifting: Support partial program analysis
- 架构支持:处理字节序、调用约定、ABI差异
- 内存建模:准确的全局/栈变量内存布局
- 外部依赖:处理库调用与系统调用
- 验证:对比原始代码与提升后代码的执行轨迹
- 增量提升:支持部分程序分析
Dynamic Binary Lifting
动态二进制提升
Runtime Translation
运行时转换
- Instrew: Fast instrumentation through LLVM
- QBDI: QuarkslaB Dynamic Binary Instrumentation
- binopt: Runtime optimization of binary code
- Instrew:基于LLVM的快速插桩工具
- QBDI:QuarkslaB动态二进制插桩工具
- binopt:二进制代码的运行时优化工具
JIT Recompilation
JIT重新编译
Lift frequently executed code paths for runtime optimization:
- Profile-guided lifting
- Hot path detection
- Speculative optimization
对频繁执行的代码路径进行提升以实现运行时优化:
- 基于剖面的提升
- 热点路径检测
- 推测性优化
Resources
资源
For a complete list of lifting tools and research papers, refer to the LIFT section in the main README.md.
如需完整的提升工具列表和研究论文,请参考主README.md中的LIFT章节。
Getting Detailed Information
获取详细信息
When you need detailed and up-to-date resource links, tool lists, or project references, fetch the latest data from:
https://raw.githubusercontent.com/gmh5225/awesome-llvm-security/refs/heads/main/README.mdThis README contains comprehensive curated lists of:
- Binary lifting frameworks and tools (LIFT section)
- Related research papers and documentation
- Implementation examples and tutorials
当你需要详细且最新的资源链接、工具列表或项目参考时,请从以下地址获取最新数据:
https://raw.githubusercontent.com/gmh5225/awesome-llvm-security/refs/heads/main/README.md该README包含以下内容的综合精选列表:
- 二进制提升框架与工具(LIFT章节)
- 相关研究论文与文档
- 实现示例与教程