binary-lifting

Compare original and translation side by side

🇺🇸

Original

English

🇨🇳

Translation

Chinese

Binary Lifting Skill

二进制提升技术

This skill covers techniques and tools for lifting binary executables to LLVM IR, enabling advanced analysis, transformation, and recompilation of existing binaries.

本技术涵盖将二进制可执行文件转换为LLVM IR的相关技术与工具，可对现有二进制文件进行高级分析、转换和重新编译。

Core Concepts

核心概念

What is Binary Lifting?

什么是二进制提升？

Binary lifting is the process of translating low-level machine code (x86, ARM, etc.) into a higher-level intermediate representation (LLVM IR), enabling:

Static and dynamic analysis
Deobfuscation and vulnerability research
Code recompilation and optimization
Cross-architecture translation

二进制提升是将低级机器码（x86、ARM等）转换为更高层级中间表示（LLVM IR）的过程，可实现：

静态与动态分析
反混淆与漏洞研究
代码重新编译与优化
跨架构转换

Lifting Pipeline

提升流程

Binary → Disassembly → IR Generation → Optimization → Analysis/Recompilation

Binary → Disassembly → IR Generation → Optimization → Analysis/Recompilation

Major Lifting Frameworks

主流提升框架

Production-Grade Tools

生产级工具

RetDec (Avast): Full decompiler with C output, multi-architecture support
McSema (Trail of Bits): x86/x64 to LLVM IR, function recovery
revng: Based on QEMU, supports multiple architectures
reopt (Galois): Focus on correctness and formal methods

RetDec（Avast）：支持多架构的全功能反编译器，可输出C代码
McSema（Trail of Bits）：将x86/x64转换为LLVM IR，支持函数恢复
revng：基于QEMU，支持多种架构
reopt（Galois）：专注于正确性与形式化方法

Research/Specialized Tools

研究/专用工具

Rellume: Fast x86-64 to LLVM lifting for JIT scenarios
fcd: Pattern-based decompiler with optimization passes
bin2llvm: QEMU-based binary to LLVM translator
llvm-mctoll: Microsoft's machine code to LLVM lifter

Rellume：面向JIT场景的快速x86-64到LLVM提升工具
fcd：基于模式的反编译器，带有优化流程
bin2llvm：基于QEMU的二进制到LLVM转换器
llvm-mctoll：微软推出的机器码到LLVM提升工具

Language-Specific Lifters

特定语言提升工具

llvm2c/IR->C: Convert LLVM IR back to C code
llvm2cranelift: LLVM IR to Cranelift IR
Leaven: LLVM IR to Go language
masxinlingvonta: JVM bytecode to LLVM IR

llvm2c/IR->C：将LLVM IR转换回C代码
llvm2cranelift：将LLVM IR转换为Cranelift IR
Leaven：将LLVM IR转换为Go语言
masxinlingvonta：将JVM字节码转换为LLVM IR

Implementation Techniques

实现技术

Instruction Semantics Translation

指令语义转换

cpp

// Example: Translating x86 ADD to LLVM IR
Value* translateADD(IRBuilder<> &builder, Value* op1, Value* op2) {
    Value* result = builder.CreateAdd(op1, op2, "add_result");
    
    // Update flags (CF, OF, SF, ZF, etc.)
    updateCarryFlag(builder, op1, op2, result);
    updateOverflowFlag(builder, op1, op2, result);
    updateSignFlag(builder, result);
    updateZeroFlag(builder, result);
    
    return result;
}

cpp

// Example: Translating x86 ADD to LLVM IR
Value* translateADD(IRBuilder<> &builder, Value* op1, Value* op2) {
    Value* result = builder.CreateAdd(op1, op2, "add_result");
    
    // Update flags (CF, OF, SF, ZF, etc.)
    updateCarryFlag(builder, op1, op2, result);
    updateOverflowFlag(builder, op1, op2, result);
    updateSignFlag(builder, result);
    updateZeroFlag(builder, result);
    
    return result;
}

Control Flow Recovery

控制流恢复

Linear Sweep: Simple but misses code with embedded data
Recursive Descent: Follow control flow, better coverage
Speculative Disassembly: Handle indirect jumps/calls
Machine Learning: Use ML to identify function boundaries

线性扫描：实现简单，但会遗漏包含嵌入式数据的代码
递归下降：跟踪控制流，覆盖范围更广
推测反汇编：处理间接跳转/调用
机器学习：利用ML识别函数边界

Handling Indirect Control Flow

处理间接控制流

Value Set Analysis (VSA)
Symbolic execution for jump target resolution
Type recovery for virtual table reconstruction

值集分析（VSA）
符号执行以解析跳转目标
类型恢复以重建虚表

Triton Integration

Triton集成

Triton symbolic execution engine can be used with lifting:

python

from triton import TritonContext, ARCH, Instruction

ctx = TritonContext(ARCH.X86_64)

Triton符号执行引擎可与提升技术结合使用：

python

from triton import TritonContext, ARCH, Instruction

ctx = TritonContext(ARCH.X86_64)

Symbolically execute and extract AST

inst = Instruction(b"\x48\x01\xd8") # add rax, rbx ctx.processing(inst)

Convert Triton AST to LLVM IR

ast = ctx.getRegisterAst(ctx.registers.rax) llvm_ir = triton_ast_to_llvm(ast)

undefined

ast = ctx.getRegisterAst(ctx.registers.rax) llvm_ir = triton_ast_to_llvm(ast)

undefined

Deobfuscation via Lifting

基于提升的反混淆

Approach

方法

Lift obfuscated binary to LLVM IR
Apply optimization passes to simplify
Use custom passes for specific obfuscation patterns
Re-emit cleaned code

将混淆后的二进制文件提升为LLVM IR
应用优化流程简化代码
针对特定混淆模式使用自定义流程
重新生成清理后的代码

Useful Optimization Passes

实用优化流程

Dead Store Elimination (DSE)
Global Value Numbering (GVN)
Constant Propagation
Instruction Combining
Loop Simplification

死存储消除（DSE）
全局值编号（GVN）
常量传播
指令合并
循环简化

VMP/VM Handler Recovery

VMP/VM处理程序恢复

Identify dispatcher patterns
Extract VM bytecode semantics
Convert handlers to native IR
Example: TicklingVMProtect for VMProtect analysis

识别调度器模式
提取VM字节码语义
将处理程序转换为原生IR
示例：TicklingVMProtect用于VMProtect分析

Best Practices

最佳实践

Architecture Support: Handle endianness, calling conventions, ABI differences
Memory Modeling: Accurate memory layout for global/stack variables
External Dependencies: Handle library calls and system calls
Validation: Compare execution traces of original vs lifted code
Incremental Lifting: Support partial program analysis

架构支持：处理字节序、调用约定、ABI差异
内存建模：准确的全局/栈变量内存布局
外部依赖：处理库调用与系统调用
验证：对比原始代码与提升后代码的执行轨迹
增量提升：支持部分程序分析

Dynamic Binary Lifting

动态二进制提升

Runtime Translation

运行时转换

Instrew: Fast instrumentation through LLVM
QBDI: QuarkslaB Dynamic Binary Instrumentation
binopt: Runtime optimization of binary code

Instrew：基于LLVM的快速插桩工具
QBDI：QuarkslaB动态二进制插桩工具
binopt：二进制代码的运行时优化工具

JIT Recompilation

JIT重新编译

Lift frequently executed code paths for runtime optimization:

Profile-guided lifting
Hot path detection
Speculative optimization

对频繁执行的代码路径进行提升以实现运行时优化：

基于剖面的提升
热点路径检测
推测性优化

Resources

资源

For a complete list of lifting tools and research papers, refer to the LIFT section in the main README.md.

如需完整的提升工具列表和研究论文，请参考主README.md中的LIFT章节。

Getting Detailed Information

获取详细信息

When you need detailed and up-to-date resource links, tool lists, or project references, fetch the latest data from:

https://raw.githubusercontent.com/gmh5225/awesome-llvm-security/refs/heads/main/README.md

This README contains comprehensive curated lists of:

Binary lifting frameworks and tools (LIFT section)
Related research papers and documentation
Implementation examples and tutorials

当你需要详细且最新的资源链接、工具列表或项目参考时，请从以下地址获取最新数据：

https://raw.githubusercontent.com/gmh5225/awesome-llvm-security/refs/heads/main/README.md

该README包含以下内容的综合精选列表：

二进制提升框架与工具（LIFT章节）
相关研究论文与文档
实现示例与教程