analyzing-golang-malware-with-ghidra
Compare original and translation side by side
🇺🇸
Original
English🇨🇳
Translation
ChineseAnalyzing Golang Malware with Ghidra
使用Ghidra分析Golang恶意软件
Overview
概述
Go (Golang) has become a popular language for malware authors due to its cross-compilation capabilities, static linking that produces self-contained binaries, and the complexity it introduces for reverse engineering. Go binaries contain the entire runtime, standard library, and all dependencies statically linked, resulting in large binaries (often 5-15MB) with thousands of functions. Ghidra struggles with Go-specific string formats (non-null-terminated), stripped function names, and goroutine concurrency patterns. Specialized tools like GoResolver (Volexity, 2025) use control-flow graph similarity to automatically deobfuscate and recover function names in stripped or obfuscated Go binaries.
Go(Golang)因其跨编译能力、生成自包含二进制文件的静态链接特性,以及给逆向工程带来的复杂性,已成为恶意软件作者青睐的语言。Go二进制文件静态链接了整个运行时、标准库及所有依赖项,导致文件体积较大(通常为5-15MB),包含数千个函数。Ghidra在处理Go特有的字符串格式(非空终止)、已剥离的函数名以及goroutine并发模式时存在困难。GoResolver(Volexity,2025)等专用工具利用控制流图相似度,自动对已剥离或混淆的Go二进制文件进行反混淆并恢复函数名。
When to Use
使用场景
- When investigating security incidents that require analyzing golang malware with ghidra
- When building detection rules or threat hunting queries for this domain
- When SOC analysts need structured procedures for this analysis type
- When validating security monitoring coverage for related attack techniques
- 调查需要用Ghidra分析Golang恶意软件的安全事件时
- 构建该领域的检测规则或威胁狩猎查询时
- SOC分析师需要此类分析的结构化流程时
- 验证相关攻击技术的安全监控覆盖范围时
Prerequisites
前置条件
- Ghidra 11.0+ with JDK 17+
- GoResolver plugin (for function name recovery)
- Go Reverse Engineering Tool Kit (go-re.tk)
- Python 3.9+ for helper scripts
- Understanding of Go runtime internals (goroutines, channels, interfaces)
- Familiarity with Go binary structure (pclntab, moduledata, itab)
- 安装有JDK 17+的Ghidra 11.0+
- GoResolver插件(用于恢复函数名)
- Go逆向工程工具包(go-re.tk)
- 用于辅助脚本的Python 3.9+
- 了解Go运行时内部机制(goroutines、channels、interfaces)
- 熟悉Go二进制文件结构(pclntab、moduledata、itab)
Key Concepts
核心概念
Go Binary Structure
Go二进制文件结构
Go binaries embed rich metadata in the (PC Line Table) structure, which maps program counters to function names, source files, and line numbers. Even stripped binaries retain this metadata. The structure contains pointers to type information, itabs (interface tables), and the pclntab itself. Go strings are stored as a pointer-length pair rather than null-terminated C strings.
pclntabmoduledataGo二进制文件在(PC行表)结构中嵌入了丰富的元数据,该结构将程序计数器映射到函数名、源文件和行号。即使是已剥离符号的二进制文件也会保留此元数据。结构包含指向类型信息、itabs(接口表)和pclntab本身的指针。Go字符串以指针-长度对的形式存储,而非空终止的C字符串。
pclntabmoduledataFunction Recovery in Stripped Binaries
已剥离二进制文件中的函数恢复
Despite stripping symbol tables, Go binaries retain function names within the pclntab. However, obfuscation tools like garble rename functions to random strings. GoResolver addresses this by computing control-flow graph signatures of obfuscated functions and matching them against a database of known Go standard library and third-party package functions.
尽管符号表被剥离,Go二进制文件仍会在pclntab中保留函数名。但garble等混淆工具会将函数重命名为随机字符串。GoResolver通过计算混淆函数的控制流图签名,并与已知的Go标准库和第三方包函数数据库进行匹配来解决此问题。
Crate/Dependency Extraction
包/依赖提取
Go's dependency management embeds module paths and version strings in the binary. Extracting these reveals the malware's third-party dependencies (HTTP libraries, encryption packages, C2 frameworks), which provides insight into capabilities without full reverse engineering.
Go的依赖管理会将模块路径和版本字符串嵌入二进制文件中。提取这些信息可以揭示恶意软件的第三方依赖(HTTP库、加密包、C2框架),无需完全逆向工程即可了解其功能。
Workflow
工作流程
Step 1: Initial Binary Analysis
步骤1:初始二进制分析
python
#!/usr/bin/env python3
"""Analyze Go binary metadata for malware analysis."""
import struct
import sys
import re
def find_go_build_info(data):
"""Extract Go build information from binary."""
# Go buildinfo magic: \xff Go buildinf:
magic = b'\xff Go buildinf:'
offset = data.find(magic)
if offset == -1:
return None
print(f"[+] Go build info at offset 0x{offset:x}")
# Extract Go version string nearby
go_version = re.search(rb'go\d+\.\d+(?:\.\d+)?', data[offset:offset+256])
if go_version:
print(f" Go Version: {go_version.group().decode()}")
return offset
def find_pclntab(data):
"""Locate the pclntab (PC Line Table) structure."""
# pclntab magic bytes vary by Go version
magics = {
b'\xfb\xff\xff\xff\x00\x00': "Go 1.2-1.15",
b'\xfa\xff\xff\xff\x00\x00': "Go 1.16-1.17",
b'\xf1\xff\xff\xff\x00\x00': "Go 1.18-1.19",
b'\xf0\xff\xff\xff\x00\x00': "Go 1.20+",
}
for magic, version in magics.items():
offset = data.find(magic)
if offset != -1:
print(f"[+] pclntab found at 0x{offset:x} ({version})")
return offset, version
return None, None
def extract_function_names(data, pclntab_offset):
"""Extract function names from pclntab."""
if pclntab_offset is None:
return []
functions = []
# Function name strings follow specific patterns
func_pattern = re.compile(
rb'(?:main|runtime|fmt|net|os|crypto|encoding|io|sync|'
rb'syscall|reflect|strings|bytes|path|time|math|sort|'
rb'github\.com|golang\.org)[/\.][\w/.]+',
)
for match in func_pattern.finditer(data):
name = match.group().decode('utf-8', errors='replace')
if len(name) > 4 and len(name) < 200:
functions.append(name)
return sorted(set(functions))
def extract_go_strings(data):
"""Extract Go-style strings (pointer+length pairs)."""
# Go strings are not null-terminated; extract readable sequences
strings = []
ascii_pattern = re.compile(rb'[\x20-\x7e]{10,}')
for match in ascii_pattern.finditer(data):
s = match.group().decode('ascii')
# Filter for interesting malware strings
interesting = [
'http', 'https', 'tcp', 'udp', 'dns',
'cmd', 'shell', 'exec', 'upload', 'download',
'encrypt', 'decrypt', 'key', 'token', 'password',
'c2', 'beacon', 'agent', 'implant', 'bot',
'mutex', 'persist', 'registry', 'scheduled',
]
if any(kw in s.lower() for kw in interesting):
strings.append(s)
return strings
def extract_dependencies(data):
"""Extract Go module dependencies from binary."""
deps = []
# Module paths follow pattern: github.com/user/repo
dep_pattern = re.compile(
rb'((?:github\.com|gitlab\.com|golang\.org|gopkg\.in|'
rb'go\.etcd\.io|google\.golang\.org)/[^\x00\s]{5,80})'
)
for match in dep_pattern.finditer(data):
dep = match.group().decode('utf-8', errors='replace')
deps.append(dep)
unique_deps = sorted(set(deps))
return unique_deps
def analyze_go_binary(filepath):
"""Full analysis of Go malware binary."""
with open(filepath, 'rb') as f:
data = f.read()
print(f"[+] Analyzing Go binary: {filepath}")
print(f" File size: {len(data):,} bytes")
print("=" * 60)
# Build info
find_go_build_info(data)
# pclntab
pclntab_offset, go_version = find_pclntab(data)
# Functions
functions = extract_function_names(data, pclntab_offset)
print(f"\n[+] Recovered {len(functions)} function names")
# Categorize functions
categories = {
"network": [], "crypto": [], "os_exec": [],
"file_io": [], "main": [], "third_party": [],
}
for f in functions:
if 'net/' in f or 'http' in f.lower():
categories["network"].append(f)
elif 'crypto' in f:
categories["crypto"].append(f)
elif 'os/exec' in f or 'syscall' in f:
categories["os_exec"].append(f)
elif 'os.' in f or 'io/' in f:
categories["file_io"].append(f)
elif f.startswith('main.'):
categories["main"].append(f)
elif 'github.com' in f or 'golang.org' in f:
categories["third_party"].append(f)
for cat, funcs in categories.items():
if funcs:
print(f"\n [{cat}] ({len(funcs)} functions):")
for fn in funcs[:10]:
print(f" {fn}")
# Dependencies
deps = extract_dependencies(data)
print(f"\n[+] Dependencies ({len(deps)}):")
for dep in deps[:20]:
print(f" {dep}")
# Suspicious strings
sus_strings = extract_go_strings(data)
print(f"\n[+] Suspicious strings ({len(sus_strings)}):")
for s in sus_strings[:20]:
print(f" {s}")
if __name__ == "__main__":
if len(sys.argv) < 2:
print(f"Usage: {sys.argv[0]} <go_binary>")
sys.exit(1)
analyze_go_binary(sys.argv[1])python
#!/usr/bin/env python3
"""Analyze Go binary metadata for malware analysis."""
import struct
import sys
import re
def find_go_build_info(data):
"""Extract Go build information from binary."""
# Go buildinfo magic: \xff Go buildinf:
magic = b'\xff Go buildinf:'
offset = data.find(magic)
if offset == -1:
return None
print(f"[+] Go build info at offset 0x{offset:x}")
# Extract Go version string nearby
go_version = re.search(rb'go\d+\.\d+(?:\.\d+)?', data[offset:offset+256])
if go_version:
print(f" Go Version: {go_version.group().decode()}")
return offset
def find_pclntab(data):
"""Locate the pclntab (PC Line Table) structure."""
# pclntab magic bytes vary by Go version
magics = {
b'\xfb\xff\xff\xff\x00\x00': "Go 1.2-1.15",
b'\xfa\xff\xff\xff\x00\x00': "Go 1.16-1.17",
b'\xf1\xff\xff\xff\x00\x00': "Go 1.18-1.19",
b'\xf0\xff\xff\xff\x00\x00': "Go 1.20+",
}
for magic, version in magics.items():
offset = data.find(magic)
if offset != -1:
print(f"[+] pclntab found at 0x{offset:x} ({version})")
return offset, version
return None, None
def extract_function_names(data, pclntab_offset):
"""Extract function names from pclntab."""
if pclntab_offset is None:
return []
functions = []
# Function name strings follow specific patterns
func_pattern = re.compile(
rb'(?:main|runtime|fmt|net|os|crypto|encoding|io|sync|'
rb'syscall|reflect|strings|bytes|path|time|math|sort|'
rb'github\.com|golang\.org)[/\.][\w/.]+',
)
for match in func_pattern.finditer(data):
name = match.group().decode('utf-8', errors='replace')
if len(name) > 4 and len(name) < 200:
functions.append(name)
return sorted(set(functions))
def extract_go_strings(data):
"""Extract Go-style strings (pointer+length pairs)."""
# Go strings are not null-terminated; extract readable sequences
strings = []
ascii_pattern = re.compile(rb'[\x20-\x7e]{10,}')
for match in ascii_pattern.finditer(data):
s = match.group().decode('ascii')
# Filter for interesting malware strings
interesting = [
'http', 'https', 'tcp', 'udp', 'dns',
'cmd', 'shell', 'exec', 'upload', 'download',
'encrypt', 'decrypt', 'key', 'token', 'password',
'c2', 'beacon', 'agent', 'implant', 'bot',
'mutex', 'persist', 'registry', 'scheduled',
]
if any(kw in s.lower() for kw in interesting):
strings.append(s)
return strings
def extract_dependencies(data):
"""Extract Go module dependencies from binary."""
deps = []
# Module paths follow pattern: github.com/user/repo
dep_pattern = re.compile(
rb'((?:github\.com|gitlab\.com|golang\.org|gopkg\.in|'
rb'go\.etcd\.io|google\.golang\.org)/[^\x00\s]{5,80})'
)
for match in dep_pattern.finditer(data):
dep = match.group().decode('utf-8', errors='replace')
deps.append(dep)
unique_deps = sorted(set(deps))
return unique_deps
def analyze_go_binary(filepath):
"""Full analysis of Go malware binary."""
with open(filepath, 'rb') as f:
data = f.read()
print(f"[+] Analyzing Go binary: {filepath}")
print(f" File size: {len(data):,} bytes")
print("=" * 60)
# Build info
find_go_build_info(data)
# pclntab
pclntab_offset, go_version = find_pclntab(data)
# Functions
functions = extract_function_names(data, pclntab_offset)
print(f"\n[+] Recovered {len(functions)} function names")
# Categorize functions
categories = {
"network": [], "crypto": [], "os_exec": [],
"file_io": [], "main": [], "third_party": [],
}
for f in functions:
if 'net/' in f or 'http' in f.lower():
categories["network"].append(f)
elif 'crypto' in f:
categories["crypto"].append(f)
elif 'os/exec' in f or 'syscall' in f:
categories["os_exec"].append(f)
elif 'os.' in f or 'io/' in f:
categories["file_io"].append(f)
elif f.startswith('main.'):
categories["main"].append(f)
elif 'github.com' in f or 'golang.org' in f:
categories["third_party"].append(f)
for cat, funcs in categories.items():
if funcs:
print(f"\n [{cat}] ({len(funcs)} functions):")
for fn in funcs[:10]:
print(f" {fn}")
# Dependencies
deps = extract_dependencies(data)
print(f"\n[+] Dependencies ({len(deps)}):")
for dep in deps[:20]:
print(f" {dep}")
# Suspicious strings
sus_strings = extract_go_strings(data)
print(f"\n[+] Suspicious strings ({len(sus_strings)}):")
for s in sus_strings[:20]:
print(f" {s}")
if __name__ == "__main__":
if len(sys.argv) < 2:
print(f"Usage: {sys.argv[0]} <go_binary>")
sys.exit(1)
analyze_go_binary(sys.argv[1])Step 2: Ghidra Analysis Script
步骤2:Ghidra分析脚本
python
undefinedpython
undefinedGhidra script (run within Ghidra's script manager)
Ghidra script (run within Ghidra's script manager)
Save as AnalyzeGoBinary.py in Ghidra scripts directory
Save as AnalyzeGoBinary.py in Ghidra scripts directory
@category MalwareAnalysis
@category MalwareAnalysis
@description Analyze Go binary structure and recover metadata
@description Analyze Go binary structure and recover metadata
def analyze_go_binary_ghidra():
"""Ghidra script for Go binary analysis."""
from ghidra.program.model.mem import MemoryAccessException
program = getCurrentProgram()
memory = program.getMemory()
listing = program.getListing()
print("[+] Go Binary Analysis Script")
print(f" Program: {program.getName()}")
# Find pclntab
pclntab_magics = [
bytes([0xf0, 0xff, 0xff, 0xff]), # Go 1.20+
bytes([0xf1, 0xff, 0xff, 0xff]), # Go 1.18-1.19
bytes([0xfa, 0xff, 0xff, 0xff]), # Go 1.16-1.17
bytes([0xfb, 0xff, 0xff, 0xff]), # Go 1.2-1.15
]
for magic in pclntab_magics:
addr = memory.findBytes(
program.getMinAddress(), magic, None, True, None
)
if addr:
print(f"[+] pclntab found at {addr}")
# Create label
program.getSymbolTable().createLabel(
addr, "go_pclntab", None,
ghidra.program.model.symbol.SourceType.ANALYSIS
)
break
# Fix Go string definitions
# Go strings are ptr+len, not null terminated
print("[+] Fixing Go string references...")
# Search for function names containing package paths
symbol_table = program.getSymbolTable()
func_count = 0
for symbol in symbol_table.getAllSymbols(True):
name = symbol.getName()
if ('.' in name and
any(pkg in name for pkg in
['main.', 'runtime.', 'net.', 'crypto.', 'os.'])):
func_count += 1
print(f"[+] Found {func_count} Go function symbols")def analyze_go_binary_ghidra():
"""Ghidra script for Go binary analysis."""
from ghidra.program.model.mem import MemoryAccessException
program = getCurrentProgram()
memory = program.getMemory()
listing = program.getListing()
print("[+] Go Binary Analysis Script")
print(f" Program: {program.getName()}")
# Find pclntab
pclntab_magics = [
bytes([0xf0, 0xff, 0xff, 0xff]), # Go 1.20+
bytes([0xf1, 0xff, 0xff, 0xff]), # Go 1.18-1.19
bytes([0xfa, 0xff, 0xff, 0xff]), # Go 1.16-1.17
bytes([0xfb, 0xff, 0xff, 0xff]), # Go 1.2-1.15
]
for magic in pclntab_magics:
addr = memory.findBytes(
program.getMinAddress(), magic, None, True, None
)
if addr:
print(f"[+] pclntab found at {addr}")
# Create label
program.getSymbolTable().createLabel(
addr, "go_pclntab", None,
ghidra.program.model.symbol.SourceType.ANALYSIS
)
break
# Fix Go string definitions
# Go strings are ptr+len, not null terminated
print("[+] Fixing Go string references...")
# Search for function names containing package paths
symbol_table = program.getSymbolTable()
func_count = 0
for symbol in symbol_table.getAllSymbols(True):
name = symbol.getName()
if ('.' in name and
any(pkg in name for pkg in
['main.', 'runtime.', 'net.', 'crypto.', 'os.'])):
func_count += 1
print(f"[+] Found {func_count} Go function symbols")Execute
Execute
analyze_go_binary_ghidra()
undefinedanalyze_go_binary_ghidra()
undefinedValidation Criteria
验证标准
- Go version and build information extracted from binary
- pclntab located and parsed for function name recovery
- Third-party dependencies identified revealing malware capabilities
- Main package functions enumerated for targeted analysis
- Network, crypto, and OS exec functions categorized
- Ghidra analysis correctly labels Go runtime structures
- 从二进制文件中提取出Go版本和构建信息
- 定位并解析pclntab以恢复函数名
- 识别第三方依赖以揭示恶意软件功能
- 枚举主包函数以进行针对性分析
- 对网络、加密和OS执行函数进行分类
- Ghidra分析正确标记Go运行时结构