binary-analysis-patterns
Install this skill
npx skills add wshobson/agentsWorks across Claude Code, Cursor, Codex, Copilot & Antigravity
Binary analysis patterns focus on the structural translation between high-level language constructs and machine-level instruction sequences. This skill set provides the framework for identifying standard program behavior by recognizing recurring assembly idioms. It covers stack frame management through function prologues and epilogues across x86-64, ARM64, and ARM32 architectures. Users can identify control flow structures like conditional branches, loops, and switch-case jump tables. Additionally, it addresses data handling, including array indexing, structure field offsets, and common string operations like strcpy or strlen. By mapping these patterns, developers and security researchers can reconstruct original logic, identify data structure access, and explain arithmetic optimizations like lea-based multiplication, facilitating effective static binary auditing without relying solely on decompiler outputs.
When to Use This Skill
- •Reverse engineering stripped binaries lacking symbol tables
- •Security auditing of compiled modules for hidden backdoors or vulnerabilities
- •Interpreting assembly outputs from debuggers during crash analysis
- •Developing custom static analysis plugins for existing disassemblers
How to Invoke This Skill
Example prompts that trigger this skill in Claude Code, Cursor, or Antigravity:
- “Explain this x86 function prologue
- “How does this loop translate to C code?
- “Identify the calling convention used in this assembly block
- “Show me the array indexing pattern for this assembly
- “Deconstruct this switch-case jump table logic
Pro Tips
- 💡Always start with high-level control flow graphs (CFGs) before diving into individual assembly instructions to grasp the overall program structure.
- 💡Utilize interactive disassemblers like Ghidra or IDA Pro to automate pattern recognition and cross-referencing, significantly speeding up your analysis.
- 💡Pay close attention to calling conventions specific to the architecture and operating system, as they dictate argument passing and stack management.
What this skill does
- •Identifying function calling conventions across major CPU architectures
- •Translating low-level branch and jump instructions into high-level control flows
- •Calculating structure member offsets and array element access patterns
- •Recognizing optimized arithmetic sequences such as division by bit-shifting
- •Reconstructing logic for standard C library functions like strlen or memcpy
When not to use it
- ✕Analyzing obfuscated or packed binaries where instruction patterns are scrambled
- ✕Performance profiling of code execution in real-time environments
- ✕High-level logic understanding of large, modern object-oriented codebases
Example workflow
- Load the raw binary into a disassembler or debugger
- Identify the function entry point to determine the stack frame setup
- Analyze register usage to ascertain the specific calling convention
- Map conditional jump sequences to reconstruct high-level if-else or loop logic
- Calculate structure offsets from base pointers to name variables
- Document the reconstructed logic flow for the audit report
Prerequisites
- –Foundational understanding of x86-64 or ARM assembly syntax
- –Basic knowledge of C language memory layout
- –Access to a disassembler like IDA Pro, Ghidra, or binaryninja
Pitfalls & limitations
- !Compiler optimizations frequently obscure simple code structures
- !Multiple instruction patterns can achieve the same logical outcome, leading to potential misinterpretation
- !Shared register usage in complex functions can make tracking data flow difficult
FAQ
How it compares
While manual analysis requires parsing instruction by instruction, this skill set applies specific pattern-matching heuristics to categorize blocks of assembly instantly, reducing cognitive load.
📄 Full skill instructions — original source: wshobson/agents
Comprehensive patterns and techniques for analyzing compiled binaries, understanding assembly code, and reconstructing program logic.
## Disassembly Fundamentals
### x86-64 Instruction Patterns
#### Function Prologue/Epilogue
; Standard prologue
push rbp ; Save base pointer
mov rbp, rsp ; Set up stack frame
sub rsp, 0x20 ; Allocate local variables
; Leaf function (no calls)
; May skip frame pointer setup
sub rsp, 0x18 ; Just allocate locals
; Standard epilogue
mov rsp, rbp ; Restore stack pointer
pop rbp ; Restore base pointer
ret
; Leave instruction (equivalent)
leave ; mov rsp, rbp; pop rbp
ret#### Calling Conventions
**System V AMD64 (Linux, macOS)**
; Arguments: RDI, RSI, RDX, RCX, R8, R9, then stack
; Return: RAX (and RDX for 128-bit)
; Caller-saved: RAX, RCX, RDX, RSI, RDI, R8-R11
; Callee-saved: RBX, RBP, R12-R15
; Example: func(a, b, c, d, e, f, g)
mov rdi, [a] ; 1st arg
mov rsi, [b] ; 2nd arg
mov rdx, [c] ; 3rd arg
mov rcx, [d] ; 4th arg
mov r8, [e] ; 5th arg
mov r9, [f] ; 6th arg
push [g] ; 7th arg on stack
call func**Microsoft x64 (Windows)**
; Arguments: RCX, RDX, R8, R9, then stack
; Shadow space: 32 bytes reserved on stack
; Return: RAX
; Example: func(a, b, c, d, e)
sub rsp, 0x28 ; Shadow space + alignment
mov rcx, [a] ; 1st arg
mov rdx, [b] ; 2nd arg
mov r8, [c] ; 3rd arg
mov r9, [d] ; 4th arg
mov [rsp+0x20], [e] ; 5th arg on stack
call func
add rsp, 0x28### ARM Assembly Patterns
#### ARM64 (AArch64) Calling Convention
; Arguments: X0-X7
; Return: X0 (and X1 for 128-bit)
; Frame pointer: X29
; Link register: X30
; Function prologue
stp x29, x30, [sp, #-16]! ; Save FP and LR
mov x29, sp ; Set frame pointer
; Function epilogue
ldp x29, x30, [sp], #16 ; Restore FP and LR
ret#### ARM32 Calling Convention
; Arguments: R0-R3, then stack
; Return: R0 (and R1 for 64-bit)
; Link register: LR (R14)
; Function prologue
push {fp, lr}
add fp, sp, #4
; Function epilogue
pop {fp, pc} ; Return by popping PC## Control Flow Patterns
### Conditional Branches
; if (a == b)
cmp eax, ebx
jne skip_block
; ... if body ...
skip_block:
; if (a < b) - signed
cmp eax, ebx
jge skip_block ; Jump if greater or equal
; ... if body ...
skip_block:
; if (a < b) - unsigned
cmp eax, ebx
jae skip_block ; Jump if above or equal
; ... if body ...
skip_block:### Loop Patterns
; for (int i = 0; i < n; i++)
xor ecx, ecx ; i = 0
loop_start:
cmp ecx, [n] ; i < n
jge loop_end
; ... loop body ...
inc ecx ; i++
jmp loop_start
loop_end:
; while (condition)
jmp loop_check
loop_body:
; ... body ...
loop_check:
cmp eax, ebx
jl loop_body
; do-while
loop_body:
; ... body ...
cmp eax, ebx
jl loop_body### Switch Statement Patterns
; Jump table pattern
mov eax, [switch_var]
cmp eax, max_case
ja default_case
jmp [jump_table + eax*8]
; Sequential comparison (small switch)
cmp eax, 1
je case_1
cmp eax, 2
je case_2
cmp eax, 3
je case_3
jmp default_case## Data Structure Patterns
### Array Access
; array[i] - 4-byte elements
mov eax, [rbx + rcx*4] ; rbx=base, rcx=index
; array[i] - 8-byte elements
mov rax, [rbx + rcx*8]
; Multi-dimensional array[i][j]
; arr[i][j] = base + (i * cols + j) * element_size
imul eax, [cols]
add eax, [j]
mov edx, [rbx + rax*4]### Structure Access
struct Example {
int a; // offset 0
char b; // offset 4
// padding // offset 5-7
long c; // offset 8
short d; // offset 16
};; Accessing struct fields
mov rdi, [struct_ptr]
mov eax, [rdi] ; s->a (offset 0)
movzx eax, byte [rdi+4] ; s->b (offset 4)
mov rax, [rdi+8] ; s->c (offset 8)
movzx eax, word [rdi+16] ; s->d (offset 16)### Linked List Traversal
; while (node != NULL)
list_loop:
test rdi, rdi ; node == NULL?
jz list_done
; ... process node ...
mov rdi, [rdi+8] ; node = node->next (assuming next at offset 8)
jmp list_loop
list_done:## Common Code Patterns
### String Operations
; strlen pattern
xor ecx, ecx
strlen_loop:
cmp byte [rdi + rcx], 0
je strlen_done
inc ecx
jmp strlen_loop
strlen_done:
; ecx contains length
; strcpy pattern
strcpy_loop:
mov al, [rsi]
mov [rdi], al
test al, al
jz strcpy_done
inc rsi
inc rdi
jmp strcpy_loop
strcpy_done:
; memcpy using rep movsb
mov rdi, dest
mov rsi, src
mov rcx, count
rep movsb### Arithmetic Patterns
; Multiplication by constant
; x * 3
lea eax, [rax + rax*2]
; x * 5
lea eax, [rax + rax*4]
; x * 10
lea eax, [rax + rax*4] ; x * 5
add eax, eax ; * 2
; Division by power of 2 (signed)
mov eax, [x]
cdq ; Sign extend to EDX:EAX
and edx, 7 ; For divide by 8
add eax, edx ; Adjust for negative
sar eax, 3 ; Arithmetic shift right
; Modulo power of 2
and eax, 7 ; x % 8### Bit Manipulation
; Test specific bit
test eax, 0x80 ; Test bit 7
jnz bit_set
; Set bit
or eax, 0x10 ; Set bit 4
; Clear bit
and eax, ~0x10 ; Clear bit 4
; Toggle bit
xor eax, 0x10 ; Toggle bit 4
; Count leading zeros
bsr eax, ecx ; Bit scan reverse
xor eax, 31 ; Convert to leading zeros
; Population count (popcnt)
popcnt eax, ecx ; Count set bits## Decompilation Patterns
### Variable Recovery
; Local variable at rbp-8
mov qword [rbp-8], rax ; Store to local
mov rax, [rbp-8] ; Load from local
; Stack-allocated array
lea rax, [rbp-0x40] ; Array starts at rbp-0x40
mov [rax], edx ; array[0] = value
mov [rax+4], ecx ; array[1] = value### Function Signature Recovery
; Identify parameters by register usage
func:
; rdi used as first param (System V)
mov [rbp-8], rdi ; Save param to local
; rsi used as second param
mov [rbp-16], rsi
; Identify return by RAX at end
mov rax, [result]
ret### Type Recovery
; 1-byte operations suggest char/bool
movzx eax, byte [rdi] ; Zero-extend byte
movsx eax, byte [rdi] ; Sign-extend byte
; 2-byte operations suggest short
movzx eax, word [rdi]
movsx eax, word [rdi]
; 4-byte operations suggest int/float
mov eax, [rdi]
movss xmm0, [rdi] ; Float
; 8-byte operations suggest long/double/pointer
mov rax, [rdi]
movsd xmm0, [rdi] ; Double## Ghidra Analysis Tips
### Improving Decompilation
// In Ghidra scripting
// Fix function signature
Function func = getFunctionAt(toAddr(0x401000));
func.setReturnType(IntegerDataType.dataType, SourceType.USER_DEFINED);
// Create structure type
StructureDataType struct = new StructureDataType("MyStruct", 0);
struct.add(IntegerDataType.dataType, "field_a", null);
struct.add(PointerDataType.dataType, "next", null);
// Apply to memory
createData(toAddr(0x601000), struct);### Pattern Matching Scripts
# Find all calls to dangerous functions
for func in currentProgram.getFunctionManager().getFunctions(True):
for ref in getReferencesTo(func.getEntryPoint()):
if func.getName() in ["strcpy", "sprintf", "gets"]:
print(f"Dangerous call at {ref.getFromAddress()}")## IDA Pro Patterns
### IDAPython Analysis
import idaapi
import idautils
import idc
# Find all function calls
def find_calls(func_name):
for func_ea in idautils.Functions():
for head in idautils.Heads(func_ea, idc.find_func_end(func_ea)):
if idc.print_insn_mnem(head) == "call":
target = idc.get_operand_value(head, 0)
if idc.get_func_name(target) == func_name:
print(f"Call to {func_name} at {hex(head)}")
# Rename functions based on strings
def auto_rename():
for s in idautils.Strings():
for xref in idautils.XrefsTo(s.ea):
func = idaapi.get_func(xref.frm)
if func and "sub_" in idc.get_func_name(func.start_ea):
# Use string as hint for naming
pass## Best Practices
### Analysis Workflow
1. **Initial triage**: File type, architecture, imports/exports
2. **String analysis**: Identify interesting strings, error messages
3. **Function identification**: Entry points, exports, cross-references
4. **Control flow mapping**: Understand program structure
5. **Data structure recovery**: Identify structs, arrays, globals
6. **Algorithm identification**: Crypto, hashing, compression
7. **Documentation**: Comments, renamed symbols, type definitions
### Common Pitfalls
- **Optimizer artifacts**: Code may not match source structure
- **Inline functions**: Functions may be expanded inline
- **Tail call optimization**:
jmp instead of call + ret- **Dead code**: Unreachable code from optimization
- **Position-independent code**: RIP-relative addressing
How to Use This Skill Unit
Option A: Project-Specific (Recommended)
- Click "Download" above
- In your project, create the directory:
.agent/skills/binary-analysis-patterns/ - Save the file as
SKILL.md - The agent will automatically discover the skill based on its description.
Option B: Global Installation (All Agents)
Save the file to these locations to make it available across all projects:
- Claude Code:
~/.claude/skills/wshobson/agents/binary-analysis-patterns/SKILL.md - Cursor:
~/.cursor/skills/wshobson/agents/binary-analysis-patterns/SKILL.md - Antigravity:
~/.gemini/antigravity/skills/wshobson/agents/binary-analysis-patterns/SKILL.md
🚀 Install with CLI:npx skills add wshobson/agents
