skip to content
My Site Logo abdons blog

x86 Assembly

x86 Assembly

Assembly Language:

Lowest-level human-readable language; highest level a binary can be reliably decompiled to.

Importance in Malware Analysis: Malware samples are compiled binaries (no access to original C/C++ code). Decompiling reveals assembly code, not variable/function names (lost during compilation). Essential for understanding binary behavior in reverse engineering.

Binary and Hex Representation:

  • Binary : Program code on disk/CPU is a sequence of 1s and 0s.
  • Hex : 8 bits = 1 byte, represented as a hex digit (e.g., 10100101 = 0xA5). Hex format (e.g., 0xA5) makes binary more readable than raw 1s and 0s.
  • Instruction Composition: Hex sequence includes opcodes (operations) and operands (targets).

Opcodes:

Hex values representing CPU instructions.

Disassembler is an app that Translates opcodes into human-readable assembly instructions.

  • EX.

    Assembly: mov eax,0x5f (move 0x5f into EAX register).

    Disassembled: 040000: b8 5f 00 00 00

    • 040000: Memory address of the instruction.
    • b8: Opcode for mov eax.
    • 5f 00 00 00: Operand 0x5f in little-endian (reversed order: 00 00 00 5f).

Endianness: Little-endian notation flips byte order (e.g., 0x5f becomes 5f 00 00 00).

Note: Disassemblers handle opcode translation; manual conversion is rarely needed.

Operands

Targets or data for opcodes in instructions.

Types:

Immediate Operands: Fixed/constant values (e.g., 0x5f in mov eax, 0x5f).

Register Operands: CPU registers (e.g., eax in mov eax, 0x5f).

Memory Operands: Memory locations, denoted by square brackets (e.g., [eax] means the address stored in EAX). Operation targets the data at that memory address.

Instructions:

instructions direct the CPU to perform operations using operands (registers, memory, or immediate values). Results: Stored in registers or memory. Essential for interpreting disassembled malware code.

MOV (Move):

Copies a value from source to destination.

Syntax: mov destination, source

  • Examples:

    mov eax, 0x5f : Moves immediate value 0x5f into EAX.

    mov ebx, eax : Copies EAX value into EBX.

    mov eax, [0x5fc53e] : Moves value at memory address 0x5fc53e into EAX.

    mov eax, [ebx] (where EBX = 0x5fc53e) : Same as above, using register as memory pointer.

    mov eax, [ebp+4] : Moves value at memory address (EBP + 4) into EAX (offset arithmetic).

  • Notes: Square brackets denote memory references; supports arithmetic in addressing.

LEA (Load Effective Address):

Loads the computed address (not the data) of the source into the destination. used to Performs arithmetic on registers in one instruction (e.g., addition/multiplication), often used by compilers for efficiency.

  • Syntax: lea destination, source

  • Example:

    lea eax, [ebp+4]: EAX = address (EBP + 4), not the value at that address.

    while MOV: mov eax, [ebp+4] would load the data at (EBP + 4).

NOP (No Operation):

Does nothing (exchanges EAX with itself); moves execution to the next instruction.Consumes CPU cycles (e.g., delays).Used in malware as a NOP sled: Padding of NOPs to ensure shellcode execution starts correctly when control flow is redirected imprecisely.

  • Syntax: nop

Shift Instructions:

Shifts bits in the destination by count positions. its a Faster alternative to multiplication/division by powers of 2 (e.g., shl = ×2, shr = ÷2 per bit shifted). Shifted-out bits are replaced with zeros. Last bit shifted out sets the Carry Flag (CF).

  • Syntax:

    shr destination, count (Shift Right)

    shl destination, count (Shift Left)

  • Examples:

    shl eax, 1: If EAX = 00000010 (2), becomes 00000100 (4).

    shr eax, 1: If EAX = 00000101 (5), becomes 00000010 (2), CF = 1.

Rotate Instructions:

Rotates bits in the destination by count positions; shifted-out bits wrap around to the other end. Useful for bit manipulation without data loss.

  • Syntax:

    ror destination, count (Rotate Right)

    rol destination, count (Rotate Left)

  • Examples:

    ror al, 1: If AL = 10101010, becomes 01010101.

    rol al, 1: If AL = 01010101, becomes 10101010.

Flags Register (EFLAGS):

Special register (EFLAGS) in x86 architecture that stores individual bits (flags) reflecting the outcome of arithmetic, logical, or operational conditions. they are Used to track results and control program flow (e.g., conditional jumps). Critical for interpreting assembly code behavior, especially in malware analysis.Used to Updated after arithmetic/logical operations (e.g., add, sub, shl).

FlagAbbreviationExplanation
CarryCFSet (1) if a carry-out or borrow occurs from the most significant bit in arithmetic or shifts.
ParityPFSet (1) if the least significant byte of the result has an even number of 1 bits.
AuxiliaryAFSet (1) if a carry-out or borrow occurs from bit 3 to bit 4 (used in BCD arithmetic).
ZeroZFSet (1) if the result of an operation is zero.
SignSFSet (1) if the result is negative (most significant bit = 1).
OverflowOFSet (1) if signed arithmetic overflows (e.g., positive + positive = negative).
DirectionDFControls string operations: 0 = forward, 1 = backward.
Interrupt EnableIFEnables (1) or disables (0) maskable hardware interrupts.

Malware may manipulate flags (e.g., CF in shifts, IF for interrupt control) to obscure logic or detect debugging.

Arithmetic Instructions:

ADD (Addition):

Adds value to destination; result stored in destination. value can be a constant (immediate) or register.

Syntax: add destination, value

Ex. add eax, 5 → EAX += 5.

SUB (Subtraction):

Subtracts value from destination; result stored in destination. some Flags may be use like , ZF (Zero Flag): Set (1) if result = 0. CF (Carry Flag): Set (1) if destination < value (unsigned borrow).

Syntax: sub destination, value

Ex. sub eax, 3 → EAX -= 3.

MUL (Multiplication):

Multiplies EAX by value; result stored in EDX:EAX (64-bit). Lower 32 bits → EAX. Upper 32 bits → EDX. value can be a register or immediate. Note Check prior EAX/EDX manipulation as they’re implicitly used.

Syntax: mul value

Ex. EAX = 0x2, mul 0x3EDX:EAX = 0x00000006 (6).

DIV (Division):

Function: Divides 64-bit EDX:EAX by value. Quotient → EAX. Remainder → EDX. value can be a register or immediate. Note EDX:EAX must be set up beforehand.

Syntax: div value

Ex.: EDX:EAX = 0x00000008, div 0x2EAX = 4, EDX = 0.

INC (Increment):

Increments register by 1.

Syntax: inc register

Ex. inc eaxEAX += 1

DEC (Decrement):

Decrements register by 1.

Syntax: dec register

Ex. dec eaxEAX -= 1.

Logical Instructions:

AND (Bitwise AND):

Performs bitwise AND; 1 if both bits are 1, else 0.

Syntax: and destination, value

OR (Bitwise OR):

Performs bitwise OR; 1 if at least one bit is 1, else 0.

Syntax: or destination, value

NOT (Bitwise NOT):

Inverts all bits in register (1 → 0, 0 → 1).

Syntax: not register

XOR (Bitwise XOR):

Performs bitwise XOR; 1 if bits differ, 0 if same.Use Case: xor reg, reg zeros a register efficiently (faster than mov reg, 0).

Syntax: xor destination, value

Conditionals:

Purpose: Enable the CPU to compare values (e.g., equal, greater, less) and alter program flow based on conditions. Conditional instructions (TEST, CMP) and branching instructions (JMP, conditional jumps). Malware often uses these for decision-making, loops, or anti-analysis tricks.

TEST:

Performs a bitwise AND; sets flags (e.g., ZF) but doesn’t store the result.Used to Check if an operand is zero (NULL). Flags Affected: ZF, SF, PF (Zero, Sign, Parity).

Syntax: test destination, source

Ex. test al, 0x7cZF = 1 if result = 0.

CMP (Compare):

Subtracts source from destination to set flags, but doesn’t modify operands.Flags Affected: ZF (Zero Flag): Set (1) if destination = source. CF (Carry Flag): Set (1) if source > destination (unsigned). Both cleared if destination > source.

Syntax: cmp destination, source

Ex. cmp eax, ebx

if EAX= EBXZF= 1.

if EBX> EAXCF= 1.

if EAX> EBXZF= 0, CF= 0.

Branching:

Alters the Instruction Pointer (EIP/RIP) to change control flow from linear to conditional paths.Types: Unconditional (JM P) and conditional jumps.

JMP (Unconditional Jump):

Sets Instruction Pointer to location; next instruction fetched from there.

Syntax: jmp location

Ex. jmp 0x401000 → Execution jumps to address 0x401000.

Conditional Jumps:

ump to location only if a specific flag condition is met (often after CMP or TEST).

Syntax: j<condition> location (e.g., jz 0x401000).

  • Common Instructions:
InstructionConditionExplanation
jzZF = 1Jump if zero (equal after CMP).
jnzZF = 0Jump if not zero (not equal).
jeZF = 1Jump if equal (same as jz).
jneZF = 0Jump if not equal (same as jnz).
jgSF = OF, ZF = 0Jump if greater (signed comparison).
jlSF ≠ OFJump if less (signed comparison).
jgeSF = OF or ZF = 1Jump if greater or equal (signed).
jleZF = 1 or SF ≠ OFJump if less or equal (signed).
jaCF = 0, ZF = 0Jump if above (unsigned comparison).
jbCF = 1Jump if below (unsigned).
jaeCF = 0Jump if above or equal (unsigned).
jbeCF = 1 or ZF = 1Jump if below or equal (unsigned).

Signed vs. Unsigned: Signed (jg, jl, etc.): Use SF (Sign) and OF (Overflow) for negative/positive checks. Unsigned (ja, jb, etc.): Use CF (Carry) for magnitude comparison.

Stack:

The stack is a Last In, First Out (LIFO) memory region. Stores variables, arguments, and control flow data (e.g., return addresses).Malware exploits the stack (e.g., buffer overflows) to hijack control flow.

Key Registers: ESP (Stack Pointer): Points to the top of the stack. EBP (Base Pointer): References the stack frame base.

PUSH:

Pushes source onto the stack. Stores value at ESP, then decrements ESP (stack grows downward). Register, immediate, or memory value. pusha/pushad often indicate shellcode saving register states.

Syntax: push source

pusha: Pushes all 16-bit registers (AX, BX, CX, DX, SI, DI, SP, BP).

pushad: Pushes all 32-bit registers (EAX, EBX, ECX, EDX, ESI, EDI, ESP, EBP).

POP:

Retrieves value from top of stack (ESP) into destination. Increments ESP to new top.ESP adjustment reflects stack shrinkage.

Syntax: pop destination

popa: Pops into 16-bit registers (order: DI, SI, BP, BX, DX, CX, AX).

popad: Pops into 32-bit registers (order: EDI, ESI, EBP, EBX, EDX, ECX, EAX).

CALL:

Initiates a function call by jumping to location. Pushes return address (next instruction’s address) onto the stack. Adjusts ESP accordingly. malwares Often used to invoke malicious routines or APIs.

Syntax: call location

Process:

  • Arguments placed on stack or in registers (per calling convention).
  • Function prologue sets up stack frame (EBP, ESP adjustments).
  • Epilogue restores stack for caller.