x86 Architecture
x86 Architecture
CPU architecture:
Control Unit:
it Retrieves instructions from Main Memory (outside CPU).Instruction Pointer (IP): Stores the address of the next instruction to execute. in a 32-bit systems: Called EIP (Extended Instruction Pointer). in a 64-bit systems: Called RIP (Register Instruction Pointer).
Arithmetic Logic Unit (ALU):
it Executes the instruction fetched from Memory. then the Results are stored in either Registers or Memory.
Registers:
are the CPU’s internal storage for fast data access. they are Much smaller and faster than Main Memory. Holds important data for quick instruction execution. they Speeds up processing by keeping data close to the CPU.
Memory (Main Memory / RAM):
Stores all code and data needed for a program. When a program runs, its code and data are loaded into Memory. CPU accesses instructions from Memory one at a time.
I/O Devices:
Facilitate interaction between the computer and the user/environment. ex. Input: Keyboard, Mouse. Output: Displays, Printers. Storage: Hard Disks, USBs.
Registers:
they are the CPU’s internal storage medium. They are Faster data access than other storage (e.g., RAM).but they are Limited in size so they requires efficient use.
Types:
- Instruction Pointer
- General-Purpose Registers
- Status Flag Registers
- Segment Registers
Instruction Pointer (IP):
Holds the memory address of the next instruction to execute. Also called the Program Counter (PC).
16-bit: IP (e.g., Intel 8086, origin of x86). 32-bit: EIP (Extended Instruction Pointer). 64-bit: RIP (Register Instruction Pointer).
General-Purpose Registers:
Used for general instruction execution. they have 2 sizes 32-bit systems: 32-bit registers (e.g., EAX). and 64-bit systems: Extended to 64-bit (e.g., RAX).
- EAX/RAX (Accumulator):
Stores results of arithmetic operations. Addressing: RAX (64-bit), EAX (32-bit), AX (16-bit), AH/AL (8-bit high/low).
- EBX/RBX (Base Register):
Stores base address for memory offsets. Addressing: RBX (64-bit), EBX (32-bit), BX (16-bit), BH/BL (8-bit).
- ECX/RCX (Counter Register):
Used in loops/counting operations. Addressing: RCX (64-bit), ECX (32-bit), CX (16-bit), CH/CL (8-bit).
- EDX/RDX (Data Register):
Used in multiplication/division operations. Addressing: RDX (64-bit), EDX (32-bit), DX (16-bit), DH/DL (8-bit).
- ESP/RSP (Stack Pointer):
Points to the top of the stack (with Stack Segment register). Addressing: RSP (64-bit), ESP (32-bit). No smaller subdivisions.
- EBP/RBP (Base Pointer):
Accesses stack parameters (with Stack Segment register). Addressing: RBP (64-bit), EBP (32-bit).
- ESI/RSI (Source Index):
Used in string operations (with Data Segment register). Addressing: RSI (64-bit), ESI (32-bit).
- EDI/RDI (Destination Index):
Used in string operations (with Extra Segment register). Addressing: RDI (64-bit), EDI (32-bit).
- R8–R15:
Exclusive to 64-bit systems. Addressing: e.g., R8 (64-bit), R8D (32-bit), R8W (16-bit), R8B (8-bit). Suffixes: D (Double-word), W (Word), B (Byte).
Status Flag Registers:
Provide execution status feedback. for 32-bit systems they are called EFLAGS (32-bit). and for 64-bit systems they are called RFLAGS (64-bit). they are divided into Key Flags (1-bit each):
- Zero Flag (ZF):
Set to 1 if result = 0 (e.g., RAX - RAX).
- Carry Flag (CF):
Set to 1 if result exceeds register size (e.g., 0xFFFFFFFF + 1 in 32-bit).
- Sign Flag (SF):
Set to 1 if result is negative (most significant bit = 1).
- Trap Flag (TF):
Set to 1 for debugging (CPU executes one instruction at a time). Used by malware to detect debuggers.
Segment Registers:
they Divide flat memory into segments for addressing. Size: 16-bit registers.
Types:
- CS (Code Segment): Points to the code section in memory.
- DS (Data Segment): Points to the data section in memory.
- SS (Stack Segment): Points to the stack in memory.
- ES, FS, GS (Extra Segments): Point to additional data sections. Combine with DS to create four distinct data sections.
Memory:
Memory Abstraction: Programs see a limited, abstracted view of Memory, not the full system Memory. The OS restricts access to only the Memory allocated to the program. Abstraction details are omitted for brevity; focus is on the program’s perspective. Relevance: Critical for reverse-engineering malware.
Code:
it contains Program’s executable code (instructions for CPU). PE File Reference: Corresponds to the text section in a Portable Executable (PE) file. Has execute permissions (CPU can run data as instructions).
Data:
it contains Initialized, constant data (unchanging during execution). PE File Reference: Corresponds to the data section in a PE file. Examples: Global variables, static data. Characteristics: Not variable; remains constant.
Heap (Dynamic Memory):
it contains Variables/data created and destroyed at runtime. Memory allocated when variables are created. Memory freed when variables are deleted. Purpose: Supports dynamic memory allocation.
Stack:
it contains Local variables. Arguments passed to the program. Return address of the parent process. Significance in Malware Analysis: Stores the return address, tied to CPU control flow. targeted by malware (e.g., via buffer overflows) to hijack execution.
Stack:
Stores arguments, local variables, and control flow data. Critical in malware analysis due to control flow hijacking (e.g., by malware). Structure: Last In First Out (LIFO) memory.
Key Registers
Stack Pointer (ESP/RSP):
Points to the top of the stack. Adjusts when elements are pushed (added) or popped (removed). 32-bit: ESP; 64-bit: RSP.
Base Pointer (EBP/RBP):
Constant reference address for the current stack frame. Tracks local variables and arguments. 32-bit: EBP; 64-bit: RBP.
Stack Layout (High to Low Memory Addresses)
- Arguments: Passed to the function, located below the Return Address.
- Return Address: Address where the Instruction Pointer resumes after function execution.
- Old Base Pointer: Base Pointer of the calling program, saved below the current Base Pointer.
- Local Variables: Stored above the Base Pointer, allocated during function execution.
- Stack Pointer: Moves dynamically as stack grows/shrinks.
Stack Buffer Overflow
Malware overflows a local variable to overwrite the Return Address. Redirects control flow to an attacker-chosen address.
Function Prologue
Prepares the stack for function execution.
- Steps:
- Arguments are pushed onto the stack.
- Return Address is pushed.
- Old Base Pointer is pushed.
- Base Pointer is set to the current top of the stack (caller’s Stack Pointer).
- Stack Pointer adjusts as the function uses the stack.
Function Epilogue
Restores the stack after function execution.
- Steps:
- Old Base Pointer is popped into the Base Pointer.
- Return Address is popped into the Instruction Pointer.
- Stack Pointer is adjusted to point to the new top of the stack.