An array is a fundamental data structure that stores elements of the same type in contiguous memory locations, enabling O(1) random access via an integer index computed as base_address + index × element_size. Arrays are the basis for many higher-level structures (stacks, queues, heaps, hash tables) and are cache-efficient due to spatial locality. In advanced usage, multi-dimensional arrays, dynamic arrays (vectors), and circular arrays extend the basic model to support resizing, matrix operations, and efficient queue implementations.
address(A[i]) = base(A) + i × sizeof(element)
LaTeX: \text{address}(A[i]) = \text{base}(A) + i \times \text{sizeof}(\text{element})
| Symbol | Meaning | Unit |
|---|---|---|
| \text{address}(A[i]) | Memory address of the i-th element | bytes |
| \text{base}(A) | Memory address of first element A[0] | bytes |
| i | Zero-based index of the desired element | dimensionless |
| \text{sizeof}(\text{element}) | Size of each element in bytes | bytes |
Problem
An integer array A starts at memory address 1000. Each integer occupies 4 bytes. Find the address of A[7] and calculate the time to access A[7] vs a linked list of 8 nodes.
Solution
Step 1: Apply address formula: address(A[7]) = 1000 + 7 × 4 = 1000 + 28 = 1028. Step 2: Array access A[7]: 1 arithmetic operation + 1 memory dereference = O(1) time. Step 3: Linked list access at index 7: must traverse 7 next-pointers from head = O(n) time = 7 pointer dereferences. Step 4: For n=1000 elements, array still takes O(1), linked list takes up to O(1000).
Answer
A[7] is at address 1028; array access O(1) vs linked list O(n) — array is 7× faster for this access
| Operation | Static Array | Dynamic Array (avg) | Dynamic Array (worst) | Notes |
|---|---|---|---|---|
| Access by index | O(1) | O(1) | O(1) | Computed directly from base address |
| Search (unsorted) | O(n) | O(n) | O(n) | Linear scan required |
| Search (sorted) | O(log n) | O(log n) | O(log n) | Binary search applicable |
| Insert at end | O(1) | O(1) amortised | O(n) | Dynamic array may resize |
| Insert at position i | O(n) | O(n) | O(n) | Requires shifting elements |
| Delete at position i | O(n) | O(n) | O(n) | Requires shifting elements |
Wikimedia Commons, CC BY-SA
A linked list is a linear data structure in which elements (nodes) are stored at arbitrary memory locations, with each node containing a data field and one or more pointers linking it to adjacent nodes. Unlike arrays, linked lists do not require contiguous memory, enabling O(1) insertion and deletion at known positions without shifting elements, but they sacrifice O(1) random access (requiring O(n) traversal) and incur pointer overhead per node. Variants include singly linked lists, doubly linked lists, circular linked lists, and skip lists, each trading memory for specific operation efficiency.
Dynamic Programming (DP) is an algorithmic paradigm that solves complex problems by breaking them into overlapping subproblems, solving each subproblem only once, and storing results to avoid redundant computation (memoization or tabulation). It applies when a problem exhibits two key properties: optimal substructure (optimal solution contains optimal solutions to subproblems) and overlapping subproblems. DP is used in sequence alignment, shortest-path problems, knapsack optimization, and compiler design.
A hash table (also called a hash map) is a data structure that implements an associative array by using a hash function to map keys to array indices (buckets), enabling average-case O(1) time for insertion, deletion, and lookup operations. When multiple keys hash to the same index (a collision), the table resolves it via chaining (each bucket holds a linked list of entries) or open addressing (probing for an alternate empty slot). Hash tables underlie Python dictionaries, Java HashMaps, database indexing, caches, and symbol tables in compilers, making them one of the most widely used data structures in software engineering.
The word "array" comes from Old French "arreer" meaning "to put in order" or "to arrange," derived from a Germanic root. In computing, the term was established in the early FORTRAN specifications (1954–1957) where multi-dimensional arrays (called "subscripted variables") were a key feature. The systematic use of zero-based indexing was popularised by C (1972) and Dijkstra's 1982 note "Why numbering should start at zero."