Data Structures 101: Arrays and Lists
How to hold "data in a row" — the difference between arrays and linked lists, how to choose, and what Java's ArrayList / LinkedList look like inside
Arrays: rooms in a row
When a program needs to hold many pieces of data in order, the most basic structure is the array. An array reserves one contiguous region of memory and packs the data into it with no gaps.
index : 0 1 2 3 +------+------+------+------+ array : | 10 | 20 | 30 | 40 | +------+------+------+------+ address: 100 104 108 112 (if int = 4 bytes) address of a[2] = 100 + 2 * 4 = 108 -> one calculation: O(1)
Because the elements sit contiguously, the location of "element i" can be computed in one step (start address + i × element size). Access by index therefore takes constant time, O(1), no matter how many elements there are.
The weakness is that the reserved length cannot be changed afterwards — and inserting one element in the middle means shifting every element behind it one slot to the right.
Linked lists: connected by arrows
A linked list chains together nodes, each holding a value and a reference (an arrow) to the next node. Unlike an array, the nodes can live anywhere in memory.
head | v +------+------+ +------+------+ +------+------+ | 10 | next-+---> | 20 | next-+---> | 30 | null | +------+------+ +------+------+ +------+------+ reaching element i means walking from the head: O(n)
Its strength is modification. Once you are at the right place, both insertion and deletion are just re-pointing arrows — nothing needs to be shifted. In exchange, reaching "element i" means walking from the head one node at a time — time proportional to the number of elements (written O(n) when there are n elements).
Terminology alert: "list" means different things in different languages
Python's list, despite the name, is an "array that grows automatically" inside (a dynamic array — explained in the next section). Java's List is the name of an interface, not a data structure (more on this below). In this article, "list" means a linked list.
Comparing time complexity
In practice we usually reach for a dynamic array — an array plus the convenience of growing automatically (Java's ArrayList, Python's list, and so on). Let us line up all three and compare the cost of the main operations — their time complexity, a measure of how processing time grows when there are n elements. Note that practical linked lists (such as Java's LinkedList) also keep a reference to the tail, so appending at the end needs no walk and is O(1).
| Aspect | Array (fixed length) | Dynamic array (ArrayList) | Linked list (LinkedList) |
|---|---|---|---|
| Access element i | O(1) | O(1) | O(n) |
| Append at the end | — (fixed length) | Amortized O(1) | O(1) |
| Insert at the head | O(n) (if there is spare room) | O(n) | O(1) |
| Insert / delete in the middle | O(n) (if there is spare room) | O(n) | Re-pointing is O(1) (walking there is O(n)) |
| Memory layout | Contiguous, no waste | Contiguous, with spare capacity | Scattered, extra space per reference |
Appending to a dynamic array is "amortized O(1)" because copying happens only when the array is full: a larger array is allocated and everything is copied over. Each copy is expensive, but it happens less and less often, so the average cost per append stays constant.
How to choose: mostly reads, or mostly writes?
The deciding question is how the data will be used. If reads (index lookups) dominate, choose the array family; if insertions and deletions at the head or in the middle (writes) dominate, a linked list is the theoretical answer.
Insert 5 at the head of an array: everything shifts right O(n) before: | 10 | 20 | 30 | 40 | after : | 5 | 10 | 20 | 30 | 40 | -> -> -> -> Insert 5 at the head of a linked list: re-point one arrow O(1) before: head -> [10] -> [20] -> [30] after : head -> [5] -> [10] -> [20] -> [30]
| If your usage is… | Choose |
|---|---|
| Frequent reads by index (read-heavy) | Array / ArrayList |
| Mostly appending at the end | ArrayList (amortized O(1)) |
| Mostly inserting / deleting at the head or middle (write-heavy) | LinkedList (beware the cost of walking there) |
| Pushing and popping at both ends (queue / deque) | ArrayDeque (the Java standard choice) |
| Not sure | ArrayList |
On modern CPUs, arrays are the default
One more practical factor hides behind the complexity table. CPUs read memory into caches in blocks, so an array laid out contiguously benefits from prefetching and runs very fast, while a linked list scattered across memory rarely stays in cache. Even at equal Big-O, the array family usually wins in measurements — which is why "when in doubt, use a dynamic array" is the practical default.
A closer look: inside Java's ArrayList and LinkedList
In Java, everything above shows up directly in the type names. List is an interface — a contract of what you can do: add, get, count — while the implementing class decides how the data is actually held.
// "List" is an interface (a contract of what you can do) // ArrayList / LinkedList are its implementations (actual structures) List<String> a = new ArrayList<>(); // an array inside List<String> b = new LinkedList<>(); // a doubly linked list inside
Inside, ArrayList is exactly what the name says: an array. Elements live in an Object[] called elementData; when it fills up, a new array about 1.5 times longer is allocated and everything is copied over (the initial capacity is 10, allocated on the first add).
// Inside ArrayList (simplified from OpenJDK) public class ArrayList<E> { Object[] elementData; // the data is just an array int size; // how many elements are actually stored public boolean add(E e) { if (size == elementData.length) { // full: allocate ~1.5x array and copy everything over int oldCapacity = elementData.length; int newCapacity = oldCapacity + (oldCapacity >> 1); elementData = Arrays.copyOf(elementData, newCapacity); } elementData[size++] = e; return true; } }
That is why ArrayList's get(i) is O(1), appending is amortized O(1), and inserting or deleting in the middle is O(n), with a copy that shifts the elements behind it — exactly the "dynamic array" column of the table above.
LinkedList, on the other hand, is a doubly linked list inside. Each node holds references to both its neighbors, and the list object itself remembers only two things: the head (first) and the tail (last).
// Inside LinkedList (simplified from OpenJDK) public class LinkedList<E> { Node<E> first; // head node Node<E> last; // tail node int size; private static class Node<E> { E item; // the value Node<E> next; // reference to the next node Node<E> prev; // reference to the previous node (doubly linked) } // get(i) walks from the closer end (still O(n)) Node<E> node(int index) { if (index < (size >> 1)) { // first half: from the head Node<E> x = first; for (int i = 0; i < index; i++) x = x.next; return x; } else { // second half: from the tail Node<E> x = last; for (int i = size - 1; i > index; i--) x = x.prev; return x; } } }
Being doubly linked, get(i) walks from whichever end is closer — the head for the first half, the tail for the second half. A clever trick, but still O(n). Adding or removing at either end is O(1), which also makes it usable as a queue or deque.
A practical note
In Java too, the default when in doubt is ArrayList. LinkedList wins only in limited scenarios such as frequent insertion and deletion while traversing with an iterator — and if all you need is a deque, ArrayDeque is usually faster.
Summary
An array remembers by location: contiguous memory and index arithmetic make reads O(1), at the cost of slow insertion and deletion.
A linked list remembers by connection: modifications are just re-pointed arrows, at the cost of O(n) reads.
Read-heavy → array family; write-heavy → consider a linked list. But thanks to CPU caches, the practical default is the dynamic array (ArrayList).
Java's List is an interface (a contract of what you can do); ArrayList / LinkedList are implementations (the actual structures). The names tell you what is inside.
Keep reading: the data structures series
Stacks →
Push on top, pop from the top. The LIFO discipline and where it shines: undo, function calls, depth-first search.
Queues →
First come, first served. The FIFO discipline, the ring buffer, and where queues shine: BFS and task processing.
Trees →
Hierarchy in branches. O(log n) binary search trees and what is inside Java's TreeMap.