Java Collections Tutorial (Java 2 Foundations, 2026 Practice)

I’ve spent a lot of time maintaining Java systems that started on Java 2 and still run critical workloads today. When I open a codebase like that, I’m rarely looking at fancy new frameworks first. I’m looking at how data is stored, passed around, and modified. Collections are the backbone of those decisions. If you choose the wrong collection, you pay for it with latency, memory churn, or correctness bugs that only show up in production.

In this tutorial, I focus on the Java Collections Framework as it exists in the Java 2 era and how we use it today in 2026. I’ll walk you through the core interfaces, the most practical implementations, and the trade-offs that actually matter on real systems. I’ll show complete code examples, point out common mistakes, and add performance guidance from the trenches. If you already know the basics, this is the second layer: the part that makes your code predictable, safe, and fast.

Why collections are the actual design surface

In my experience, most Java applications aren’t limited by algorithms. They’re limited by data choices. Collections are where those choices become real. A list that should have been a map leads to O(n) lookups. A set that should have been ordered makes your output unstable and tests flaky. A queue that should have been blocking turns into a busy loop. When I review production incidents, collection misuse is a recurring theme.

Collections give you a standard language for intent: “ordered, allows duplicates” is a List. “unique, hash-based membership” is a HashSet. “key to value mapping” is a Map. These choices aren’t about style; they’re about making your code future-proof and understandable to everyone on the team.

Here’s my rule of thumb: pick the collection that most clearly matches the behavior you want. Then check the big-O costs of the operations you will perform most often. When those two align, your code tends to be correct and fast.

Core interfaces and how I think about them

The interfaces define behavior. The implementations define the costs. I treat interfaces as contracts and implementations as strategy.

Collection vs Map

Collection is the root for List, Set, and Queue types. Map is separate because it models key-value pairs. That split is more than formal. It forces you to think about identity. If you need identity via keys, pick Map. If identity is inherent in the object itself, pick a Collection.

List

I reach for List when:

I need ordering by index.
I care about duplicates.
I’m mostly iterating in order.

If I’m adding and removing frequently in the middle, I’ll think about LinkedList or a different approach altogether.

Set

I reach for Set when:

I need uniqueness.
I’m checking membership often.
I don’t care about duplicates.

If ordering matters, I pick LinkedHashSet (insertion order) or TreeSet (sorted order).

Queue and Deque

Queues are for FIFO behavior, Deques for both ends. In practice, Deque is one of the most underrated interfaces. It replaces Stack, supports queue operations, and is the foundation for many buffer-like structures.

Map

Map handles key-value lookups. I think about:

Are keys unique? (they must be)
Do I need ordering? (HashMap vs LinkedHashMap vs TreeMap)
Do I need concurrency? (ConcurrentHashMap or other concurrent types)

Lists: arrays, nodes, and legacy baggage

Lists seem easy but are full of sharp edges. The trick is to match your access patterns to the underlying structure.

ArrayList: default for most cases

ArrayList is backed by a dynamically resizing array. I use it for most list needs.

Key behaviors:

Fast random access: typically O(1)
Append is amortized O(1)
Insert/remove in the middle is O(n)

If you’re mostly iterating and appending, ArrayList is the right choice.

import java.util.*;
public class ProductCatalog {
public static void main(String[] args) {
List productNames = new ArrayList();
productNames.add("Laptop Pro 16");
productNames.add("NoiseCancel Headphones");
productNames.add("Smart Thermostat");
for (String name : productNames) {
System.out.println(name);
}
// Random access is fast
System.out.println("First item: " + productNames.get(0));
}
}

Common mistake I see: repeatedly inserting at the front. That turns into O(n) shifts each time. If you need that pattern, switch to a Deque or rethink the design.

LinkedList: good for head/tail operations, weak for indexing

LinkedList is a doubly-linked list. It’s useful when you add/remove at the ends often and don’t need random access.

Access by index: O(n)
Add/remove at ends: O(1)

I rarely use it as a general List. I use it as a Deque when the collection represents a queue or stack.

import java.util.*;
public class JobQueue {
public static void main(String[] args) {
Deque jobs = new LinkedList();
jobs.addLast("image-resize");
jobs.addLast("video-transcode");
jobs.addFirst("urgent-thumbnail");
while (!jobs.isEmpty()) {
System.out.println("Processing: " + jobs.removeFirst());
}
}
}

Vector and Stack: legacy with costs

Vector is synchronized and slower. Stack extends Vector and is considered legacy. I avoid both unless I’m trapped in old APIs. Modern code uses ArrayList and Deque.

If you need a stack, use ArrayDeque:

import java.util.*;
public class ExpressionChecker {
public static void main(String[] args) {
String expr = "(a+b)*(c-d)";
Deque stack = new ArrayDeque();
for (char ch : expr.toCharArray()) {
if (ch == ‘(‘) {
stack.push(ch);
} else if (ch == ‘)‘) {
if (stack.isEmpty()) {
System.out.println("Unbalanced");
return;
}
stack.pop();
}
}
System.out.println(stack.isEmpty() ? "Balanced" : "Unbalanced");
}
}

Sets: uniqueness with distinct ordering choices

Sets enforce uniqueness, but their ordering behavior is where the real decisions lie.

HashSet: fast membership, no order

HashSet is backed by a HashMap. I use it when I need fast lookups and don’t care about order.

Add/contains/remove: typically O(1)
Iteration order: undefined

import java.util.*;
public class EmailDeduplicator {
public static void main(String[] args) {
Set emails = new HashSet();
emails.add("[email protected]");
emails.add("[email protected]");
emails.add("[email protected]"); // duplicate ignored
System.out.println("Unique count: " + emails.size());
}
}

LinkedHashSet: preserves insertion order

If you need uniqueness and stable iteration order, LinkedHashSet is a simple upgrade. I use it for user-facing lists where ordering matters but duplicates are not allowed.

TreeSet: sorted order

TreeSet keeps elements sorted using natural ordering or a comparator. It’s backed by a tree structure, so operations are O(log n). I use it when I need order-based operations like range queries.

import java.util.*;
public class TopScores {
public static void main(String[] args) {
Set scores = new TreeSet();
scores.add(980);
scores.add(1200);
scores.add(760);
System.out.println("Lowest: " + ((TreeSet) scores).first());
System.out.println("Highest: " + ((TreeSet) scores).last());
}
}

EnumSet: small, fast, type-safe

EnumSet is a powerhouse when you have enum keys. It uses bit vectors, so it’s compact and fast. I always choose EnumSet over HashSet for enums.

import java.util.*;
public class AccessControl {
enum Permission { READ, WRITE, DELETE, ADMIN }
public static void main(String[] args) {
EnumSet perms = EnumSet.of(Permission.READ, Permission.WRITE);
System.out.println(perms.contains(Permission.DELETE));
}
}

Queues and Deques: scheduling and buffers

Queues are where many systems start to look like systems. If your application processes tasks, events, or messages, you’re going to touch these interfaces.

PriorityQueue: ordered by priority, not FIFO

PriorityQueue is based on a heap. It always gives you the smallest (or largest) element according to the comparator.

Insert: O(log n)
Remove min/max: O(log n)
Peek: O(1)

import java.util.*;
public class SupportTickets {
static class Ticket {
final String id;
final int priority; // lower value = higher priority
Ticket(String id, int priority) {
this.id = id;
this.priority = priority;
}
}
public static void main(String[] args) {
PriorityQueue queue = new PriorityQueue(Comparator.comparingInt(t -> t.priority));
queue.add(new Ticket("TCK-101", 2));
queue.add(new Ticket("TCK-102", 1));
queue.add(new Ticket("TCK-103", 3));
while (!queue.isEmpty()) {
Ticket t = queue.poll();
System.out.println("Handle: " + t.id + " (priority " + t.priority + ")");
}
}
}

ArrayDeque: fast double-ended queue

ArrayDeque is a resizable array implementation for Deque. It’s fast and cache-friendly. I use it for stacks, queues, or any double-ended operations.

BlockingQueue: when threads are involved

BlockingQueue is for producer-consumer patterns. In Java 2 era systems, this is still foundational for concurrency.

import java.util.concurrent.*;
public class LogPipeline {
public static void main(String[] args) throws Exception {
BlockingQueue queue = new ArrayBlockingQueue(10);
Thread producer = new Thread(() -> {
try {
queue.put("app-started");
queue.put("user-login");
queue.put("order-submitted");
} catch (InterruptedException ignored) {}
});
Thread consumer = new Thread(() -> {
try {
while (true) {
String event = queue.take();
System.out.println("Processed: " + event);
if ("order-submitted".equals(event)) break;
}
} catch (InterruptedException ignored) {}
});
producer.start();
consumer.start();
producer.join();
consumer.join();
}
}

Maps: keys, values, and operational clarity

Maps are where I see the most performance wins and the most silent bugs. The key choice should reflect identity, not accidental string formatting.

HashMap: the default

HashMap is fast and flexible. It’s my default map choice unless I need ordering.

import java.util.*;
public class SessionStore {
public static void main(String[] args) {
Map sessions = new HashMap();
sessions.put("token-1", "user-anne");
sessions.put("token-2", "user-raj");
String user = sessions.get("token-1");
System.out.println("Token belongs to: " + user);
}
}

LinkedHashMap: stable iteration and LRU patterns

LinkedHashMap preserves insertion order and can be configured for access-order, which makes it perfect for a simple LRU cache.

import java.util.*;
public class LruCacheExample {
public static void main(String[] args) {
int maxSize = 3;
Map cache = new LinkedHashMap(16, 0.75f, true) {
@Override
protected boolean removeEldestEntry(Map.Entry eldest) {
return size() > maxSize;
}
};
cache.put("A", "alpha");
cache.put("B", "beta");
cache.put("C", "gamma");
cache.get("A"); // move A to most-recent
cache.put("D", "delta"); // evicts B
System.out.println(cache.keySet());
}
}

TreeMap: sorted keys

TreeMap maintains sorted keys and supports range queries. I use it for time-series indexing or ordered reports.

ConcurrentHashMap: safe for high concurrency

In modern systems, I use ConcurrentHashMap for shared state where multiple threads read and update frequently. It avoids the global lock that used to be common in older synchronized map wrappers.

Patterns I use in real code

Here are a few patterns I find myself repeating across codebases.

1) Defensive copying to avoid shared mutation

If you expose a collection, you invite external modification. I return an unmodifiable view or a copy.

import java.util.*;
public class Team {
private final List members = new ArrayList();
public void addMember(String name) {
members.add(name);
}
public List getMembers() {
return Collections.unmodifiableList(members);
}
}

2) Using interfaces in method signatures

I code to interfaces to keep my options open.

public void printNames(List names) {
for (String name : names) {
System.out.println(name);
}
}

3) Pre-sizing to avoid reallocation

If I know the size, I pass it in. It reduces resize cost.

List logLines = new ArrayList(expectedLines);

4) Using computeIfAbsent for maps

This is a modern convenience that makes map usage safer and clearer.

import java.util.*;
public class Grouping {
public static void main(String[] args) {
Map<String, List> byCity = new HashMap();
String city = "Berlin";
String user = "Marta";
byCity.computeIfAbsent(city, c -> new ArrayList()).add(user);
System.out.println(byCity);
}
}

Common mistakes and how I avoid them

I’ve seen all of these in production code. Most are easy to fix once you notice the pattern.

Mistake 1: Using a List for membership checks

If you repeatedly call contains on a List, you’re doing O(n) work each time. Switch to a Set if you only care about membership.

Mistake 2: Using HashMap with mutable keys

If the fields used in equals or hashCode change after insertion, the key becomes unreachable. Use immutable keys for maps and sets.

Mistake 3: Relying on HashMap iteration order

HashMap does not guarantee order. If you need stability, use LinkedHashMap or TreeMap.

Mistake 4: Synchronizing collections manually

Using Collections.synchronizedList is fragile if you don’t synchronize during iteration. In 2026, I strongly prefer concurrent collections when concurrency is real.

Mistake 5: Overusing LinkedList

LinkedList sounds fast because of O(1) insertions, but random access makes it slow in practice. Use ArrayList unless you have a real queue or deque pattern.

Iteration, views, and fail-fast behavior

Iteration is where correctness bugs hide. Java collections are mostly fail-fast: they throw ConcurrentModificationException if you modify them while iterating in an unsafe way. I treat this as a gift, not a nuisance.

Safe removal during iteration

If I need to remove elements while iterating, I use the iterator’s remove method, not the collection’s remove.

import java.util.*;
public class CleanupExample {
public static void main(String[] args) {
List hosts = new ArrayList(Arrays.asList("up-1", "down-1", "up-2", "down-2"));
Iterator it = hosts.iterator();
while (it.hasNext()) {
String h = it.next();
if (h.startsWith("down")) {
it.remove();
}
}
System.out.println(hosts);
}
}

Unmodifiable views vs defensive copies

An unmodifiable view prevents modification but still reflects changes to the original. A defensive copy isolates state. I choose based on intent.

import java.util.*;
public class ViewVsCopy {
public static void main(String[] args) {
List src = new ArrayList();
src.add("A");
List view = Collections.unmodifiableList(src);
List copy = new ArrayList(src);
src.add("B");
System.out.println(view); // [A, B]
System.out.println(copy); // [A]
}
}

Why fail-fast matters in production

If a collection is modified during iteration from another thread, fail-fast can surface bugs early. It does not make iteration thread-safe. If you need safe concurrent iteration, use a concurrent collection or snapshot the data first.

Sorting and ordering strategies

Ordering is a frequent source of subtle bugs, especially when ordering is unstable or dependent on locale. I always make ordering explicit.

Sorting lists with comparators

I use Comparator.comparing for clarity and keep it close to the data.

import java.util.*;
public class SortUsers {
static class User {
final String name;
final int score;
User(String name, int score) {
this.name = name;
this.score = score;
}
}
public static void main(String[] args) {
List users = Arrays.asList(new User("Ana", 12), new User("Bo", 7), new User("Cal", 12));
users.sort(Comparator.comparingInt((User u) -> u.score).reversed().thenComparing(u -> u.name));
for (User u : users) {
System.out.println(u.name + " " + u.score);
}
}
}

TreeSet and TreeMap comparators

If you use TreeSet or TreeMap, your comparator defines uniqueness. Two keys that compare as equal are treated as duplicates. I keep that in mind when sorting by partial fields.

When not to sort

Sorting is O(n log n). If I only need the top N items, I use a PriorityQueue or a bounded structure instead of sorting the full list.

Interop with arrays and legacy APIs

Java 2 era code often moves between arrays and collections. I keep these rules in mind.

Arrays.asList gotcha

Arrays.asList returns a fixed-size list backed by the array. You can set, but not add or remove.

import java.util.*;
public class ArraysAsListGotcha {
public static void main(String[] args) {
String[] arr = {"a", "b", "c"};
List list = Arrays.asList(arr);
list.set(0, "z");
System.out.println(Arrays.toString(arr)); // [z, b, c]
// list.add("d"); // UnsupportedOperationException
}
}

If I need a resizable list, I wrap it: new ArrayList(Arrays.asList(arr)).

toArray patterns

If I need a typed array, I pass a zero-length array as a template. It’s clearer and safer.

String[] names = list.toArray(new String[0]);

Legacy APIs and raw types

Old APIs may return raw types. I isolate the cast and add validation to avoid ClassCastException deep in the code.

Concurrency choices in 2026 with Java 2 foundations

Java 2 era code often uses synchronized wrappers or Vector. I treat those as last resorts. Today, I choose concurrent collections when multiple threads touch the data.

Synchronized wrappers

Collections.synchronizedList or synchronizedMap wrap a collection with a single lock. They work, but iteration must still be synchronized manually.

List list = Collections.synchronizedList(new ArrayList());
// Correct iteration pattern
synchronized (list) {
for (String s : list) {
System.out.println(s);
}
}

ConcurrentHashMap

ConcurrentHashMap scales under read-heavy and write-heavy load. I use it for shared caches, counters, and registries.

import java.util.concurrent.*;
public class MetricsRegistry {
public static void main(String[] args) {
ConcurrentHashMap counters = new ConcurrentHashMap();
counters.computeIfAbsent("requests", k -> new LongAdder()).increment();
System.out.println(counters.get("requests").sum());
}
}

CopyOnWriteArrayList

CopyOnWriteArrayList is great for read-heavy collections with rare writes, like listener lists. Writes are expensive because they copy the whole array.

Maps in the real world: multi-map and index patterns

I often need to group values by key or index data by multiple attributes. The standard library doesn’t ship a multi-map, but it’s easy to build.

Grouping values by key

import java.util.*;
public class MultiMapExample {
public static void main(String[] args) {
Map<String, List> byCategory = new HashMap();
add(byCategory, "books", "Java 2 Guide");
add(byCategory, "books", "Concurrency Notes");
add(byCategory, "tools", "Profiler");
System.out.println(byCategory);
}
static void add(Map<String, List> map, String key, String value) {
map.computeIfAbsent(key, k -> new ArrayList()).add(value);
}
}

Indexing by multiple keys

If I need to query by id and by email, I keep two maps synchronized by updates. This is cheaper than scanning a list.

import java.util.*;
public class UserIndex {
static class User {
final String id;
final String email;
User(String id, String email) { this.id = id; this.email = email; }
}
private final Map byId = new HashMap();
private final Map byEmail = new HashMap();
public void add(User u) {
byId.put(u.id, u);
byEmail.put(u.email, u);
}
public User findById(String id) { return byId.get(id); }
public User findByEmail(String email) { return byEmail.get(email); }
}

I prefer explicit duplication to repeated searches through a list when the query is hot.

Equality, hashing, and comparator correctness

If you store custom objects in a Set or Map, you have to get equals and hashCode right. If you store them in a TreeSet or TreeMap, your comparator must be consistent with equals.

Immutable keys are safer keys

I avoid mutable keys because modifying them after insertion breaks lookup. If I must use a mutable object, I never mutate the fields used for hashing.

Consistency between equals and compareTo

If compareTo says two objects are equal but equals says they are different, TreeSet will treat one as a duplicate and discard it. I always write tests for this when using sorted collections.

Memory and GC considerations

Collections trade memory for speed. In older systems with constrained heaps, memory pressure can cause unexpected slowdowns. I watch these patterns:

ArrayList over-allocation

ArrayList grows by ~1.5x. If I know I’ll hold 10 million items, I pre-size to avoid massive intermediate arrays.

HashMap capacity and load factor

If I know the entry count, I compute the capacity to avoid rehashing. Rehashing is expensive and can cause GC spikes.

int expected = 1000000;
int capacity = (int) (expected / 0.75f) + 1;
Map map = new HashMap(capacity);

LinkedList node overhead

Every element in a LinkedList allocates a node object. This is slower and heavier on GC than ArrayList, which stores references in a contiguous array.

Choosing collections under real workloads

When I decide, I use these questions as a checklist:

Do I need ordering? If yes, is it insertion order or sorted order?
Do I need uniqueness? If yes, is uniqueness based on equals or a comparator?
Do I need fast random access or fast head/tail operations?
Is the collection shared across threads? If yes, use concurrent structures.
Is the collection exposed to callers? If yes, use copies or unmodifiable views.
What is the hot operation? If 90% of operations are contains, use a Set.

Practical scenarios with do/don’t guidance

These are patterns I’ve used in production, with clear do/don’t guidance.

Scenario 1: Deduplicating user input

If I ingest IDs from a file and need unique entries, I use a HashSet. I do not use a List with contains.

Set ids = new HashSet();
for (String id : inputIds) {
ids.add(id.trim());
}

Scenario 2: User-facing ordered lists

If I show the user a list of recent actions, I use LinkedHashSet or LinkedHashMap to preserve order and avoid duplicates.

Scenario 3: Top N leaderboard

If I need the top N scores, I use a min-heap PriorityQueue of size N instead of sorting the entire list. This keeps memory stable when the dataset is huge.

Scenario 4: Single-threaded task queue

If one thread produces and one thread consumes, I use ArrayDeque for raw speed. If multiple threads are involved, I use a BlockingQueue.

Scenario 5: Read-mostly registry

If I have a registry that is read constantly and updated rarely, I use CopyOnWriteArrayList or a volatile snapshot list, not a synchronized ArrayList.

Performance considerations with practical ranges

I avoid exact numbers because hardware and JVM settings vary, but I do share rough ranges that help with intuition. These are the kinds of differences I see in production-like conditions:

ArrayList iteration is often 2x to 5x faster than LinkedList iteration because of cache locality.
HashSet contains is usually 5x to 50x faster than List contains on large collections.
TreeMap and TreeSet operations are commonly 3x to 10x slower than HashMap and HashSet for simple lookups, but they provide ordering and range queries.
CopyOnWriteArrayList reads are fast, but writes can be 10x to 100x more expensive than ArrayList due to array copying.
ConcurrentHashMap can be near HashMap speed on reads and 2x to 4x slower on writes compared to single-threaded HashMap, but it scales safely under concurrency.

I use these ranges as intuition, not as laws. If performance is critical, I measure with a benchmark that matches the real data shape and access patterns.

Testing collection behavior

I test collection behavior when it affects correctness. A small investment in tests can prevent nasty regressions.

Example: ordering guarantees

import java.util.*;
public class OrderingTest {
public static void main(String[] args) {
Set set = new LinkedHashSet();
set.add("A");
set.add("B");
set.add("C");
System.out.println(set); // [A, B, C]
}
}

Example: equals and hashCode

I always test that equal objects hash the same and that mutated objects do not break map lookups.

Modern utility methods that still fit Java 2 mindsets

Even if you’re in a Java 2 era codebase, modern utilities can simplify code.

Collections utilities I use often

Collections.emptyList, emptySet, emptyMap for safe returns.
Collections.singletonList for fixed one-element collections.
Collections.unmodifiableList to protect internal data.
Collections.binarySearch for sorted lists.
Collections.shuffle when test data needs randomization.

Avoiding accidental mutability

If a method returns a collection, I decide explicitly whether the caller can modify it. In my experience, unintentional mutability is one of the biggest sources of bugs in older Java systems.

When NOT to use a collection

Sometimes the right answer is not a collection at all.

If you only need one or two values, a pair of fields or a small object is clearer than a List.
If you need index-based fixed-size storage, an array might be simpler and faster.
If you need memory-mapped storage or off-heap structures, standard collections are the wrong tool.

A practical collection selection checklist

When I’m on-call or reviewing changes, this is the checklist I use:

1) Behavior: what must be true (order, uniqueness, key identity)?

2) Hot operations: what happens most often (get, add, contains, remove)?

3) Size: how big can it get (thousands, millions, billions)?

4) Concurrency: who touches it (single thread, many threads, fork-join)?

5) Exposure: who can mutate it (internal only, external callers)?

6) Failure mode: what happens if it grows or becomes inconsistent?

If a collection choice fails any of those, I re-think it before I ship.

Final thoughts

Java 2-era collections are still a foundation for modern Java systems. They’re simple, proven, and incredibly fast when used correctly. The trick is to treat them as design decisions, not just data containers. When you match your collection to the behavior you need, your code becomes clearer, faster, and more reliable.

The biggest wins I’ve seen in legacy systems rarely came from a new framework. They came from swapping a List for a Set, replacing a synchronized Map with a ConcurrentHashMap, or switching from LinkedList to ArrayList. Those are small changes that compound over years. That’s the power of choosing the right collection.