Data structures contain pointers

Share
Copied to clipboard.
Trey Hunner smiling in a t-shirt against a yellow wall
Trey Hunner
3 min. read 4 min. video Python 3.10—3.14

In Python, data structures don't contain objects. So, what do they contain?

For a more thorough dive into this topic, see the much longer article, Variables and objects in Python.

Referencing the same object in multiple places

Let's point a variable row to a list of three zeroes:

>>> row = [0, 0, 0]

Now let's make a new variable that points to a list-of-lists:

>>> boat = [row, row, row]

We now have a list of three lists, each with three zeroes in it:

>>> boat
[[0, 0, 0], [0, 0, 0], [0, 0, 0]]

What would happen if we look up index 1, and then index 1 again, and change that to the number 1?

>>> boat[1][1] = 1

What do you think might happen? What will change in our lists?

We're looking up the second list, and then the second value in the second list, and assigning that value to 1. So we've asked to change the middle item in the middle list to the number 1.

That's not quite what happens though:

>>> boat
[[0, 1, 0], [0, 1, 0], [0, 1, 0]]

Instead, we changed the middle number in all three of our inner lists.

Why did this happen?

Well...

Data structures store references, not objects

Lists in Python don't actually contain objects. They contain references to objects.

And when we made our outer list, we gave it three references to the same list:

>>> id(boat[0])
139699039162432
>>> id(boat[1])
139699039162432
>>> boat[0] is boat[1]
True

Python's data structures contain pointers to objects, not the objects themselves.

If we look at our row list, we'll see that it changed also:

>>> row
[0, 1, 0]

We've talked about the fact that Python's variables are not like buckets that contain objects. Instead, they refer to objects, or they point to them.

Python's data structures also are not like buckets that contain objects. They point or refer to objects as well.

That means that one object could be pointed to by multiple variables, multiple data structures, or even the same data structure multiple times.

So our boat list stores three pointers to the same list:

>>> boat = [row, row, row]

When we changed one of our inner lists, we mutated that list (one of our 2 types of change in Python):

>>> boat[1][1] = 1
>>> boat
[[0, 1, 0], [0, 1, 0], [0, 1, 0]]

But we actually store three references to the same list.

So it seems like we changed three lists, but we actually just changed one list that we happen to refer to three times in the same list.

Avoid referencing the same mutable object

If we wanted to avoid this issue, we could have copied our row list using the list copy method:

>>> row = [0, 0, 0]
>>> boat = [row.copy(), row.copy(), row.copy()]

Or we could have just manually created three equivalent lists:

>>> row = [0, 0, 0]
>>> boat = [[0, 0, 0], [0, 0, 0], [0, 0, 0]]

Either way, changing one of the three sub-lists wouldn't change the others because they're all independent lists:

>>> boat[1][1] = 1
>>> boat
[[0, 0, 0], [0, 1, 0], [0, 0, 0]]

These are three different lists stored in different parts of memory in Python:

>>> boat[0] is boat[1]
False
>>> id(boat[0])
140660999377344
>>> id(boat[1])
140660999379648

An ouroboros: A list that contains itself

Data structures contain references to objects, not the objects themselves.

This append call is the ultimate demonstration of this fact:

>>> x = []
>>> x.append(x)

At this point, the first and only element of this list is the list itself:

>>> x[0] is x
True

And the first element of that list is also the list itself:

>>> x[0][0] is x
True

And the first element of that list is the list itself as well:

>>> x[0][0][0] is x
True

It's one list all the way down. So we've made an infinitely recursive data structure:

>>> x[0][0][0][0][0] is x
True

Python represents this list to the Python prompt by putting three dots inside these square brackets:

>>> x
[[...]]

It's smart enough not to show an infinite number of square brackets.

We didn't stick a bucket inside itself here, because that's physically impossible.

Data structures don't contain objects, they contain pointers to objects. And our list happens to contain a pointer that points back to itself.

Lists can store pointers that point to anything. This list just happens to point to itself.

Python's data structures contain pointers to objects

Just as variables don't contain objects, they just contain pointers to objects, data structures in Python also just store pointers to objects.

There's no actual "containment" in Python, just containment of pointers.

Series: Assignment and Mutation

Python's variables aren't buckets that contain things; they're pointers that reference objects.

The way Python's variables work can often confuse folks new to Python, both new programmers and folks moving from other languages like C++ or Java.

To track your progress on this Python Morsels topic trail, sign in or sign up.

0%
5 Keys to Python Success 🔑

Sign up for my 5 day email course and learn essential concepts that introductory courses often overlook!