Scatter plots are one of the most common and useful plots for visualizing relationships between two continuous variables. The plotly Python library provides a powerful graph objects framework for building customizable interactive scatter plots.
In this comprehensive guide, we will explore how to:
- Create basic scatter plots with graph objects
- Customize markers, colors, scales, and styles
- Animate scatter plot traces over time
- Add interactivity with hover text, click events, selections
- Plot geographic data on scatter mapbox plots
- Leverage datasets to quickly generate plots
- Combine multiple traces for advanced use cases
- Statistical analysis and model fitting
- Best practices for scatter plot design
Getting Started with Basic Scatter Plots
To get started, import plotly.graph_objects and instantiate a Figure containing a Scatter trace:
import plotly.graph_objects as go
fig = go.Figure(data=go.Scatter(
x=[0, 1, 2, 3],
y=[2, 1, 4, 3]
))
fig.show()
By default, Plotly will draw line connectors between the points. To switch to markers-only mode, set the mode parameter to "markers":
fig = go.Figure(data=go.Scatter(
x=[0, 1, 2, 3],
y=[2, 1, 4, 3],
mode="markers"
))
Customizing Markers, Colors, and Styles
We have extensive control over the visual styling of the markers. Here are some of the customizations we can apply:
Size, Symbol, Color
fig = go.Figure(data=go.Scatter(
x=[0, 1, 2, 3],
y=[2, 1, 4, 3],
mode="markers",
marker=dict(
size=30,
symbol=‘pentagon‘,
color=‘rgb(200, 0, 0)‘,
)
))
Border Width, Color
fig = go.Figure(data=go.Scatter(
marker=dict(
line=dict(
width=4,
color=‘rgb(0, 0, 0)‘
)
)
))
Opacity
fig = go.Figure(data=go.Scatter(
marker=dict(
opacity=0.5
)
))
We can also set these properties to arrays to visually encode an extra variable.
For example, mapping the marker size to a z-dimension:
fig = go.Figure(data=go.Scatter(
x=[0, 1, 2, 3],
y=[2, 1, 4, 3],
marker=dict(
size=[10, 20, 30, 40],
)
))
Customizing Colorscales
To visualize a third continuous variable, we can map a colorscale to the markers.
First we import colorscales from plotly.express, then create the scale and pass to the marker parameter along with showscale=True:
from plotly.express import px
fig = go.Figure(data=go.Scatter(
x=[0, 1, 2, 3],
y=[2, 1, 4, 3],
mode="markers",
marker=dict(
color=[180, 220, 280, 340],
colorscale=px.colors.sequential.Viridis,
showscale=True
)
))
| Colorscale | Description |
|---|---|
| Viridis | Perceptually uniform, printable friendly |
| Cividis | Colorvision deficiency friendly |
| Turbo | Distinct colors |
We can also color the markers using a categorical column from a dataset:
import pandas as pd
df = pd.DataFrame({
‘x‘: [0, 1, 2, 3],
‘y‘: [2, 1, 4, 3],
‘category‘: [‘a‘, ‘b‘, ‘a‘, ‘b‘]
})
fig = go.Figure(data=go.Scatter(
x=df[‘x‘],
y=df[‘y‘],
marker=dict(
color=df[‘category‘],
colorscale=[‘blue‘, ‘red‘],
showscale=True
)
))
This makes it easy to quickly visualize clusters and categories on different axes.
Animating Scatter Plot Traces Over Time
To create animated scatter plots, we add frames defining the data for each timestep:
import numpy as np
t = np.linspace(0, 20, 100)
x = np.sin(t) + np.random.randn(100)*0.2
y = np.cos(t) + np.random.randn(100)*0.2
fig = go.Figure(data=go.Scatter(
x = [x[0]],
y = [y[0]],
mode="markers+lines"
), frames=[go.Frame(
data=go.Scatter(
x=x[:k+1],
y=y[:k+1]
)
) for k in range(len(x))]
)
fig.show()
This plots the points sequentially over time. We can also parameterize the styles so colors, sizes etc change over time too.
Some common animation examples include:
- Simulating model predictions
- Visualizing movement over time
- Showing temporal patterns and seasonality
Adding Interactivity to Scatter Plots
Plotly figures have built-in support for hover tooltips, click events, and selections.
Hover Text
To set the text displayed when hovering over a point, use a hovertext or text parameter:
import numpy as np
fig = go.Figure(data=go.Scatter(
x=np.random.rand(10),
y=np.random.rand(10),
hovertext=[‘Point A‘, ‘Point B‘, ‘Point C‘],
mode=‘markers‘
))
Click Events
We can also execute Python callbacks when clicking on points with clickmode=‘event+select‘:
import numpy as np
def click_handler(trace, points, state):
ind = points.point_inds[0]
print(f‘Clicked on Point {ind}‘)
fig = go.Figure(data=go.Scatter(
x=np.random.rand(10),
y=np.random.rand(10),
mode=‘markers‘,
clickmode=‘event+select‘
))
fig.data[0].on_click(click_handler)
fig.show()
Selections
There is also built-in support for selecting points with rectangles or lasso shapes:
fig = go.Figure(data=go.Scatter(
x=np.random.rand(10),
y=np.random.rand(10),
mode=‘markers‘,
selectedpoints=[]
))
fig.update_layout(
dragmode=‘lasso‘
)
These interactivity features make Plotly scatter plots much more engaging and usable for things like data cleaning, identification of outliers, and dynamic linking to other plots.
Geographic Scatter Plots with Mapbox
For geographic data, we can create interactive scatter plots mapped onto Mapbox maps.
First install plotly-geo, import px.scatter_mapbox, and pass in the location data:
import plotly.graph_objects as go
fig = go.Figure(data=go.Scattermapbox(
lat=[45, 53, 38],
lon=[-75, -3, -97],
mode = "markers",
marker = go.scattermapbox.Marker(
size = 14
)
))
Then configure the Mapbox layout:
fig.update_layout(
mapbox_style="open-street-map",
mapbox_center_lat=40,
mapbox_center_lon=-20,
mapbox=dict(
zoom=3
)
)
Mapbox access tokens are required for public visualization. Offline development works without a token.
Common use cases include:
- Plotting travel paths over time
- Visualizing spatial point patterns
- Linking geographic regions to multivariate data
Statistical Analysis and Model Fitting
Scatter plots provide an intuitive visualization for statistical analysis between two variables.
We can fit models to assess the correlation and relationship. For example, using numpy polyfit to fit a polynomial:
import numpy as np
x = np.array([0, 1, 2, 3, 4, 5])
y = np.array([1, 3, 2, 4, 7, 10])
z = np.polyfit(x, y, 2)
f = np.poly1d(z)
xp = np.linspace(0, 5, 100)
yp = f(xp)
fig = go.Figure(data=go.Scatter(
x=x, y=y,
mode=‘markers‘
))
fig.add_trace(go.Scatter(x=xp, y=yp,
line=dict(color=‘darkblue‘, width=2)))
fig.show()
We can also add linear regression, confidence intervals, compute correlation coefficients, and more. These modeling capabilities make Plotly scatter plots useful for quantitative analysis.
Leveraging Datasets for Quick Plots
For rapid data exploration, Plotly Express lets you instantly visualize DataFrames:
import plotly.express as px
df = px.data.iris()
fig = px.scatter(df, x="sepal_width", y="sepal_length")
fig.show()
Plotly Express handles details like axes labels, hovers, colors, and legends automatically.
We can then use the generated figure to customize as needed with graph objects:
import plotly.graph_objects as go
from plotly.express import scatter
fig = scatter(df, x="sepal_width", y="sepal_length")
fig.update_traces(
hovertemplate="Species: %{customdata}<extra></extra>"
)
fig.update_layout(
title=‘My Custom Iris Plot‘,
width=800,
height=500
)
This gives the best of both worlds: quick exploration with Express then customization with graph objects.
Combining Multiple Trace Types
By layering different trace types, we can build rich scatter plot use cases:
import numpy as np
import plotly.graph_objects as go
t = np.linspace(0, 10, 200)
fig = go.Figure()
fig.add_trace(go.Scatter(
x=t, y=np.sin(t),
mode=‘lines‘,
name=‘sin function‘
))
fig.add_trace(go.Scatter(
x=t, y=np.cos(t),
mode=‘markers‘,
name=‘cos samples ‘,
marker=dict(size=10)
))
fig.show()
Some ideas for multiple traces:
- Regressions + data points
- Fitted models + confidence intervals
- Data points + smoothed trends
- Timeseries forecast vs actual
- Geographic regions + categories
By overlaying scatter traces, you can build rich, customized data visualizations.
Design Best Practices for Scatter Plots
Here are some key tips for effective scatter plot design:
Label Clearly
- Give the plots and axes clear descriptive labels
- Include units of measurement where applicable
- Use plot subtitles and captions if needed for clarity
Show Origins
- Anchor axes at zero when possible
- If zoomed/transformed clearly indicate on axes
Visualize Distributions
- Use marginal histograms or kde plots to show distribution of each variable
- Can help identify patterns like gaps, natural clusters, etc
Color and Symbol Encode
- Use color or symbols to visually encode categories
- Apply distinct easily differentiable colors/symbols
Control Density
- Adjust marker size, opacity, jitter to control overplotting
- Show marker density with heatmaps
By following these kinds of best practices, you can create scatter plots that clearly communicate the relationships in your data.
Conclusion
In this comprehensive guide, we explored how to leverage Plotly‘s powerful graph objects API for building customizable scatter plots in Python.
Key topics included:
- Basic scatter plot configuration
- Customizing markers, colors, styles, and scales
- Adding animations, interactivity, geographic mapping
- Statistical analysis and model fitting
- Quick plotting with datasets
- Combining multiple traces
- Best practices for effective visualization
With the graph objects framework, you have extensive control to create interactive publication-quality scatter plots tailored to your specific needs.
The wide range flexibility makes Plotly one of the best libraries for advanced statistical data visualization in Python.


