Creating Interactive Scatter Plots with Plotly Graph Objects

Scatter plots are one of the most common and useful plots for visualizing relationships between two continuous variables. The plotly Python library provides a powerful graph objects framework for building customizable interactive scatter plots.

In this comprehensive guide, we will explore how to:

Create basic scatter plots with graph objects
Customize markers, colors, scales, and styles
Animate scatter plot traces over time
Add interactivity with hover text, click events, selections
Plot geographic data on scatter mapbox plots
Leverage datasets to quickly generate plots
Combine multiple traces for advanced use cases
Statistical analysis and model fitting
Best practices for scatter plot design

Getting Started with Basic Scatter Plots

To get started, import plotly.graph_objects and instantiate a Figure containing a Scatter trace:

import plotly.graph_objects as go

fig = go.Figure(data=go.Scatter(
    x=[0, 1, 2, 3],
    y=[2, 1, 4, 3]  
))

fig.show()

By default, Plotly will draw line connectors between the points. To switch to markers-only mode, set the mode parameter to "markers":

fig = go.Figure(data=go.Scatter(
    x=[0, 1, 2, 3],
    y=[2, 1, 4, 3],
    mode="markers"
))

Customizing Markers, Colors, and Styles

We have extensive control over the visual styling of the markers. Here are some of the customizations we can apply:

Size, Symbol, Color

fig = go.Figure(data=go.Scatter(
    x=[0, 1, 2, 3],
    y=[2, 1, 4, 3], 
    mode="markers", 
    marker=dict(
        size=30,
        symbol=‘pentagon‘,
        color=‘rgb(200, 0, 0)‘,
    )
))

Border Width, Color

fig = go.Figure(data=go.Scatter(
    marker=dict(
        line=dict(
            width=4,
            color=‘rgb(0, 0, 0)‘
        )
    ) 
))

Opacity

fig = go.Figure(data=go.Scatter(
    marker=dict(
        opacity=0.5
    )
))

We can also set these properties to arrays to visually encode an extra variable.

For example, mapping the marker size to a z-dimension:

fig = go.Figure(data=go.Scatter(
    x=[0, 1, 2, 3],
    y=[2, 1, 4, 3],
    marker=dict(
        size=[10, 20, 30, 40], 
    )
))

Customizing Colorscales

To visualize a third continuous variable, we can map a colorscale to the markers.

First we import colorscales from plotly.express, then create the scale and pass to the marker parameter along with showscale=True:

from plotly.express import px
fig = go.Figure(data=go.Scatter(
    x=[0, 1, 2, 3],
    y=[2, 1, 4, 3],
    mode="markers", 
    marker=dict(
        color=[180, 220, 280, 340],
        colorscale=px.colors.sequential.Viridis, 
        showscale=True
    )
))

Colorscale	Description
Viridis	Perceptually uniform, printable friendly
Cividis	Colorvision deficiency friendly
Turbo	Distinct colors

We can also color the markers using a categorical column from a dataset:

import pandas as pd
df = pd.DataFrame({
    ‘x‘: [0, 1, 2, 3],
    ‘y‘: [2, 1, 4, 3],
    ‘category‘: [‘a‘, ‘b‘, ‘a‘, ‘b‘]    
})

fig = go.Figure(data=go.Scatter(
    x=df[‘x‘],
    y=df[‘y‘],
    marker=dict(
        color=df[‘category‘],
        colorscale=[‘blue‘, ‘red‘],
        showscale=True
    ) 
))

This makes it easy to quickly visualize clusters and categories on different axes.

Animating Scatter Plot Traces Over Time

To create animated scatter plots, we add frames defining the data for each timestep:

import numpy as np
t = np.linspace(0, 20, 100)
x = np.sin(t) + np.random.randn(100)*0.2
y = np.cos(t) + np.random.randn(100)*0.2

fig = go.Figure(data=go.Scatter(
    x = [x[0]],
    y = [y[0]], 
    mode="markers+lines"
), frames=[go.Frame(
    data=go.Scatter(
        x=x[:k+1],
        y=y[:k+1]
   )
) for k in range(len(x))]
)

fig.show()

This plots the points sequentially over time. We can also parameterize the styles so colors, sizes etc change over time too.

Some common animation examples include:

Simulating model predictions
Visualizing movement over time
Showing temporal patterns and seasonality

Adding Interactivity to Scatter Plots

Plotly figures have built-in support for hover tooltips, click events, and selections.

Hover Text

To set the text displayed when hovering over a point, use a hovertext or text parameter:

import numpy as np

fig = go.Figure(data=go.Scatter(
    x=np.random.rand(10),
    y=np.random.rand(10),
    hovertext=[‘Point A‘, ‘Point B‘, ‘Point C‘], 
    mode=‘markers‘ 
))

Click Events

We can also execute Python callbacks when clicking on points with clickmode=‘event+select‘:

import numpy as np

def click_handler(trace, points, state):
    ind = points.point_inds[0]
    print(f‘Clicked on Point {ind}‘) 

fig = go.Figure(data=go.Scatter(
    x=np.random.rand(10),
    y=np.random.rand(10),
    mode=‘markers‘,
    clickmode=‘event+select‘ 
))

fig.data[0].on_click(click_handler)
fig.show()

Selections

There is also built-in support for selecting points with rectangles or lasso shapes:

fig = go.Figure(data=go.Scatter(
    x=np.random.rand(10),
    y=np.random.rand(10), 
    mode=‘markers‘,
    selectedpoints=[]
))

fig.update_layout(
    dragmode=‘lasso‘  
)

These interactivity features make Plotly scatter plots much more engaging and usable for things like data cleaning, identification of outliers, and dynamic linking to other plots.

Geographic Scatter Plots with Mapbox

For geographic data, we can create interactive scatter plots mapped onto Mapbox maps.

First install plotly-geo, import px.scatter_mapbox, and pass in the location data:

import plotly.graph_objects as go
fig = go.Figure(data=go.Scattermapbox(
    lat=[45, 53, 38],
    lon=[-75, -3, -97], 
    mode = "markers",
    marker = go.scattermapbox.Marker(
        size = 14
    )
))

Then configure the Mapbox layout:

fig.update_layout(
    mapbox_style="open-street-map",
    mapbox_center_lat=40,
    mapbox_center_lon=-20, 
    mapbox=dict(
        zoom=3
    )
)

Mapbox access tokens are required for public visualization. Offline development works without a token.

Common use cases include:

Plotting travel paths over time
Visualizing spatial point patterns
Linking geographic regions to multivariate data

Statistical Analysis and Model Fitting

Scatter plots provide an intuitive visualization for statistical analysis between two variables.

We can fit models to assess the correlation and relationship. For example, using numpy polyfit to fit a polynomial:

import numpy as np
x = np.array([0, 1, 2, 3, 4, 5])
y = np.array([1, 3, 2, 4, 7, 10])

z = np.polyfit(x, y, 2)
f = np.poly1d(z)

xp = np.linspace(0, 5, 100)
yp = f(xp)

fig = go.Figure(data=go.Scatter(
    x=x, y=y,
    mode=‘markers‘
))
fig.add_trace(go.Scatter(x=xp, y=yp, 
                line=dict(color=‘darkblue‘, width=2)))

fig.show()

We can also add linear regression, confidence intervals, compute correlation coefficients, and more. These modeling capabilities make Plotly scatter plots useful for quantitative analysis.

Leveraging Datasets for Quick Plots

For rapid data exploration, Plotly Express lets you instantly visualize DataFrames:

import plotly.express as px
df = px.data.iris()

fig = px.scatter(df, x="sepal_width", y="sepal_length") 
fig.show()

Plotly Express handles details like axes labels, hovers, colors, and legends automatically.

We can then use the generated figure to customize as needed with graph objects:

import plotly.graph_objects as go
from plotly.express import scatter

fig = scatter(df, x="sepal_width", y="sepal_length")

fig.update_traces(
    hovertemplate="Species: %{customdata}<extra></extra>"  
)
fig.update_layout(
    title=‘My Custom Iris Plot‘,
    width=800,
    height=500 
)

This gives the best of both worlds: quick exploration with Express then customization with graph objects.

Combining Multiple Trace Types

By layering different trace types, we can build rich scatter plot use cases:

import numpy as np
import plotly.graph_objects as go

t = np.linspace(0, 10, 200)

fig = go.Figure() 

fig.add_trace(go.Scatter(
    x=t, y=np.sin(t), 
    mode=‘lines‘, 
    name=‘sin function‘
))

fig.add_trace(go.Scatter(
    x=t, y=np.cos(t),
    mode=‘markers‘,
    name=‘cos samples ‘,  
    marker=dict(size=10)
))

fig.show()

Some ideas for multiple traces:

Regressions + data points
Fitted models + confidence intervals
Data points + smoothed trends
Timeseries forecast vs actual
Geographic regions + categories

By overlaying scatter traces, you can build rich, customized data visualizations.

Design Best Practices for Scatter Plots

Here are some key tips for effective scatter plot design:

Label Clearly

Give the plots and axes clear descriptive labels
Include units of measurement where applicable
Use plot subtitles and captions if needed for clarity

Show Origins

Anchor axes at zero when possible
If zoomed/transformed clearly indicate on axes

Visualize Distributions

Use marginal histograms or kde plots to show distribution of each variable
Can help identify patterns like gaps, natural clusters, etc

Color and Symbol Encode

Use color or symbols to visually encode categories
Apply distinct easily differentiable colors/symbols

Control Density

Adjust marker size, opacity, jitter to control overplotting
Show marker density with heatmaps

By following these kinds of best practices, you can create scatter plots that clearly communicate the relationships in your data.

Conclusion

In this comprehensive guide, we explored how to leverage Plotly‘s powerful graph objects API for building customizable scatter plots in Python.

Key topics included:

Basic scatter plot configuration
Customizing markers, colors, styles, and scales
Adding animations, interactivity, geographic mapping
Statistical analysis and model fitting
Quick plotting with datasets
Combining multiple traces
Best practices for effective visualization

With the graph objects framework, you have extensive control to create interactive publication-quality scatter plots tailored to your specific needs.

The wide range flexibility makes Plotly one of the best libraries for advanced statistical data visualization in Python.

Creating Interactive Scatter Plots with Plotly Graph Objects

Getting Started with Basic Scatter Plots

Customizing Markers, Colors, and Styles

Customizing Colorscales

Animating Scatter Plot Traces Over Time

Adding Interactivity to Scatter Plots

Geographic Scatter Plots with Mapbox

Statistical Analysis and Model Fitting

Leveraging Datasets for Quick Plots

Combining Multiple Trace Types

Design Best Practices for Scatter Plots

Conclusion

Pandas DataFrame Select Rows By Condition

How to Install LibreELEC on a Raspberry Pi

Trimming Whitespace in Go: A Senior Dev‘s Comprehensive Guide

A Full-Stack Developer‘s Guide to Expertly Removing Unnamed Columns in Pandas

A Developer‘s Guide to Permanently Fixing Docker‘s Infamous "Name Already in Use" Error

Installing and Optimizing PostgreSQL 14 for Production on Ubuntu 20.04

Linuxhaxor.net – About Open Source & Linux

Getting Started with Basic Scatter Plots

Customizing Markers, Colors, and Styles

Customizing Colorscales

Animating Scatter Plot Traces Over Time

Adding Interactivity to Scatter Plots

Geographic Scatter Plots with Mapbox

Statistical Analysis and Model Fitting

Leveraging Datasets for Quick Plots

Combining Multiple Trace Types

Design Best Practices for Scatter Plots

Conclusion

Related posts:

Similar Posts

Linuxhaxor.net – About Open Source & Linux