{ggseqplot}

      ggplotify sequence plots (updated slides)

Motivation


  • {TraMineR}’s default plots appearance almost publication-ready
  • But: virtually always some adjustments are necessary
  • Requires some knowledge of base R’s {plot}
  • However, today most (new) R users prefer {ggplot2}

Motivation


  • Goals:

    • Provide a package that uses {ggplot2} to render sequence plots
    • and that allows for using {ggplot2} functions and extensions to change the plot appearance
  • Tasks:

    • reshape sequence data into format required by {ggplot2} (usually long data format)
    • use appropriate geom_* functions to rebuild seqplot functions


install.packages("ggseqplot")
library(ggseqplot)

What we can do with {ggseqplot}


Summarization plots TraMineR functions ggseqplot function ggplot2 function
State Distribution Plot

seqdplot

seqstatd

ggseqdplot geom_bar
Entropy Line Plot

seqHtplot

seqstatd

ggseqeplot geom_line
Modal State Sequence Plot

seqmsplot

seqmodst

ggseqmsplot geom_bar
Mean Time Plot

seqmtplot

seqmeant

ggseqmtplot geom_bar
Transition Rate Plot seqtrate ggseqtrplot geom_tile

What we can do with {ggseqplot}


Representation plots TraMineR functions ggseqplot functions ggplot2 and related functions
Sequence Index Plot seqiplot ggseqiplot

geom_rect

ggh4x::facetted_pos_scales

Sequence Frequency Plot seqfplot

ggseqfplot

(ggseqiplot)

seqtab

geom_rect

ggh4x::facetted_pos_scales

Representative Sequence Plot

seqrplot

seqrep

ggseqrplot

(ggseqiplot)

geom_rect

ggrepel::geom_label_repel

ggrepel::geom_text_repel

ggtext::element_markdown

patchwork

Relative Frequency Sequence Plot seqrfplot

(ggseqiplot)

geom_rect

geom_boxplot

patchwork

Example


We use the well-known example data from {TraMineR} to render some plots.


data(actcal)

set.seed(1)
actcal <- actcal[sample(nrow(actcal),300),]

actcal.lab <- c(
  "> 37 hours", "19-36 hours", 
  "1-18 hours", "no work"
  )

actcal.seq <- seqdef(
  actcal,
  13:24,
  labels = actcal.lab
  )

State distribution plots


seqdplot(actcal.seq)

TraMineR dplot - default version

ggseqdplot(actcal.seq)

ggseqplot dplot - default version

State distribution plots


ggseqdplot(actcal.seq,
           border = TRUE)

State distribution plots


ggseqdplot(actcal.seq,
           border = TRUE) +
  # Built-in months abbreviations for axis labels
  scale_x_discrete(labels = month.abb)

State distribution plots


ggseqdplot(actcal.seq,
           border = TRUE) +
  # Built-in months abbreviations for axis labels
  scale_x_discrete(labels = month.abb) + 
  # change the color palette (fill and border color)
  scale_fill_discrete_sequential("heat")

State distribution plots


ggseqdplot(actcal.seq,
           border = TRUE) +
  # Built-in months abbreviations for axis labels
  scale_x_discrete(labels = month.abb) + 
  # change the color palette (fill and border color)
  scale_fill_discrete_sequential("heat") +
  # apply & adjust alternative theme
  theme_ipsum() +
  theme(
    legend.position = "bottom",
    legend.title = element_blank(),
    legend.text = element_text(size = 11)
    )

State distribution plots

ggseqdplot(actcal.seq, border = TRUE,
           group = actcal$sex,   # Group by gender
           dissect = "row") +    # separate plot for each state
  scale_x_discrete(labels = month.abb) +
  scale_fill_discrete_sequential("heat")

Sequence index plots


seqIplot(actcal.seq, sortv = "from.end")

TraMineR iplot

ggseqiplot(actcal.seq, sortv = "from.end")

ggseqplot iplot

Sequence index plots


ggseqiplot(actcal.seq, 
           sortv = "from.end")

Sequence index plots


ggseqiplot(actcal.seq, 
           sortv = "from.end", 
           group = actcal$sex)

Sequence index plots


ggseqiplot(actcal.seq, 
           sortv = "from.end", 
           group = actcal$sex,
           facet_scale = "fixed")

Sequence index plots


# using {ggh4x} to get varying plot sizes

# a vector storing the heights of the subplots   
hghts <- table(fct_drop(actcal$sex))/nrow(actcal.seq)

ggseqiplot(actcal.seq, 
           sortv = "from.end", 
           group = actcal$sex,
           facet_ncol = 1) +
  force_panelsizes(rows = hghts) +
  theme(panel.spacing = unit(1, "lines"))

Sequence index plots


ggseqiplot(actcal.seq, sortv = "from.end") + 
  # Use months abbreviations for axis labels
  scale_x_discrete(labels = month.abb) + 
  # change the fill and border color
  scale_fill_discrete_sequential("heat") +
  scale_color_discrete_sequential("heat") +
  # add a title and a axis title
  labs(x = "Month",
       title = "Piccarreta-flavored Index Plot") +
  # let the time run "bottom-up" instead of "left-right"
  coord_flip() +
  # Change the position and size 
  # of the title and the legend position
  theme(legend.position = "top",
        plot.title = element_text(size = 30),
        plot.title.position = "plot")

Representative sequence index plots

# Compute dissimilarity matrix
lcs.dis <- seqdist(actcal.seq, method="LCS")

seqrplot(actcal.seq, diss = lcs.dis, coverage = .7)

TraMineR rplot

# Compute dissimilarity matrix
lcs.dis <- seqdist(actcal.seq, method="LCS")

ggseqrplot(actcal.seq, diss = lcs.dis, coverage = .7, border = TRUE)

ggseqdplot rplot

Transition rate plots

ggseqtrplot(actcal.seq, 
            group = actcal$sex)

Some words of caution


Warning

  • I am working on this package alone in my spare time and I am a novice R developer
  • Double check ggseqplots by comparing them to the TraMineR plots


Important

  • Don’t use ggseqrfplot until the current issue is fixed
  • Dependencies are not specified correctly: update tidyverse package(s)

Some words of caution


Warning

  • I am working on this package alone in my spare time and I am a novice R developer
  • Double check ggseqplots by comparing them to the TraMineR plots


Important

  • Don’t use ggseqrfplot until the current issue is fixed
  • Dependencies are not specified correctly: update tidyverse package(s)
  • Both issues should be fixed after update to version 0.8.3 (on CRAN since 2023-09-22)

Future plans


  • High priority: Fix known issues (rfplot; dependencies)
  • Low priority: include additional plot types and revise current functions