Connected scatterplot with R

Cedric Vidonne

Lei Chen

Connected scatterplot with R

A connected scatterplot is a type of visualization that displays the evolution of a series of data points that are connected by straight line segments. In some cases, it is not the most intuitive to read; but it is impressive for storytelling.

More about: Connected scatterplot


Connected scatterplot

# Loading required packages
library(unhcrthemes)
library(tidyverse)
library(scales)
library(ggrepel)

# Loading data
df <- read_csv("https://raw.githubusercontent.com/GDS-ODSSS/unhcr-dataviz-platform/master/data/correlation/scatterplot_connected.csv")

# Plot
ggplot(
  df,
  aes(
    x = refugee_number,
    y = idp_number
  )
) +
  geom_segment(aes(
    xend = c(tail(refugee_number, n = -1), NA),
    yend = c(tail(idp_number, n = -1), NA)
  ),
  color = unhcr_pal(n = 1, "pal_grey")
  ) +
  geom_point(
    color = unhcr_pal(n = 1, "pal_blue"),
    size = 3
  ) +
  geom_text_repel(
    data = df[seq(1, nrow(df), 2), ],
    aes(label = year),
    size = 8 / .pt,
    point.padding = 5
  ) +
  labs(
    title = "Evolution of refugee vs IDP population in Afghanistan\n2001-2021",
    y = "Number of IDPs",
    x = "Number of refugees",
    caption = "Source: Data source here\n© UNHCR, The UN Refugee Agency"
  ) +
  scale_x_continuous(labels = label_number_si()) +
  scale_y_continuous(labels = label_number_si()) +
  theme_unhcr(
    grid = "XY",
    axis = FALSE,
    axis_title = "xy"
  )

A connected scatterplot showing evolution of refugee vs IDP population in Afghanistan | 2001-2021


Related chart with R