Boxplot with R

Cedric Vidonne

Lei Chen

Boxplot with R

The boxplot uses boxes and lines to show the distributions of one or more groups of numeric data based on a 5-point summary of data points: the upperextreme (“maximum”), upper quartile (Q3), median, lower quartile (Q1), and lower extreme (minimum) values.

More about: Boxplot


Boxplot

# Loading required packages
library(unhcrthemes)
library(tidyverse)

# Loading data
df <- read_csv("https://raw.githubusercontent.com/GDS-ODSSS/unhcr-dataviz-platform/master/data/distribution/boxplot.csv")

# Plot
ggplot(df, aes(x = country, y = age)) +
  geom_boxplot(fill = unhcr_pal(n = 1, "pal_blue"),
               alpha = 0.3,
               color = unhcr_pal(n = 1, "pal_grey"),
               width = .5) +
  labs(
    title = "Refugees age distribution by country of asylum | 2020",
    caption = "Source: Data source here\n© UNHCR, The UN Refugee Agency",
    x = NULL,
    y = "Age"
  ) +
  scale_y_continuous(expand = expansion(c(0, 0.1)),
                     breaks = seq(0, 100, 10)) +
  theme_unhcr(grid = "Y")

A boxplot showing age distribution | 2020


Grouped boxplot

# Loading required packages
library(unhcrthemes)
library(tidyverse)

# Loading data
df <- read_csv("https://raw.githubusercontent.com/GDS-ODSSS/unhcr-dataviz-platform/master/data/distribution/boxplot.csv")

# Plot
ggplot(df, aes(x = country, y = age, fill = gender)) +
  geom_boxplot(alpha = 0.4,
               color = unhcr_pal(n = 1, "pal_grey"),
               width = .5) +
  labs(
    title = "Refugees age distribution and gender\nby country of asylum | 2020",
    caption = "Source: Data source here\n© UNHCR, The UN Refugee Agency",
    x = NULL,
    y = "Age"
  ) +
  scale_fill_unhcr_d(nmax = 3, order = c(2, 1)) +
  scale_y_continuous(expand = expansion(c(0, 0.1)),
                     breaks = seq(0, 100, 10)) +
  theme_unhcr(grid = "Y")

A boxplot showing age distribution by gender | 2020


Related chart with R