[R] Hybrid map: Chloropleth X Time series

The {geofacet} extension provide a facet_geo() function which allows to position plots in a similar pattern to the original geography. The amount of information given by the whole plot is thus more important than on classical chloropleths, where the different entities of the map are colored according to a single variable. But such plots are harder to read and less accessible than chloropleths.

In this tutorial, we will see how we can mix both approaches, and add a “classic” chloropleth map to the plots produced with {geofacet}. As an example, we will use the TidyTuesday dataset about milk production in the US

1. Time series map with {geofacet}

This dataset contains a table listing milk production by state from 1970 to 2017.

library(tidyverse)
# Load data 
production<-read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2019/2019-01-29/state_milk_production.csv')

head(production)
## # A tibble: 6 x 4
##   region    state          year milk_produced
##   <chr>     <chr>         <dbl>         <dbl>
## 1 Northeast Maine          1970     619000000
## 2 Northeast New Hampshire  1970     356000000
## 3 Northeast Vermont        1970    1970000000
## 4 Northeast Massachusetts  1970     658000000
## 5 Northeast Rhode Island   1970      75000000
## 6 Northeast Connecticut    1970     661000000

In this table, milk production is given in pounds. We will make a quick conversion to liters.

# 100 pounds of milk ~ 44 liters
production <- production %>%
  mutate(milk_liter=milk_produced*(44/100))

We will start with a combined graph with facet_wrap(), showing the temporal evolution of the milk production in each state.

ggplot(
  data=production, 
  aes(x=year,y=milk_liter/10^9))+
  geom_line()+
  # One plot by state
  facet_wrap(~state)+
  scale_x_continuous(breaks=c(1970,1986,2002))+
  scale_y_continuous(breaks=c(0,7,14))+
  labs(
    y="Milk production (billion liters)",
    x="")+
  theme_minimal()

In the plot above, the states are arranged in alphabetical order, not by geographical location. After loading the {geofacet} extension, we just have to change facet_wrap() to facet_geo() to convert this plot into a map.

library(geofacet)

ggplot(
  data=production, 
  aes(x=year,y=milk_liter/10^9))+
  geom_line()+
  # One plot by state
  facet_geo(~state)+
  scale_x_continuous(breaks=c(1970,1986,2002))+
  scale_y_continuous(breaks=c(0,7,14))+
  labs(
    y="Milk production (billion liters)",
    x="")+
  theme_minimal()

Looking closely at the graph above, we can see that no data is available for the District of Columbia in our data set. To remove this state from our plot, we will simply change the reference grid. The use of the predefined grids in geofacet is quite straightforward, and many grids are avaible for various countries.

library(geofacet)

ggplot(
  data=production, 
  aes(x=year,y=milk_liter/10^9))+
  geom_line()+
  # Reference grid without DC
  facet_geo(~state, grid = "us_state_without_DC_grid3")+
  scale_x_continuous(breaks=c(1970,1986,2002))+
  scale_y_continuous(breaks=c(0,7,14))+
  labs(
    y="Milk production (billion liters)",
    x="")+
  theme_minimal()

2. Adding chloropleth

The resulting graph may be difficult to read, due to the amount of information available. To make it easier to read, we will color each state according to the amount of milk produced relative to the country’s total production in 2017, in a similar way to chloropleths.

To do so, we will start by calculating this ratio for each state.

# Compute percentage produced by each state and by year
production<-production%>%
  dplyr::group_by(year)%>%
  dplyr::mutate(tot=sum(milk_liter))%>%
  ungroup()%>%
  mutate(per=milk_liter/tot*100)

We are now ready to add this information on the map with geom_rect().

chloropleth<- ggplot(
  production, aes(x=year,y=milk_liter/10^9))+
  # Place geom_rect() before geom_line()
  geom_rect(
    # Select only 2017
    data=production%>%filter(year==2017),
    # Fill according to percentage
    aes(fill=per),xmin=1970,xmax=Inf,ymin=0,ymax=Inf,
    inherit.aes = FALSE)+
  scale_fill_gradient(low="#e1e5f2",high="#1f7a8c")+
  geom_line(color="#d1495b")+
  scale_x_continuous(breaks=c(1970,1986,2002))+
  scale_y_continuous(breaks=c(0,7,14))+
  labs(
    y="Milk production (billion liters)",
    x="",
    fill="% of country\nproduction (2017)")+
  facet_geo(~state, grid = "us_state_without_DC_grid3")+
  theme_minimal()

chloropleth

So California is clearly visible as the main milk producer in the US, which may be surprising given the state’s rather dry climate, which can limit the production of forage resources.

To finish the layout of the graph, we can manually add the grid of graphs, which has been covered by geom_rect().

# Grid coordinates
grid <- tibble(
  x=c(1986,2002),
  y=c(7,14)
)

chloropleth+
  # Add grid to plot
  geom_segment(
    data=grid,aes(x=x,xend=x),y=0,yend=Inf,color="white"
  )+
  geom_segment(
    data=grid,aes(y=y,yend=y),x=0,xend=Inf,color="white"
  )

3. Customize plot

You may now customize the plot! You will find below an example (full code available here).