Gaspare Tortorici - Remittances from Italy

The financial side of migration

Migrants often keep in touch with those left behind with letters, postcards, messages, phone and video calls etc. These are all ways to bridge geographical distances and feel closer to one’s families. While often neglected, there is a possibly less romantic, intimately financial, side of this: remittances – defined as the money that migrants send back home to their families and friends.

According to the Italian Statistics Office, there were about 5 million foreign residents in Italy as of 2023 (or about 8.7% of the whole population). The Bank of Italy has created a fascinating data set that contains information on formal remittance flows from each Italian province to any country of the world, from 2005 to 2023, on a yearly basis.

This post offers a graphical representation of it, and comments on some of the most glaring facts that shine through the data.

What countries receive most remittances?

After having downloaded the data from here, we start to explore them by computing some period averages. As far as I understand, the unit of measurement is nominal (not real) €s, meaning that the data is not adjusted for inflation. Additionally, informal channels are not picked up, suggesting that these figure might understate the actual dimension of the phenomenon.

Code

library(readxl)
library(tidyverse)
library(viridis)
library(ggthemes)
library(treemapify)

a <- read_excel("rimesse.xlsx", sheet = 3, skip = 1)

# glimpse(a)
names(a)[5] <- "value"

largest_recepients <- a %>% 
                      group_by(country_name, iso = country_code) %>% # Group by country
                      summarise(value = round(sum(value, na.rm = T), 0)) %>% # Sum all yearly values and round up
                      ungroup() %>% # Ungroup
                      arrange(desc(value)) %>% # Arrange the result in descending order
                      mutate(country_name = str_to_title(country_name)) # This is a pet peeve but I like when country names are not written in capital letters. The first will suffice!

non_zero <- largest_recepients %>%
            filter(value > 0) # 164 countries

# sum(non_zero$value) # 54824 billion!

ten_largest <- largest_recepients %>%
               slice(1:10) %>% # Get a slice of the first 10 rows
               pull(country_name) # Pull a vector with the names of the largest recipients

Italian migrants send money to 164 out of the 200 odd countries in the world today. The overall 2016-2024 volume is mind-boggling: €54.824 billion. That is a fourth of the planned 2021-2026 Next Generation EU funding to Italy. Given the list of countries is very long, it may be worth visualising the data through a tree map.

Code

ggplot(non_zero %>%
        mutate(value = value/1000), 
       aes(area = value, fill = value, label = paste(country_name, "\n", round(value, 2)))) +
  geom_treemap() +
  geom_treemap_text(family = "Palatino", colour = "white", place = "centre", grow = F, reflow = F) +
  scale_fill_gradient("Remittances, Millions of €s", high = "#023047", low = "royalblue") +
  theme(legend.position = "bottom")

Let us consider Bangladesh, the largest recipient of remittances from Italy: it secured more than a €1.1 billion in 2023 alone. Obviously Bangladeshi diaspora send money from other countries too, so it is useful to put this number in perspective. According to the World Bank, Bangladesh secured a grand total of about €21 billion (5% of its GPD), meaning that the Italian community only accounted for a mere 5.5% of all remittances that flew to Bangladesh in 2023.

Here I report the the 10 largest recipients of remittances from Italy:


Rank	Country Name	Millions of €s
1	Bangladesh	6770
2	Romania	5000
3	The Philippines	3922
4	Pakistan	3761
5	Morocco	3447
6	Senegal	3066
7	India	2934
8	Sri Lanka	2383
9	Georgia	2099
10	Peru	2041
…	…	…

A cursory analysis of the table suggests that the ranking does not perfectly reflect the size of these ethnic communities in Italy. As of 2023, there were more than a million Romanians, while Bangladeshis were less than a fifth of that number. The second most represented country among migrants (i.e. Albania) does not even make it to the remittance top-10 list (it currently is the 13th largest recipient, but it remitted more in the past). Why is this? A proper appraisal of the causes is beyond the scope of this post, but the following factors are be part of the story: different propensity to remit, remitting costs, and favourable exchange rates; tighter family links; economic success in Italy; longer migration spells; different development levels.

Migrants and remittances

Let us test this idea slightly more formally and focus on a single year, 2023. We download some data on foreigners in Italy from the Italian Statistics Office, and merge it to the one on remittances. We the create a scatterplot, and overlay a regression line on it.

Code

istat <- read.csv("istat_stranieri.csv", sep = "|") 

# names(istat)
# glimpse(istat)

istat_2023 <- istat %>%
              filter(Territorio == "Italia", Sesso == "totale") %>% # Select national level statistics, for both men and women
              select(7, 11) %>% # Select country codes and how many foreigners there were
              setNames(c("iso", "value_migration")) %>%
              mutate(value_migration = value_migration/1000)

remittances_2023 <- a %>% 
                    filter(year == 2023) %>%
                    group_by(country_name, iso = country_code) %>% 
                    summarise(value_remittances = round(sum(value, na.rm = T), 0)) %>%
                    ungroup() %>% 
                    arrange(desc(value_remittances)) %>%
                    mutate(country_name = str_to_title(country_name))
                    
# Now we merge these two objcets using their iso code
                    
df <- istat_2023 %>%
      left_join(remittances_2023, by = "iso") %>%
      filter(value_remittances > 0) # We only want positive values of remittances

# cor(df$value_migration, df$value_remittances)

ggplot(df, aes(x = value_migration, y = value_remittances, label = iso)) +
  geom_point(color = "#023047", size = 2) + 
  geom_text(vjust = 1.75, size = 3) +
  geom_smooth(method = "lm", se = F, color = "royalblue") +
  theme_minimal() +
  labs(x = "Number of Migrants, Thousands", y = "Remittances sent, Millions of €s")

There is a moderate positive correlation that would increases if one excluded some outliers like Romania and Bangladesh.

Code

ggplot(df %>%
         filter(iso != "RO", iso != "BD"), aes(x = value_migration, y = value_remittances, label = iso)) +
  geom_point(color = "#023047", size = 2) +
  geom_text(vjust = 1.75, size = 3) +
  geom_smooth(method = "lm", se = F, color = "royalblue") +
  theme_minimal() +
  labs(x = "Number of Migrants, Thousands", y = "Remittances sent, Millions of €s")

The cloud in the bottom-left corner indicates that most ethnic communities are relatively small and remit relatively little in absolute terms. Groups in the top-left remit disproportionately more relative to their size; the opposite is true for groups in the bottom right.

From Italy to the World

We can also plot the data on a map. Before being able to do that we need to merge our remittances data to a shapefile of the world. We use country codes rather than country names because our sources are in different languages.

Code

largest_recepients <- a %>%
                      group_by(country_name, country_code) %>% 
                      summarise(value = round(sum(value, na.rm = T), 0)) %>%
                      ungroup() %>% # 
                      arrange(desc(value))

shp_world <- shp_world %>%
             mutate(value = largest_recepients$value[match(iso_3166_1_, largest_recepients$country_code, nomatch = NA)]) %>%
             mutate(value = value/1000)

# ggplot() +
#   geom_sf(data = shp_world, size = .1, color = "black", aes(fill = value)) +
#   scale_fill_viridis("Remittances sent, Millions of €s", direction = - 1, end = .95) +
#   theme_map() +
#   theme(legend.position = "bottom") +
#   coord_sf()

# ggsave("remittances_world.png", height = 8, width = 10)

What happened over time?

So far, we have neglected the time dimension. That’s a shame! Let us start by plotting remitting trends in the 10 largest recipients.

Code

largest_recepients_time <- a %>%
                           group_by(country_name, year) %>% 
                           summarise(value = round(sum(value, na.rm = T), 0)) %>% 
                           ungroup() %>% 
                           arrange(desc(value)) %>%
                           mutate(country_name = str_to_title(country_name)) %>%
                           filter(country_name %in% ten_largest) %>%
                           filter(year < 2024) 

ggplot(largest_recepients_time) +
  geom_line(aes(x = year, y = value/1000, group = country_name), size = 1, color = "royalblue") +
  facet_wrap(.~ reorder(country_name, - value), ncol = 5) +
  theme_minimal() +
  theme(legend.position = "none") +
  labs(x = "Year", y = "Millions of €s")

I allocated a facet to each country because 10 different lines in a single chart would make it look somewhat crowded. The rise of Bangladesh is quite impressive; so is the fall of Romania. To visualise how much each series has changed relative to itself over the period 2016-2023, we can compute annualised growth rates and plot them on a slope graph.

Code

largest_recepients_time_change <- largest_recepients_time %>%
                                  arrange(country_name, year) %>%
                                  group_by(country_name) %>%
                                  mutate(yoy_change = (value - lag(value, 1)) / lag(value, 1)) %>%
                                  summarise(yoy_average = mean(yoy_change, na.rm = T)) %>%
                                  mutate(begin_value = 0) %>%
                                  mutate(end_value = begin_value + yoy_average*100) %>%
                                  select(1, 3, 4) %>%
                                  ungroup() %>%
                                  pivot_longer(names_to = "time", values_to = "points", 2:3) %>%
                                  mutate(time = case_when(time == "begin_value" ~ "2016",
                                                                   TRUE ~ "2023"))

largest_recepients_time_change$country_name <- fct_reorder(largest_recepients_time_change$country_name, 
                                                           - largest_recepients_time_change$points, .fun = mean)

# custom_palette <- c("#7400B8", "#6930C3", "#5E60CE", "#5390D9", "#4EA8DE", "#48BFE3", "#56CFE1", "#64DFDF", "#72EFDD", "#80ffdb")

ggplot(largest_recepients_time_change, 
       aes(x = time, y = points, group = country_name, color = country_name)) +
  geom_line(size = 1) +
  geom_point(size = 3) +
  # scale_color_manual("Countires", values = custom_palette) +
  scale_color_viridis("Countires", discrete = T) +
  labs(x = "Time", y = "Average Annual Growth, %") +
  theme_minimal() +
  theme(axis.text.x = element_text(angle = 45, hjust = 1),
       legend.position = "right")

The steepest the slope, the faster a country’s remittance flow has grown. Remittances to Georgia displayed the fastest growth rate; remittances to Bangladesh have increased at a more moderate pace, despite being the largest in absolute terms.

Where do remittances originate in Italy?

As per official figures, 83.4% of all migrants reside in central and northern Italy. Do remittances display a similar spacial distribution?

Code

c <- read_excel("rimesse.xlsx", sheet = 4, skip = 1)

c <- c %>%
     pivot_longer(names_to = "year", values_to = "value", 6:24) %>%
     mutate(year = as.integer(str_replace(year, "year_", ""))) %>%
     group_by(province_name, region_name) %>%
     summarise(value = sum(value, na.rm = T)) %>%
     ungroup() 

c$province_name <- gsub("REGGIO EMILIA", "REGGIO NELL'EMILIA", c$province_name)
c$province_name <- gsub("REGGIO CALABRIA", "REGGIO DI CALABRIA", c$province_name)
c$province_name <- gsub("VERBANO CUSIO OSSOLA", "VERBANO-CUSIO-OSSOLA", c$province_name)
c$province_name <- gsub("MONZA-BRIANZA", "MONZA E DELLA BRIANZA", c$province_name)

# unique(c$province_name)

shp_reg <- st_read(dsn = 'ProvCM01012024_g/Reg01012024_g_WGS84.shp', quiet = T) 

shp_italy <- st_read(dsn = 'ProvCM01012024_g/ProvCM01012024_g_WGS84.shp', quiet = T) %>%
             st_transform(crs = crs) %>%
             mutate(DEN_UTS = toupper(DEN_UTS)) %>%
             mutate(value = c$value[match(DEN_UTS, c$province_name, nomatch = NA)]) %>%
             mutate(value = value/1000) %>%
             mutate(
               value_bin = cut(
                 value,
                 breaks = quantile(na.rm = T, value, probs = seq(0, 1, by = 0.2)), 
                 include.lowest = TRUE,
                 right = FALSE,
               )
             )

# glimpse(shp_italy)

ggplot() +
  geom_sf(data = shp_italy, size = .1, color = "black", aes(fill = value_bin)) +
  geom_sf(data = shp_reg, color = "black", size = 1, fill = NA) +
  theme_map() +
  scale_fill_viridis("Remittances sent, Millions of €s", direction = - 1, end = .95, discrete = T, guide = guide_legend(nrow = 1))

That’s a pretty Dalmatian! As a rule of thumb, southern Italy is lighter, but remittances are not nearly as concentrated as one would have expected. This might be due to the fact that migrants tend to live in urban areas. For instance, the provinces of Rome, Bari, Naples, and Catania all belong to the highest quintile.

Citation

BibTeX citation:

@online{tortorici2024,
  author = {Tortorici, Gaspare},
  title = {Remittances from {Italy}},
  date = {2024-08-08},
  url = {http://www.gasparetortorici.info/posts/08_08_2024_post/remittances_website.html},
  langid = {en}
}

For attribution, please cite this work as:

Tortorici, Gaspare. 2024. “Remittances from Italy.” August 8, 2024. http://www.gasparetortorici.info/posts/08_08_2024_post/remittances_website.html.

--- title: "Remittances from Italy" description: "Migrants, Love, and a Ton of Money" author: - name: Gaspare Tortorici url: www.gasparetortorici.info date: 08-08-2024 categories: [Data Analysis, Web Scraping, Networks, R] citation: url: http://www.gasparetortorici.info/posts/08_08_2024_post/remittances_website.html image: money.jpg format: html: code-fold: true code-tools: source: true toggle: true caption: none editor: visual draft: false # Change this if you don't want the article to appear --- ```{r setup, include = FALSE} knitr::opts_chunk$set(message = FALSE) ``` ## The financial side of migration Migrants often keep in touch with those left behind with letters, postcards, messages, phone and video calls etc. These are all ways to bridge geographical distances and feel closer to one’s families. While often neglected, there is a possibly less romantic, intimately financial, side of this: remittances – defined as the money that migrants send back home to their families and friends. According to the Italian Statistics Office, there were about 5 million foreign residents in Italy as of 2023 (or about 8.7% of the whole population). The Bank of Italy has created a fascinating data set that contains information on formal remittance flows from each Italian province to any country of the world, from 2005 to 2023, on a yearly basis. This post offers a graphical representation of it, and comments on some of the most glaring facts that shine through the data. ### What countries receive most remittances? After having downloaded the data from [here](https://www.bancaditalia.it/statistiche/tematiche/rapporti-estero/rimesse-immigrati/), we start to explore them by computing some period averages. As far as I understand, the unit of measurement is *nominal* (not *real*) €s, meaning that the data is not adjusted for inflation. Additionally, informal channels are not picked up, suggesting that these figure might understate the actual dimension of the phenomenon. ```{r} #| warning: false library(readxl) library(tidyverse) library(viridis) library(ggthemes) library(treemapify) a <- read_excel("rimesse.xlsx", sheet = 3, skip = 1) # glimpse(a) names(a)[5] <- "value" largest_recepients <- a %>% group_by(country_name, iso = country_code) %>% # Group by country summarise(value = round(sum(value, na.rm = T), 0)) %>% # Sum all yearly values and round up ungroup() %>% # Ungroup arrange(desc(value)) %>% # Arrange the result in descending order mutate(country_name = str_to_title(country_name)) # This is a pet peeve but I like when country names are not written in capital letters. The first will suffice! non_zero <- largest_recepients %>% filter(value > 0) # 164 countries # sum(non_zero$value) # 54824 billion! ten_largest <- largest_recepients %>% slice(1:10) %>% # Get a slice of the first 10 rows pull(country_name) # Pull a vector with the names of the largest recipients ``` Italian migrants send money to 164 out of the 200 odd countries in the world today. The overall 2016-2024 volume is mind-boggling: €54.824 billion. That is a fourth of the planned 2021-2026 [Next Generation EU](https://www.italynextgeneration.eu/recoveryfund-en/) funding to Italy. Given the list of countries is very long, it may be worth visualising the data through a tree map. ```{r} #| warning: false #| fig-align: center #| fig-width: 10 #| fig-height: 10 ggplot(non_zero %>% mutate(value = value/1000), aes(area = value, fill = value, label = paste(country_name, "\n", round(value, 2)))) + geom_treemap() + geom_treemap_text(family = "Palatino", colour = "white", place = "centre", grow = F, reflow = F) + scale_fill_gradient("Remittances, Millions of €s", high = "#023047", low = "royalblue") + theme(legend.position = "bottom") ``` Let us consider Bangladesh, the largest recipient of remittances from Italy: it secured more than a €1.1 billion in 2023 alone. Obviously Bangladeshi diaspora send money from other countries too, so it is useful to put this number in perspective. According to the World Bank, Bangladesh secured a grand total of about €21 billion (5% of its GPD), meaning that the Italian community only accounted for a mere 5.5% of all remittances that flew to Bangladesh in 2023. Here I report the the 10 largest recipients of remittances from Italy: | Rank | Country Name | Millions of €s | |------|-----------------|----------------| | 1 | Bangladesh | 6770 | | 2 | Romania | 5000 | | 3 | The Philippines | 3922 | | 4 | Pakistan | 3761 | | 5 | Morocco | 3447 | | 6 | Senegal | 3066 | | 7 | India | 2934 | | 8 | Sri Lanka | 2383 | | 9 | Georgia | 2099 | | 10 | Peru | 2041 | | ... | ... | ... | : {.striped .hover} A cursory analysis of the table suggests that the ranking does not perfectly reflect the size of these ethnic communities in Italy. As of 2023, there were more than a million Romanians, while Bangladeshis were less than a fifth of that number. The second most represented country among migrants (i.e. Albania) does not even make it to the remittance top-10 list (it currently is the 13th largest recipient, but it remitted more in the past). Why is this? A proper appraisal of the causes is beyond the scope of this post, but the following factors are be part of the story: different propensity to remit, remitting costs, and favourable exchange rates; tighter family links; economic success in Italy; longer migration spells; different development levels. ### Migrants and remittances Let us test this idea slightly more formally and focus on a single year, 2023. We download [some data](http://dati.istat.it/Index.aspx?DataSetCode=DCIS_POPSTRCIT1) on foreigners in Italy from the Italian Statistics Office, and merge it to the one on remittances. We the create a scatterplot, and overlay a regression line on it. ```{r} #| warning: false #| fig-width: 9 #| fig-height: 4 istat <- read.csv("istat_stranieri.csv", sep = "|") # names(istat) # glimpse(istat) istat_2023 <- istat %>% filter(Territorio == "Italia", Sesso == "totale") %>% # Select national level statistics, for both men and women select(7, 11) %>% # Select country codes and how many foreigners there were setNames(c("iso", "value_migration")) %>% mutate(value_migration = value_migration/1000) remittances_2023 <- a %>% filter(year == 2023) %>% group_by(country_name, iso = country_code) %>% summarise(value_remittances = round(sum(value, na.rm = T), 0)) %>% ungroup() %>% arrange(desc(value_remittances)) %>% mutate(country_name = str_to_title(country_name)) # Now we merge these two objcets using their iso code df <- istat_2023 %>% left_join(remittances_2023, by = "iso") %>% filter(value_remittances > 0) # We only want positive values of remittances # cor(df$value_migration, df$value_remittances) ggplot(df, aes(x = value_migration, y = value_remittances, label = iso)) + geom_point(color = "#023047", size = 2) + geom_text(vjust = 1.75, size = 3) + geom_smooth(method = "lm", se = F, color = "royalblue") + theme_minimal() + labs(x = "Number of Migrants, Thousands", y = "Remittances sent, Millions of €s") ``` There is a moderate positive correlation that would increases if one excluded some outliers like Romania and Bangladesh. ```{r} #| warning: false #| fig-width: 9 #| fig-height: 4 ggplot(df %>% filter(iso != "RO", iso != "BD"), aes(x = value_migration, y = value_remittances, label = iso)) + geom_point(color = "#023047", size = 2) + geom_text(vjust = 1.75, size = 3) + geom_smooth(method = "lm", se = F, color = "royalblue") + theme_minimal() + labs(x = "Number of Migrants, Thousands", y = "Remittances sent, Millions of €s") ``` The cloud in the bottom-left corner indicates that most ethnic communities are relatively small and remit relatively little in absolute terms. Groups in the top-left remit disproportionately more relative to their size; the opposite is true for groups in the bottom right. ### From Italy to the World We can also plot the data on a map. Before being able to do that we need to merge our remittances data to a [shapefile](link) of the world. We use country codes rather than country names because our sources are in different languages. ```{r} #| warning: false #| message: false #| echo: false #| fig-align: center #| fig-width: 10 #| fig-height: 10 library(sf) crs <- 4326 # Coordinate Reference System shp_world <- st_read(dsn = "world-administrative-boundaries/world-administrative-boundaries.shp", quiet = T) %>% st_transform(crs = crs) # glimpse(shp_world) ``` ```{r} #| warning: false #| fig-width: 10 #| fig-height: 10 largest_recepients <- a %>% group_by(country_name, country_code) %>% summarise(value = round(sum(value, na.rm = T), 0)) %>% ungroup() %>% # arrange(desc(value)) shp_world <- shp_world %>% mutate(value = largest_recepients$value[match(iso_3166_1_, largest_recepients$country_code, nomatch = NA)]) %>% mutate(value = value/1000) # ggplot() + # geom_sf(data = shp_world, size = .1, color = "black", aes(fill = value)) + # scale_fill_viridis("Remittances sent, Millions of €s", direction = - 1, end = .95) + # theme_map() + # theme(legend.position = "bottom") + # coord_sf() # ggsave("remittances_world.png", height = 8, width = 10) ``` ![](remittances_world.png)               ### What happened over time? So far, we have neglected the time dimension. That's a shame! Let us start by plotting remitting trends in the 10 largest recipients. ```{r} #| warning: false #| fig-width: 8 #| fig-height: 8 largest_recepients_time <- a %>% group_by(country_name, year) %>% summarise(value = round(sum(value, na.rm = T), 0)) %>% ungroup() %>% arrange(desc(value)) %>% mutate(country_name = str_to_title(country_name)) %>% filter(country_name %in% ten_largest) %>% filter(year < 2024) ggplot(largest_recepients_time) + geom_line(aes(x = year, y = value/1000, group = country_name), size = 1, color = "royalblue") + facet_wrap(.~ reorder(country_name, - value), ncol = 5) + theme_minimal() + theme(legend.position = "none") + labs(x = "Year", y = "Millions of €s") ``` I allocated a facet to each country because 10 different lines in a single chart would make it look somewhat crowded. The rise of Bangladesh is quite impressive; so is the fall of Romania. To visualise how much each series has changed relative to itself over the period 2016-2023, we can compute annualised growth rates and plot them on a slope graph. ```{r} #| warning: false #| fig-width: 10 #| fig-height: 10 largest_recepients_time_change <- largest_recepients_time %>% arrange(country_name, year) %>% group_by(country_name) %>% mutate(yoy_change = (value - lag(value, 1)) / lag(value, 1)) %>% summarise(yoy_average = mean(yoy_change, na.rm = T)) %>% mutate(begin_value = 0) %>% mutate(end_value = begin_value + yoy_average*100) %>% select(1, 3, 4) %>% ungroup() %>% pivot_longer(names_to = "time", values_to = "points", 2:3) %>% mutate(time = case_when(time == "begin_value" ~ "2016", TRUE ~ "2023")) largest_recepients_time_change$country_name <- fct_reorder(largest_recepients_time_change$country_name, - largest_recepients_time_change$points, .fun = mean) # custom_palette <- c("#7400B8", "#6930C3", "#5E60CE", "#5390D9", "#4EA8DE", "#48BFE3", "#56CFE1", "#64DFDF", "#72EFDD", "#80ffdb") ggplot(largest_recepients_time_change, aes(x = time, y = points, group = country_name, color = country_name)) + geom_line(size = 1) + geom_point(size = 3) + # scale_color_manual("Countires", values = custom_palette) + scale_color_viridis("Countires", discrete = T) + labs(x = "Time", y = "Average Annual Growth, %") + theme_minimal() + theme(axis.text.x = element_text(angle = 45, hjust = 1), legend.position = "right") ``` The steepest the slope, the faster a country's remittance flow has grown. Remittances to Georgia displayed the fastest growth rate; remittances to Bangladesh have increased at a more moderate pace, despite being the largest in absolute terms. ### Where do remittances originate in Italy? As per official figures, 83.4% of all migrants reside in central and northern Italy. Do remittances display a similar spacial distribution? ```{r} #| warning: false #| fig-align: center #| fig-width: 10 #| fig-height: 10 c <- read_excel("rimesse.xlsx", sheet = 4, skip = 1) c <- c %>% pivot_longer(names_to = "year", values_to = "value", 6:24) %>% mutate(year = as.integer(str_replace(year, "year_", ""))) %>% group_by(province_name, region_name) %>% summarise(value = sum(value, na.rm = T)) %>% ungroup() c$province_name <- gsub("REGGIO EMILIA", "REGGIO NELL'EMILIA", c$province_name) c$province_name <- gsub("REGGIO CALABRIA", "REGGIO DI CALABRIA", c$province_name) c$province_name <- gsub("VERBANO CUSIO OSSOLA", "VERBANO-CUSIO-OSSOLA", c$province_name) c$province_name <- gsub("MONZA-BRIANZA", "MONZA E DELLA BRIANZA", c$province_name) # unique(c$province_name) shp_reg <- st_read(dsn = 'ProvCM01012024_g/Reg01012024_g_WGS84.shp', quiet = T) shp_italy <- st_read(dsn = 'ProvCM01012024_g/ProvCM01012024_g_WGS84.shp', quiet = T) %>% st_transform(crs = crs) %>% mutate(DEN_UTS = toupper(DEN_UTS)) %>% mutate(value = c$value[match(DEN_UTS, c$province_name, nomatch = NA)]) %>% mutate(value = value/1000) %>% mutate( value_bin = cut( value, breaks = quantile(na.rm = T, value, probs = seq(0, 1, by = 0.2)), include.lowest = TRUE, right = FALSE, ) ) # glimpse(shp_italy) ggplot() + geom_sf(data = shp_italy, size = .1, color = "black", aes(fill = value_bin)) + geom_sf(data = shp_reg, color = "black", size = 1, fill = NA) + theme_map() + scale_fill_viridis("Remittances sent, Millions of €s", direction = - 1, end = .95, discrete = T, guide = guide_legend(nrow = 1)) ``` That's a pretty Dalmatian! As a rule of thumb, southern Italy is lighter, but remittances are not nearly as concentrated as one would have expected. This might be due to the fact that migrants tend to live in urban areas. For instance, the provinces of Rome, Bari, Naples, and Catania all belong to the highest quintile.