MATH167R: Tidying and manipulating data 
 
Overview of today 
Tidying datasets 
Exploratory analysis 
 
 
One dataset, four ways 
library (tidyverse) 
 table1 
# A tibble: 6 × 4
  country      year  cases population
  <chr>       <dbl>  <dbl>      <dbl>
1 Afghanistan  1999    745   19987071
2 Afghanistan  2000   2666   20595360
3 Brazil       1999  37737  172006362
4 Brazil       2000  80488  174504898
5 China        1999 212258 1272915272
6 China        2000 213766 1280428583 
 
 
# A tibble: 12 × 4
   country      year type            count
   <chr>       <dbl> <chr>           <dbl>
 1 Afghanistan  1999 cases             745
 2 Afghanistan  1999 population   19987071
 3 Afghanistan  2000 cases            2666
 4 Afghanistan  2000 population   20595360
 5 Brazil       1999 cases           37737
 6 Brazil       1999 population  172006362
 7 Brazil       2000 cases           80488
 8 Brazil       2000 population  174504898
 9 China        1999 cases          212258
10 China        1999 population 1272915272
11 China        2000 cases          213766
12 China        2000 population 1280428583 
 
 
 
library (tidyverse) 
 table3 
# A tibble: 6 × 3
  country      year rate             
  <chr>       <dbl> <chr>            
1 Afghanistan  1999 745/19987071     
2 Afghanistan  2000 2666/20595360    
3 Brazil       1999 37737/172006362  
4 Brazil       2000 80488/174504898  
5 China        1999 212258/1272915272
6 China        2000 213766/1280428583 
 
 
# A tibble: 3 × 3
  country     `1999` `2000`
  <chr>        <dbl>  <dbl>
1 Afghanistan    745   2666
2 Brazil       37737  80488
3 China       212258 213766 
 
 
# A tibble: 3 × 3
  country         `1999`     `2000`
  <chr>            <dbl>      <dbl>
1 Afghanistan   19987071   20595360
2 Brazil       172006362  174504898
3 China       1272915272 1280428583 
 
 
 
 
 
Which way? 
Suppose I wanted to create the following plot:
Which format should I use? Try to recreate this plot.
What if we wanted to compute the number of cases per capita , for each country and year?
 
Munging data 
What if we want to convert from table2 or table4a and table4b to table1?
Today, we’ll discuss how to do this in both base R  and the tidyverse .
 
pivot_longer() and pivot_wider() 
The pivot_longer() and pivot_wider() functions are useful for reshaping data: pivot_longer() “lengthens” datasets and pivot_wider() does the opposite.
They are challenging to use because you need to have a clear vision of the data frame you want, not because they are complicated in R.
 
Example: pivot_longer() 
 
Agnostic 
27 
34 
60 
81 
76 
137 
122 
109 
84 
96 
 
Atheist 
12 
27 
37 
52 
35 
70 
73 
59 
74 
76 
 
Buddhist 
27 
21 
30 
34 
33 
58 
62 
39 
53 
54 
 
Catholic 
418 
617 
732 
670 
638 
1116 
949 
792 
633 
1489 
 
Don’t know/refused 
15 
14 
15 
11 
10 
35 
21 
17 
18 
116 
 
Evangelical Prot 
575 
869 
1064 
982 
881 
1486 
949 
723 
414 
1529 
 
Hindu 
1 
9 
7 
9 
11 
34 
47 
48 
54 
37 
 
Historically Black Prot 
228 
244 
236 
238 
197 
223 
131 
81 
78 
339 
 
Jehovah's Witness 
20 
27 
24 
24 
21 
30 
15 
11 
6 
37 
 
Jewish 
19 
19 
25 
25 
30 
95 
69 
87 
151 
162 
 
Mainline Prot 
289 
495 
619 
655 
651 
1107 
939 
753 
634 
1328 
 
Mormon 
29 
40 
48 
51 
56 
112 
85 
49 
42 
69 
 
Muslim 
6 
7 
9 
10 
9 
23 
16 
8 
6 
22 
 
Orthodox 
13 
17 
23 
32 
32 
47 
38 
42 
46 
73 
 
Other Christian 
9 
7 
11 
13 
13 
14 
18 
14 
12 
18 
 
Other Faiths 
20 
33 
40 
46 
49 
63 
46 
40 
41 
71 
 
Other World Religions 
5 
2 
3 
4 
2 
7 
3 
4 
4 
8 
 
Unaffiliated 
217 
299 
374 
365 
341 
528 
407 
321 
258 
597 
 
 
 
 
 
 
Example: pivot_longer() 
 
Example: pivot_longer() 
 relig_income |>  
   pivot_longer (cols =  ! religion, names_to =  "income" , values_to =  "count" ) 
 
 
Agnostic 
<$10k 
27 
 
Agnostic 
$10-20k 
34 
 
Agnostic 
$20-30k 
60 
 
Agnostic 
$30-40k 
81 
 
Agnostic 
$40-50k 
76 
 
Agnostic 
$50-75k 
137 
 
Agnostic 
$75-100k 
122 
 
Agnostic 
$100-150k 
109 
 
Agnostic 
>150k 
84 
 
Agnostic 
Don't know/refused 
96 
 
Atheist 
<$10k 
12 
 
Atheist 
$10-20k 
27 
 
Atheist 
$20-30k 
37 
 
Atheist 
$30-40k 
52 
 
Atheist 
$40-50k 
35 
 
Atheist 
$50-75k 
70 
 
Atheist 
$75-100k 
73 
 
Atheist 
$100-150k 
59 
 
Atheist 
>150k 
74 
 
Atheist 
Don't know/refused 
76 
 
Buddhist 
<$10k 
27 
 
Buddhist 
$10-20k 
21 
 
Buddhist 
$20-30k 
30 
 
Buddhist 
$30-40k 
34 
 
Buddhist 
$40-50k 
33 
 
Buddhist 
$50-75k 
58 
 
Buddhist 
$75-100k 
62 
 
Buddhist 
$100-150k 
39 
 
Buddhist 
>150k 
53 
 
Buddhist 
Don't know/refused 
54 
 
Catholic 
<$10k 
418 
 
Catholic 
$10-20k 
617 
 
Catholic 
$20-30k 
732 
 
Catholic 
$30-40k 
670 
 
Catholic 
$40-50k 
638 
 
Catholic 
$50-75k 
1116 
 
Catholic 
$75-100k 
949 
 
Catholic 
$100-150k 
792 
 
Catholic 
>150k 
633 
 
Catholic 
Don't know/refused 
1489 
 
Don’t know/refused 
<$10k 
15 
 
Don’t know/refused 
$10-20k 
14 
 
Don’t know/refused 
$20-30k 
15 
 
Don’t know/refused 
$30-40k 
11 
 
Don’t know/refused 
$40-50k 
10 
 
Don’t know/refused 
$50-75k 
35 
 
Don’t know/refused 
$75-100k 
21 
 
Don’t know/refused 
$100-150k 
17 
 
Don’t know/refused 
>150k 
18 
 
Don’t know/refused 
Don't know/refused 
116 
 
Evangelical Prot 
<$10k 
575 
 
Evangelical Prot 
$10-20k 
869 
 
Evangelical Prot 
$20-30k 
1064 
 
Evangelical Prot 
$30-40k 
982 
 
Evangelical Prot 
$40-50k 
881 
 
Evangelical Prot 
$50-75k 
1486 
 
Evangelical Prot 
$75-100k 
949 
 
Evangelical Prot 
$100-150k 
723 
 
Evangelical Prot 
>150k 
414 
 
Evangelical Prot 
Don't know/refused 
1529 
 
Hindu 
<$10k 
1 
 
Hindu 
$10-20k 
9 
 
Hindu 
$20-30k 
7 
 
Hindu 
$30-40k 
9 
 
Hindu 
$40-50k 
11 
 
Hindu 
$50-75k 
34 
 
Hindu 
$75-100k 
47 
 
Hindu 
$100-150k 
48 
 
Hindu 
>150k 
54 
 
Hindu 
Don't know/refused 
37 
 
Historically Black Prot 
<$10k 
228 
 
Historically Black Prot 
$10-20k 
244 
 
Historically Black Prot 
$20-30k 
236 
 
Historically Black Prot 
$30-40k 
238 
 
Historically Black Prot 
$40-50k 
197 
 
Historically Black Prot 
$50-75k 
223 
 
Historically Black Prot 
$75-100k 
131 
 
Historically Black Prot 
$100-150k 
81 
 
Historically Black Prot 
>150k 
78 
 
Historically Black Prot 
Don't know/refused 
339 
 
Jehovah's Witness 
<$10k 
20 
 
Jehovah's Witness 
$10-20k 
27 
 
Jehovah's Witness 
$20-30k 
24 
 
Jehovah's Witness 
$30-40k 
24 
 
Jehovah's Witness 
$40-50k 
21 
 
Jehovah's Witness 
$50-75k 
30 
 
Jehovah's Witness 
$75-100k 
15 
 
Jehovah's Witness 
$100-150k 
11 
 
Jehovah's Witness 
>150k 
6 
 
Jehovah's Witness 
Don't know/refused 
37 
 
Jewish 
<$10k 
19 
 
Jewish 
$10-20k 
19 
 
Jewish 
$20-30k 
25 
 
Jewish 
$30-40k 
25 
 
Jewish 
$40-50k 
30 
 
Jewish 
$50-75k 
95 
 
Jewish 
$75-100k 
69 
 
Jewish 
$100-150k 
87 
 
Jewish 
>150k 
151 
 
Jewish 
Don't know/refused 
162 
 
Mainline Prot 
<$10k 
289 
 
Mainline Prot 
$10-20k 
495 
 
Mainline Prot 
$20-30k 
619 
 
Mainline Prot 
$30-40k 
655 
 
Mainline Prot 
$40-50k 
651 
 
Mainline Prot 
$50-75k 
1107 
 
Mainline Prot 
$75-100k 
939 
 
Mainline Prot 
$100-150k 
753 
 
Mainline Prot 
>150k 
634 
 
Mainline Prot 
Don't know/refused 
1328 
 
Mormon 
<$10k 
29 
 
Mormon 
$10-20k 
40 
 
Mormon 
$20-30k 
48 
 
Mormon 
$30-40k 
51 
 
Mormon 
$40-50k 
56 
 
Mormon 
$50-75k 
112 
 
Mormon 
$75-100k 
85 
 
Mormon 
$100-150k 
49 
 
Mormon 
>150k 
42 
 
Mormon 
Don't know/refused 
69 
 
Muslim 
<$10k 
6 
 
Muslim 
$10-20k 
7 
 
Muslim 
$20-30k 
9 
 
Muslim 
$30-40k 
10 
 
Muslim 
$40-50k 
9 
 
Muslim 
$50-75k 
23 
 
Muslim 
$75-100k 
16 
 
Muslim 
$100-150k 
8 
 
Muslim 
>150k 
6 
 
Muslim 
Don't know/refused 
22 
 
Orthodox 
<$10k 
13 
 
Orthodox 
$10-20k 
17 
 
Orthodox 
$20-30k 
23 
 
Orthodox 
$30-40k 
32 
 
Orthodox 
$40-50k 
32 
 
Orthodox 
$50-75k 
47 
 
Orthodox 
$75-100k 
38 
 
Orthodox 
$100-150k 
42 
 
Orthodox 
>150k 
46 
 
Orthodox 
Don't know/refused 
73 
 
Other Christian 
<$10k 
9 
 
Other Christian 
$10-20k 
7 
 
Other Christian 
$20-30k 
11 
 
Other Christian 
$30-40k 
13 
 
Other Christian 
$40-50k 
13 
 
Other Christian 
$50-75k 
14 
 
Other Christian 
$75-100k 
18 
 
Other Christian 
$100-150k 
14 
 
Other Christian 
>150k 
12 
 
Other Christian 
Don't know/refused 
18 
 
Other Faiths 
<$10k 
20 
 
Other Faiths 
$10-20k 
33 
 
Other Faiths 
$20-30k 
40 
 
Other Faiths 
$30-40k 
46 
 
Other Faiths 
$40-50k 
49 
 
Other Faiths 
$50-75k 
63 
 
Other Faiths 
$75-100k 
46 
 
Other Faiths 
$100-150k 
40 
 
Other Faiths 
>150k 
41 
 
Other Faiths 
Don't know/refused 
71 
 
Other World Religions 
<$10k 
5 
 
Other World Religions 
$10-20k 
2 
 
Other World Religions 
$20-30k 
3 
 
Other World Religions 
$30-40k 
4 
 
Other World Religions 
$40-50k 
2 
 
Other World Religions 
$50-75k 
7 
 
Other World Religions 
$75-100k 
3 
 
Other World Religions 
$100-150k 
4 
 
Other World Religions 
>150k 
4 
 
Other World Religions 
Don't know/refused 
8 
 
Unaffiliated 
<$10k 
217 
 
Unaffiliated 
$10-20k 
299 
 
Unaffiliated 
$20-30k 
374 
 
Unaffiliated 
$30-40k 
365 
 
Unaffiliated 
$40-50k 
341 
 
Unaffiliated 
$50-75k 
528 
 
Unaffiliated 
$75-100k 
407 
 
Unaffiliated 
$100-150k 
321 
 
Unaffiliated 
>150k 
258 
 
Unaffiliated 
Don't know/refused 
597 
 
 
 
 
 
 
Example: pivot_longer() 
 
2 Pac 
Baby Don't Cry (Keep... 
2000-02-26 
87 
82 
72 
77 
87 
94 
99 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
 
2Ge+her 
The Hardest Part Of ... 
2000-09-02 
91 
87 
92 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
 
3 Doors Down 
Kryptonite 
2000-04-08 
81 
70 
68 
67 
66 
57 
54 
53 
51 
51 
51 
51 
47 
44 
38 
28 
22 
18 
18 
14 
12 
7 
6 
6 
6 
5 
5 
4 
4 
4 
4 
3 
3 
3 
4 
5 
5 
9 
9 
15 
14 
13 
14 
16 
17 
21 
22 
24 
28 
33 
42 
42 
49 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
 
3 Doors Down 
Loser 
2000-10-21 
76 
76 
72 
69 
67 
65 
55 
59 
62 
61 
61 
59 
61 
66 
72 
76 
75 
67 
73 
70 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
 
504 Boyz 
Wobble Wobble 
2000-04-15 
57 
34 
25 
17 
17 
31 
36 
49 
53 
57 
64 
70 
75 
76 
78 
85 
92 
96 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
 
98^0 
Give Me Just One Nig... 
2000-08-19 
51 
39 
34 
26 
26 
19 
2 
2 
3 
6 
7 
22 
29 
36 
47 
67 
66 
84 
93 
94 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
 
 
 
 
 
 
Example: pivot_longer() 
 billboard |>  
   pivot_longer ( 
     cols =  starts_with ("wk" ), 
     names_to =  "week" , 
     names_prefix =  "wk" , 
     values_to =  "rank" , 
     values_drop_na =  TRUE  
   ) 
 
 
2 Pac 
Baby Don't Cry (Keep... 
2000-02-26 
1 
87 
 
2 Pac 
Baby Don't Cry (Keep... 
2000-02-26 
2 
82 
 
2 Pac 
Baby Don't Cry (Keep... 
2000-02-26 
3 
72 
 
2 Pac 
Baby Don't Cry (Keep... 
2000-02-26 
4 
77 
 
2 Pac 
Baby Don't Cry (Keep... 
2000-02-26 
5 
87 
 
2 Pac 
Baby Don't Cry (Keep... 
2000-02-26 
6 
94 
 
2 Pac 
Baby Don't Cry (Keep... 
2000-02-26 
7 
99 
 
2Ge+her 
The Hardest Part Of ... 
2000-09-02 
1 
91 
 
2Ge+her 
The Hardest Part Of ... 
2000-09-02 
2 
87 
 
2Ge+her 
The Hardest Part Of ... 
2000-09-02 
3 
92 
 
3 Doors Down 
Kryptonite 
2000-04-08 
1 
81 
 
3 Doors Down 
Kryptonite 
2000-04-08 
2 
70 
 
3 Doors Down 
Kryptonite 
2000-04-08 
3 
68 
 
3 Doors Down 
Kryptonite 
2000-04-08 
4 
67 
 
3 Doors Down 
Kryptonite 
2000-04-08 
5 
66 
 
3 Doors Down 
Kryptonite 
2000-04-08 
6 
57 
 
3 Doors Down 
Kryptonite 
2000-04-08 
7 
54 
 
3 Doors Down 
Kryptonite 
2000-04-08 
8 
53 
 
3 Doors Down 
Kryptonite 
2000-04-08 
9 
51 
 
3 Doors Down 
Kryptonite 
2000-04-08 
10 
51 
 
 
 
 
 
 
Example: pivot_wider() 
 
4842 
Release 
1 
 
4842 
I80_1 
1 
 
4842 
Lisbon 
1 
 
4842 
Rstr 
1 
 
4842 
Base_TD 
1 
 
4842 
BCE 
1 
 
4842 
BCW 
1 
 
4842 
BCE2 
1 
 
4842 
BCW2 
1 
 
4842 
MAE 
1 
 
4842 
MAW 
1 
 
4843 
Release 
1 
 
4843 
I80_1 
1 
 
4843 
Lisbon 
1 
 
4843 
Rstr 
1 
 
4843 
Base_TD 
1 
 
4843 
BCE 
1 
 
4843 
BCW 
1 
 
4843 
BCE2 
1 
 
4843 
BCW2 
1 
 
4843 
MAE 
1 
 
4843 
MAW 
1 
 
4844 
Release 
1 
 
4844 
I80_1 
1 
 
4844 
Lisbon 
1 
 
4844 
Rstr 
1 
 
4844 
Base_TD 
1 
 
4844 
BCE 
1 
 
4844 
BCW 
1 
 
4844 
BCE2 
1 
 
4844 
BCW2 
1 
 
4844 
MAE 
1 
 
4844 
MAW 
1 
 
4845 
Release 
1 
 
4845 
I80_1 
1 
 
4845 
Lisbon 
1 
 
4845 
Rstr 
1 
 
4845 
Base_TD 
1 
 
4847 
Release 
1 
 
4847 
I80_1 
1 
 
4847 
Lisbon 
1 
 
4848 
Release 
1 
 
4848 
I80_1 
1 
 
4848 
Lisbon 
1 
 
4848 
Rstr 
1 
 
4849 
Release 
1 
 
4849 
I80_1 
1 
 
4850 
Release 
1 
 
4850 
I80_1 
1 
 
4850 
Rstr 
1 
 
4850 
Base_TD 
1 
 
4850 
BCE 
1 
 
4850 
BCW 
1 
 
4851 
Release 
1 
 
4851 
I80_1 
1 
 
4854 
Release 
1 
 
4854 
I80_1 
1 
 
4855 
Release 
1 
 
4855 
I80_1 
1 
 
4855 
Lisbon 
1 
 
4855 
Rstr 
1 
 
4855 
Base_TD 
1 
 
4857 
Release 
1 
 
4857 
I80_1 
1 
 
4857 
Lisbon 
1 
 
4857 
Rstr 
1 
 
4857 
Base_TD 
1 
 
4857 
BCE 
1 
 
4857 
BCW 
1 
 
4857 
BCE2 
1 
 
4857 
BCW2 
1 
 
4858 
Release 
1 
 
4858 
I80_1 
1 
 
4858 
Lisbon 
1 
 
4858 
Rstr 
1 
 
4858 
Base_TD 
1 
 
4858 
BCE 
1 
 
4858 
BCW 
1 
 
4858 
BCE2 
1 
 
4858 
BCW2 
1 
 
4858 
MAE 
1 
 
4858 
MAW 
1 
 
4859 
Release 
1 
 
4859 
I80_1 
1 
 
4859 
Lisbon 
1 
 
4859 
Rstr 
1 
 
4859 
Base_TD 
1 
 
4861 
Release 
1 
 
4861 
I80_1 
1 
 
4861 
Lisbon 
1 
 
4861 
Rstr 
1 
 
4861 
Base_TD 
1 
 
4861 
BCE 
1 
 
4861 
BCW 
1 
 
4861 
BCE2 
1 
 
4861 
BCW2 
1 
 
4861 
MAE 
1 
 
4861 
MAW 
1 
 
4862 
Release 
1 
 
4862 
I80_1 
1 
 
4862 
Lisbon 
1 
 
4862 
Rstr 
1 
 
4862 
Base_TD 
1 
 
4862 
BCE 
1 
 
4862 
BCW 
1 
 
4862 
BCE2 
1 
 
4862 
BCW2 
1 
 
4863 
Release 
1 
 
4863 
I80_1 
1 
 
4864 
Release 
1 
 
4864 
I80_1 
1 
 
4865 
Release 
1 
 
4865 
I80_1 
1 
 
4865 
Lisbon 
1 
 
 
 
 
 
 
Example: pivot_wider() 
 fish_encounters |>  
   pivot_wider (names_from =  station, values_from =  seen) 
 
 
4842 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
 
4843 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
 
4844 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
 
4845 
1 
1 
1 
1 
1 
NA 
NA 
NA 
NA 
NA 
NA 
 
4847 
1 
1 
1 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
 
4848 
1 
1 
1 
1 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
 
4849 
1 
1 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
 
4850 
1 
1 
NA 
1 
1 
1 
1 
NA 
NA 
NA 
NA 
 
4851 
1 
1 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
 
4854 
1 
1 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
 
4855 
1 
1 
1 
1 
1 
NA 
NA 
NA 
NA 
NA 
NA 
 
4857 
1 
1 
1 
1 
1 
1 
1 
1 
1 
NA 
NA 
 
4858 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
 
4859 
1 
1 
1 
1 
1 
NA 
NA 
NA 
NA 
NA 
NA 
 
4861 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
1 
 
4862 
1 
1 
1 
1 
1 
1 
1 
1 
1 
NA 
NA 
 
4863 
1 
1 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
 
4864 
1 
1 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
 
4865 
1 
1 
1 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
 
 
 
 
 
 
Example: pivot_wider() 
 
01 
Alabama 
income 
24476 
136 
 
01 
Alabama 
rent 
747 
3 
 
02 
Alaska 
income 
32940 
508 
 
02 
Alaska 
rent 
1200 
13 
 
04 
Arizona 
income 
27517 
148 
 
04 
Arizona 
rent 
972 
4 
 
05 
Arkansas 
income 
23789 
165 
 
05 
Arkansas 
rent 
709 
5 
 
06 
California 
income 
29454 
109 
 
06 
California 
rent 
1358 
3 
 
08 
Colorado 
income 
32401 
109 
 
08 
Colorado 
rent 
1125 
5 
 
09 
Connecticut 
income 
35326 
195 
 
09 
Connecticut 
rent 
1123 
5 
 
10 
Delaware 
income 
31560 
247 
 
10 
Delaware 
rent 
1076 
10 
 
11 
District of Columbia 
income 
43198 
681 
 
11 
District of Columbia 
rent 
1424 
17 
 
12 
Florida 
income 
25952 
70 
 
12 
Florida 
rent 
1077 
3 
 
13 
Georgia 
income 
27024 
106 
 
13 
Georgia 
rent 
927 
3 
 
15 
Hawaii 
income 
32453 
218 
 
15 
Hawaii 
rent 
1507 
18 
 
16 
Idaho 
income 
25298 
208 
 
16 
Idaho 
rent 
792 
7 
 
17 
Illinois 
income 
30684 
83 
 
17 
Illinois 
rent 
952 
3 
 
18 
Indiana 
income 
27247 
117 
 
18 
Indiana 
rent 
782 
3 
 
19 
Iowa 
income 
30002 
143 
 
19 
Iowa 
rent 
740 
4 
 
20 
Kansas 
income 
29126 
208 
 
20 
Kansas 
rent 
801 
5 
 
21 
Kentucky 
income 
24702 
159 
 
21 
Kentucky 
rent 
713 
4 
 
22 
Louisiana 
income 
25086 
155 
 
22 
Louisiana 
rent 
825 
4 
 
23 
Maine 
income 
26841 
187 
 
23 
Maine 
rent 
808 
7 
 
24 
Maryland 
income 
37147 
152 
 
24 
Maryland 
rent 
1311 
5 
 
25 
Massachusetts 
income 
34498 
199 
 
25 
Massachusetts 
rent 
1173 
5 
 
26 
Michigan 
income 
26987 
82 
 
26 
Michigan 
rent 
824 
3 
 
27 
Minnesota 
income 
32734 
189 
 
27 
Minnesota 
rent 
906 
4 
 
28 
Mississippi 
income 
22766 
194 
 
28 
Mississippi 
rent 
740 
5 
 
29 
Missouri 
income 
26999 
113 
 
29 
Missouri 
rent 
784 
4 
 
30 
Montana 
income 
26249 
206 
 
30 
Montana 
rent 
751 
9 
 
31 
Nebraska 
income 
30020 
146 
 
31 
Nebraska 
rent 
773 
4 
 
32 
Nevada 
income 
29019 
213 
 
32 
Nevada 
rent 
1017 
6 
 
33 
New Hampshire 
income 
33172 
387 
 
33 
New Hampshire 
rent 
1052 
9 
 
34 
New Jersey 
income 
35075 
148 
 
34 
New Jersey 
rent 
1249 
4 
 
35 
New Mexico 
income 
24457 
214 
 
35 
New Mexico 
rent 
809 
6 
 
36 
New York 
income 
31057 
69 
 
36 
New York 
rent 
1194 
3 
 
37 
North Carolina 
income 
26482 
111 
 
37 
North Carolina 
rent 
844 
3 
 
38 
North Dakota 
income 
32336 
245 
 
38 
North Dakota 
rent 
775 
9 
 
39 
Ohio 
income 
27435 
94 
 
39 
Ohio 
rent 
764 
2 
 
40 
Oklahoma 
income 
26207 
101 
 
40 
Oklahoma 
rent 
766 
3 
 
41 
Oregon 
income 
27389 
146 
 
41 
Oregon 
rent 
988 
4 
 
42 
Pennsylvania 
income 
28923 
119 
 
42 
Pennsylvania 
rent 
885 
3 
 
44 
Rhode Island 
income 
30210 
259 
 
44 
Rhode Island 
rent 
957 
6 
 
45 
South Carolina 
income 
25454 
123 
 
45 
South Carolina 
rent 
836 
4 
 
46 
South Dakota 
income 
28821 
276 
 
46 
South Dakota 
rent 
696 
7 
 
47 
Tennessee 
income 
25453 
102 
 
47 
Tennessee 
rent 
808 
4 
 
48 
Texas 
income 
28063 
110 
 
48 
Texas 
rent 
952 
2 
 
49 
Utah 
income 
27928 
239 
 
49 
Utah 
rent 
948 
6 
 
50 
Vermont 
income 
29351 
361 
 
50 
Vermont 
rent 
945 
11 
 
51 
Virginia 
income 
32545 
202 
 
51 
Virginia 
rent 
1166 
5 
 
53 
Washington 
income 
32318 
113 
 
53 
Washington 
rent 
1120 
4 
 
54 
West Virginia 
income 
23707 
203 
 
54 
West Virginia 
rent 
681 
6 
 
55 
Wisconsin 
income 
29868 
135 
 
55 
Wisconsin 
rent 
813 
3 
 
56 
Wyoming 
income 
30854 
342 
 
56 
Wyoming 
rent 
828 
11 
 
72 
Puerto Rico 
income 
NA 
NA 
 
72 
Puerto Rico 
rent 
464 
6 
 
 
 
 
 
 
Example: pivot_wider() 
 us_rent_income |>  
   pivot_wider ( 
     names_from =  variable, 
     values_from =  c (estimate, moe) 
   ) 
 
 
01 
Alabama 
24476 
747 
136 
3 
 
02 
Alaska 
32940 
1200 
508 
13 
 
04 
Arizona 
27517 
972 
148 
4 
 
05 
Arkansas 
23789 
709 
165 
5 
 
06 
California 
29454 
1358 
109 
3 
 
08 
Colorado 
32401 
1125 
109 
5 
 
09 
Connecticut 
35326 
1123 
195 
5 
 
10 
Delaware 
31560 
1076 
247 
10 
 
11 
District of Columbia 
43198 
1424 
681 
17 
 
12 
Florida 
25952 
1077 
70 
3 
 
13 
Georgia 
27024 
927 
106 
3 
 
15 
Hawaii 
32453 
1507 
218 
18 
 
16 
Idaho 
25298 
792 
208 
7 
 
17 
Illinois 
30684 
952 
83 
3 
 
18 
Indiana 
27247 
782 
117 
3 
 
19 
Iowa 
30002 
740 
143 
4 
 
20 
Kansas 
29126 
801 
208 
5 
 
21 
Kentucky 
24702 
713 
159 
4 
 
22 
Louisiana 
25086 
825 
155 
4 
 
23 
Maine 
26841 
808 
187 
7 
 
24 
Maryland 
37147 
1311 
152 
5 
 
25 
Massachusetts 
34498 
1173 
199 
5 
 
26 
Michigan 
26987 
824 
82 
3 
 
27 
Minnesota 
32734 
906 
189 
4 
 
28 
Mississippi 
22766 
740 
194 
5 
 
29 
Missouri 
26999 
784 
113 
4 
 
30 
Montana 
26249 
751 
206 
9 
 
31 
Nebraska 
30020 
773 
146 
4 
 
32 
Nevada 
29019 
1017 
213 
6 
 
33 
New Hampshire 
33172 
1052 
387 
9 
 
34 
New Jersey 
35075 
1249 
148 
4 
 
35 
New Mexico 
24457 
809 
214 
6 
 
36 
New York 
31057 
1194 
69 
3 
 
37 
North Carolina 
26482 
844 
111 
3 
 
38 
North Dakota 
32336 
775 
245 
9 
 
39 
Ohio 
27435 
764 
94 
2 
 
40 
Oklahoma 
26207 
766 
101 
3 
 
41 
Oregon 
27389 
988 
146 
4 
 
42 
Pennsylvania 
28923 
885 
119 
3 
 
44 
Rhode Island 
30210 
957 
259 
6 
 
45 
South Carolina 
25454 
836 
123 
4 
 
46 
South Dakota 
28821 
696 
276 
7 
 
47 
Tennessee 
25453 
808 
102 
4 
 
48 
Texas 
28063 
952 
110 
2 
 
49 
Utah 
27928 
948 
239 
6 
 
50 
Vermont 
29351 
945 
361 
11 
 
51 
Virginia 
32545 
1166 
202 
5 
 
53 
Washington 
32318 
1120 
113 
4 
 
54 
West Virginia 
23707 
681 
203 
6 
 
55 
Wisconsin 
29868 
813 
135 
3 
 
56 
Wyoming 
30854 
828 
342 
11 
 
72 
Puerto Rico 
NA 
464 
NA 
6 
 
 
 
 
 
 
One dataset, four ways 
library (tidyverse) 
 table1 
# A tibble: 6 × 4
  country      year  cases population
  <chr>       <dbl>  <dbl>      <dbl>
1 Afghanistan  1999    745   19987071
2 Afghanistan  2000   2666   20595360
3 Brazil       1999  37737  172006362
4 Brazil       2000  80488  174504898
5 China        1999 212258 1272915272
6 China        2000 213766 1280428583 
 
 
# A tibble: 12 × 4
   country      year type            count
   <chr>       <dbl> <chr>           <dbl>
 1 Afghanistan  1999 cases             745
 2 Afghanistan  1999 population   19987071
 3 Afghanistan  2000 cases            2666
 4 Afghanistan  2000 population   20595360
 5 Brazil       1999 cases           37737
 6 Brazil       1999 population  172006362
 7 Brazil       2000 cases           80488
 8 Brazil       2000 population  174504898
 9 China        1999 cases          212258
10 China        1999 population 1272915272
11 China        2000 cases          213766
12 China        2000 population 1280428583 
 
 
 
library (tidyverse) 
 table3 
# A tibble: 6 × 3
  country      year rate             
  <chr>       <dbl> <chr>            
1 Afghanistan  1999 745/19987071     
2 Afghanistan  2000 2666/20595360    
3 Brazil       1999 37737/172006362  
4 Brazil       2000 80488/174504898  
5 China        1999 212258/1272915272
6 China        2000 213766/1280428583 
 
 
# A tibble: 3 × 3
  country     `1999` `2000`
  <chr>        <dbl>  <dbl>
1 Afghanistan    745   2666
2 Brazil       37737  80488
3 China       212258 213766 
 
 
# A tibble: 3 × 3
  country         `1999`     `2000`
  <chr>            <dbl>      <dbl>
1 Afghanistan   19987071   20595360
2 Brazil       172006362  174504898
3 China       1272915272 1280428583 
 
 
 
 
 
Munging data 
What if we want to convert from table2 or table4a and table4b to table1?
Group work : Work through how to convert table2 to table1 (and back). If time permits, try converting table4a and table4b table1 (and back).
 
Working with messier data: US population, 2010-2020 
 state_population <-  readr:: read_csv ("https://www2.census.gov/programs-surveys/popest/datasets/2010-2020/national/totals/nst-est2020.csv" ) 
 
 
010 
0 
0 
00 
United States 
308745538 
308758105 
309327143 
311583481 
313877662 
316059947 
318386329 
320738994 
323071755 
325122128 
326838199 
328329953 
329398742 
329484123 
 
020 
1 
0 
00 
Northeast Region 
55317240 
55318414 
55380764 
55608318 
55782661 
55912775 
56021339 
56052790 
56063777 
56083383 
56084543 
56002934 
55924275 
55849869 
 
020 
2 
0 
00 
Midwest Region 
66927001 
66929737 
66975328 
67164092 
67348275 
67576524 
67765576 
67885682 
68018175 
68160342 
68263019 
68340091 
68357895 
68316744 
 
020 
3 
0 
00 
South Region 
114555744 
114563042 
114869421 
116019483 
117264196 
118397213 
119666248 
121049223 
122419547 
123611036 
124649156 
125686544 
126494232 
126662754 
 
020 
4 
0 
00 
West Region 
71945553 
71946912 
72101630 
72791588 
73482530 
74173435 
74933166 
75751299 
76570256 
77267367 
77841481 
78300384 
78622340 
78654756 
 
040 
3 
6 
01 
Alabama 
4779736 
4780118 
4785514 
4799642 
4816632 
4831586 
4843737 
4854803 
4866824 
4877989 
4891628 
4907965 
4920706 
4921532 
 
040 
4 
9 
02 
Alaska 
710231 
710246 
713982 
722349 
730810 
737626 
737075 
738430 
742575 
740983 
736624 
733603 
732074 
731158 
 
040 
4 
8 
04 
Arizona 
6392017 
6392292 
6407342 
6473416 
6556344 
6634690 
6732873 
6832810 
6944767 
7048088 
7164228 
7291843 
7393900 
7421401 
 
040 
3 
7 
05 
Arkansas 
2915918 
2916029 
2921998 
2941038 
2952876 
2960459 
2968759 
2979732 
2991815 
3003855 
3012161 
3020985 
3029672 
3030522 
 
040 
4 
9 
06 
California 
37253956 
37254522 
37319550 
37636311 
37944551 
38253768 
38586706 
38904296 
39149186 
39337785 
39437463 
39437610 
39418894 
39368078 
 
040 
4 
8 
08 
Colorado 
5029196 
5029319 
5047539 
5121900 
5193660 
5270774 
5352637 
5454328 
5543844 
5617421 
5697155 
5758486 
5798266 
5807719 
 
040 
1 
1 
09 
Connecticut 
3574097 
3574151 
3579173 
3588632 
3595211 
3595792 
3595697 
3588561 
3579830 
3575324 
3574561 
3566022 
3561494 
3557006 
 
040 
3 
5 
10 
Delaware 
897934 
897947 
899647 
907590 
915518 
924062 
933131 
942065 
949989 
957942 
966985 
976668 
984899 
986809 
 
040 
3 
5 
11 
District of Columbia 
601723 
601767 
605282 
620290 
635737 
651559 
663603 
677014 
687576 
697079 
704147 
708253 
712185 
712816 
 
040 
3 
5 
12 
Florida 
18801310 
18804589 
18846143 
19055607 
19302016 
19551678 
19853880 
20219111 
20627237 
20977089 
21254926 
21492056 
21688239 
21733312 
 
040 
3 
5 
13 
Georgia 
9687653 
9688737 
9712209 
9803630 
9903580 
9975592 
10071204 
10183353 
10308442 
10417031 
10519389 
10628020 
10695662 
10710017 
 
040 
4 
9 
15 
Hawaii 
1360301 
1360304 
1364004 
1379562 
1395199 
1408822 
1415335 
1422999 
1428885 
1425763 
1423102 
1415615 
1410587 
1407006 
 
040 
4 
8 
16 
Idaho 
1567582 
1567658 
1570819 
1584272 
1595910 
1612053 
1632248 
1652495 
1684036 
1719745 
1752074 
1789060 
1818238 
1826913 
 
040 
2 
3 
17 
Illinois 
12830632 
12831572 
12840545 
12867783 
12883029 
12895778 
12885092 
12859585 
12821709 
12779893 
12724685 
12667017 
12615162 
12587530 
 
040 
2 
3 
18 
Indiana 
6483802 
6484050 
6490555 
6517250 
6538989 
6570575 
6596019 
6611442 
6637898 
6662068 
6698481 
6731010 
6752183 
6754953 
 
040 
2 
4 
19 
Iowa 
3046355 
3046877 
3050819 
3066772 
3076844 
3093935 
3110643 
3122541 
3133210 
3143734 
3149900 
3159596 
3164115 
3163561 
 
040 
2 
4 
20 
Kansas 
2853118 
2853120 
2858266 
2869677 
2886024 
2894306 
2901861 
2910717 
2912977 
2910892 
2912748 
2912635 
2915024 
2913805 
 
040 
3 
6 
21 
Kentucky 
4339367 
4339330 
4348464 
4370817 
4387865 
4406906 
4416992 
4429126 
4440306 
4455590 
4464273 
4472345 
4477899 
4477251 
 
040 
3 
7 
22 
Louisiana 
4533372 
4533500 
4544635 
4576244 
4602067 
4626040 
4645938 
4666998 
4681346 
4673673 
4664450 
4658285 
4650984 
4645318 
 
040 
1 
1 
23 
Maine 
1328361 
1328354 
1327651 
1328473 
1328094 
1328543 
1331217 
1329098 
1332348 
1335743 
1340123 
1345770 
1349647 
1350141 
 
040 
3 
5 
24 
Maryland 
5773552 
5773787 
5788784 
5840241 
5888375 
5925197 
5960064 
5988528 
6007014 
6028186 
6042153 
6054954 
6059529 
6055802 
 
040 
1 
1 
25 
Massachusetts 
6547629 
6547788 
6566440 
6614218 
6664269 
6715158 
6764864 
6797484 
6827280 
6863560 
6885720 
6894883 
6898116 
6893574 
 
040 
2 
3 
26 
Michigan 
9883640 
9884112 
9877597 
9883053 
9898289 
9914802 
9932033 
9934483 
9954117 
9976752 
9987286 
9984795 
9976330 
9966555 
 
040 
2 
4 
27 
Minnesota 
5303925 
5303933 
5310934 
5346620 
5377500 
5414722 
5452665 
5484002 
5525360 
5569283 
5608762 
5640053 
5655855 
5657342 
 
040 
3 
6 
28 
Mississippi 
2967297 
2968129 
2970615 
2979147 
2984599 
2989839 
2991892 
2990231 
2990595 
2990674 
2982879 
2978227 
2971253 
2966786 
 
040 
2 
4 
29 
Missouri 
5988927 
5988941 
5996089 
6011182 
6026027 
6042989 
6059130 
6075411 
6091384 
6111382 
6125986 
6140475 
6151737 
6151548 
 
040 
4 
8 
30 
Montana 
989415 
989400 
990730 
997518 
1004168 
1014158 
1022657 
1031495 
1042137 
1053862 
1061818 
1070123 
1078405 
1080577 
 
040 
2 
4 
31 
Nebraska 
1826341 
1826311 
1829591 
1840914 
1853691 
1865813 
1879955 
1892059 
1906483 
1916998 
1925512 
1932571 
1937258 
1937552 
 
040 
4 
8 
32 
Nevada 
2700551 
2700683 
2702483 
2713114 
2744670 
2776956 
2818935 
2868531 
2919555 
2972097 
3030725 
3090771 
3128500 
3138259 
 
040 
1 
1 
33 
New Hampshire 
1316470 
1316457 
1316807 
1320444 
1324677 
1327272 
1334257 
1337480 
1343694 
1350395 
1355064 
1360783 
1365533 
1366275 
 
040 
1 
2 
34 
New Jersey 
8791894 
8791959 
8799451 
8828552 
8845671 
8857821 
8867277 
8870312 
8873584 
8888147 
8891730 
8891258 
8890883 
8882371 
 
040 
4 
8 
35 
New Mexico 
2059179 
2059199 
2064614 
2080707 
2087715 
2092833 
2090236 
2090071 
2092555 
2092844 
2093754 
2099634 
2106117 
2106319 
 
040 
1 
2 
36 
New York 
19378102 
19378117 
19399956 
19499921 
19574362 
19626488 
19653431 
19657321 
19636391 
19593849 
19544098 
19463131 
19382373 
19336776 
 
040 
3 
5 
37 
North Carolina 
9535483 
9535762 
9574586 
9658913 
9751810 
9846717 
9937295 
10037218 
10161802 
10275758 
10391358 
10501384 
10581885 
10600823 
 
040 
2 
4 
38 
North Dakota 
672591 
672575 
674752 
685526 
702227 
723149 
738736 
755537 
756114 
756755 
760062 
763724 
765224 
765309 
 
040 
2 
3 
39 
Ohio 
11536504 
11536763 
11539449 
11545735 
11550971 
11579692 
11606573 
11622315 
11640060 
11665706 
11680892 
11696507 
11699855 
11693217 
 
040 
3 
7 
40 
Oklahoma 
3751351 
3751582 
3760014 
3788824 
3819320 
3853891 
3879187 
3910518 
3928143 
3933602 
3943488 
3960676 
3977682 
3980783 
 
040 
4 
9 
41 
Oregon 
3831074 
3831083 
3837614 
3872672 
3900102 
3924110 
3965447 
4018542 
4093271 
4147294 
4183538 
4216116 
4237408 
4241507 
 
040 
1 
2 
42 
Pennsylvania 
12702379 
12702891 
12711406 
12747052 
12769123 
12779538 
12792392 
12789838 
12788468 
12794679 
12809107 
12798883 
12794404 
12783254 
 
040 
1 
1 
44 
Rhode Island 
1052567 
1052970 
1053994 
1053829 
1054893 
1055560 
1056511 
1056886 
1057816 
1056554 
1059338 
1058158 
1058004 
1057125 
 
040 
3 
5 
45 
South Carolina 
4625364 
4625358 
4635846 
4672655 
4719027 
4766469 
4826858 
4896006 
4963031 
5027102 
5091702 
5157702 
5205864 
5218040 
 
040 
2 
4 
46 
South Dakota 
814180 
814198 
816193 
823740 
833859 
842751 
849670 
854663 
863693 
873732 
879386 
887127 
891688 
892717 
 
040 
3 
6 
47 
Tennessee 
6346105 
6346281 
6355518 
6400298 
6455752 
6496943 
6544617 
6595354 
6651277 
6714748 
6778180 
6830325 
6875939 
6886834 
 
040 
3 
7 
48 
Texas 
25145561 
25146072 
25241897 
25645504 
26084120 
26479646 
26963092 
27468531 
27914064 
28291024 
28624564 
28986794 
29286467 
29360759 
 
040 
4 
8 
49 
Utah 
2763885 
2763891 
2775413 
2814797 
2854146 
2898773 
2938327 
2983626 
3044241 
3103540 
3155153 
3203383 
3239542 
3249879 
 
040 
1 
1 
50 
Vermont 
625741 
625727 
625886 
627197 
626361 
626603 
625693 
625810 
624366 
625132 
624802 
624046 
623821 
623347 
 
040 
3 
5 
51 
Virginia 
8001024 
8001046 
8024004 
8102437 
8187456 
8255861 
8315430 
8367303 
8417651 
8471011 
8510920 
8556642 
8587217 
8590563 
 
040 
4 
9 
53 
Washington 
6724540 
6724540 
6743009 
6827479 
6898599 
6966252 
7057531 
7167287 
7299961 
7427951 
7526793 
7614024 
7678379 
7693612 
 
040 
3 
5 
54 
West Virginia 
1852994 
1853008 
1854265 
1856606 
1857446 
1854768 
1850569 
1843332 
1832435 
1818683 
1805953 
1795263 
1788150 
1784787 
 
040 
2 
3 
55 
Wisconsin 
5686986 
5687285 
5690538 
5705840 
5720825 
5738012 
5753199 
5762927 
5775170 
5793147 
5809319 
5824581 
5833464 
5832655 
 
040 
4 
8 
56 
Wyoming 
563626 
563775 
564531 
567491 
576656 
582620 
583159 
586389 
585243 
579994 
579054 
580116 
582030 
582328 
 
040 
X 
X 
72 
Puerto Rico 
3725789 
3726157 
3721525 
3678732 
3634488 
3593077 
3534874 
3473232 
3406672 
3325284 
3193344 
3193553 
3167851 
3159343 
 
 
 
 
 
 
National population data 
We can use the SUMLEV variable to obtain only national data:
library (tidyverse) 
 US_pop <-  state_population |>  
   filter (SUMLEV ==  "010" ) 
 
 
010 
0 
0 
00 
United States 
308745538 
308758105 
309327143 
311583481 
313877662 
316059947 
318386329 
320738994 
323071755 
325122128 
326838199 
328329953 
329398742 
329484123 
 
 
 
 
 
National population data 
How can we generate the following plot?
 
National population data 
Note that our data cannot be readily used with ggplot to generate the previous plot. Ideally, we would like two vectors of the same length: one representing Year and one representing Population.
 
Manipulating data for visualization 
library (tidyverse) 
 US_pop <-  state_population |>  
   filter (SUMLEV ==  "010" ) 
 
 
010 
0 
0 
00 
United States 
308745538 
308758105 
309327143 
311583481 
313877662 
316059947 
318386329 
320738994 
323071755 
325122128 
326838199 
328329953 
329398742 
329484123 
 
 
 
 
 
 
Approach 1: Build a data frame manually 
# base R  
 plot_dat <-  data.frame ( 
   Year =  2010 : 2020 , 
   Population =  US_pop[1 , 8 : 19 ] 
 ) 
head (plot_dat) 
 
  Year Population.POPESTIMATE2010 Population.POPESTIMATE2011
1 2010                  309327143                  311583481
2 2011                  309327143                  311583481
3 2012                  309327143                  311583481
4 2013                  309327143                  311583481
5 2014                  309327143                  311583481
6 2015                  309327143                  311583481
  Population.POPESTIMATE2012 Population.POPESTIMATE2013
1                  313877662                  316059947
2                  313877662                  316059947
3                  313877662                  316059947
4                  313877662                  316059947
5                  313877662                  316059947
6                  313877662                  316059947
  Population.POPESTIMATE2014 Population.POPESTIMATE2015
1                  318386329                  320738994
2                  318386329                  320738994
3                  318386329                  320738994
4                  318386329                  320738994
5                  318386329                  320738994
6                  318386329                  320738994
  Population.POPESTIMATE2016 Population.POPESTIMATE2017
1                  323071755                  325122128
2                  323071755                  325122128
3                  323071755                  325122128
4                  323071755                  325122128
5                  323071755                  325122128
6                  323071755                  325122128
  Population.POPESTIMATE2018 Population.POPESTIMATE2019
1                  326838199                  328329953
2                  326838199                  328329953
3                  326838199                  328329953
4                  326838199                  328329953
5                  326838199                  328329953
6                  326838199                  328329953
  Population.POPESTIMATE042020 Population.POPESTIMATE2020
1                    329398742                  329484123
2                    329398742                  329484123
3                    329398742                  329484123
4                    329398742                  329484123
5                    329398742                  329484123
6                    329398742                  329484123 
 
 
 
 
National population data 
What went wrong? US_pop[1, 8:19] is a tibble, and we need a numeric.
 plot_dat <-  data.frame ( 
   Year =  2010 : 2020 , 
   Population =  as.numeric (US_pop[1 , 8 : 19 ]) 
 ) 
 
 
Error in data.frame(Year = 2010:2020, Population = as.numeric(US_pop[1, : arguments imply differing number of rows: 11, 12 
 
 
 
 
National population data 
Now what? US_pop has an extra column POPESTIMATE042020 which we need to remove.
# base R  
 plot_dat <-  data.frame ( 
   Year =  2010 : 2020 , 
   Population =  as.numeric (US_pop[, c (8 : 17 , 19 )]) 
 ) 
head (plot_dat) 
  Year Population
1 2010  309327143
2 2011  311583481
3 2012  313877662
4 2013  316059947
5 2014  318386329
6 2015  320738994 
 
 
 
 
National population data 
Finally, we can use the plot function to get our desired visualization:
plot (plot_dat$ Year, plot_dat$ Population) 
 
 
Approach 2: Tidyverse 
Now let’s see how we might approach the same task in the tidyverse. Our strategy will be to “tidy” the data so we have one row for each observation and one column for each observation.
First, let’s remove the unnecessary column using the select() function.
 US_pop |>   
   select (- POPESTIMATE042020)   
 
 
010 
0 
0 
00 
United States 
308745538 
308758105 
309327143 
311583481 
313877662 
316059947 
318386329 
320738994 
323071755 
325122128 
326838199 
328329953 
329484123 
 
 
 
 
 
 
Tidyverse strategy 
Now let’s use the pivot_longer function to combine the POPESTIMATE columns into two variables: Year and Population:
 US_pop |>   
   select (- POPESTIMATE042020) |>  
   pivot_longer (contains ("POPESTIMATE" ),  
                names_to =  "Year" , 
                values_to =  "Population" )   
 
 
Tidyverse strategy 
 
010 
0 
0 
00 
United States 
308745538 
308758105 
POPESTIMATE2010 
309327143 
 
010 
0 
0 
00 
United States 
308745538 
308758105 
POPESTIMATE2011 
311583481 
 
010 
0 
0 
00 
United States 
308745538 
308758105 
POPESTIMATE2012 
313877662 
 
010 
0 
0 
00 
United States 
308745538 
308758105 
POPESTIMATE2013 
316059947 
 
010 
0 
0 
00 
United States 
308745538 
308758105 
POPESTIMATE2014 
318386329 
 
010 
0 
0 
00 
United States 
308745538 
308758105 
POPESTIMATE2015 
320738994 
 
010 
0 
0 
00 
United States 
308745538 
308758105 
POPESTIMATE2016 
323071755 
 
010 
0 
0 
00 
United States 
308745538 
308758105 
POPESTIMATE2017 
325122128 
 
010 
0 
0 
00 
United States 
308745538 
308758105 
POPESTIMATE2018 
326838199 
 
010 
0 
0 
00 
United States 
308745538 
308758105 
POPESTIMATE2019 
328329953 
 
010 
0 
0 
00 
United States 
308745538 
308758105 
POPESTIMATE2020 
329484123 
 
 
 
 
 
What do we need to do next?
 
Tidyverse strategy 
Finally, we need to use the str_sub() function to shorten our Year variable before turning it into a numeric.
 US_pop |>   
   select (- POPESTIMATE042020) |>  
   pivot_longer (contains ("POPESTIMATE" ),  
                names_to =  "Year" , 
                values_to =  "Population" ) |>  
   mutate (Year =  as.numeric (stringr:: str_sub (Year, start =  12 ))) 
 
 
Tidyverse strategy 
Now, Year is numeric as desired:
 
010 
0 
0 
00 
United States 
308745538 
308758105 
2010 
309327143 
 
010 
0 
0 
00 
United States 
308745538 
308758105 
2011 
311583481 
 
010 
0 
0 
00 
United States 
308745538 
308758105 
2012 
313877662 
 
010 
0 
0 
00 
United States 
308745538 
308758105 
2013 
316059947 
 
010 
0 
0 
00 
United States 
308745538 
308758105 
2014 
318386329 
 
010 
0 
0 
00 
United States 
308745538 
308758105 
2015 
320738994 
 
010 
0 
0 
00 
United States 
308745538 
308758105 
2016 
323071755 
 
010 
0 
0 
00 
United States 
308745538 
308758105 
2017 
325122128 
 
010 
0 
0 
00 
United States 
308745538 
308758105 
2018 
326838199 
 
010 
0 
0 
00 
United States 
308745538 
308758105 
2019 
328329953 
 
010 
0 
0 
00 
United States 
308745538 
308758105 
2020 
329484123 
 
 
 
 
 
 
Tidyverse strategy 
Now we can |> it to ggplot:
 US_pop |>   
   select (- POPESTIMATE042020) |>  
   pivot_longer (contains ("POPESTIMATE" ),  
                names_to =  "Year" , 
                values_to =  "Population" ) |>  
   mutate (Year =  as.numeric (stringr:: str_sub (Year, start =  12 ))) |>  
   ggplot (aes (x =  Year, y =  Population)) +   
   geom_line () 
 
 
Tidyverse strategy 
 
Tidyverse strategy: More complicated plots 
We can also work with the region level data, mapping the color aesthetic to the region:
 state_population |>   
   filter (SUMLEV ==  "020" ) |>  
   select (- POPESTIMATE042020) |>  
   pivot_longer (contains ("POPESTIMATE" ),  
                names_to =  "Year" , 
                values_to =  "Population" ) |>  
   mutate (Year =  as.numeric (stringr:: str_sub (Year, start =  12 ))) |>  
   ggplot (aes (x =  Year, y =  Population, color =  NAME)) +   
   geom_line () 
 
 
Tidyverse strategy: More complicated plots 
 
Example: World Bank Inflation Data 
The World Bank provides historical data on inflation in countries around the world here .
In particular, let’s explore the annual consumer price inflation data for the US.
The consumer price inflation  measures the “annual percentage change in the cost to the average consumer of acquiring a basket of goods.”
 
 
Example: World Bank Inflation Data 
library (openxlsx) 
 cpi <-  read.xlsx ("https://thedocs.worldbank.org/en/doc/1ad246272dbbc437c74323719506aa0c-0350012021/original/Inflation-data.xlsx" , 
                  sheet =  5 ) 
 cpi <-  cpi[1 : 203 ,] 
 
 
ABW 
314 
Aruba 
Inflation 
Headline Consumer Price Inflation 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
3.200 
2.996 
1.875 
2.277 
4.049 
2.890 
3.320 
3.660 
2.530 
3.400 
3.610 
5.390 
8.960 
-2.140 
2.080 
4.320 
0.630 
-2.370 
0.420 
0.480 
-0.890 
-0.474 
3.58 
4.257 
1.222 
0.744 
5.520 
Annual average inflation 
 
AFG 
512 
Afghanistan 
Inflation 
Headline Consumer Price Inflation 
25.510 
25.510 
-12.520 
-10.680 
10.230 
11.850 
0.600 
7.650 
5.000 
9.900 
13.400 
22.200 
18.200 
15.900 
20.400 
8.700 
-2.100 
18.400 
27.500 
71.500 
47.400 
43.800 
58.190 
33.990 
20.010 
14.000 
14.010 
14.010 
14.010 
14.010 
0.001 
-43.400 
51.930 
35.663 
16.358 
10.569 
6.780 
8.680 
26.420 
-6.810 
2.180 
11.800 
6.440 
7.390 
4.670 
-0.660 
4.380 
4.976 
0.63 
2.302 
5.443 
5.062 
NA 
Annual average inflation 
 
AGO 
614 
Angola 
Inflation 
Headline Consumer Price Inflation 
7.970 
5.780 
15.800 
15.670 
27.420 
29.000 
80.700 
69.010 
48.460 
101.300 
46.708 
1.391 
1.833 
1.833 
1.833 
1.833 
1.833 
1.833 
1.833 
1.833 
1.833 
85.265 
299.097 
1379.476 
949.771 
2672.230 
4146.010 
221.492 
107.429 
248.248 
325.029 
152.586 
108.893 
98.219 
43.525 
22.961 
13.305 
12.249 
12.465 
13.721 
14.480 
13.484 
10.285 
8.782 
7.298 
9.159 
32.378 
29.844 
19.63 
17.079 
21.024 
23.846 
23.827 
Annual average inflation 
 
ALB 
914 
Albania 
Inflation 
Headline Consumer Price Inflation 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
-0.180 
35.717 
225.996 
85.005 
22.565 
7.755 
12.799 
33.212 
20.552 
0.402 
0.025 
3.105 
5.185 
2.378 
2.830 
2.357 
2.410 
2.908 
3.405 
2.290 
3.565 
3.415 
2.028 
1.935 
1.622 
1.910 
1.291 
1.987 
2.03 
1.411 
1.621 
2.041 
6.725 
Annual average inflation 
 
ARE 
466 
United Arab Emirates 
Inflation 
Headline Consumer Price Inflation 
21.985 
21.985 
21.985 
21.985 
21.985 
21.985 
21.985 
21.970 
13.043 
10.989 
10.066 
7.946 
7.083 
1.297 
2.433 
3.500 
5.435 
5.498 
4.995 
2.792 
0.604 
3.400 
6.383 
5.273 
5.699 
4.330 
2.976 
2.966 
1.994 
2.100 
1.348 
2.800 
2.918 
3.119 
5.041 
6.195 
9.285 
11.128 
12.272 
1.570 
0.884 
0.883 
0.663 
1.094 
2.344 
4.067 
1.620 
1.967 
3.06 
-1.931 
-2.079 
0.180 
4.828 
Annual average inflation 
 
ARG 
213 
Argentina 
Inflation 
Headline Consumer Price Inflation 
13.587 
34.732 
58.446 
61.249 
23.475 
182.927 
443.966 
176.002 
175.515 
159.507 
100.764 
104.476 
164.777 
343.811 
626.719 
672.181 
90.097 
131.327 
342.955 
3079.810 
2313.960 
171.672 
24.900 
10.611 
4.177 
3.376 
0.156 
0.527 
0.925 
-1.167 
-0.939 
-1.065 
25.869 
13.443 
4.416 
9.642 
10.898 
8.830 
8.585 
6.270 
10.461 
9.775 
10.043 
10.619 
38.081 
26.538 
39.400 
25.675 
34.28 
53.548 
42.015 
48.409 
72.431 
Annual average inflation 
 
ARM 
911 
Armenia 
Inflation 
Headline Consumer Price Inflation 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
3731.800 
5273.449 
176.743 
18.653 
14.046 
8.673 
0.648 
-0.791 
3.146 
1.790 
4.543 
5.928 
0.717 
3.433 
4.553 
9.018 
3.544 
7.274 
7.653 
2.540 
5.788 
2.981 
3.731 
-1.406 
1.031 
2.52 
1.443 
1.211 
7.185 
8.641 
Annual average inflation 
 
ATG 
311 
Antigua and Barbuda 
Inflation 
Headline Consumer Price Inflation 
8.870 
8.630 
12.670 
21.720 
12.970 
8.610 
15.620 
13.750 
6.150 
16.330 
18.999 
11.465 
4.170 
2.346 
3.844 
0.996 
0.500 
3.602 
6.800 
4.362 
6.591 
4.518 
3.000 
3.100 
6.485 
2.737 
2.965 
0.362 
3.262 
1.135 
-0.162 
1.935 
2.408 
1.994 
2.030 
2.099 
1.788 
1.416 
5.334 
-0.550 
3.370 
3.457 
3.377 
1.059 
1.089 
0.969 
-0.489 
2.425 
1.21 
1.475 
1.058 
1.627 
7.531 
Annual average inflation 
 
AUS 
193 
Australia 
Inflation 
Headline Consumer Price Inflation 
3.440 
6.138 
6.024 
9.091 
15.417 
15.162 
13.323 
12.310 
8.005 
9.122 
10.136 
9.488 
11.352 
10.039 
3.960 
6.735 
9.050 
8.533 
7.216 
7.534 
7.333 
3.177 
1.012 
1.754 
1.970 
4.628 
2.615 
0.225 
0.860 
1.483 
4.457 
4.407 
2.982 
2.733 
2.343 
2.692 
3.555 
2.328 
4.350 
1.771 
2.918 
3.304 
1.763 
2.450 
2.488 
1.508 
1.277 
1.972 
1.91 
1.610 
0.847 
2.820 
6.594 
Annual average inflation 
 
AUT 
122 
Austria 
Inflation 
Headline Consumer Price Inflation 
4.370 
4.704 
6.355 
7.531 
9.522 
8.445 
7.319 
5.495 
3.574 
3.707 
6.328 
6.803 
5.436 
3.339 
5.663 
3.190 
1.705 
1.402 
1.916 
2.568 
3.262 
3.337 
4.021 
3.632 
2.953 
2.243 
1.861 
1.306 
0.922 
0.569 
2.345 
2.650 
1.810 
1.356 
2.061 
2.299 
1.442 
2.169 
3.216 
0.506 
1.814 
3.287 
2.486 
2.000 
1.606 
0.897 
0.892 
2.230 
2.00 
1.490 
1.382 
2.767 
8.547 
Annual average inflation 
 
AZE 
912 
Azerbaijan 
Inflation 
Headline Consumer Price Inflation 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
106.600 
917.000 
1129.700 
1664.000 
411.760 
19.795 
3.674 
-0.773 
-8.525 
1.805 
1.863 
2.777 
2.131 
6.707 
9.542 
8.232 
16.566 
20.797 
1.332 
5.685 
7.772 
1.010 
2.432 
1.433 
4.033 
12.441 
12.844 
2.33 
2.712 
2.760 
6.650 
13.852 
Annual average inflation 
 
BDI 
618 
Burundi 
Inflation 
Headline Consumer Price Inflation 
-0.200 
3.851 
3.829 
6.006 
15.722 
15.710 
6.857 
6.835 
23.896 
36.541 
1.200 
12.178 
5.854 
8.158 
14.301 
3.821 
1.668 
7.108 
4.490 
11.667 
6.991 
9.011 
5.328 
9.714 
14.713 
19.361 
26.416 
31.063 
12.471 
3.522 
24.430 
9.300 
-1.370 
10.650 
8.180 
13.250 
2.750 
8.410 
24.410 
10.560 
6.490 
9.590 
18.160 
7.940 
4.410 
5.540 
5.560 
16.643 
-2.81 
-0.709 
7.321 
8.313 
18.801 
Annual average inflation 
 
BEL 
124 
Belgium 
Inflation 
Headline Consumer Price Inflation 
3.910 
4.344 
5.448 
6.955 
12.678 
12.768 
9.069 
7.096 
4.471 
4.469 
6.649 
7.627 
8.727 
7.662 
6.342 
4.869 
1.294 
1.553 
1.161 
3.107 
3.448 
3.215 
2.428 
2.754 
2.378 
1.468 
2.077 
1.628 
0.949 
1.121 
2.545 
2.469 
1.645 
1.589 
2.097 
2.781 
1.791 
1.823 
4.489 
-0.053 
2.189 
3.532 
2.840 
1.113 
0.340 
0.561 
1.974 
2.224 
2.05 
1.249 
0.741 
2.440 
9.598 
Annual average inflation 
 
BEN 
638 
Benin 
Inflation 
Headline Consumer Price Inflation 
6.320 
2.770 
3.900 
6.870 
14.420 
-34.410 
8.260 
9.860 
6.840 
10.270 
9.596 
0.795 
4.059 
-6.067 
10.265 
1.151 
0.414 
-1.339 
3.445 
-0.202 
1.112 
2.100 
5.921 
0.445 
38.535 
14.458 
4.917 
3.803 
5.755 
0.348 
4.210 
3.981 
2.422 
1.507 
0.883 
5.364 
3.787 
1.294 
7.939 
0.398 
2.179 
2.720 
6.740 
0.890 
-1.010 
0.270 
-0.850 
1.769 
0.83 
-0.933 
3.023 
5.917 
1.351 
Annual average inflation 
 
BFA 
748 
Burkina Faso 
Inflation 
Headline Consumer Price Inflation 
1.773 
2.061 
-2.923 
7.604 
8.720 
18.757 
-8.401 
29.986 
8.269 
14.993 
12.259 
7.556 
12.058 
8.346 
4.847 
7.054 
-2.700 
-2.850 
4.191 
-0.322 
-0.807 
2.502 
-1.991 
0.614 
24.720 
7.772 
6.109 
2.929 
4.911 
-1.080 
-0.169 
4.727 
2.300 
2.040 
-0.400 
6.409 
2.336 
-0.233 
10.668 
0.864 
-0.608 
2.760 
3.820 
0.530 
-0.260 
0.950 
-0.240 
0.351 
1.96 
-3.233 
1.884 
3.848 
14.290 
Annual average inflation 
 
BGD 
513 
Bangladesh 
Inflation 
Headline Consumer Price Inflation 
3.540 
-0.590 
45.290 
45.010 
54.650 
16.620 
-2.430 
9.340 
17.370 
13.590 
15.385 
14.545 
12.875 
9.531 
10.414 
10.465 
10.175 
10.828 
9.674 
8.734 
10.522 
8.285 
3.624 
2.979 
6.150 
10.117 
2.455 
4.959 
8.648 
6.179 
2.483 
1.908 
3.719 
5.361 
6.103 
7.040 
6.770 
9.109 
8.900 
5.441 
8.151 
10.264 
6.549 
7.524 
6.983 
6.186 
5.506 
5.611 
5.55 
5.475 
5.691 
5.307 
7.697 
Annual average inflation 
 
BGR 
918 
Bulgaria 
Inflation 
Headline Consumer Price Inflation 
-0.400 
-0.100 
0.000 
0.000 
0.000 
0.000 
0.000 
0.000 
0.000 
0.000 
0.000 
0.002 
2.800 
2.800 
2.800 
2.800 
2.700 
2.700 
2.500 
6.404 
23.900 
333.500 
91.317 
72.824 
96.866 
61.455 
125.528 
1037.480 
17.802 
2.555 
10.305 
7.450 
5.863 
2.389 
6.142 
5.028 
7.255 
8.416 
12.348 
2.747 
2.443 
4.218 
2.954 
0.889 
-1.414 
-0.098 
-0.795 
1.188 
2.81 
2.457 
1.673 
2.787 
15.325 
Annual average inflation 
 
BHR 
419 
Bahrain 
Inflation 
Headline Consumer Price Inflation 
1.629 
5.789 
5.088 
14.332 
24.393 
16.160 
22.496 
17.731 
15.785 
2.206 
3.800 
11.400 
8.700 
3.200 
0.000 
-2.400 
-2.500 
-1.700 
0.200 
1.200 
1.299 
0.899 
-0.302 
2.599 
3.999 
3.139 
-0.185 
4.602 
-0.418 
-1.259 
-0.730 
-1.175 
-0.496 
1.679 
2.248 
2.618 
2.041 
3.252 
3.520 
2.797 
1.961 
-0.373 
2.771 
3.296 
2.653 
1.836 
2.800 
1.387 
2.08 
1.000 
-2.318 
-0.606 
3.626 
Annual average inflation 
 
BHS 
313 
Bahamas, The 
Inflation 
Headline Consumer Price Inflation 
6.152 
4.615 
6.831 
5.482 
13.072 
10.360 
4.256 
3.190 
6.109 
9.091 
12.231 
11.046 
6.101 
4.000 
3.966 
4.624 
5.414 
5.994 
4.051 
5.405 
4.628 
7.304 
5.577 
2.748 
1.367 
2.023 
1.262 
0.561 
1.207 
1.119 
1.732 
2.362 
1.498 
3.656 
1.160 
1.819 
1.958 
2.407 
4.434 
1.666 
1.621 
3.200 
1.970 
0.720 
1.510 
1.870 
-0.350 
1.553 
2.23 
1.335 
0.039 
2.905 
5.605 
Annual average inflation 
 
BIH 
963 
Bosnia and Herzegovina 
Inflation 
Headline Consumer Price Inflation 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
NA 
12.901 
-11.717 
5.732 
-0.279 
2.841 
4.973 
4.573 
0.313 
0.547 
0.282 
3.582 
6.083 
1.547 
7.399 
-0.390 
2.208 
3.658 
2.058 
-0.091 
-0.923 
-1.015 
-1.059 
0.810 
1.27 
0.617 
-1.051 
1.982 
14.021 
Annual average inflation 
 
BLR 
913 
Belarus 
Inflation 
Headline Consumer Price Inflation 
0.000 
-0.400 
-0.200 
0.000 
-0.100 
0.100 
0.100 
0.200 
0.900 
1.200 
1.100 
1.200 
3.100 
1.000 
-0.700 
0.500 
2.300 
1.600 
0.000 
1.800 
4.600 
94.200 
970.700 
1190.316 
2220.897 
685.185 
51.651 
64.075 
73.270 
294.639 
167.577 
60.845 
42.650 
28.492 
18.241 
10.373 
6.958 
8.295 
14.734 
12.835 
7.662 
53.175 
59.499 
18.332 
18.095 
13.476 
11.803 
6.032 
4.87 
5.600 
5.548 
9.460 
15.210 
Annual average inflation 
 
BLZ 
339 
Belize 
Inflation 
Headline Consumer Price Inflation 
21.410 
18.630 
20.730 
66.960 
20.780 
20.770 
-2.900 
12.150 
4.870 
-11.820 
7.051 
11.224 
6.839 
4.996 
3.383 
4.153 
0.794 
2.021 
5.321 
0.000 
2.047 
3.225 
2.397 
1.473 
2.569 
2.890 
6.425 
1.026 
-0.847 
-1.207 
0.614 
1.150 
2.200 
2.620 
3.070 
3.651 
4.240 
2.318 
6.394 
-1.104 
0.918 
1.695 
1.300 
0.507 
1.203 
-0.852 
0.665 
1.148 
0.24 
0.187 
0.120 
3.260 
6.266 
Annual average inflation 
 
BOL 
218 
Bolivia 
Inflation 
Headline Consumer Price Inflation 
3.956 
0.000 
5.556 
34.211 
41.176 
9.722 
12.658 
7.865 
6.250 
20.588 
48.780 
32.240 
126.446 
273.358 
1289.541 
11702.827 
273.555 
14.433 
15.988 
15.248 
17.181 
21.442 
12.043 
8.512 
7.867 
10.193 
12.430 
4.704 
7.670 
2.156 
4.606 
1.602 
0.929 
3.343 
4.438 
5.399 
2.335 
8.705 
14.010 
3.342 
2.496 
9.884 
4.515 
5.736 
5.764 
4.064 
3.627 
2.820 
2.27 
1.839 
0.941 
0.737 
1.746 
Annual average inflation 
 
BRA 
223 
Brazil 
Inflation 
Headline Consumer Price Inflation 
22.260 
20.140 
16.520 
12.650 
27.590 
28.920 
42.010 
43.650 
38.700 
53.920 
90.227 
101.725 
100.543 
135.028 
192.122 
225.990 
147.143 
228.336 
629.113 
1430.725 
2947.733 
432.787 
951.962 
1927.381 
2075.888 
66.007 
15.758 
6.927 
3.195 
4.858 
7.044 
6.840 
8.450 
14.715 
6.597 
6.870 
4.184 
3.641 
5.679 
4.888 
5.039 
6.636 
5.404 
6.204 
6.329 
9.030 
8.739 
3.446 
3.66 
3.733 
3.212 
8.302 
9.280 
Annual average inflation 
 
BRB 
316 
Barbados 
Inflation 
Headline Consumer Price Inflation 
7.284 
7.469 
11.877 
16.855 
38.923 
20.295 
4.987 
8.352 
9.482 
13.170 
18.498 
14.589 
10.287 
5.275 
4.648 
4.858 
0.199 
3.567 
4.719 
6.334 
3.036 
6.300 
5.996 
1.184 
0.700 
2.838 
2.386 
7.711 
-1.269 
1.560 
2.436 
2.801 
0.169 
1.577 
1.428 
6.065 
7.308 
4.034 
8.108 
3.645 
5.824 
9.430 
4.530 
1.810 
1.770 
-1.110 
1.280 
4.410 
3.69 
4.100 
2.882 
3.064 
9.439 
Annual average inflation 
 
 
 
 
 
 
Example: World Bank Inflation Data 
How do we create a line plot with Year on the x-axis and CPI on the y-axis for the US?
 cpi |>  
   pivot_longer (cols =  ` 1970 ` : ` 2022 ` , 
                names_to =  "Year" ,  
                values_to =  "CPI" ) |>  
   mutate (Year =  as.numeric (Year)) |>  
   select (Country, Year, CPI) 
 
 
Aruba 
1970 
NA 
 
Aruba 
1971 
NA 
 
Aruba 
1972 
NA 
 
Aruba 
1973 
NA 
 
Aruba 
1974 
NA 
 
Aruba 
1975 
NA 
 
Aruba 
1976 
NA 
 
Aruba 
1977 
NA 
 
Aruba 
1978 
NA 
 
Aruba 
1979 
NA 
 
Aruba 
1980 
NA 
 
Aruba 
1981 
NA 
 
Aruba 
1982 
NA 
 
Aruba 
1983 
NA 
 
Aruba 
1984 
NA 
 
Aruba 
1985 
NA 
 
Aruba 
1986 
NA 
 
Aruba 
1987 
NA 
 
Aruba 
1988 
NA 
 
Aruba 
1989 
NA 
 
Aruba 
1990 
NA 
 
Aruba 
1991 
NA 
 
Aruba 
1992 
NA 
 
Aruba 
1993 
NA 
 
Aruba 
1994 
NA 
 
Aruba 
1995 
NA 
 
Aruba 
1996 
3.200 
 
Aruba 
1997 
2.996 
 
Aruba 
1998 
1.875 
 
Aruba 
1999 
2.277 
 
 
 
 
 
 
 
Example: World Bank Inflation Data 
 cpi |>  
   pivot_longer (cols =  ` 1970 ` : ` 2022 ` , 
                names_to =  "Year" ,  
                values_to =  "CPI" ) |>  
   mutate (Year =  as.numeric (Year)) |>  
   filter (Country.Code ==  "USA" ) |>  
   ggplot (mapping =  aes (x =  Year, y =  CPI)) +  
   geom_line () 
 
 
Data manipulation cheatsheet 
filter() subset rows 
select() subset columns, use with contains(), starts_with(), ends_with(), … 
mutate() create columns 
group_by(), summarize(), count() group and summarize groups 
rename() rename columns 
pivot_longer(), pivot_wider() reshape data 
inner_join(), left_join(), right_join(), outer_join() combine data (like SQL) 
 
Much, much more! (Click me for a cheat sheet) 
 
Exploratory data analysis 
Exploratory data analysis is the process of learning about a dataset by visualizing, summarizing, and transforming it.
By now we’ve seen most of the skills you need to carry out exploratory data analysis (EDA). As Wickham, Çetinkaya-Rundel, and Grolemund put it, the cycle of EDA looks something like this:
Generate questions about your data.
 
Explore your questions by visualizing, transforming, and modelling your data.
 
Use your findings to refine your questions and generate new questions.
 
 
You can read their walkthrough of the EDA process here .