Rで何かをしたり、読書をするブログ

政府統計の総合窓口のデータや、OECDやUCIやのデータを使って、Rの練習をしています。ときどき、読書記録も載せています。

OECD Purchasing power parities (PPP) data analysis 1 - read CSV file with read_csv() function in R and make a dataframe to analyze.

f:id:cross_hyou:20211219074420j:plain

Photo by Andrew Svk on Unsplash 

In this post, I will analyze OECD Purchasing power parities (PPP).

f:id:cross_hyou:20211219075432p:plain

From the OECD website, I got below CSV file.

f:id:cross_hyou:20211219075725p:plain

I analyze those data with R. Firstly, I load tidyvesr package.

f:id:cross_hyou:20211219080014p:plain

Let's load the CSV file with read_csv() function.

f:id:cross_hyou:20211219080214p:plain

We have 7 variables, LOCATION, INDICATOR, SUBJECT, MEASURE, FREQUENCY, TIME and Valie.
Let's check each variables.

LOCATION

f:id:cross_hyou:20211219080529p:plain

USA, TUR, SWE ~~ BEL, AUT, AUS has 61 observations.

So, I will filter those LOCATIONS only.

f:id:cross_hyou:20211219081421p:plain

INDICATOR

f:id:cross_hyou:20211219081625p:plain

For INDICATOR, there is only PPP. So I delete INDICATOR from df.

f:id:cross_hyou:20211219081753p:plain

SUBJECT

f:id:cross_hyou:20211219081905p:plain

For SUBJECT, there is only TOT, so I can delete SUBJECT from df too.

f:id:cross_hyou:20211219082025p:plain

MEASURE

f:id:cross_hyou:20211219082154p:plain

For MAESURE, there is only NATUSD, so I will delete MEASURE too.

f:id:cross_hyou:20211219082310p:plain

FREQUENCY

f:id:cross_hyou:20211219082455p:plain

For FREQUENCY, there is only A, so I will delete it too.

f:id:cross_hyou:20211219082618p:plain

TIME

f:id:cross_hyou:20211219082905p:plain

TIME is numeric variable, so I use summary() function. the most recent TIME is 2020.

Value

f:id:cross_hyou:20211219083042p:plain

Value is Purchasing power parity (PPP). I see Median is 1.007 while Mean is 68.948, so there must be skewed data.

All right, let's see dataframe "df" with summary() function.

f:id:cross_hyou:20211219083332p:plain

I will change variables names to iso, year and ppp. and change data type to factor for iso.

f:id:cross_hyou:20211219083705p:plain

All right, now I have a good data frame to analyze.
That's it. Thank you!

Next post is

 

www.crosshyou.info