www.crosshyou.info

政府統計の総合窓口のデータや、OECDやUCIやのデータを使って、Rの練習をしています。ときどき、読書記録も載せています。

OECD Social spending data analysis 2 - Using filter(), select(), inner_join(), rename() function with R to make a dataframe to analyze.

UnsplashMilos Prelevicが撮影した写真 

www.crosshyou.info

This post is following of the above post. In the previous post, I load OECD Social spending data into R. I also load country ISO code and continent name data like below CSV file.

I get above data from List of Countries by Continent - StatisticsTimes.com

Let's use read_csv() function to load the CSV file data.

Using glimpse() function, I display continent dataframe.

Then, I use inner_join() function to merge 'df' and 'continent'.

Let' see if it is all right or not.

Good! I successfully merged 'df' and 'continent' dataframes.

To make a dataframe to analyze, I will check how many obervations each SUBJECT and MEASURE has by using table() function.

From the above table, I decide I will use PRIV & PC_GDP, PUB & PC_GDP and PUB & USD_CAP observations.

I make PRIVE & PC_GDP datagrame using filter() function, select() function and rename() function.

Let's see how 'priv_pc_gdp' looks like with glimpse() function.

I see AUS means Australia and it is in Oceania continent.

Next, I make PUB & PC_GDP datagrame.

Let's see how 'pub_pc_gdp' dataframe looks like with glimpse() functiion.

And I make PUB & USD_GDP dataframe.

Let's see 'pub_usd_cap' with glimpse() function.

Now, I have three dataframes, 'priv_pc_gdp', 'pub_pc_gdp' and 'pub_usd_cap'.
I will merge those datafames into one dataframe using inner_join() function.
LOCATION, TIME, name and Continent are common variable names.

Let' see how 'df2' looks like with glimpse() function.

Well done! 
I modifiy 'df2' dataframe with remane() function and select() function.

Let's see how 'df2' dataframe looks like with glimpse() funtion.

I change continent to factor.

Lastly, let's see summary statistics for 'df2' dataframe with summary() function.

I see there is no NA observations in 'df2' dataframe, there are 5 continent, Asia, Europe, North America, Oceania and South America. So there is not African countries in this dataframe.

That's it. Thank you!

Next post is

www.crosshyou.info

 

 

To read from the 1st post,

www.crosshyou.info