crosshyou

主にクロス表(分割表)分析をしようかなと思いはじめましたが、あまりクロス表の分析はできず。R言語の練習ブログになっています。

OECD Social spending data analysis 1 - Load CSV file data using R, read_csv() function.

UnsplashAlexander Schimmeckが撮影した写真 

In this post I will analyze OECD Social spending data using R.

OECD (2022), Social spending (indicator). doi: 10.1787/7497563b-en (Accessed on 26 November 2022)

This indicator is measured as a percentage of GDP or USD per capita.

From the web site, I downloaded CSV file as below.

Before loading the CSV file, I load some packages such as tidyverse and so on.

I loaded tidyvese, moderndive, infer and gridExtra packages. tidyverse is de-facto standard package in R for data science. I load moderndive package and infer package because I am learning Statistical Inference via Data Science (moderndive.com)

I loaded gridExtra package to display multiple plots at one screen.

Now, let's load the CSV file data with read_csv() function.

Then, using glimpse() function, let's check if data is correctly loaded.

Nice! The CSV file data is loaded succesfully into R.

There are 7 variables in the dataframe, let's see each variables one by one.

Location:

LOCATION is ISO country code, I see USA(United States of America) has the most observations.

 

INDICATOR:

INDICATOR has no variation, has just one value, SOCEXP. So, I will remove INDICATOR.

 

SUBJECT:

According to OECD web site below, PUB means Public, PRIV means Privae, PUBNET means Public net and TOTNET means Total net. We see PUB has the most observations.

 

MEASURE:

PC_GDP means a percentage of GDP, USD_CAP means ISD per capita.

 

FREQUENCY:

There is no variation in FREQUENCY, then I remove it.

 

TIME:

TIME means year, I see 2017 has the most observations, 190.

 

Value:

There is no NA in Value.

That's it. Thank you!

Next post is

www.crosshyou.info