crosshyou

主にクロス表(分割表)分析をしようかなと思いはじめましたが、あまりクロス表の分析はできず。R言語の練習ブログになっています。

OECD Net ODA data analysis 1 - using read_csv() function to read CSV file data into R.

f:id:cross_hyou:20210626103706j:plain

Photo by Wengang Zhai on Unsplash

In this blog, I will analyzie OECD Net ODA data

f:id:cross_hyou:20210626103757p:plain

I got data from OECD web site.

f:id:cross_hyou:20210626103829p:plain

This is what the CSV file looks like.
I use R for data analysis.

Firstly, I load tidyverse package.

f:id:cross_hyou:20210626104103p:plain

Next, use read_csv() function to read CSV file.

f:id:cross_hyou:20210626104738p:plain

We see LOCATION, INDICATOR, SUBJECT, MEASURE, FREQUENCY are character.
TIME and Value are numeric.
'Flag Codes' is logical.

Let's use table() function to see how many elements character type variables have.

f:id:cross_hyou:20210626105058p:plain

For LOCATION, we see a lot of locations and max observation number is 128.

f:id:cross_hyou:20210626105610p:plain

We see INDICATOR and FREQUENCY have just one value and 'Flag Codes' has no value.
So, we can delete those variables.
And we see SUBJECT and MEASURE have two values.
So, we can convert them to factor.

f:id:cross_hyou:20210626110020p:plain

f:id:cross_hyou:20210626110201p:plain

I delete attr("spec") to make df simpler.

f:id:cross_hyou:20210626112224p:plain

I convert variable names to lower case.

I use tolower() function

f:id:cross_hyou:20210626112832p:plain

All right. That's it in this blog.

Thank you!

Next blog is

 

www.crosshyou.info