OECD Household savings data analysis 2 - Data visualization, histogram, bar chart and box plot with R ggplot2

Generated by Bing Image Creator: photo of real iris flower garden

www.crosshyou.info

This post is following of the above post. In the previous post, I make a tidy data frame.

In this post, I will do visualization to get the feeling of what a data is.

We have 'year', 'savings', 'locations'.

Let's start with 'year'.

year is from 1970 to 2022. years that have the most observations are around 2015.

Next, let's see 'savings'

Almost savings are between -20 and 20.

How about location?

USA and AUS have the most observations, greater than 50 whille RUS and CRI have observations less than 10.

Then, let's see saving by year.

savings seems not much changed over years.

How about savings by location?

Almost locations have postive median savings value, but LVA and GRC have negative median savings value.

To fo further analysis, I would like to make a subset of data frame, year from 2000, each locations have 23 observations, which means each locations have 2000 ~ 2022.

Let's make it.

Let's see the summary statistics of the subset data frame.

Now, 'n' is not needed, so I remove it.

All right. I will do further analysis with the subset next post.
That's it. Thank you!

Next post is

www.crosshyou.info

To read from the first post,

www.crosshyou.info

ランキング参加中

知識