Generated by Bing Image Creator: photo of real iris flower garden
This post is following of the above post. In the previous post, I make a tidy data frame.
In this post, I will do visualization to get the feeling of what a data is.
We have 'year', 'savings', 'locations'.
Let's start with 'year'.
year is from 1970 to 2022. years that have the most observations are around 2015.
Next, let's see 'savings'
Almost savings are between -20 and 20.
How about location?
USA and AUS have the most observations, greater than 50 whille RUS and CRI have observations less than 10.
Then, let's see saving by year.
savings seems not much changed over years.
How about savings by location?
Almost locations have postive median savings value, but LVA and GRC have negative median savings value.
To fo further analysis, I would like to make a subset of data frame, year from 2000, each locations have 23 observations, which means each locations have 2000 ~ 2022.
Let's make it.
Let's see the summary statistics of the subset data frame.
Now, 'n' is not needed, so I remove it.
All right. I will do further analysis with the subset next post.
That's it. Thank you!
Next post is
To read from the first post,