www.crosshyou.info

政府統計の総合窓口のデータや、OECDやUCIやのデータを使って、Rの練習をしています。ときどき、読書記録も載せています。

OECD Crop production data analysis 8 - Hierarchical Clustering using R

UnsplashDustin Humesが撮影した写真 

www.crosshyou.info

This post is following of the above post. In this post, I will do Hierarchical Clustering using R.

First, I make a matrix for clustering. I start with subsetting only year 2020 data.

Let's use summary() function to see summary statistics.

Then, I use scale() function to make variables standardized value.

You will see scaled matrix has mean 0, standard deviation 1 for all variables.

Now, I can use dist() function to calculate distance.

Next, I can use hclust() function for Hierarchical Culstering.

Finally, I can use plot() function to visualize Hierarchical Culstering result.

CHN(China) and IND(India) are clusterd as same group.

Let's to same procedures with the first year data, 1990.

I see some differences between 1990 and 2020, for example USA is only one in 1990 but in 2020, USA is clustered with BRA(Brasil). So I can see the world has changed since 1990 to 2020.

That's it. Thank you!

Next post is

www.crosshyou.info

 

To read from the 1st post,

www.crosshyou.info