Rで何かをしたり、読書をするブログ

政府統計の総合窓口のデータや、OECDやUCIやのデータを使って、Rの練習をしています。ときどき、読書記録も載せています。

OECD Purchasing power parities (PPP) data analysis 5 - PCA (Principal Component Analysis) using R.

f:id:cross_hyou:20220101200456j:plain

Photo by Aron Visuals on Unsplash 

www.crosshyou.info

This post is following of the above post.
In this post, I will do PCA(Principal Component Analysis).

I refer below web site.
Principal Component Analysis (PCA) 101, using R | by Peter Nistrup | Towards Data Science

 

Firstly, I will make subset for PCA from df.

f:id:cross_hyou:20220101201356p:plain

Now, I have subset_pca data frame which has 6 mumerical varibles, ppp, gdp, capi, l_gdp, l_capi and l_ppp. 

Then, I use prcomp() function for PCA.

f:id:cross_hyou:20220101201808p:plain

Above results tells that PC1 counts 39% of variables, PC2 counts 32% variable. So, PC1 and PC2 counts 72% variable.

Then, I make a plot to visualize PCA result using screeplot() function.

f:id:cross_hyou:20220101202706p:plain

f:id:cross_hyou:20220101202719p:plain

Let's make PC1 vs PC2 plot

f:id:cross_hyou:20220101203632p:plain

f:id:cross_hyou:20220101203643p:plain

Then, I add rownames to pca_result$x.

f:id:cross_hyou:20220101204801p:plain

Then, let's make a hierarchial culstering dendrogram.

f:id:cross_hyou:20220101205257p:plain

f:id:cross_hyou:20220101205307p:plain

That's it. Thank you!.

The next post is

 

www.crosshyou.info

 


To read the 1st post,

www.crosshyou.info