www.crosshyou.info

政府統計の総合窓口のデータや、OECDやUCIやのデータを使って、Rの練習をしています。ときどき、読書記録も載せています。

OECD Total official and private flows data analysis 5 - Classification using R's glmnet package.

Generated by Bing Image Creator: Photo of dandelions and violets blooming all over a field

www.crosshyou.info

This post is following of the above post. In the previous post, I used R's 'tree' package. In this post, I will use R's 'glmnet' package for classification.
First, Ioad 'glmnet' package.

Since 'glmnet' require a matrix object, I will make a matices for use of 'glmnet'.

I also add squared, cubed and interaction variables.

Then, I will divide 'mtx' into two matrices, one is for training, the other is for testing.

Okay, let's use 'glmnet' package. First, I use cv.glmnet() function to find the best lambda.

I set alpha = 1, so it is LASSO regression.

Let's plot the result.

cvfit_lasso$lambda.min shows the best lambda.

I use the best lambda to train tne model.

Let's see beta coefficients.

v2, v3 and tv are not included in this estimate.

Let's predict using 'fit_lasso'.

Let's make a contingency table.

So, this lasso regression predict (124 + 39) / (124 + 39 + 66 + 41) = 60% only.

So, in this case, tree model is better than lasso regression.

That's it. Thank you!
Next post is

www.crosshyou.info

 


To read the post from the 1st post,

www.crosshyou.info