World Bank's Pregnant women receiving prenatal care (%) data analysis 6 - Classification practice with tidymodels package

Generated by Bing Image Creator: Close up photo of wisteria flowers.


In this post, I will do classification practice using tidymodels workflow.

I refer to tidymodels - Preprocess your data with recipes

First, I make a dummy variable which indicates 1 if care (%) is equal or greater than 90.

Let's see proportion on ninety by group.

High_income and Upper_middle income have 0.75 or more while Low_income and Lower_middle_income have below 0.4.

Then, I divided "df" into training data set and test data set using initial_split() function.

Summary data are as below.

Next, I create a recipe.

Outcome variable is ninety, predictor variables are group and year. I ignore other variables, I don't use other variables in analysis.

Next, I create a model. Since this is classification problem, I use logistic model.

Next, I make workflow.

Next, fit a model using fit() function.

Let's see ninety_fit.

I can use predict() function to predict ninety with test data set.

I use augument() function to see the prediction results.

I see some obserbations are correctly predicted and some are not.

Let's make a ROC Curve with roc_curve() and autoplot().

I caluculate roc area with roc_auc() function.

I caluculate how much predicted value is correctly predicted with accuracy() function.

With Logistic Regression, 71.1% are correctly predicted.

That's it. Thank you!

Next post is



To read from the 1st post,