www.crosshyou.info

政府統計の総合窓口のデータや、OECDやUCIやのデータを使って、Rの練習をしています。ときどき、読書記録も載せています。

World Bank's Pregnant women receiving prenatal care (%) data analysis 5 - Regression Analysis(one categorical, one numerical) with tidymodels workflow

Generated by Bing Image Creator: Close up photo of yellow roses

www.crosshyou.info

This post is following of the above post. In this post, I will do regression anaysis using tidymodels workflow.

I refere to tidymodels - Build a model

To begin with, I load needed packages, tidymodels, broom.mixed and dotwhisker packages.

Then, let's make a visualization graph.

I see High_income group has different slopes.

Next, I make linear regression model.

Then, I fit a model.

In usual lm() function, I use summary() function to see the result. But in tidymodels workflow, I use tidy() function.

Base interaction term is year:groupHigh_income, the result shows other interaction term( which mean slope difference) are statistically significant from base interaction term.

I use dwplot function to see the results in visualization.

I see all three interaction terms are different from 0.

Next, let's predict care (%) value using new data set.

I use expand.grid() function to make new data points.

Then, I use predict() function to predict.

One of advantages using tidymodels workflow in regression analysis is the result is tibble.

So, I can make some data manupulation so easily.

I can make confidence interbal tibble.

Then, I merge new data points, predicted mean and confidence intervals.

Now, I can visualize it with ggplot2 workflow.

From the above graph, I can say there is different between High_income and Low_income in 1984 and 2006, but there is not different in 2021.

That's it. Thank you!

Next post is

www.crosshyou.info

 

 

To read from the 1st post,

www.crosshyou.info