OECD Trust in government data analysis 5 - Multiple(two numerical variables) Linear Regression with R, log(per capita GDP) is still significant under controlling year.

Generated by Bing Image Creator: Photograph of coral leaf with beautiful fishes, more colorful.


This post is followoing of previous post above. In this post, I will do multiple regression analysis using 'infer' package workflow with R.

I use 'df' data frame. First, I add log(gdp) into the 'df'.

Let's see observed slope parameters.

The result means l_pc_gdp has positive relationship to trust, on the contrary, year has negative relationship to trust.

This estimates can be get with lm() function.

Next, let's generate bootstrap distribution.

I use visualize() function to see eatimate histograms.

I see l_pc_gdp histogram does not include value = 0. so l_pc_gdp seems significant variable. On the contrary, year histogram include value 0, so it maybe not significant.

Let's calculate 95% level confidence intervals.

l_pc_gdp confidence interval is between 12.3 and 17.7, it does not include 0.

year confidence interval is between -0.4523 and 0.0297, it include 0.

let's compare theoretical-base(formula-base) confidence interval.

theoretical-base confidence intervals are similar to simulation-base('infer' package workflow) confidence intervals.

Let's shade simulation-base confidence intervals.

That's it. Thank you!

Next post is



To read 1st post,