Rで何かをしたり、読書をするブログ

政府統計の総合窓口のデータや、OECDやUCIやのデータを使って、Rの練習をしています。ときどき、読書記録も載せています。

World Bank's Lead time to import data analysis 7 - Linear Regression Analysis and Heteroskedasticity Check

Generated by Bing Image Creator: Close up phot of Crape Myrtle flowers on the local spring water

 

www.crosshyou.info

This post is following of the above post. In the above post, I found that Lead time to import change and initial Lead time to import are negative correlated.

In this post, I will check this relationship is still existing after controlling region and income group.

In my mind, I have following models.

1. change = beta0 + beta1 * lttl_2007 + beta2 * region + beta3 * group + u

2. change = beta0 + beta1 * log(lttl_2007) + beta2 * region + beta3 * group + u

beta1 is what I would like to examine, if beta1 is statistically different from 0, lttl_2007 or log(lttl_2007) is statistically significant variable.

I will use lm() function.

Let's check model 1.

lttl_2007's coefficient is -1.11 and p-value is less than 2e-16. So lttl_2007 is statistically significant.

How about model 2?

The coefficient of log(lttl_2007) is -5.34 and statistically significant.

For linear regression, it is important to check heteroskedasticity. Let's check it with lmtest::bptest() function, which is dorong Breusch Pagan Test.

For model 1,

p-value is greater than 0.05, so I conclude that there is not heteroskedasticity in model2.

For model 2,

p-value is 0.02 and less than 0.05. So I cannot denay heteroskedasticity. 
Thus, I prefer to model 1.

Conclusion is that after controlling region and income group, 2007 Lead time to import  and Change of it from 2007 to 2018 are negatively correlated.

That's it. Thank you!
To read from the first post,

www.crosshyou.info