www.crosshyou.info

政府統計の総合窓口のデータや、OECDのデータを使って、Rの練習をしています。ときどき、読書記録も載せています。

Data_Analysis

OECD Adult education level data analysis 7 - Time series analysis, serial correlation, cochrane-orcutt estimation using R

UnsplashのMattia Bericchiaが撮影した写真 www.crosshyou.info This post is following of the above post. In the previous post, I did time-series regression with differenced data and found these models are not valid. So, I will do with level d…

OECD Adult education level data analysis 6 - Time series regression analysis using R

UnsplashのJohannes Wが撮影した写真 www.crosshyou.info This post is following of the above post. In this post, I will do time series regression analysis. Before starting this, let's see which LOCATION has the most obserbations. USA has 41 o…

OECD Adult education level data analysis 5 - Cross sectional regression analysis using R

UnsplashのAditi Bhattが撮影した写真 www.crosshyou.info This post is following of the above post. In the previous post, I make some scatter plots to feel some sense of variables relationships. In this post, I will do corss sectional regress…

OECD Adult education level data analysis 4 - Making a scatter plot using ggplot() + geom() function with R

UnsplashのSergey Leont'evが撮影した写真 www.crosshyou.info This post is following of the above post. In this post, let's see relationship between two variables. First, let's see correlations. I use cor() function to see correlation. TRY an…

OECD Adult education level data analysis 3 - using ggplot() + geom_boxplot() function to see categorical variable and numerical variable relationship

UnsplashのJean Vellaが撮影した写真 www.crosshyou.info This post is following of the above post.I made histograms in the previous post, in this post, I will make another type of graphs, boxplot. Let's start wtih LOCATION and TRY. I see RUS,…

OECD Adult education level data analysis 2 - calculate summary statistics and making histograms using R

UnsplashのS. Tsuchiyaが撮影した写真 www.crosshyou.info This post is following of the above post. In the above post, I made a data frame to work with. Let's check each variable names and it's explanations. BUPSRY: Below upper secondary, in …

OECD Adult education level data analysis 1 - Importing CSV file into R using read_CSV() and making a data frame to analyze.

UnsplashのWeiye Tanが撮影した写真 In this post, I will analyze OECD Adult education level data. First, I downloaded CSV file from OECD web site.{Education attainment - Adult education level - OECD Data} It is like above screen-shot. I alos…

OECD Tourism flows data analysis 7 - regression analysis with infer package of R

UnsplashのAlexander Cifuentesが撮影した写真 www.crosshyou.info This post is following of above post. In this post, I will do cross sectional regression analysis using infer package. Fisrt, I select one year for cross sectional regression. …

OECD Tourism flows data analysis 6 - Testing for serial correlation of the error term using R

UnsplashのBob Brewerが撮影した写真 www.crosshyou.info This post is following of the above post. By previous posts, I made 6 models. These models are time series regression model. So, I would like to check whether there are serial correlati…

OECD Tourism flows data analysis 5 - regression with differenced data using R

Unsplashのguy stevensが撮影した写真 www.crosshyou.info This psot is following of above post.In the previous post, I regress acc_nights, inter_arr and inter_dep on per_capita and trend term for Japan tourism flow data. In this post, I will …

OECD Tourism flows data analysis 4 - Time series regression for Japan tourism data with R.

UnsplashのHan Chenxuが撮影した写真 www.crosshyou.info This post is floowing of the above post.In the above post, I find year 2020 and 2021 have COVID-19 effect and each locations has their own characteristics. So I will focus a few locatio…

OECD Tourism flows data analysis 3 - Boxplots and Scatter Plots using R.

UnsplashのQuang Nguyen Vinhが撮影した写真 www.crosshyou.info This post is floowing of above post.In this post, I will draw some graphs to see relationship of two variables. Fisrt, I see per capita and location with boxplots. I see SAU, AUS…

OECD Tourism flows data analysis 2 - Data wrangling and one variable visualization with R.

Unsplashの2H Mediaが撮影した写真 www.crosshyou.info This post is following of the above post. I make ACC_NIGHTS only data frame with filter() function in R. Next, I make INTER_ARR only data frame. I make INTER_DEP only data frame. Then, I …

OECD Tourism flows data analysis 1 - Load CSV data into R.

UnsplashのErik Knoefが撮影した写真 In this post, I will analyze OECD Tourism flows data analysis. I download CSV file from the OECD website. I also downladed GDP per capita data from OECD web site. I use R for data analysis. First, I load …

OECD Gender wage pay gap data analysis 7 - Line Graph and Boxplot with R - gender pay gap is decreasing

UnsplashのAlejandro Contrerasが撮影した写真 www.crosshyou.info This post is following of the above post. Let's check which LOCATION has the most observations. I see GBR has the most observations, 66 observations. USA has the 2nd most, FIN …

OECD Gender wage pay gap data analysis 6 - Multiple Linear Regression with R

UnsplashのMatteo Vellaが撮影した写真 www.crosshyou.info This post is following of the above post.In the above post, I did simple linear regression analysis. This time, I will for multiple linear regression (MLR). I make a new variable, fac…

OECD Gender wage pay gap data analysis 5 - Simple Linear Regression with R and calculating confidence interval.

UnsplashのLaura Smetsersが撮影した写真 www.crosshyou.info This post is following of the above post. I will do regression analysis. Fist, I will do simple linear regression analysis. I use SELFEMPLOYEMENT as a dependent variable and EMPLOYE…

OECD Gender wage pay gap data analysis 4 - Estonia is the outlier from the point of gender wage pay gap view.

UnsplashのRenato Pozziが撮影した写真 www.crosshyou.info This post is following of the above post. I make small data frame that contains only 2010 and 2014 observations. First, I use filter() function to get only 2010 and 2014 data, then I …

OECD Gender wage pay gap data analysis 3 - Bootstrapping Confidence Interval and Traditional Method p-value with R

UnsplashのMaria Tejadaが撮影した写真 www.crosshyou.info This post is following of the above post.I will calculate confidence interval. First, I make bootstrapping confidence interval. I use R infer package. I use specify(), generte() and c…

OECD Gender wage gap data analysis 2 - Statistical Inference with Infer package, mean difference, calculation p-value with simulation based method

UnsplashのJames Wainscoatが撮影した写真 www.crosshyou.info This post is following of the above post. In the previous post I see mean of EMPLOYEE is 17.9 and mean of SELFEMPLOYED is 30.6. Let's see whether the difference is statistically si…

OECD Gender wage gap data analysis 1 - Load CSV file data into R

UnsplashのKumiko SHIMIZUが撮影した写真 Hello. In this post, I will analyze Gender wage gap of OECD data with R. First, I download CSV file like belo from OECD web site, https://data.oecd.org/earnwage/gender-wage-gap.htm I use R to analyze …

OECD Nuclear power plants data analysis 5 - Hypothesis test for One proportion using R infer package.

UnsplashのCraig Mannersが撮影した写真 www.crosshyou.info This post is following of the above post. In this post, I will do hypothesis test for one proportion. For Japan nuclear power plants proportion. In the previous post, I found Japan n…

OECD Nuclear power plants data analysis 4 - Getting confidence interval for one proportion using R infer package

UnsplashのMaarten van den Heuvelが撮影した写真 www.crosshyou.info This post is following of the above post. In this post, I will get confidence interval for one proportion. In this case, number of nuclear power plants in Japan / number of …

OECD Nuclear power plants data analysis 3 - Hypothesis testing using R with infer package

UnsplashのYan Agritが撮影した写真 www.crosshyou.info This post is following of the above post. In this post I do hypothesis testing using R with infer package. I refere to B Inference Examples | Statistical Inference via Data Science (mode…

OECD Nuclear power plants data analysis 2 - Getting Confidence Interval using R with infer package

UnsplashのEean Chenが撮影した写真 www.crosshyou.info This post is following of the above post.I will calculate confidence interval in this post. There are two ways to calclulate confidence interval, one is bootstrap method and the other is…

OECD Nuclear power plants data analysis 1 - Loading CSV data with R - USA has the most nuclear power plants.

UnsplashのLukáš Lehotskýが撮影した写真 In this post, I will playaround with OECD Nuclear power plants data with R. OECD Nuclear power plants data is defined as the number of nuclear units in operation as of 1 January 2019. It is measured a…

OECD social spending data analysis 5 - Bootstrapping with R infer package

UnsplashのSonika Agarwalが撮影した写真 www.crosshyou.info This post is following of the above post. In this post, I will do bootstrapping with R infer package. Suppoese df2$priv_pc_gdp is population. So true mean of priv_pc_gdp is The true…

OECD Social spending data analysis 4 - Calculating Confidence Interval using R

UnsplashのArda Demirkaynakが撮影した写真 www.crosshyou.info This post is following of above post. In the previous post, I made some visualizations with R ggplot2 package. In this post. In this post I will calculate confidence intervals. Fi…

OECD Social spending data analysis 3 - Data Visualization with 5 Named Graphs (5NG) using R

UnsplashのAlicia Steelsが撮影した写真 www.crosshyou.info This post is following of above post.In the previous post, I made a dataframe for data analysis, named 'df2'.Now, let's start data analysis with data visualization.I will make 5 Name…

OECD Social spending data analysis 2 - Using filter(), select(), inner_join(), rename() function with R to make a dataframe to analyze.

UnsplashのMilos Prelevicが撮影した写真 www.crosshyou.info This post is following of the above post. In the previous post, I load OECD Social spending data into R. I also load country ISO code and continent name data like below CSV file. I …