2021-09-19

OECD Meat Consumption Data Analysis 6 - POULTRY Consumption is positively correlated with GDP

Data_Analysis

f:id:cross_hyou:20210919075836j:plain

Photo by corina ardeleanu on Unsplash

www.crosshyou.info

This post is following of above post.

I have GDP data like below CSV file.

f:id:cross_hyou:20210919080056p:plain

So, let's combine this GDP data and Meat Consumption data.

f:id:cross_hyou:20210919080421p:plain

Next, I use inner_join() function to combine df2 dataframe and gdp dataframe.

f:id:cross_hyou:20210919080559p:plain

Let's see the result.

f:id:cross_hyou:20210919080732p:plain

usd is gross GDP value, capit is per capita GDP.

summary if df2

f:id:cross_hyou:20210919080959p:plain

We see year range is changed. 1990 ~ 2020.

Now, let's see capit and 4 KG_CAP data.

f:id:cross_hyou:20210919082211p:plain

f:id:cross_hyou:20210919082223p:plain

I see shkg is not so much correlated with capit.

Let's see correlation with capit and 4 KG_CAPs by country.

f:id:cross_hyou:20210919083340p:plain

f:id:cross_hyou:20210919083350p:plain

We see some countries have negative correlations and some contries have positive correlations.

How about pikg: PIG KG_CAP?

f:id:cross_hyou:20210919083451p:plain

f:id:cross_hyou:20210919083506p:plain

How about POULTRY?

f:id:cross_hyou:20210919083902p:plain

f:id:cross_hyou:20210919083913p:plain

All countires have positive correlations with POULTRY.

Last, how about shkg?

f:id:cross_hyou:20210919084319p:plain

f:id:cross_hyou:20210919084329p:plain

We see some countries have negative correlation while some countries have positive correlations.

per capita POULTRY Consumption and per capita GDP are positively correlated in all countries.

That's it. Thank you!

Next post is

www.crosshyou.info

To read the 1st post,

www.crosshyou.info

2021-09-18

OECD Meat Consumption Data Analysis 5 - scatter plot using R ggplot2::geom_point()

Data_Analysis

f:id:cross_hyou:20210918164320j:plain

Photo by Casey Horner on Unsplash

www.crosshyou.info

This post is following of above post.
In this post, let's draw scatter plots using R ggplot2::geom_point.
First of all, let's see correlations about 4 KG_CAPs.

f:id:cross_hyou:20210918164528p:plain

bekg: BEEF KG_CAP and pokg: POULTRY KG_CAP are the most strongly correlated pair.

pikg: PIG KG_CAP and shkg: SHEEP KG_CAP are the least weakly correlated pair.

Then, let's draw scatter plots.
First, bekg: BEEF KG_CAP and pikg: PIG KG_CAP

f:id:cross_hyou:20210918164914p:plain

f:id:cross_hyou:20210918165123p:plain

We see some countries seems have negative correlation.

Let's see bekg and pokg: POULTRY KG_CAP

f:id:cross_hyou:20210918165308p:plain

f:id:cross_hyou:20210918165318p:plain

How about bekg and shkg: SHEEP KG_CAP?

f:id:cross_hyou:20210918165453p:plain

f:id:cross_hyou:20210918165503p:plain

Many countries have relatively low value for shkg compared to bekg.

pikg and pokg

f:id:cross_hyou:20210918170418p:plain

f:id:cross_hyou:20210918170429p:plain

pikg and shkg

f:id:cross_hyou:20210918170113p:plain

f:id:cross_hyou:20210918170123p:plain

Lastly, pokg and shkg

f:id:cross_hyou:20210918170555p:plain

f:id:cross_hyou:20210918170606p:plain

Now, we have 6 scatter plot objects, p1 ~ p6.
Let's show it at once. we use gridExtra::grid.arrange() function.

f:id:cross_hyou:20210918170758p:plain

f:id:cross_hyou:20210918170829p:plain

That's it. Thank you!

Next post is

www.crosshyou.info

To read the 1st post,

www.crosshyou.info

2021-09-18

OECD Meat Consumption Data Analysis 4 - USA is the most meat consumption country.

Data_Analysis

f:id:cross_hyou:20210918135117j:plain

Photo by Claiton Conto on Unsplash

www.crosshyou.info

This post is following of above post.
Let's see KG_CAP data as country average.

Firstly, bekg: BEEK KG_CAP

f:id:cross_hyou:20210918140731p:plain

f:id:cross_hyou:20210918140811p:plain

ARG is the highest beef consumption country. IND is the lowest.

How about pikg: PIG KG_CAP?

f:id:cross_hyou:20210918141449p:plain

f:id:cross_hyou:20210918141503p:plain

TUR, PAK and IRN does not eat PIG.

poke: POULTRY KG_CAP

f:id:cross_hyou:20210918142306p:plain

f:id:cross_hyou:20210918142320p:plain

ISR is the highest consumption country for POULTRY.

Next, shkg: SHEEP KG_CAP

f:id:cross_hyou:20210918142731p:plain

f:id:cross_hyou:20210918142809p:plain

All right, let's see total(BEEF + PIG + POULTRY + SHEEP) KG_CAP.

f:id:cross_hyou:20210918143255p:plain

f:id:cross_hyou:20210918143308p:plain

USA is the most Meat Consumption country.

That's it. thank you!

Next post is

www.crosshyou.info

To see the 1st post,

www.crosshyou.info

2021-09-18

OECD Meat Consumption Data Analysis 3 - PIG and POULTRY are on up trend while BEEF and SHEEP are on down trend.

Data_Analysis

f:id:cross_hyou:20210918075109j:plain

Photo by Nathan Anderson on Unsplash

www.crosshyou.info

This post is following of above post.

Let's see coun: country.

f:id:cross_hyou:20210918075630p:plain

We see all country have 40 observations.

Let's see year

f:id:cross_hyou:20210918075821p:plain

We see all year have 38 observations.
So, df2 data frame is 40 coun * 38 year = 1520 observations.

f:id:cross_hyou:20210918080058p:plain

Let's see histograms for KG_CAP data.

First, bekg: BEEF KG_CAP

f:id:cross_hyou:20210918080354p:plain

f:id:cross_hyou:20210918080415p:plain

pikg: PIG KG_CAP

f:id:cross_hyou:20210918080602p:plain

f:id:cross_hyou:20210918080612p:plain

pokg: POULTRY KG_CAP

f:id:cross_hyou:20210918080742p:plain

f:id:cross_hyou:20210918080813p:plain

shkg: SHEEP KG_CAP

f:id:cross_hyou:20210918081042p:plain

f:id:cross_hyou:20210918081054p:plain

We see POULTRY is most normaly spread.

Next, let's see time trend.

We will see average KG_CAP by year.

First, bekg: BEEF KG_CAP

f:id:cross_hyou:20210918081753p:plain

f:id:cross_hyou:20210918081804p:plain

pikg: PIG KG_CAP time trend

f:id:cross_hyou:20210918082326p:plain

f:id:cross_hyou:20210918082337p:plain

pokg: POULTRY KG_CAP time trend

f:id:cross_hyou:20210918082552p:plain

f:id:cross_hyou:20210918082604p:plain

shkg: SHHEP KG_CAP time trend

f:id:cross_hyou:20210918082847p:plain

f:id:cross_hyou:20210918082857p:plain

We see BEEF and SHEEP are on down trend and PIG and POULTRY are on up trend.
That's it. Thank you!

Next post is

www.crosshyou.info

To read the 1st post,

www.crosshyou.info

2021-09-13

OECD Meat Consumption Data Analysis 2 - PIG is the most popular meat.

Data_Analysis

f:id:cross_hyou:20210913202112j:plain

Photo by boris misevic on Unsplash

www.crosshyou.info

This post is following above post.
Now, we now there are 4 sunjects and 2 measures.

4 subjects are BEEG, PIG, POULTRY and SHEEP.

2 measures are KG_CAP and THND_TONNE.
So, we have 8 combinations for them.

Let's make dataframes for each combination.

BEEF and KG_CAP

f:id:cross_hyou:20210913202537p:plain

BEEF and THND_TONNE

f:id:cross_hyou:20210913202626p:plain

PIG and KG_CAP

f:id:cross_hyou:20210913202707p:plain

PIG and THND_TONNE

f:id:cross_hyou:20210913202753p:plain

POULTRY and KG_CAP

f:id:cross_hyou:20210913202833p:plain

POULTRY and THND_TONNE

f:id:cross_hyou:20210913202918p:plain

SHEEP and KG_CAP

f:id:cross_hyou:20210913203014p:plain

SHEEP and THND_TONNE

f:id:cross_hyou:20210913203106p:plain

Then, I use inner_join() function to merge those 8 dataframes into one dataframe.

f:id:cross_hyou:20210913203223p:plain

All right, let's use summary() function to see summary statistics.

f:id:cross_hyou:20210913203456p:plain

variable mame ~~ kg means KG_CAP, it is per capita consumption and ~~ th means THSN_TONNE, it is total consumption.

Let's see bekg: BEEF KG_CAP, pikg: PIG KG_CAP, pokg: POULTRY KG_CAP and shkg: SHEEP KG_CAP.

pikg has the largest mean, so we find PIG is the most widely eaten meat.

That's it. Thank you!

Next post is.

www.crosshyou.info

To read the 1st post,

www.crosshyou.info

2021-09-12

OECD Meat Consumption Data Analysis 1 - Using R to read CSV data with read_csv() function.

Data_Analysis

f:id:cross_hyou:20210912113233j:plain

Photo by Wolfgang Hasselmann on Unsplash

Hello! This post, I will anaysis OECD Meat Consumption data using R.

f:id:cross_hyou:20210912113432p:plain

I got a CSV file like below from OECD web site(Agricultural output - Meat consumption - OECD Data)

f:id:cross_hyou:20210912113633p:plain

Firstly, I load tidyverse package, which is very useful package for data science.

f:id:cross_hyou:20210912113944p:plain

Let's read the CSV file with read_csv() function.

f:id:cross_hyou:20210912114206p:plain

Let's see each variables.

coun is country name.

f:id:cross_hyou:20210912114348p:plain

We see all countries have 320 observations.

indi is indicator.

f:id:cross_hyou:20210912114527p:plain

indi ha sonly one value, MEATCONSUMP, so we need not keep this variable in our dataframe.

f:id:cross_hyou:20210912114715p:plain

subj means subject.

f:id:cross_hyou:20210912114859p:plain

subj has 4 kinds of value, SHEEP, POULTRY, PIG and BEEF. Each subj has 3040 observations. I convert subj to factor class.

f:id:cross_hyou:20210912115116p:plain

meas is measure.

f:id:cross_hyou:20210912115243p:plain

meas has two kind of value, THND_TINNE and KG_CAP. I convert meas to factor class.

f:id:cross_hyou:20210912115528p:plain

feq is frequency.

f:id:cross_hyou:20210912115703p:plain

freq ha only one value, A, I will remove freq from my data frame.

f:id:cross_hyou:20210912115829p:plain

year is year.

f:id:cross_hyou:20210912120001p:plain

The oldest year is 1990, the newest year is 2029! Maybe this data frame is including estimated data.

valu is meat consumption value.

f:id:cross_hyou:20210912120248p:plain

Median is 24.77 and Mean is 2291.58, so I see there is skewd data.

Let's see summary of my data frame.

f:id:cross_hyou:20210912120531p:plain

That's it!
Thank you!

Next post is

www.crosshyou.info

2021-09-12

都道府県別の仕事の平均時間のデータ分析７ - 女性の仕事の平均時間は、１人当たり県民所得が高くなると短くなる。

データ分析

f:id:cross_hyou:20210912083231j:plain

Photo by Xavier von Erlach on Unsplash

www.crosshyou.info

の続きです。

こんどは、male_m: 男性の仕事の平均時間(分)を回帰分析してみます。

f:id:cross_hyou:20210912083929p:plain

P-valueが0.703ということなので、有意なモデルではないです。

step関数でモデルを単純化してみます。

f:id:cross_hyou:20210912084222p:plain

female_m: 女性の仕事の平均時間(分)だけがかろうじて残りました。ただ、female_mの係数のp値は0.112ですから10%水準でも有意ではありません。

つまり、男性の仕事の平均時間は１人当たり県民所得や300人以上の事業所に勤める従業者割合や東日本、西日本による違いや、大きな都府県などそのような要因とは関係ないということですね。

female_m: 女性の仕事の平均時間(分)を回帰分析してみます。

f:id:cross_hyou:20210912084740p:plain

こちらのモデルのp値は2.937e-05なので有意なモデルです。

big6とnoseaは5%以下の水準で有意ですし、inc, male, female, male_mは10%以下の水準で有意です。

coef()関数とround()関数で係数をわかりやすく表示します。

f:id:cross_hyou:20210912085543p:plain

big6を見ると、東京都、神奈川県、千葉県、埼玉県、愛知県、大阪府の女性は他の条件が同じならば35分ほど仕事の平均時間が短いことを表します。

noseaを見ると、海なし県は他の条件が同じならば11分ほど仕事の平均時間が短いことを表します。

incを見ると、所得が1千円増えると他の条件が同じならば、0.01643分、仕事の平均時間が短かくなります。100万円の増加で16分短くなります。incの最小値は2002, つまり200万円、最大値は4525、つまり423万円ですですから30分強の違いが最低の所得水準と最高の所得水準で発生します。

今回は以上です。

男性の仕事の平均時間は、他の要因には影響されないですが、

女性の仕事の平均時間は、外部要因に影響されていることがわかりました。

今回は以上です。

はじめから読むには

www.crosshyou.info

です。