Generated by Bing Image Creator: Closeup flowering blue, yellow, pink and red roses. Background is natural high mountains dark night sky and a nebula, photo
In this series of posts, I will analyze Lead time to inport data from World Bank Data using R.
Lead time to import, median case (days) | Data (worldbank.org)
I downloaded two CSV files.
One is data file and the other is meta data file.
The data file is like below screen-shot.
The meta data file is likw below screen-shot.
First, I load tidyverse package.
Then, I use read_csv() function to load CSV file.
Since this dataframe is not a tidy dataframe, I will convert it to tidy dataframe with pivot_longer() function.
To make future workflow easier, I will change variables names and remove non necessarily variables.
Then, I will remove rows which include NA.
Next, I road meta data.
I change variable names and remove non necessarily variables.
Then, I omit rows which inclue NA.
So far, I have two tidy dataframes. I will merge the both with inner_join() function.
Looking at above screen-shot, I found year is <chr>, I will change it to numeric. and I will change code, region and group to factor.
All right.
Next, let's use skimr::skim() function to get summary data.
I see there is not NA, there are 153 codes, 7 regions, 4 groups, minimum year is 2007, maximum year is 2018, minimum lttl(Lead time to imposrt) is 1, maximum lttl is 81.
That's it. Thank you!
Next post is,