Loading and exploring the dataset#
Loading and examining multidimensional_poverty_index_disaggregation_by_ethnic_racial_caste_groups.xlsx.
Importing libraries and packages#
1# Mathematical operations and data manipulation
2import pandas as pd
Set paths#
1# Path to datasets directory
2data_path = "./datasets"
3# Path to assets directory (for saving results to)
4assets_path = "./assets"
Loading dataset#
1dataset = pd.read_excel(
2 f"{data_path}/mpi_disaggregation_by_ethnic_racial_caste_groups.xlsx",
3 header=[0, 1],
4 skiprows=2,
5) # noqa
6dataset.columns = dataset.columns.to_flat_index()
Wrangling#
1dataset.head()
(Unnamed: 0_level_0, Country) | (Unnamed: 1_level_0, Country) | (Type of survey, Unnamed: 2_level_1) | (Type of survey, Unnamed: 3_level_1) | (Survey year, Unnamed: 4_level_1) | (Survey year, Unnamed: 5_level_1) | (Ethnic/racial/caste group, Unnamed: 6_level_1) | (Ethnic/racial/caste group, Unnamed: 7_level_1) | (Multidimensional Poverty Index (MPI), Value for the country) | (Multidimensional Poverty Index (MPI), Value for the country.1) | ... | (Contribution of deprivation in indicator to overall multidimensional poverty, Electricity) | (Contribution of deprivation in indicator to overall multidimensional poverty, Electricity.1) | (Contribution of deprivation in indicator to overall multidimensional poverty, Housing) | (Contribution of deprivation in indicator to overall multidimensional poverty, Housing.1) | (Contribution of deprivation in indicator to overall multidimensional poverty, Assets) | (a, Unnamed: 43_level_1) | (Population share by ethnic/racial/caste group, Unnamed: 44_level_1) | (Population size by ethnic/racial/ caste group, Unnamed: 45_level_1) | (Population size, 2019) | (Region, Unnamed: 47_level_1) | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | (%) | NaN | (%) | NaN | (%) | NaN | (%) | (thousands) | (thousands) | NaN |
1 | Bangladesh | NaN | MICS | NaN | 2019 | NaN | Bengali | NaN | 0.104060 | NaN | ... | 2.308714 | NaN | 12.478603 | NaN | 8.562996 | NaN | 98.809242 | 161104.688057 | 163046.173 | South Asia |
2 | Bangladesh | NaN | MICS | NaN | 2019 | NaN | Other | NaN | 0.104060 | NaN | ... | 8.593331 | NaN | 11.53615 | NaN | 10.738271 | NaN | 1.190756 | 1941.482818 | 163046.173 | South Asia |
3 | Belize | NaN | MICS | NaN | 2015/2016 | NaN | Creole | NaN | 0.017109 | NaN | ... | 3.383591 | NaN | 6.162911 | NaN | 4.409921 | NaN | 22.916001 | 89.452839 | 390.351 | Latin America and the Caribbean |
4 | Belize | NaN | MICS | NaN | 2015/2016 | NaN | Garifuna | NaN | 0.017109 | NaN | ... | 2.96302 | NaN | 2.96302 | NaN | 2.96302 | NaN | 5.251431 | 20.499014 | 390.351 | Latin America and the Caribbean |
5 rows × 48 columns
1dataset.shape
(322, 48)
1dataset.dtypes
(Unnamed: 0_level_0, Country) object
(Unnamed: 1_level_0, Country) float64
(Type of survey, Unnamed: 2_level_1) object
(Type of survey, Unnamed: 3_level_1) float64
(Survey year, Unnamed: 4_level_1) object
(Survey year, Unnamed: 5_level_1) float64
(Ethnic/racial/caste group, Unnamed: 6_level_1) object
(Ethnic/racial/caste group, Unnamed: 7_level_1) float64
(Multidimensional Poverty Index (MPI), Value for the country) float64
(Multidimensional Poverty Index (MPI), Value for the country.1) object
(Multidimensional Poverty Index (MPI), Value for the ethnic/racial/caste group) float64
(a, Unnamed: 11_level_1) object
(Headcount, Unnamed: 12_level_1) object
(a, Unnamed: 13_level_1) object
(Number of multidimen-sionally poor people by ethnic/racial/ caste group, Unnamed: 14_level_1) object
(a, Unnamed: 15_level_1) object
(Intensity of deprivation, Unnamed: 16_level_1) object
(a, Unnamed: 17_level_1) object
(Contribution of deprivation in dimension to overall multidimensional poverty, Health) object
(Contribution of deprivation in dimension to overall multidimensional poverty, Health.1) object
(Contribution of deprivation in dimension to overall multidimensional poverty, Education) object
(Contribution of deprivation in dimension to overall multidimensional poverty, Education.1) object
(Contribution of deprivation in dimension to overall multidimensional poverty, Standard of living) object
(a, Unnamed: 23_level_1) object
(Contribution of deprivation in indicator to overall multidimensional poverty, Nutrition) object
(Contribution of deprivation in indicator to overall multidimensional poverty, Nutrition.1) object
(Contribution of deprivation in indicator to overall multidimensional poverty, Child mortality) object
(Contribution of deprivation in indicator to overall multidimensional poverty, Child mortality.1) object
(Contribution of deprivation in indicator to overall multidimensional poverty, Years of schooling) object
(Contribution of deprivation in indicator to overall multidimensional poverty, Years of schooling.1) object
(Contribution of deprivation in indicator to overall multidimensional poverty, School attendance) object
(Contribution of deprivation in indicator to overall multidimensional poverty, School attendance.1) object
(Contribution of deprivation in indicator to overall multidimensional poverty, Cooking fuel) object
(Contribution of deprivation in indicator to overall multidimensional poverty, Cooking fuel.1) object
(Contribution of deprivation in indicator to overall multidimensional poverty, Sanitation) object
(Contribution of deprivation in indicator to overall multidimensional poverty, Sanitation.1) object
(Contribution of deprivation in indicator to overall multidimensional poverty, Drinking water) object
(Contribution of deprivation in indicator to overall multidimensional poverty, Drinking water.1) object
(Contribution of deprivation in indicator to overall multidimensional poverty, Electricity) object
(Contribution of deprivation in indicator to overall multidimensional poverty, Electricity.1) object
(Contribution of deprivation in indicator to overall multidimensional poverty, Housing) object
(Contribution of deprivation in indicator to overall multidimensional poverty, Housing.1) object
(Contribution of deprivation in indicator to overall multidimensional poverty, Assets) object
(a, Unnamed: 43_level_1) object
(Population share by ethnic/racial/caste group, Unnamed: 44_level_1) object
(Population size by ethnic/racial/ caste group, Unnamed: 45_level_1) object
(Population size, 2019) object
(Region, Unnamed: 47_level_1) object
dtype: object
1columns = dataset.columns
2print(columns)
Index([ ('Unnamed: 0_level_0', 'Country'),
('Unnamed: 1_level_0', 'Country'),
('Type of survey', 'Unnamed: 2_level_1'),
('Type of survey', 'Unnamed: 3_level_1'),
('Survey year', 'Unnamed: 4_level_1'),
('Survey year', 'Unnamed: 5_level_1'),
('Ethnic/racial/caste group', 'Unnamed: 6_level_1'),
('Ethnic/racial/caste group', 'Unnamed: 7_level_1'),
('Multidimensional Poverty Index (MPI)', 'Value for the country'),
('Multidimensional Poverty Index (MPI)', 'Value for the country.1'),
('Multidimensional Poverty Index (MPI)', 'Value for the ethnic/racial/caste group'),
('a', 'Unnamed: 11_level_1'),
('Headcount', 'Unnamed: 12_level_1'),
('a', 'Unnamed: 13_level_1'),
('Number of multidimen-sionally poor people by ethnic/racial/ caste group', 'Unnamed: 14_level_1'),
('a', 'Unnamed: 15_level_1'),
('Intensity of deprivation', 'Unnamed: 16_level_1'),
('a', 'Unnamed: 17_level_1'),
('Contribution of deprivation in dimension to overall multidimensional poverty', 'Health'),
('Contribution of deprivation in dimension to overall multidimensional poverty', 'Health.1'),
('Contribution of deprivation in dimension to overall multidimensional poverty', 'Education'),
('Contribution of deprivation in dimension to overall multidimensional poverty', 'Education.1'),
('Contribution of deprivation in dimension to overall multidimensional poverty', 'Standard of living'),
('a', 'Unnamed: 23_level_1'),
('Contribution of deprivation in indicator to overall multidimensional poverty', 'Nutrition'),
('Contribution of deprivation in indicator to overall multidimensional poverty', 'Nutrition.1'),
('Contribution of deprivation in indicator to overall multidimensional poverty', 'Child mortality'),
('Contribution of deprivation in indicator to overall multidimensional poverty', 'Child mortality.1'),
('Contribution of deprivation in indicator to overall multidimensional poverty', 'Years of schooling'),
('Contribution of deprivation in indicator to overall multidimensional poverty', 'Years of schooling.1'),
('Contribution of deprivation in indicator to overall multidimensional poverty', 'School attendance'),
('Contribution of deprivation in indicator to overall multidimensional poverty', 'School attendance.1'),
('Contribution of deprivation in indicator to overall multidimensional poverty', 'Cooking fuel'),
('Contribution of deprivation in indicator to overall multidimensional poverty', 'Cooking fuel.1'),
('Contribution of deprivation in indicator to overall multidimensional poverty', 'Sanitation'),
('Contribution of deprivation in indicator to overall multidimensional poverty', 'Sanitation.1'),
('Contribution of deprivation in indicator to overall multidimensional poverty', 'Drinking water'),
('Contribution of deprivation in indicator to overall multidimensional poverty', 'Drinking water.1'),
('Contribution of deprivation in indicator to overall multidimensional poverty', 'Electricity'),
('Contribution of deprivation in indicator to overall multidimensional poverty', 'Electricity.1'),
('Contribution of deprivation in indicator to overall multidimensional poverty', 'Housing'),
('Contribution of deprivation in indicator to overall multidimensional poverty', 'Housing.1'),
('Contribution of deprivation in indicator to overall multidimensional poverty', 'Assets'),
('a', 'Unnamed: 43_level_1'),
('Population share by ethnic/racial/caste group', 'Unnamed: 44_level_1'),
('Population size by ethnic/racial/ caste group', 'Unnamed: 45_level_1'),
('Population size', 2019),
('Region', 'Unnamed: 47_level_1')],
dtype='object')
1dataset.columns = [
2 "Country",
3 "Country",
4 "Type of survey",
5 "Type of survey",
6 "Survey year",
7 "Survey year",
8 "Ethnic/racial/caste group",
9 "Ethnic/racial/caste group",
10 "MPI: Value for the country",
11 "MPI: Value for the country",
12 "MPI: Value for the group",
13 "a",
14 "Headcount (%)",
15 "a",
16 "Number of multidimensionally poor people by group (thousands)",
17 "a",
18 "Intensity of deprivation (%)",
19 "a",
20 "Health (%)",
21 "Health",
22 "Education (%)",
23 "Education",
24 "Standard of living (%)",
25 "a",
26 "Nutrition (%)",
27 "Nutrition",
28 "Child mortality (%)",
29 "Child mortality",
30 "Years of schooling (%)",
31 "Years of schooling",
32 "School attendance (%)",
33 "School attendance",
34 "Cooking fuel (%)",
35 "Cooking fuel",
36 "Sanitation (%)",
37 "Sanitation",
38 "Drinking water (%)",
39 "Drinking water",
40 "Electricity (%)",
41 "Electricity",
42 " Housing (%)",
43 "Housing",
44 "Assets (%)",
45 "a",
46 "Population share by group (%)",
47 "Population size by group (thousands)",
48 "Population size (thousands)",
49 "Region",
50]
1dataset.head()
Country | Country | Type of survey | Type of survey | Survey year | Survey year | Ethnic/racial/caste group | Ethnic/racial/caste group | MPI: Value for the country | MPI: Value for the country | ... | Electricity (%) | Electricity | Housing (%) | Housing | Assets (%) | a | Population share by group (%) | Population size by group (thousands) | Population size (thousands) | Region | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | ... | (%) | NaN | (%) | NaN | (%) | NaN | (%) | (thousands) | (thousands) | NaN |
1 | Bangladesh | NaN | MICS | NaN | 2019 | NaN | Bengali | NaN | 0.104060 | NaN | ... | 2.308714 | NaN | 12.478603 | NaN | 8.562996 | NaN | 98.809242 | 161104.688057 | 163046.173 | South Asia |
2 | Bangladesh | NaN | MICS | NaN | 2019 | NaN | Other | NaN | 0.104060 | NaN | ... | 8.593331 | NaN | 11.53615 | NaN | 10.738271 | NaN | 1.190756 | 1941.482818 | 163046.173 | South Asia |
3 | Belize | NaN | MICS | NaN | 2015/2016 | NaN | Creole | NaN | 0.017109 | NaN | ... | 3.383591 | NaN | 6.162911 | NaN | 4.409921 | NaN | 22.916001 | 89.452839 | 390.351 | Latin America and the Caribbean |
4 | Belize | NaN | MICS | NaN | 2015/2016 | NaN | Garifuna | NaN | 0.017109 | NaN | ... | 2.96302 | NaN | 2.96302 | NaN | 2.96302 | NaN | 5.251431 | 20.499014 | 390.351 | Latin America and the Caribbean |
5 rows × 48 columns
1dataset.to_csv(
2 f"{data_path}/changed_columns_mpi_disagg_by_groups.csv", index=False
3)