Loading and exploring the dataset#

Loading and examining multidimensional_poverty_index_disaggregation_by_ethnic_racial_caste_groups.xlsx.

Importing libraries and packages#

1# Mathematical operations and data manipulation
2import pandas as pd

Set paths#

1# Path to datasets directory
2data_path = "./datasets"
3# Path to assets directory (for saving results to)
4assets_path = "./assets"

Loading dataset#

1dataset = pd.read_excel(
2    f"{data_path}/mpi_disaggregation_by_ethnic_racial_caste_groups.xlsx",
3    header=[0, 1],
4    skiprows=2,
5)  # noqa
6dataset.columns = dataset.columns.to_flat_index()

Wrangling#

1dataset.head()
(Unnamed: 0_level_0, Country) (Unnamed: 1_level_0, Country) (Type of survey, Unnamed: 2_level_1) (Type of survey, Unnamed: 3_level_1) (Survey year, Unnamed: 4_level_1) (Survey year, Unnamed: 5_level_1) (Ethnic/racial/caste group, Unnamed: 6_level_1) (Ethnic/racial/caste group, Unnamed: 7_level_1) (Multidimensional Poverty Index (MPI), Value for the country) (Multidimensional Poverty Index (MPI), Value for the country.1) ... (Contribution of deprivation in indicator to overall multidimensional poverty, Electricity) (Contribution of deprivation in indicator to overall multidimensional poverty, Electricity.1) (Contribution of deprivation in indicator to overall multidimensional poverty, Housing) (Contribution of deprivation in indicator to overall multidimensional poverty, Housing.1) (Contribution of deprivation in indicator to overall multidimensional poverty, Assets) (a, Unnamed: 43_level_1) (Population share by ethnic/racial/caste group, Unnamed: 44_level_1) (Population size by ethnic/racial/ caste group, Unnamed: 45_level_1) (Population size, 2019) (Region, Unnamed: 47_level_1)
0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... (%) NaN (%) NaN (%) NaN (%) (thousands) (thousands) NaN
1 Bangladesh NaN MICS NaN 2019 NaN Bengali NaN 0.104060 NaN ... 2.308714 NaN 12.478603 NaN 8.562996 NaN 98.809242 161104.688057 163046.173 South Asia
2 Bangladesh NaN MICS NaN 2019 NaN Other NaN 0.104060 NaN ... 8.593331 NaN 11.53615 NaN 10.738271 NaN 1.190756 1941.482818 163046.173 South Asia
3 Belize NaN MICS NaN 2015/2016 NaN Creole NaN 0.017109 NaN ... 3.383591 NaN 6.162911 NaN 4.409921 NaN 22.916001 89.452839 390.351 Latin America and the Caribbean
4 Belize NaN MICS NaN 2015/2016 NaN Garifuna NaN 0.017109 NaN ... 2.96302 NaN 2.96302 NaN 2.96302 NaN 5.251431 20.499014 390.351 Latin America and the Caribbean

5 rows × 48 columns

1dataset.shape
(322, 48)
1dataset.dtypes
(Unnamed: 0_level_0, Country)                                                                            object
(Unnamed: 1_level_0, Country)                                                                           float64
(Type of survey, Unnamed: 2_level_1)                                                                     object
(Type of survey, Unnamed: 3_level_1)                                                                    float64
(Survey year, Unnamed: 4_level_1)                                                                        object
(Survey year, Unnamed: 5_level_1)                                                                       float64
(Ethnic/racial/caste group, Unnamed: 6_level_1)                                                          object
(Ethnic/racial/caste group, Unnamed: 7_level_1)                                                         float64
(Multidimensional Poverty Index (MPI), Value for the country)                                           float64
(Multidimensional Poverty Index (MPI), Value for the country.1)                                          object
(Multidimensional Poverty Index (MPI), Value for the ethnic/racial/caste group)                         float64
(a, Unnamed: 11_level_1)                                                                                 object
(Headcount, Unnamed: 12_level_1)                                                                         object
(a, Unnamed: 13_level_1)                                                                                 object
(Number of multidimen-sionally poor people by ethnic/racial/ caste group, Unnamed: 14_level_1)           object
(a, Unnamed: 15_level_1)                                                                                 object
(Intensity of deprivation, Unnamed: 16_level_1)                                                          object
(a, Unnamed: 17_level_1)                                                                                 object
(Contribution of deprivation in dimension to overall multidimensional poverty, Health)                   object
(Contribution of deprivation in dimension to overall multidimensional poverty, Health.1)                 object
(Contribution of deprivation in dimension to overall multidimensional poverty, Education)                object
(Contribution of deprivation in dimension to overall multidimensional poverty, Education.1)              object
(Contribution of deprivation in dimension to overall multidimensional poverty, Standard of living)       object
(a, Unnamed: 23_level_1)                                                                                 object
(Contribution of deprivation in indicator to overall multidimensional poverty, Nutrition)                object
(Contribution of deprivation in indicator to overall multidimensional poverty, Nutrition.1)              object
(Contribution of deprivation in indicator to overall multidimensional poverty, Child mortality)          object
(Contribution of deprivation in indicator to overall multidimensional poverty, Child mortality.1)        object
(Contribution of deprivation in indicator to overall multidimensional poverty, Years of schooling)       object
(Contribution of deprivation in indicator to overall multidimensional poverty, Years of schooling.1)     object
(Contribution of deprivation in indicator to overall multidimensional poverty, School attendance)        object
(Contribution of deprivation in indicator to overall multidimensional poverty, School attendance.1)      object
(Contribution of deprivation in indicator to overall multidimensional poverty, Cooking fuel)             object
(Contribution of deprivation in indicator to overall multidimensional poverty, Cooking fuel.1)           object
(Contribution of deprivation in indicator to overall multidimensional poverty, Sanitation)               object
(Contribution of deprivation in indicator to overall multidimensional poverty, Sanitation.1)             object
(Contribution of deprivation in indicator to overall multidimensional poverty, Drinking water)           object
(Contribution of deprivation in indicator to overall multidimensional poverty, Drinking water.1)         object
(Contribution of deprivation in indicator to overall multidimensional poverty, Electricity)              object
(Contribution of deprivation in indicator to overall multidimensional poverty, Electricity.1)            object
(Contribution of deprivation in indicator to overall multidimensional poverty, Housing)                  object
(Contribution of deprivation in indicator to overall multidimensional poverty, Housing.1)                object
(Contribution of deprivation in indicator to overall multidimensional poverty, Assets)                   object
(a, Unnamed: 43_level_1)                                                                                 object
(Population share by ethnic/racial/caste group, Unnamed: 44_level_1)                                     object
(Population size by ethnic/racial/ caste group, Unnamed: 45_level_1)                                     object
(Population size, 2019)                                                                                  object
(Region, Unnamed: 47_level_1)                                                                            object
dtype: object
1columns = dataset.columns
2print(columns)
Index([                                                                       ('Unnamed: 0_level_0', 'Country'),
                                                                              ('Unnamed: 1_level_0', 'Country'),
                                                                       ('Type of survey', 'Unnamed: 2_level_1'),
                                                                       ('Type of survey', 'Unnamed: 3_level_1'),
                                                                          ('Survey year', 'Unnamed: 4_level_1'),
                                                                          ('Survey year', 'Unnamed: 5_level_1'),
                                                            ('Ethnic/racial/caste group', 'Unnamed: 6_level_1'),
                                                            ('Ethnic/racial/caste group', 'Unnamed: 7_level_1'),
                                              ('Multidimensional Poverty Index (MPI)', 'Value for the country'),
                                            ('Multidimensional Poverty Index (MPI)', 'Value for the country.1'),
                            ('Multidimensional Poverty Index (MPI)', 'Value for the ethnic/racial/caste group'),
                                                                                   ('a', 'Unnamed: 11_level_1'),
                                                                           ('Headcount', 'Unnamed: 12_level_1'),
                                                                                   ('a', 'Unnamed: 13_level_1'),
             ('Number of multidimen-sionally poor people by ethnic/racial/ caste group', 'Unnamed: 14_level_1'),
                                                                                   ('a', 'Unnamed: 15_level_1'),
                                                            ('Intensity of deprivation', 'Unnamed: 16_level_1'),
                                                                                   ('a', 'Unnamed: 17_level_1'),
                     ('Contribution of deprivation in dimension to overall multidimensional poverty', 'Health'),
                   ('Contribution of deprivation in dimension to overall multidimensional poverty', 'Health.1'),
                  ('Contribution of deprivation in dimension to overall multidimensional poverty', 'Education'),
                ('Contribution of deprivation in dimension to overall multidimensional poverty', 'Education.1'),
         ('Contribution of deprivation in dimension to overall multidimensional poverty', 'Standard of living'),
                                                                                   ('a', 'Unnamed: 23_level_1'),
                  ('Contribution of deprivation in indicator to overall multidimensional poverty', 'Nutrition'),
                ('Contribution of deprivation in indicator to overall multidimensional poverty', 'Nutrition.1'),
            ('Contribution of deprivation in indicator to overall multidimensional poverty', 'Child mortality'),
          ('Contribution of deprivation in indicator to overall multidimensional poverty', 'Child mortality.1'),
         ('Contribution of deprivation in indicator to overall multidimensional poverty', 'Years of schooling'),
       ('Contribution of deprivation in indicator to overall multidimensional poverty', 'Years of schooling.1'),
          ('Contribution of deprivation in indicator to overall multidimensional poverty', 'School attendance'),
        ('Contribution of deprivation in indicator to overall multidimensional poverty', 'School attendance.1'),
               ('Contribution of deprivation in indicator to overall multidimensional poverty', 'Cooking fuel'),
             ('Contribution of deprivation in indicator to overall multidimensional poverty', 'Cooking fuel.1'),
                 ('Contribution of deprivation in indicator to overall multidimensional poverty', 'Sanitation'),
               ('Contribution of deprivation in indicator to overall multidimensional poverty', 'Sanitation.1'),
             ('Contribution of deprivation in indicator to overall multidimensional poverty', 'Drinking water'),
           ('Contribution of deprivation in indicator to overall multidimensional poverty', 'Drinking water.1'),
                ('Contribution of deprivation in indicator to overall multidimensional poverty', 'Electricity'),
              ('Contribution of deprivation in indicator to overall multidimensional poverty', 'Electricity.1'),
                    ('Contribution of deprivation in indicator to overall multidimensional poverty', 'Housing'),
                  ('Contribution of deprivation in indicator to overall multidimensional poverty', 'Housing.1'),
                     ('Contribution of deprivation in indicator to overall multidimensional poverty', 'Assets'),
                                                                                   ('a', 'Unnamed: 43_level_1'),
                                       ('Population share by ethnic/racial/caste group', 'Unnamed: 44_level_1'),
                                       ('Population size by ethnic/racial/ caste group', 'Unnamed: 45_level_1'),
                                                                                      ('Population size', 2019),
                                                                              ('Region', 'Unnamed: 47_level_1')],
      dtype='object')
 1dataset.columns = [
 2    "Country",
 3    "Country",
 4    "Type of survey",
 5    "Type of survey",
 6    "Survey year",
 7    "Survey year",
 8    "Ethnic/racial/caste group",
 9    "Ethnic/racial/caste group",
10    "MPI: Value for the country",
11    "MPI: Value for the country",
12    "MPI: Value for the group",
13    "a",
14    "Headcount (%)",
15    "a",
16    "Number of multidimensionally poor people by group (thousands)",
17    "a",
18    "Intensity of deprivation (%)",
19    "a",
20    "Health (%)",
21    "Health",
22    "Education (%)",
23    "Education",
24    "Standard of living (%)",
25    "a",
26    "Nutrition (%)",
27    "Nutrition",
28    "Child mortality (%)",
29    "Child mortality",
30    "Years of schooling (%)",
31    "Years of schooling",
32    "School attendance (%)",
33    "School attendance",
34    "Cooking fuel (%)",
35    "Cooking fuel",
36    "Sanitation (%)",
37    "Sanitation",
38    "Drinking water (%)",
39    "Drinking water",
40    "Electricity (%)",
41    "Electricity",
42    " Housing (%)",
43    "Housing",
44    "Assets (%)",
45    "a",
46    "Population share by group (%)",
47    "Population size by group (thousands)",
48    "Population size (thousands)",
49    "Region",
50]
1dataset.head()
Country Country Type of survey Type of survey Survey year Survey year Ethnic/racial/caste group Ethnic/racial/caste group MPI: Value for the country MPI: Value for the country ... Electricity (%) Electricity Housing (%) Housing Assets (%) a Population share by group (%) Population size by group (thousands) Population size (thousands) Region
0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN ... (%) NaN (%) NaN (%) NaN (%) (thousands) (thousands) NaN
1 Bangladesh NaN MICS NaN 2019 NaN Bengali NaN 0.104060 NaN ... 2.308714 NaN 12.478603 NaN 8.562996 NaN 98.809242 161104.688057 163046.173 South Asia
2 Bangladesh NaN MICS NaN 2019 NaN Other NaN 0.104060 NaN ... 8.593331 NaN 11.53615 NaN 10.738271 NaN 1.190756 1941.482818 163046.173 South Asia
3 Belize NaN MICS NaN 2015/2016 NaN Creole NaN 0.017109 NaN ... 3.383591 NaN 6.162911 NaN 4.409921 NaN 22.916001 89.452839 390.351 Latin America and the Caribbean
4 Belize NaN MICS NaN 2015/2016 NaN Garifuna NaN 0.017109 NaN ... 2.96302 NaN 2.96302 NaN 2.96302 NaN 5.251431 20.499014 390.351 Latin America and the Caribbean

5 rows × 48 columns

1dataset.to_csv(
2    f"{data_path}/changed_columns_mpi_disagg_by_groups.csv", index=False
3)