User-defined functions#

User-defined functions can be run through the apply method. Much like the native Python apply function, this method accepts a user-defined function and additional arguments and returns a new column after applying the function on a particular column elementwise.

Importing libraries and packages#

1# Mathematical operations and data manipulation
2import pandas as pd

Set paths#

1# Path to datasets directory
2data_path = "./datasets"
3# Path to assets directory (for saving results to)
4assets_path = "./assets"

Loading dataset#

1dataset = pd.read_csv(f"{data_path}/cleaned_mpi_disagg_by_groups.csv")

Wrangling#

1dataset
Country Type of survey Survey year Ethnic/racial/caste group MPI: Value for the country MPI: Value for the group Headcount (%) Number of multidimensionally poor people by group (thousands) Intensity of deprivation (%) Health (%) ... Cooking fuel (%) Sanitation (%) Drinking water (%) Electricity (%) Housing (%) Assets (%) Population share by group (%) Population size by group (thousands) Population size (thousands) Region
0 Bangladesh MICS 2019 Bengali 0.104060 0.102702 24.384759 39284.990511 42.117223 17.441109 ... 12.484664 8.274627 0.569494 2.308714 12.478603 8.562996 98.809242 161104.688057 163046.173 South Asia
1 Bangladesh MICS 2019 Other 0.104060 0.216783 45.868093 890.521140 47.262356 10.881517 ... 11.733451 10.198139 8.354676 8.593331 11.536150 10.738271 1.190756 1941.482818 163046.173 South Asia
2 Belize MICS 2015/2016 Creole 0.017109 0.003768 1.051818 0.940881 35.820526 52.086931 ... 1.126231 3.964365 1.126231 3.383591 6.162911 4.409921 22.916001 89.452839 390.351 Latin America and the Caribbean
3 Belize MICS 2015/2016 Garifuna 0.017109 0.003887 1.097083 0.224891 35.433114 85.184902 ... 2.963020 2.963020 0.000000 2.963020 2.963020 2.963020 5.251431 20.499014 390.351 Latin America and the Caribbean
4 Belize MICS 2015/2016 Maya 0.017109 0.078922 18.631953 8.557940 42.358151 37.911840 ... 11.931632 7.811719 2.319572 9.465594 11.165109 4.267081 11.766724 45.931523 390.351 Latin America and the Caribbean
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
291 Uganda DHS 2016 Lango 0.281028 0.331589 67.104161 1765.869191 49.414003 22.405487 ... 11.113079 10.360502 8.658755 10.666937 9.859275 4.350616 5.944340 2631.534572 44269.587 Sub-Saharan Africa
292 Uganda DHS 2016 Lugbara 0.281028 0.380233 71.118927 809.259358 53.464323 21.582941 ... 10.391146 10.167766 7.661117 8.479015 9.504942 5.541577 2.570378 1137.895905 44269.587 Sub-Saharan Africa
293 Uganda DHS 2016 Other 0.281028 0.348234 66.484243 5747.428972 52.378464 24.311363 ... 10.578003 9.857720 8.236721 9.430651 9.725867 5.491462 19.527625 8644.798738 44269.587 Sub-Saharan Africa
294 Viet Nam MICS 2013/2014 Ethnic minorities 0.019334 0.070516 16.658276 2241.570304 42.331207 14.158440 ... 12.865621 11.510227 5.163109 1.879746 8.144484 3.683084 13.949722 13456.195951 96462.108 East Asia and the Paficic
295 Viet Nam MICS 2013/2014 Kinh/Hoa 0.019334 0.011037 2.988247 2480.421534 36.934501 16.320269 ... 12.656878 11.753646 3.399943 0.674117 9.583682 2.956655 86.050278 83005.912049 96462.108 East Asia and the Paficic

296 rows × 26 columns

1dataset_subset = dataset.loc[
2    [i for i in range(20)],
3    ["Country", "MPI: Value for the country", "Intensity of deprivation (%)"],
4]
5print(dataset_subset)
                            Country  MPI: Value for the country  \
0                        Bangladesh                    0.104060   
1                        Bangladesh                    0.104060   
2                            Belize                    0.017109   
3                            Belize                    0.017109   
4                            Belize                    0.017109   
5                            Belize                    0.017109   
6                            Belize                    0.017109   
7   Bolivia, Plurinational State of                    0.037754   
8   Bolivia, Plurinational State of                    0.037754   
9   Bolivia, Plurinational State of                    0.037754   
10  Bolivia, Plurinational State of                    0.037754   
11  Bolivia, Plurinational State of                    0.037754   
12                     Burkina Faso                    0.523424   
13                     Burkina Faso                    0.523424   
14                     Burkina Faso                    0.523424   
15                     Burkina Faso                    0.523424   
16                     Burkina Faso                    0.523424   
17                     Burkina Faso                    0.523424   
18                     Burkina Faso                    0.523424   
19                     Burkina Faso                    0.523424   

    Intensity of deprivation (%)  
0                      42.117223  
1                      47.262356  
2                      35.820526  
3                      35.433114  
4                      42.358151  
5                      36.699757  
6                      39.199564  
7                      37.935901  
8                      33.333334  
9                      41.581705  
10                     43.263215  
11                     43.184847  
12                     55.149454  
13                     56.443775  
14                     62.004858  
15                     53.393632  
16                     68.189025  
17                     70.047671  
18                     59.310508  
19                     70.925540  
1def categorize_iop(iop):
2    if iop < 10:
3        return "Low Intensity of deprivation (%)"
4    elif iop < 40:
5        return "Medium Intensity of deprivation (%)"
6    else:
7        return "High Intensity of deprivation (%)"
 1dataset_sample = dataset[
 2    [
 3        "Country",
 4        "MPI: Value for the country",
 5        "Intensity of deprivation (%)",
 6        "Ethnic/racial/caste group",
 7        "Number of multidimensionally poor people by group (thousands)",
 8    ]
 9].sample(n=100)
10dataset_sample
Country MPI: Value for the country Intensity of deprivation (%) Ethnic/racial/caste group Number of multidimensionally poor people by group (thousands)
7 Bolivia, Plurinational State of 0.037754 37.935901 Aymara 223.806399
69 Cuba 0.002689 38.146138 Mulato/Mestizo/Other 23.120644
247 Sierra Leone 0.292899 52.130735 Korankoh 212.726294
193 Mongolia 0.028127 40.553683 Other 57.338328
242 Serbia 0.000433 39.860407 Roma 2.019261
... ... ... ... ... ...
156 Kyrgyzstan 0.001426 0.000000 Russian 0.000000
198 Nigeria 0.254390 45.378658 Igala 635.924786
208 Paraguay 0.018849 36.654904 Guaraní and Spanish speaker 25.251111
234 Senegal 0.262862 49.028081 Other /non Senegalese 357.846501
249 Sierra Leone 0.292899 44.758216 Loko 65.443718

100 rows × 5 columns

1dataset_sample["Intensity of deprivation Category"] = dataset_sample[
2    "Intensity of deprivation (%)"
3].apply(categorize_iop)
4dataset_sample.head(10)
Country MPI: Value for the country Intensity of deprivation (%) Ethnic/racial/caste group Number of multidimensionally poor people by group (thousands) Intensity of deprivation Category
7 Bolivia, Plurinational State of 0.037754 37.935901 Aymara 223.806399 Medium Intensity of deprivation (%)
69 Cuba 0.002689 38.146138 Mulato/Mestizo/Other 23.120644 Medium Intensity of deprivation (%)
247 Sierra Leone 0.292899 52.130735 Korankoh 212.726294 High Intensity of deprivation (%)
193 Mongolia 0.028127 40.553683 Other 57.338328 High Intensity of deprivation (%)
242 Serbia 0.000433 39.860407 Roma 2.019261 Medium Intensity of deprivation (%)
158 Lao People's Democratic Republic 0.108333 46.754810 Chinese-Tibetan 86.627640 High Intensity of deprivation (%)
154 Kyrgyzstan 0.001426 36.415604 Kyrgyz 12.373465 Medium Intensity of deprivation (%)
88 Gambia 0.203638 45.463789 Mandinka 261.295819 High Intensity of deprivation (%)
295 Viet Nam 0.019334 36.934501 Kinh/Hoa 2480.421534 Medium Intensity of deprivation (%)
84 Gabon 0.069695 66.626418 Pygmée 5.908309 High Intensity of deprivation (%)