The unique function#
Finding the number of unique countries/states/cities in the Superstore dataset.
Importing libraries and packages#
1# Mathematical operations and data manipulation
2import pandas as pd
Set paths#
1# Path to datasets directory
2data_path = "./datasets"
3# Path to assets directory (for saving results to)
4assets_path = "./assets"
Loading dataset#
1dataset = pd.read_csv(f"{data_path}/cleaned_mpi_disagg_by_groups.csv")
Wrangling#
1dataset.head()
Country | Type of survey | Survey year | Ethnic/racial/caste group | MPI: Value for the country | MPI: Value for the group | Headcount (%) | Number of multidimensionally poor people by group (thousands) | Intensity of deprivation (%) | Health (%) | ... | Cooking fuel (%) | Sanitation (%) | Drinking water (%) | Electricity (%) | Housing (%) | Assets (%) | Population share by group (%) | Population size by group (thousands) | Population size (thousands) | Region | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | Bangladesh | MICS | 2019 | Bengali | 0.104060 | 0.102702 | 24.384759 | 39284.990511 | 42.117223 | 17.441109 | ... | 12.484664 | 8.274627 | 0.569494 | 2.308714 | 12.478603 | 8.562996 | 98.809242 | 161104.688057 | 163046.173 | South Asia |
1 | Bangladesh | MICS | 2019 | Other | 0.104060 | 0.216783 | 45.868093 | 890.521140 | 47.262356 | 10.881517 | ... | 11.733451 | 10.198139 | 8.354676 | 8.593331 | 11.536150 | 10.738271 | 1.190756 | 1941.482818 | 163046.173 | South Asia |
2 | Belize | MICS | 2015/2016 | Creole | 0.017109 | 0.003768 | 1.051818 | 0.940881 | 35.820526 | 52.086931 | ... | 1.126231 | 3.964365 | 1.126231 | 3.383591 | 6.162911 | 4.409921 | 22.916001 | 89.452839 | 390.351 | Latin America and the Caribbean |
3 | Belize | MICS | 2015/2016 | Garifuna | 0.017109 | 0.003887 | 1.097083 | 0.224891 | 35.433114 | 85.184902 | ... | 2.963020 | 2.963020 | 0.000000 | 2.963020 | 2.963020 | 2.963020 | 5.251431 | 20.499014 | 390.351 | Latin America and the Caribbean |
4 | Belize | MICS | 2015/2016 | Maya | 0.017109 | 0.078922 | 18.631953 | 8.557940 | 42.358151 | 37.911840 | ... | 11.931632 | 7.811719 | 2.319572 | 9.465594 | 11.165109 | 4.267081 | 11.766724 | 45.931523 | 390.351 | Latin America and the Caribbean |
5 rows × 26 columns
1dataset["Country"].unique()
array(['Bangladesh', 'Belize', 'Bolivia, Plurinational State of',
'Burkina Faso', 'Central African Republic', 'Chad', 'Colombia',
"Cote d'Ivoire", 'Cuba', 'Ecuador', 'Gabon', 'Gambia', 'Georgia',
'Ghana', 'Guatemala', 'Guinea', 'Guinea-Bissau', 'Guyana', 'India',
'Kazakhstan', 'Kenya', 'Kyrgyzstan',
"Lao People's Democratic Republic", 'Malawi', 'Mali',
'Moldova, Republic of', 'Mongolia', 'Nigeria', 'North Macedonia',
'Paraguay', 'Peru', 'Philippines', 'Senegal', 'Serbia',
'Sierra Leone', 'Sri Lanka', 'Suriname', 'Togo',
'Trinidad and Tobago', 'Uganda', 'Viet Nam'], dtype=object)
1dataset["Country"].nunique()
41