Reading from a JSON file#
JSON is a schema-less, text-based representation of structured data that is based on key-value pairs and ordered lists. Pandas can read data from a JSON file directly into a DataFrame.
Importing libraries and packages#
1# Mathematical operations and data manipulation
2import pandas as pd
3
4# Reading data
5import json
Set paths#
1# Path to datasets directory
2data_path = "./datasets"
3# Path to assets directory (for saving results to)
4assets_path = "./assets"
Loading dataset#
1# The .json file shows JSON records in a list
2json_object = pd.read_json(f"{data_path}/covid.json", orient="records")
3json_object
Date | Data | Location | |
---|---|---|---|
0 | {'Day': 31, 'Month': 12, 'Year': 2019} | {'Cases': 0, 'Deaths': 0, 'Population': 380417... | {'Country': 'Afghanistan', 'Code': 'AFG', 'Con... |
1 | {'Day': 31, 'Month': 12, 'Year': 2019} | {'Cases': 0, 'Deaths': 0, 'Population': 430530... | {'Country': 'Algeria', 'Code': 'DZA', 'Contine... |
2 | {'Day': 31, 'Month': 12, 'Year': 2019} | {'Cases': 0, 'Deaths': 0, 'Population': 295772... | {'Country': 'Armenia', 'Code': 'ARM', 'Contine... |
3 | {'Day': 31, 'Month': 12, 'Year': 2019} | {'Cases': 0, 'Deaths': 0, 'Population': 252032... | {'Country': 'Australia', 'Code': 'AUS', 'Conti... |
4 | {'Day': 31, 'Month': 12, 'Year': 2019} | {'Cases': 0, 'Deaths': 0, 'Population': 885877... | {'Country': 'Austria', 'Code': 'AUT', 'Contine... |
... | ... | ... | ... |
53624 | {'Day': 10, 'Month': 12, 'Year': 2020} | {'Cases': 202, 'Deaths': 16, 'Population': 380... | {'Country': 'Afghanistan', 'Code': 'AFG', 'Con... |
53625 | {'Day': 11, 'Month': 12, 'Year': 2020} | {'Cases': 63, 'Deaths': 10, 'Population': 3804... | {'Country': 'Afghanistan', 'Code': 'AFG', 'Con... |
53626 | {'Day': 12, 'Month': 12, 'Year': 2020} | {'Cases': 113, 'Deaths': 11, 'Population': 380... | {'Country': 'Afghanistan', 'Code': 'AFG', 'Con... |
53627 | {'Day': 13, 'Month': 12, 'Year': 2020} | {'Cases': 298, 'Deaths': 9, 'Population': 3804... | {'Country': 'Afghanistan', 'Code': 'AFG', 'Con... |
53628 | {'Day': 14, 'Month': 12, 'Year': 2020} | {'Cases': 746, 'Deaths': 6, 'Population': 3804... | {'Country': 'Afghanistan', 'Code': 'AFG', 'Con... |
53629 rows × 3 columns
Wrangling#
1# pd.read_json with the filename directly will not work for
2# arbitrarily nested JSON objects. Using json.loads is required.
3
4# Loading data using Python JSON module
5with open(f"{data_path}/covid.json", "r") as f:
6 data = json.loads(f.read())
7
8# Normalizing data
9dataset_17 = pd.json_normalize(data)
10dataset_17
Date.Day | Date.Month | Date.Year | Data.Cases | Data.Deaths | Data.Population | Data.Rate | Location.Country | Location.Code | Location.Continent | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 31 | 12 | 2019 | 0 | 0 | 38041757 | 0.000000 | Afghanistan | AFG | Asia |
1 | 31 | 12 | 2019 | 0 | 0 | 43053054 | 0.000000 | Algeria | DZA | Africa |
2 | 31 | 12 | 2019 | 0 | 0 | 2957728 | 0.000000 | Armenia | ARM | Europe |
3 | 31 | 12 | 2019 | 0 | 0 | 25203200 | 0.000000 | Australia | AUS | Oceania |
4 | 31 | 12 | 2019 | 0 | 0 | 8858775 | 0.000000 | Austria | AUT | Europe |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
53624 | 10 | 12 | 2020 | 202 | 16 | 38041757 | 6.968658 | Afghanistan | AFG | Asia |
53625 | 11 | 12 | 2020 | 63 | 10 | 38041757 | 7.134266 | Afghanistan | AFG | Asia |
53626 | 12 | 12 | 2020 | 113 | 11 | 38041757 | 6.868768 | Afghanistan | AFG | Asia |
53627 | 13 | 12 | 2020 | 298 | 9 | 38041757 | 7.052776 | Afghanistan | AFG | Asia |
53628 | 14 | 12 | 2020 | 746 | 6 | 38041757 | 9.013779 | Afghanistan | AFG | Asia |
53629 rows × 10 columns