Reading from a JSON file#

JSON is a schema-less, text-based representation of structured data that is based on key-value pairs and ordered lists. Pandas can read data from a JSON file directly into a DataFrame.

Importing libraries and packages#

1# Mathematical operations and data manipulation
2import pandas as pd
3
4# Reading data
5import json

Set paths#

1# Path to datasets directory
2data_path = "./datasets"
3# Path to assets directory (for saving results to)
4assets_path = "./assets"

Loading dataset#

Covid JSON File

1# The .json file shows JSON records in a list
2json_object = pd.read_json(f"{data_path}/covid.json", orient="records")
3json_object
Date Data Location
0 {'Day': 31, 'Month': 12, 'Year': 2019} {'Cases': 0, 'Deaths': 0, 'Population': 380417... {'Country': 'Afghanistan', 'Code': 'AFG', 'Con...
1 {'Day': 31, 'Month': 12, 'Year': 2019} {'Cases': 0, 'Deaths': 0, 'Population': 430530... {'Country': 'Algeria', 'Code': 'DZA', 'Contine...
2 {'Day': 31, 'Month': 12, 'Year': 2019} {'Cases': 0, 'Deaths': 0, 'Population': 295772... {'Country': 'Armenia', 'Code': 'ARM', 'Contine...
3 {'Day': 31, 'Month': 12, 'Year': 2019} {'Cases': 0, 'Deaths': 0, 'Population': 252032... {'Country': 'Australia', 'Code': 'AUS', 'Conti...
4 {'Day': 31, 'Month': 12, 'Year': 2019} {'Cases': 0, 'Deaths': 0, 'Population': 885877... {'Country': 'Austria', 'Code': 'AUT', 'Contine...
... ... ... ...
53624 {'Day': 10, 'Month': 12, 'Year': 2020} {'Cases': 202, 'Deaths': 16, 'Population': 380... {'Country': 'Afghanistan', 'Code': 'AFG', 'Con...
53625 {'Day': 11, 'Month': 12, 'Year': 2020} {'Cases': 63, 'Deaths': 10, 'Population': 3804... {'Country': 'Afghanistan', 'Code': 'AFG', 'Con...
53626 {'Day': 12, 'Month': 12, 'Year': 2020} {'Cases': 113, 'Deaths': 11, 'Population': 380... {'Country': 'Afghanistan', 'Code': 'AFG', 'Con...
53627 {'Day': 13, 'Month': 12, 'Year': 2020} {'Cases': 298, 'Deaths': 9, 'Population': 3804... {'Country': 'Afghanistan', 'Code': 'AFG', 'Con...
53628 {'Day': 14, 'Month': 12, 'Year': 2020} {'Cases': 746, 'Deaths': 6, 'Population': 3804... {'Country': 'Afghanistan', 'Code': 'AFG', 'Con...

53629 rows × 3 columns

Wrangling#

 1# pd.read_json with the filename directly will not work for
 2# arbitrarily nested JSON objects. Using json.loads is required.
 3
 4# Loading data using Python JSON module
 5with open(f"{data_path}/covid.json", "r") as f:
 6    data = json.loads(f.read())
 7
 8# Normalizing data
 9dataset_17 = pd.json_normalize(data)
10dataset_17
Date.Day Date.Month Date.Year Data.Cases Data.Deaths Data.Population Data.Rate Location.Country Location.Code Location.Continent
0 31 12 2019 0 0 38041757 0.000000 Afghanistan AFG Asia
1 31 12 2019 0 0 43053054 0.000000 Algeria DZA Africa
2 31 12 2019 0 0 2957728 0.000000 Armenia ARM Europe
3 31 12 2019 0 0 25203200 0.000000 Australia AUS Oceania
4 31 12 2019 0 0 8858775 0.000000 Austria AUT Europe
... ... ... ... ... ... ... ... ... ... ...
53624 10 12 2020 202 16 38041757 6.968658 Afghanistan AFG Asia
53625 11 12 2020 63 10 38041757 7.134266 Afghanistan AFG Asia
53626 12 12 2020 113 11 38041757 6.868768 Afghanistan AFG Asia
53627 13 12 2020 298 9 38041757 7.052776 Afghanistan AFG Asia
53628 14 12 2020 746 6 38041757 9.013779 Afghanistan AFG Asia

53629 rows × 10 columns