Reading CSV data from a compressed file#

Pandas allows for reading directly from a compressed file, such as .zip , .gz , .bz2 , or .xz (but not .7z). The only requirement is that the intended data file (CSV) is the only file inside the compressed file.

Importing libraries and packages#

1# Mathematical operations and data manipulation
2import pandas as pd

Set paths#

1# Path to datasets directory
2data_path = "./datasets"
3# Path to assets directory (for saving results to)
4assets_path = "./assets"

Loading datasets#

1dataset_9 = pd.read_csv(f"{data_path}/example_1.zip")
2dataset_9
Column 1 Column 2 Column 3 Column 4
0 2 1500 Good 300000
1 3 1300 Fair 240000
2 3 1900 Very good 450000
3 3 1850 Bad 280000
4 2 1640 Good 310000
1dataset_10 = pd.read_csv(f"{data_path}/example_1.tar.xz")
2dataset_10
example_1.csv Column 2 Column 3 Column 4
0 2.0 1500.0 Good 300000.0
1 3.0 1300.0 Fair 240000.0
2 3.0 1900.0 Very good 450000.0
3 3.0 1850.0 Bad 280000.0
4 2.0 1640.0 Good 310000.0
5 NaN NaN NaN NaN
1# Can not
2# dataset_11 = pd.read_csv(f'{data_path}/example_1.7z')
3# dataset_11