YouTube#
Visualizing the total number of subscribers and the total number of views for the top 30 YouTube channels (January 2020) in the music category by using the FacetGrid()
function provided by Seaborn.
Importing libraries and packages#
1# Warnings
2import warnings
3
4# Mathematical operations and data manipulation
5import pandas as pd
6
7# Plotting
8import matplotlib.pyplot as plt
9import seaborn as sns
10
11sns.set()
12warnings.filterwarnings("ignore")
Set paths#
1# Path to datasets directory
2data_path = "./datasets"
3# Path to assets directory (for saving results to)
4assets_path = "./assets"
Loading dataset#
1dataset = pd.read_csv(f"{data_path}/YouTube.csv")
Exploring dataset#
1# Shape of the dataset
2print("Shape of the dataset: ", dataset.shape)
3# View
4dataset
Shape of the dataset: (30, 3)
Channel | Subs (in millions) | Views (in millions) | |
---|---|---|---|
0 | T-Series | 123.0 | 94410 |
1 | Canal KondZilla | 54.5 | 27860 |
2 | Zee Music Company | 48.5 | 22689 |
3 | Ed Sheeran | 43.2 | 18905 |
4 | EminemMusic | 40.2 | 773 |
5 | Ariana Grande | 39.3 | 953 |
6 | Taylor Swift | 36.8 | 310 |
7 | JustinBieberVEVO | 33.1 | 19326 |
8 | BLACKPINK | 32.4 | 8112 |
9 | Alan Walker | 31.7 | 7470 |
10 | Shemaroo Filmi Gaane | 31.0 | 14708 |
11 | ibighit | 30.9 | 7659 |
12 | One Direction | 30.4 | 356 |
13 | Wave Music | 30.4 | 20569 |
14 | Sony Music India | 29.9 | 12077 |
15 | El Reino Infantil | 29.2 | 26159 |
16 | Maroon 5 | 29.2 | 294 |
17 | Trap Nation | 27.9 | 10195 |
18 | Speed Records | 27.4 | 13769 |
19 | GR6 EXPLODE | 27.2 | 13341 |
20 | TaylorSwiftVEVO | 27.0 | 18096 |
21 | SonyMusicIndiaVEVO | 27.0 | 12577 |
22 | Ozuna | 27.0 | 13059 |
23 | Daddy Yankee | 26.9 | 9796 |
24 | YRF | 26.8 | 14253 |
25 | Spinnin' Records | 26.4 | 15738 |
26 | Bruno Mars | 26.3 | 11411 |
27 | RihannaVEVO | 25.9 | 14768 |
28 | T-Series Bhakti Sagar | 25.8 | 10552 |
29 | KatyPerryVEVO | 25.8 | 18603 |
Preprocessing#
1channels = dataset[dataset.columns[0]].tolist()
2print(channels)
['T-Series', 'Canal KondZilla', 'Zee Music Company', 'Ed Sheeran ', 'EminemMusic ', 'Ariana Grande ', 'Taylor Swift', 'JustinBieberVEVO ', ' BLACKPINK', 'Alan Walker', 'Shemaroo Filmi Gaane', 'ibighit', 'One Direction', 'Wave Music ', 'Sony Music India ', 'El Reino Infantil', 'Maroon 5 ', 'Trap Nation', 'Speed Records', 'GR6 EXPLODE ', 'TaylorSwiftVEVO ', 'SonyMusicIndiaVEVO', 'Ozuna', 'Daddy Yankee', 'YRF', "Spinnin' Records", 'Bruno Mars', 'RihannaVEVO ', 'T-Series Bhakti Sagar', 'KatyPerryVEVO']
1subs = dataset[dataset.columns[1]].tolist()
2print(subs)
[123.0, 54.5, 48.5, 43.2, 40.2, 39.3, 36.8, 33.1, 32.4, 31.7, 31.0, 30.9, 30.4, 30.4, 29.9, 29.2, 29.2, 27.9, 27.4, 27.2, 27.0, 27.0, 27.0, 26.9, 26.8, 26.4, 26.3, 25.9, 25.8, 25.8]
1views = dataset[dataset.columns[2]].tolist()
2print(views)
[94410, 27860, 22689, 18905, 773, 953, 310, 19326, 8112, 7470, 14708, 7659, 356, 20569, 12077, 26159, 294, 10195, 13769, 13341, 18096, 12577, 13059, 9796, 14253, 15738, 11411, 14768, 10552, 18603]
1data = pd.DataFrame(
2 {
3 "YouTube Channels": channels + channels,
4 "Subscribers/Views in millions": subs + views,
5 "Type": ["Subscribers"] * len(subs) + ["Views"] * len(views),
6 }
7)
8data
YouTube Channels | Subscribers/Views in millions | Type | |
---|---|---|---|
0 | T-Series | 123.0 | Subscribers |
1 | Canal KondZilla | 54.5 | Subscribers |
2 | Zee Music Company | 48.5 | Subscribers |
3 | Ed Sheeran | 43.2 | Subscribers |
4 | EminemMusic | 40.2 | Subscribers |
5 | Ariana Grande | 39.3 | Subscribers |
6 | Taylor Swift | 36.8 | Subscribers |
7 | JustinBieberVEVO | 33.1 | Subscribers |
8 | BLACKPINK | 32.4 | Subscribers |
9 | Alan Walker | 31.7 | Subscribers |
10 | Shemaroo Filmi Gaane | 31.0 | Subscribers |
11 | ibighit | 30.9 | Subscribers |
12 | One Direction | 30.4 | Subscribers |
13 | Wave Music | 30.4 | Subscribers |
14 | Sony Music India | 29.9 | Subscribers |
15 | El Reino Infantil | 29.2 | Subscribers |
16 | Maroon 5 | 29.2 | Subscribers |
17 | Trap Nation | 27.9 | Subscribers |
18 | Speed Records | 27.4 | Subscribers |
19 | GR6 EXPLODE | 27.2 | Subscribers |
20 | TaylorSwiftVEVO | 27.0 | Subscribers |
21 | SonyMusicIndiaVEVO | 27.0 | Subscribers |
22 | Ozuna | 27.0 | Subscribers |
23 | Daddy Yankee | 26.9 | Subscribers |
24 | YRF | 26.8 | Subscribers |
25 | Spinnin' Records | 26.4 | Subscribers |
26 | Bruno Mars | 26.3 | Subscribers |
27 | RihannaVEVO | 25.9 | Subscribers |
28 | T-Series Bhakti Sagar | 25.8 | Subscribers |
29 | KatyPerryVEVO | 25.8 | Subscribers |
30 | T-Series | 94410.0 | Views |
31 | Canal KondZilla | 27860.0 | Views |
32 | Zee Music Company | 22689.0 | Views |
33 | Ed Sheeran | 18905.0 | Views |
34 | EminemMusic | 773.0 | Views |
35 | Ariana Grande | 953.0 | Views |
36 | Taylor Swift | 310.0 | Views |
37 | JustinBieberVEVO | 19326.0 | Views |
38 | BLACKPINK | 8112.0 | Views |
39 | Alan Walker | 7470.0 | Views |
40 | Shemaroo Filmi Gaane | 14708.0 | Views |
41 | ibighit | 7659.0 | Views |
42 | One Direction | 356.0 | Views |
43 | Wave Music | 20569.0 | Views |
44 | Sony Music India | 12077.0 | Views |
45 | El Reino Infantil | 26159.0 | Views |
46 | Maroon 5 | 294.0 | Views |
47 | Trap Nation | 10195.0 | Views |
48 | Speed Records | 13769.0 | Views |
49 | GR6 EXPLODE | 13341.0 | Views |
50 | TaylorSwiftVEVO | 18096.0 | Views |
51 | SonyMusicIndiaVEVO | 12577.0 | Views |
52 | Ozuna | 13059.0 | Views |
53 | Daddy Yankee | 9796.0 | Views |
54 | YRF | 14253.0 | Views |
55 | Spinnin' Records | 15738.0 | Views |
56 | Bruno Mars | 11411.0 | Views |
57 | RihannaVEVO | 14768.0 | Views |
58 | T-Series Bhakti Sagar | 10552.0 | Views |
59 | KatyPerryVEVO | 18603.0 | Views |
Visualisation#
1# Plot a FacetGrid with two columns
2g = sns.FacetGrid(data, col="Type", hue="Type", sharex=False, height=8)
3g.map(sns.barplot, "Subscribers/Views in millions", "YouTube Channels")
4plt.show()