YouTube#

Visualizing the total number of subscribers and the total number of views for the top 30 YouTube channels (January 2020) in the music category by using the FacetGrid() function provided by Seaborn.

Importing libraries and packages#

 1# Warnings
 2import warnings
 3
 4# Mathematical operations and data manipulation
 5import pandas as pd
 6
 7# Plotting
 8import matplotlib.pyplot as plt
 9import seaborn as sns
10
11sns.set()
12warnings.filterwarnings("ignore")

Set paths#

1# Path to datasets directory
2data_path = "./datasets"
3# Path to assets directory (for saving results to)
4assets_path = "./assets"

Loading dataset#

1dataset = pd.read_csv(f"{data_path}/YouTube.csv")

Exploring dataset#

1# Shape of the dataset
2print("Shape of the dataset: ", dataset.shape)
3# View
4dataset
Shape of the dataset:  (30, 3)
Channel Subs (in millions) Views (in millions)
0 T-Series 123.0 94410
1 Canal KondZilla 54.5 27860
2 Zee Music Company 48.5 22689
3 Ed Sheeran 43.2 18905
4 EminemMusic 40.2 773
5 Ariana Grande 39.3 953
6 Taylor Swift 36.8 310
7 JustinBieberVEVO 33.1 19326
8 BLACKPINK 32.4 8112
9 Alan Walker 31.7 7470
10 Shemaroo Filmi Gaane 31.0 14708
11 ibighit 30.9 7659
12 One Direction 30.4 356
13 Wave Music 30.4 20569
14 Sony Music India 29.9 12077
15 El Reino Infantil 29.2 26159
16 Maroon 5 29.2 294
17 Trap Nation 27.9 10195
18 Speed Records 27.4 13769
19 GR6 EXPLODE 27.2 13341
20 TaylorSwiftVEVO 27.0 18096
21 SonyMusicIndiaVEVO 27.0 12577
22 Ozuna 27.0 13059
23 Daddy Yankee 26.9 9796
24 YRF 26.8 14253
25 Spinnin' Records 26.4 15738
26 Bruno Mars 26.3 11411
27 RihannaVEVO 25.9 14768
28 T-Series Bhakti Sagar 25.8 10552
29 KatyPerryVEVO 25.8 18603

Preprocessing#

1channels = dataset[dataset.columns[0]].tolist()
2print(channels)
['T-Series', 'Canal KondZilla', 'Zee Music Company', 'Ed Sheeran ', 'EminemMusic ', 'Ariana Grande ', 'Taylor Swift', 'JustinBieberVEVO ', ' BLACKPINK', 'Alan Walker', 'Shemaroo Filmi Gaane', 'ibighit', 'One Direction', 'Wave Music ', 'Sony Music India ', 'El Reino Infantil', 'Maroon 5 ', 'Trap Nation', 'Speed Records', 'GR6 EXPLODE ', 'TaylorSwiftVEVO ', 'SonyMusicIndiaVEVO', 'Ozuna', 'Daddy Yankee', 'YRF', "Spinnin' Records", 'Bruno Mars', 'RihannaVEVO ', 'T-Series Bhakti Sagar', 'KatyPerryVEVO']
1subs = dataset[dataset.columns[1]].tolist()
2print(subs)
[123.0, 54.5, 48.5, 43.2, 40.2, 39.3, 36.8, 33.1, 32.4, 31.7, 31.0, 30.9, 30.4, 30.4, 29.9, 29.2, 29.2, 27.9, 27.4, 27.2, 27.0, 27.0, 27.0, 26.9, 26.8, 26.4, 26.3, 25.9, 25.8, 25.8]
1views = dataset[dataset.columns[2]].tolist()
2print(views)
[94410, 27860, 22689, 18905, 773, 953, 310, 19326, 8112, 7470, 14708, 7659, 356, 20569, 12077, 26159, 294, 10195, 13769, 13341, 18096, 12577, 13059, 9796, 14253, 15738, 11411, 14768, 10552, 18603]
1data = pd.DataFrame(
2    {
3        "YouTube Channels": channels + channels,
4        "Subscribers/Views in millions": subs + views,
5        "Type": ["Subscribers"] * len(subs) + ["Views"] * len(views),
6    }
7)
8data
YouTube Channels Subscribers/Views in millions Type
0 T-Series 123.0 Subscribers
1 Canal KondZilla 54.5 Subscribers
2 Zee Music Company 48.5 Subscribers
3 Ed Sheeran 43.2 Subscribers
4 EminemMusic 40.2 Subscribers
5 Ariana Grande 39.3 Subscribers
6 Taylor Swift 36.8 Subscribers
7 JustinBieberVEVO 33.1 Subscribers
8 BLACKPINK 32.4 Subscribers
9 Alan Walker 31.7 Subscribers
10 Shemaroo Filmi Gaane 31.0 Subscribers
11 ibighit 30.9 Subscribers
12 One Direction 30.4 Subscribers
13 Wave Music 30.4 Subscribers
14 Sony Music India 29.9 Subscribers
15 El Reino Infantil 29.2 Subscribers
16 Maroon 5 29.2 Subscribers
17 Trap Nation 27.9 Subscribers
18 Speed Records 27.4 Subscribers
19 GR6 EXPLODE 27.2 Subscribers
20 TaylorSwiftVEVO 27.0 Subscribers
21 SonyMusicIndiaVEVO 27.0 Subscribers
22 Ozuna 27.0 Subscribers
23 Daddy Yankee 26.9 Subscribers
24 YRF 26.8 Subscribers
25 Spinnin' Records 26.4 Subscribers
26 Bruno Mars 26.3 Subscribers
27 RihannaVEVO 25.9 Subscribers
28 T-Series Bhakti Sagar 25.8 Subscribers
29 KatyPerryVEVO 25.8 Subscribers
30 T-Series 94410.0 Views
31 Canal KondZilla 27860.0 Views
32 Zee Music Company 22689.0 Views
33 Ed Sheeran 18905.0 Views
34 EminemMusic 773.0 Views
35 Ariana Grande 953.0 Views
36 Taylor Swift 310.0 Views
37 JustinBieberVEVO 19326.0 Views
38 BLACKPINK 8112.0 Views
39 Alan Walker 7470.0 Views
40 Shemaroo Filmi Gaane 14708.0 Views
41 ibighit 7659.0 Views
42 One Direction 356.0 Views
43 Wave Music 20569.0 Views
44 Sony Music India 12077.0 Views
45 El Reino Infantil 26159.0 Views
46 Maroon 5 294.0 Views
47 Trap Nation 10195.0 Views
48 Speed Records 13769.0 Views
49 GR6 EXPLODE 13341.0 Views
50 TaylorSwiftVEVO 18096.0 Views
51 SonyMusicIndiaVEVO 12577.0 Views
52 Ozuna 13059.0 Views
53 Daddy Yankee 9796.0 Views
54 YRF 14253.0 Views
55 Spinnin' Records 15738.0 Views
56 Bruno Mars 11411.0 Views
57 RihannaVEVO 14768.0 Views
58 T-Series Bhakti Sagar 10552.0 Views
59 KatyPerryVEVO 18603.0 Views

Visualisation#

1# Plot a FacetGrid with two columns
2g = sns.FacetGrid(data, col="Type", hue="Type", sharex=False, height=8)
3g.map(sns.barplot, "Subscribers/Views in millions", "YouTube Channels")
4plt.show()
../../_images/312f24c572769b9c1c6644972c55fc0ac8483d896f77469446559a8233d29e67.png