Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:00,000 --> 00:00:02,280
yo what's up YouTube it's been a while
2
00:00:02,280 --> 00:00:04,019
since I made a tutorial but it's great
3
00:00:04,019 --> 00:00:06,000
to be back recently I've been getting a
4
00:00:06,000 --> 00:00:07,680
lot of questions about what projects
5
00:00:07,680 --> 00:00:09,780
should I build to get a job to be honest
6
00:00:09,780 --> 00:00:11,400
there's no one project that will
7
00:00:11,400 --> 00:00:13,080
guarantee that you'll get a job how I
8
00:00:13,080 --> 00:00:14,340
like to look at it is that you should
9
00:00:14,340 --> 00:00:16,199
build something that you're able to talk
10
00:00:16,199 --> 00:00:18,420
about for at least 10 minutes during an
11
00:00:18,420 --> 00:00:20,279
interview for example if you built a
12
00:00:20,279 --> 00:00:21,840
calculator app do you think that you can
13
00:00:21,840 --> 00:00:25,080
talk about it for at least 10 minutes so
14
00:00:25,080 --> 00:00:26,699
I built this calculator app that can add
15
00:00:26,699 --> 00:00:28,920
numbers subtract numbers multiply Etc
16
00:00:28,920 --> 00:00:31,260
it's super cool
17
00:00:31,260 --> 00:00:33,239
um yeah that's about it as you can see
18
00:00:33,239 --> 00:00:34,860
this calculated app wasn't that
19
00:00:34,860 --> 00:00:36,840
interesting I wasn't passionate about it
20
00:00:36,840 --> 00:00:38,399
and I couldn't really talk about it that
21
00:00:38,399 --> 00:00:40,800
much however if you build something that
22
00:00:40,800 --> 00:00:42,540
can solve a real life problem that
23
00:00:42,540 --> 00:00:44,579
you're facing you'll be more inclined to
24
00:00:44,579 --> 00:00:46,680
talk more about it so for today's video
25
00:00:46,680 --> 00:00:48,480
I want to talk about a problem that I've
26
00:00:48,480 --> 00:00:50,760
been facing recently and how I was able
27
00:00:50,760 --> 00:00:52,500
to solve it with python automation
28
00:00:52,500 --> 00:00:54,539
remember folks if you ever find yourself
29
00:00:54,539 --> 00:00:57,059
doing a manual task over and over
30
00:00:57,059 --> 00:00:59,160
there's probably a way that you can
31
00:00:59,160 --> 00:01:01,800
automate it with code so without further
32
00:01:01,800 --> 00:01:04,140
Ado let's get into it so before we touch
33
00:01:04,140 --> 00:01:06,060
any code we need to identify a problem
34
00:01:06,060 --> 00:01:07,979
that we want to solve with automation
35
00:01:07,979 --> 00:01:10,140
for those of you that don't know I hope
36
00:01:10,140 --> 00:01:12,000
my girlfriend run a tick tock account
37
00:01:12,000 --> 00:01:14,340
for our cat papaya you guys should
38
00:01:14,340 --> 00:01:15,140
follow
39
00:01:15,140 --> 00:01:17,580
papayohoe.cat on tick tock
40
00:01:17,580 --> 00:01:19,320
currently if you want to be a content
41
00:01:19,320 --> 00:01:21,119
creator you have to spread yourself out
42
00:01:21,119 --> 00:01:23,159
and repost your content on multiple
43
00:01:23,159 --> 00:01:25,200
platforms so that way you can grow we've
44
00:01:25,200 --> 00:01:26,759
been creating our own content on Tick
45
00:01:26,759 --> 00:01:28,560
Tock but we also want to grow our
46
00:01:28,560 --> 00:01:30,479
presence on YouTube shorts the problem
47
00:01:30,479 --> 00:01:33,600
is we have over 500 Tech talk videos and
48
00:01:33,600 --> 00:01:35,460
ain't nobody got time to manually do
49
00:01:35,460 --> 00:01:37,619
this now moving forward we need to
50
00:01:37,619 --> 00:01:39,299
identify the steps to solve the problem
51
00:01:39,299 --> 00:01:41,579
manually so first I would open up the
52
00:01:41,579 --> 00:01:43,799
tick tock website click the share button
53
00:01:43,799 --> 00:01:46,500
copy the link and use some third-party
54
00:01:46,500 --> 00:01:48,840
websites to download the video without a
55
00:01:48,840 --> 00:01:50,820
watermark and then I would upload it to
56
00:01:50,820 --> 00:01:52,979
YouTube shorts everything I'm doing here
57
00:01:52,979 --> 00:01:55,979
is totally fine but it's very tedious
58
00:01:55,979 --> 00:01:58,560
and cumbersome and it takes a lot of
59
00:01:58,560 --> 00:02:00,840
time so since I know Python and I know
60
00:02:00,840 --> 00:02:03,360
how to code I can automate this and
61
00:02:03,360 --> 00:02:04,439
that's what I'll be showing you guys
62
00:02:04,439 --> 00:02:06,299
today so before you watch this video
63
00:02:06,299 --> 00:02:08,098
make sure you know some python so that
64
00:02:08,098 --> 00:02:09,840
you can follow along if you don't have
65
00:02:09,840 --> 00:02:12,060
any python experience don't worry I have
66
00:02:12,060 --> 00:02:13,739
a playlist over here it's completely
67
00:02:13,739 --> 00:02:15,959
free and top by myself so check it out
68
00:02:15,959 --> 00:02:18,180
without further Ado let's go loading
69
00:02:18,180 --> 00:02:19,140
time
70
00:02:19,140 --> 00:02:21,000
alright so first things first let's open
71
00:02:21,000 --> 00:02:23,459
up Papaya's Tick Tock page and also just
72
00:02:23,459 --> 00:02:25,319
a reminder that this is just an example
73
00:02:25,319 --> 00:02:27,360
so feel free to use any Tick Tock page
74
00:02:27,360 --> 00:02:28,980
that you want to automate downloading
75
00:02:28,980 --> 00:02:31,020
videos for to get the link to the video
76
00:02:31,020 --> 00:02:33,480
all we have to do is open the video and
77
00:02:33,480 --> 00:02:35,520
copy the address at the top this method
78
00:02:35,520 --> 00:02:37,800
requires a lot of clicking so instead we
79
00:02:37,800 --> 00:02:39,840
should think like programmers every
80
00:02:39,840 --> 00:02:42,900
website basically uses HTML in order to
81
00:02:42,900 --> 00:02:44,640
link a user to another page we have to
82
00:02:44,640 --> 00:02:47,400
use an a tag and provide an href with
83
00:02:47,400 --> 00:02:49,500
the URL that we want to link the user to
84
00:02:49,500 --> 00:02:52,080
so to access the HTML on the page all we
85
00:02:52,080 --> 00:02:53,580
have to do is right click and click
86
00:02:53,580 --> 00:02:55,800
inspect and this opens the console tool
87
00:02:55,800 --> 00:02:58,140
so that we can inspect the HTML on the
88
00:02:58,140 --> 00:03:00,180
page and as you move your cursor around
89
00:03:00,180 --> 00:03:02,040
you'll see that it will highlight each
90
00:03:02,040 --> 00:03:04,980
HTML element so for our case we want to
91
00:03:04,980 --> 00:03:07,260
see the a tags for each of these videos
92
00:03:07,260 --> 00:03:09,660
and when I move my mouse here I see that
93
00:03:09,660 --> 00:03:11,700
the stuff that I want is highlighted so
94
00:03:11,700 --> 00:03:13,800
let's expand this so click the arrow to
95
00:03:13,800 --> 00:03:15,599
expand and now as you can see there are
96
00:03:15,599 --> 00:03:17,940
two divs the first one is for videos and
97
00:03:17,940 --> 00:03:20,340
liked and the second one is for all of
98
00:03:20,340 --> 00:03:22,980
the videos so now let's expand this and
99
00:03:22,980 --> 00:03:25,620
let's expand this one and here we go we
100
00:03:25,620 --> 00:03:28,200
got each individual video so now let's
101
00:03:28,200 --> 00:03:30,720
open the first one and let's try to find
102
00:03:30,720 --> 00:03:32,940
the a tag I'm going to expand this
103
00:03:32,940 --> 00:03:36,300
expand this expand this and look at that
104
00:03:36,300 --> 00:03:38,640
we found the a tag and I'm just going to
105
00:03:38,640 --> 00:03:40,200
click on it just to make sure that it
106
00:03:40,200 --> 00:03:42,060
works and this looks like the correct
107
00:03:42,060 --> 00:03:44,459
link so let's close this so now that we
108
00:03:44,459 --> 00:03:46,260
confirmed that this works the next thing
109
00:03:46,260 --> 00:03:48,299
we need to do is find a way to get this
110
00:03:48,299 --> 00:03:50,700
HTML document so that way we can grab
111
00:03:50,700 --> 00:03:53,400
all the hrefs for each video so we can
112
00:03:53,400 --> 00:03:55,560
achieve this very easily by using python
113
00:03:55,560 --> 00:03:58,319
selenium and beautiful soup so now open
114
00:03:58,319 --> 00:04:00,480
a new file and give it a name and save
115
00:04:00,480 --> 00:04:04,099
it and inside here type from selenium
116
00:04:04,099 --> 00:04:07,620
import Webdriver and this will basically
117
00:04:07,620 --> 00:04:09,780
import the selenium Library which
118
00:04:09,780 --> 00:04:11,640
basically allows you to write automation
119
00:04:11,640 --> 00:04:14,519
code that opens a web browser and then
120
00:04:14,519 --> 00:04:17,100
it can click stuff search stuff and Etc
121
00:04:17,100 --> 00:04:19,019
you can kind of think of this as a bot
122
00:04:19,019 --> 00:04:21,860
so cool next just type driver equals
123
00:04:21,860 --> 00:04:23,759
webdriver.chrome which basically
124
00:04:23,759 --> 00:04:25,560
specifies the browser that we want to
125
00:04:25,560 --> 00:04:27,419
use so in this case we want to use
126
00:04:27,419 --> 00:04:30,780
Chrome next type driver.get and then
127
00:04:30,780 --> 00:04:33,000
open the parenthesis and inside here
128
00:04:33,000 --> 00:04:35,220
open quotation marks and paste the link
129
00:04:35,220 --> 00:04:37,139
of the website that you want to get so
130
00:04:37,139 --> 00:04:38,699
in this case we want to get the tick
131
00:04:38,699 --> 00:04:41,040
tock website and next we want to add a
132
00:04:41,040 --> 00:04:43,380
delay to the code so that way it waits
133
00:04:43,380 --> 00:04:45,360
for the website to load so we can do
134
00:04:45,360 --> 00:04:47,340
time dot sleep and then open the
135
00:04:47,340 --> 00:04:49,620
parenthesis and put one for one second
136
00:04:49,620 --> 00:04:52,139
and in order to use time in Python we
137
00:04:52,139 --> 00:04:54,360
have to import time so let's type import
138
00:04:54,360 --> 00:04:57,000
time cool so now that we have loaded the
139
00:04:57,000 --> 00:04:59,460
web page we want to parse the HTML
140
00:04:59,460 --> 00:05:02,100
inside it and a great library for this
141
00:05:02,100 --> 00:05:04,139
is beautiful soup so to use beautiful
142
00:05:04,139 --> 00:05:06,479
soup all we have to do is import it so
143
00:05:06,479 --> 00:05:08,840
we have to type from bs4 import
144
00:05:08,840 --> 00:05:11,699
beautiful soup and now in our code we
145
00:05:11,699 --> 00:05:14,580
can do soup equals beautiful soup and
146
00:05:14,580 --> 00:05:16,080
then open the parenthesis and type
147
00:05:16,080 --> 00:05:19,380
driver Dot Page underscore source which
148
00:05:19,380 --> 00:05:21,120
will give us the HTML Source from the
149
00:05:21,120 --> 00:05:23,340
page and now hit comma space and open
150
00:05:23,340 --> 00:05:25,580
the quotes and inside here type
151
00:05:25,580 --> 00:05:28,020
html.parser to specify that we want to
152
00:05:28,020 --> 00:05:30,660
parse HTML cool now in the next line
153
00:05:30,660 --> 00:05:33,240
just type print open the parenthesis and
154
00:05:33,240 --> 00:05:36,360
do soup Dot prettify and then open the
155
00:05:36,360 --> 00:05:38,220
parenthesis this line will let us test
156
00:05:38,220 --> 00:05:40,440
whether our code works and purify just
157
00:05:40,440 --> 00:05:42,360
makes the HTML look pretty so that it's
158
00:05:42,360 --> 00:05:44,639
easy for us to read now hit save and
159
00:05:44,639 --> 00:05:46,080
save this code somewhere on your
160
00:05:46,080 --> 00:05:48,360
computer and now open your terminal and
161
00:05:48,360 --> 00:05:49,979
vs code in the top you can click
162
00:05:49,979 --> 00:05:52,320
terminal and click new terminal and this
163
00:05:52,320 --> 00:05:54,419
will just open a terminal and to run
164
00:05:54,419 --> 00:05:55,740
this code you can type the command
165
00:05:55,740 --> 00:05:58,620
Python and I'm using python3 so I'll
166
00:05:58,620 --> 00:06:01,440
type three space and then type the the
167
00:06:01,440 --> 00:06:03,600
name of your file so for me it's scrape
168
00:06:03,600 --> 00:06:05,759
underscore video dot Pi if you're not
169
00:06:05,759 --> 00:06:07,440
python on your computer make sure to
170
00:06:07,440 --> 00:06:09,539
install it installing python is out of
171
00:06:09,539 --> 00:06:11,280
the scope for this tutorial so I won't
172
00:06:11,280 --> 00:06:12,780
be covering that and if you run into
173
00:06:12,780 --> 00:06:15,240
errors related to missing dependencies
174
00:06:15,240 --> 00:06:17,880
you can fix that very easily by typing
175
00:06:17,880 --> 00:06:19,919
pip and in my case since I'm using
176
00:06:19,919 --> 00:06:22,500
python3 I have to put a 3 at the end and
177
00:06:22,500 --> 00:06:24,840
then you can just do install and
178
00:06:24,840 --> 00:06:26,880
basically the name of the module so here
179
00:06:26,880 --> 00:06:29,039
you can type selenium and then we want
180
00:06:29,039 --> 00:06:31,380
to install beautiful soup which is bs4
181
00:06:31,380 --> 00:06:34,440
like this and then hit enter and this
182
00:06:34,440 --> 00:06:35,639
will basically install the libraries
183
00:06:35,639 --> 00:06:37,860
that we need to run this code once
184
00:06:37,860 --> 00:06:39,419
everything is set up feel free to run
185
00:06:39,419 --> 00:06:41,639
the code and as you can see it opened a
186
00:06:41,639 --> 00:06:43,919
web browser and then it loads the page
187
00:06:43,919 --> 00:06:46,259
and after one second the page closes
188
00:06:46,259 --> 00:06:48,360
because the code finished executing and
189
00:06:48,360 --> 00:06:50,220
now if you look inside my terminal I
190
00:06:50,220 --> 00:06:52,800
basically got the HTML for the page cool
191
00:06:52,800 --> 00:06:55,139
so now we basically have this HTML
192
00:06:55,139 --> 00:06:57,419
within our python code and the cool
193
00:06:57,419 --> 00:06:59,220
thing about beautiful soup is that it's
194
00:06:59,220 --> 00:07:02,940
able to find elements within the HTML so
195
00:07:02,940 --> 00:07:05,039
for example each video is within a div
196
00:07:05,039 --> 00:07:09,419
with a class called Tick Tock Dash x6y
197
00:07:09,419 --> 00:07:11,400
Etc and they all have the same class
198
00:07:11,400 --> 00:07:13,740
name so basically with beautiful soup we
199
00:07:13,740 --> 00:07:15,960
can specify that we only want divs with
200
00:07:15,960 --> 00:07:19,500
the name Tick Tock Dash x6y Etc and it
201
00:07:19,500 --> 00:07:21,120
will select all of them and return it
202
00:07:21,120 --> 00:07:22,979
back as a list and to make our lives
203
00:07:22,979 --> 00:07:24,660
easier we should try to get the div
204
00:07:24,660 --> 00:07:27,539
closest to the a tag and the div that
205
00:07:27,539 --> 00:07:31,319
contains the a tag is Tick Tock Dash YZ
206
00:07:31,319 --> 00:07:34,560
6i Etc and just be safe we should double
207
00:07:34,560 --> 00:07:36,479
check that all the other divs have the
208
00:07:36,479 --> 00:07:38,520
same naming so let's open the second
209
00:07:38,520 --> 00:07:41,400
video and open the first div and open
210
00:07:41,400 --> 00:07:44,699
this div and open this div as well and
211
00:07:44,699 --> 00:07:46,740
here we see the a tag and now if we look
212
00:07:46,740 --> 00:07:48,960
at the class we see Tick Tock Dash
213
00:07:48,960 --> 00:07:51,060
yz6ijl
214
00:07:51,060 --> 00:07:53,280
Etc which basically matches the one at
215
00:07:53,280 --> 00:07:55,259
the top so this makes our lives very
216
00:07:55,259 --> 00:07:57,120
easy so now we can just look for the div
217
00:07:57,120 --> 00:07:59,340
with this class name so let's copy this
218
00:07:59,340 --> 00:08:01,319
and now let's go back to our code and
219
00:08:01,319 --> 00:08:03,560
now remove this line and type videos
220
00:08:03,560 --> 00:08:07,380
equals soup dot find underscore all and
221
00:08:07,380 --> 00:08:09,120
now open the parenthesis and open the
222
00:08:09,120 --> 00:08:11,220
quotation marks and type div and then
223
00:08:11,220 --> 00:08:13,800
add a comma and then space and then open
224
00:08:13,800 --> 00:08:15,539
this clinical brackets and now inside
225
00:08:15,539 --> 00:08:17,460
here open the quotation marks and type
226
00:08:17,460 --> 00:08:20,759
class and then open the colon and I open
227
00:08:20,759 --> 00:08:22,680
the quotation marks and paste that class
228
00:08:22,680 --> 00:08:24,840
name and all we're doing here is we're
229
00:08:24,840 --> 00:08:27,599
saying we want to find all the divs with
230
00:08:27,599 --> 00:08:31,379
the class name Tick Tock YZ Etc and I'll
231
00:08:31,379 --> 00:08:33,719
do print and let's get the length of the
232
00:08:33,719 --> 00:08:35,458
videos so that way we know how many
233
00:08:35,458 --> 00:08:37,679
videos we got back and then next we can
234
00:08:37,679 --> 00:08:39,539
just Loop through each video so we can
235
00:08:39,539 --> 00:08:43,260
do for video in videos and add a colon
236
00:08:43,260 --> 00:08:45,120
and go to the next line and now just
237
00:08:45,120 --> 00:08:47,100
type prints open the parenthesis and
238
00:08:47,100 --> 00:08:49,560
type video and now we can do Dot and
239
00:08:49,560 --> 00:08:51,480
then we can type a which means it's
240
00:08:51,480 --> 00:08:53,820
getting the a tag within the div and
241
00:08:53,820 --> 00:08:55,620
then open the square brackets and then
242
00:08:55,620 --> 00:08:57,720
open the quotation marks and type href
243
00:08:57,720 --> 00:08:59,399
which means we want to grab the href
244
00:08:59,399 --> 00:09:01,560
value from inside the 8
245
00:09:01,560 --> 00:09:03,600
and now let's hit save and now go back
246
00:09:03,600 --> 00:09:04,860
to your terminal and let's run this
247
00:09:04,860 --> 00:09:07,200
command again cool so it opens the page
248
00:09:07,200 --> 00:09:09,600
and then one second later it closes and
249
00:09:09,600 --> 00:09:11,100
now if you look at my terminal again we
250
00:09:11,100 --> 00:09:13,080
basically got all the links of the
251
00:09:13,080 --> 00:09:15,240
videos inside the page and if you look
252
00:09:15,240 --> 00:09:18,240
closely we only got 29 videos this
253
00:09:18,240 --> 00:09:19,320
doesn't really add up because
254
00:09:19,320 --> 00:09:21,720
technically we actually have around 500
255
00:09:21,720 --> 00:09:23,339
videos so it looks like something's not
256
00:09:23,339 --> 00:09:25,019
working properly so let's go back to
257
00:09:25,019 --> 00:09:27,120
this page so one interesting thing about
258
00:09:27,120 --> 00:09:29,459
this page is that as you scroll more
259
00:09:29,459 --> 00:09:32,040
videos get loaded so this is going to be
260
00:09:32,040 --> 00:09:33,779
a problem for our code because we're
261
00:09:33,779 --> 00:09:35,760
only loading whatever we can see
262
00:09:35,760 --> 00:09:37,860
initially so if we want to see more
263
00:09:37,860 --> 00:09:41,100
videos we have to tell selenium to
264
00:09:41,100 --> 00:09:43,320
scroll through the page until it reaches
265
00:09:43,320 --> 00:09:46,140
the bottom so that way we can get all of
266
00:09:46,140 --> 00:09:48,240
the videos now that sounds super
267
00:09:48,240 --> 00:09:49,980
complicated and that's where Google
268
00:09:49,980 --> 00:09:52,140
comes in we should use Google as a
269
00:09:52,140 --> 00:09:54,000
resource to help us find a way to
270
00:09:54,000 --> 00:09:55,920
achieve this so I came across this
271
00:09:55,920 --> 00:09:58,320
medium post by Quan Wei and this does
272
00:09:58,320 --> 00:10:00,360
exactly what we want to do here's the
273
00:10:00,360 --> 00:10:02,339
code that basically handles scrolling
274
00:10:02,339 --> 00:10:04,440
all the way to the bottom of the page so
275
00:10:04,440 --> 00:10:06,000
we only care about this part of the code
276
00:10:06,000 --> 00:10:07,920
so let's copy it and let's go back to
277
00:10:07,920 --> 00:10:09,779
our code and let's paste this in the
278
00:10:09,779 --> 00:10:11,760
middle before we grab the page Source
279
00:10:11,760 --> 00:10:13,680
from the driver and I'm missing an S
280
00:10:13,680 --> 00:10:15,600
here so let's add it back so basically
281
00:10:15,600 --> 00:10:17,880
this code just runs a while loop that
282
00:10:17,880 --> 00:10:20,160
executes JavaScript code to scroll down
283
00:10:20,160 --> 00:10:22,200
and then we have a variable I to keep
284
00:10:22,200 --> 00:10:24,300
track of how many times we have scrolled
285
00:10:24,300 --> 00:10:25,980
and here we tell the program to sleep
286
00:10:25,980 --> 00:10:28,260
for scroll pause time which is just one
287
00:10:28,260 --> 00:10:30,060
second and here we execute another
288
00:10:30,060 --> 00:10:32,100
JavaScript function which basically
289
00:10:32,100 --> 00:10:34,019
grabs the new scroll height of the page
290
00:10:34,019 --> 00:10:36,360
and we basically compare the height of
291
00:10:36,360 --> 00:10:38,519
the screen multiplied by the number of
292
00:10:38,519 --> 00:10:40,440
times that we have scrolled and we check
293
00:10:40,440 --> 00:10:42,899
whether this value is greater than the
294
00:10:42,899 --> 00:10:44,399
current scroll height and if it's
295
00:10:44,399 --> 00:10:46,140
greater that basically just means that
296
00:10:46,140 --> 00:10:48,420
we've reached the end of the page cool
297
00:10:48,420 --> 00:10:50,519
so now let's hit save and now let's run
298
00:10:50,519 --> 00:10:53,459
our code again hit enter and here it
299
00:10:53,459 --> 00:10:56,100
opens the browser and let's watch some
300
00:10:56,100 --> 00:10:57,000
magic
301
00:10:57,000 --> 00:10:59,100
you see that it's scrolling by itself
302
00:10:59,100 --> 00:11:01,620
isn't that awesome and now I'll just
303
00:11:01,620 --> 00:11:03,300
speed up the video and let it scroll all
304
00:11:03,300 --> 00:11:05,339
the way to the bottom five minutes later
305
00:11:05,339 --> 00:11:07,260
and that took a while because we had a
306
00:11:07,260 --> 00:11:08,760
lot of videos but now look at our
307
00:11:08,760 --> 00:11:11,339
console we actually have a lot of videos
308
00:11:11,339 --> 00:11:13,560
and when I scroll all the way to the top
309
00:11:13,560 --> 00:11:15,839
you're gonna see that we have 457 videos
310
00:11:15,839 --> 00:11:18,120
in total I guess I was off by like 50
311
00:11:18,120 --> 00:11:20,640
but still that's a lot of videos cool
312
00:11:20,640 --> 00:11:22,140
now that we have all these links the
313
00:11:22,140 --> 00:11:23,820
next thing we need to do is just go to
314
00:11:23,820 --> 00:11:25,100
this website
315
00:11:25,100 --> 00:11:27,899
ssstick.io which basically allows you to
316
00:11:27,899 --> 00:11:29,760
download Tick Tock videos without a
317
00:11:29,760 --> 00:11:31,320
watermark so let's paste the link to one
318
00:11:31,320 --> 00:11:33,060
of the videos and before we hit download
319
00:11:33,060 --> 00:11:35,040
right click on the page and go to
320
00:11:35,040 --> 00:11:37,380
inspect and then go to the networks Tab
321
00:11:37,380 --> 00:11:39,540
and basically on this tab it will record
322
00:11:39,540 --> 00:11:42,120
network activity so let me show you what
323
00:11:42,120 --> 00:11:43,920
that means so let's go back to the page
324
00:11:43,920 --> 00:11:45,959
and click download and if you look at
325
00:11:45,959 --> 00:11:47,279
the right you're going to see that three
326
00:11:47,279 --> 00:11:49,380
Network activities happened when we
327
00:11:49,380 --> 00:11:51,360
click the download button and if you
328
00:11:51,360 --> 00:11:53,339
click response you're gonna see that it
329
00:11:53,339 --> 00:11:55,920
returns HTML and if you look here
330
00:11:55,920 --> 00:11:58,079
there's an a tag and if I scroll to the
331
00:11:58,079 --> 00:11:59,760
right and you're going to see that this
332
00:11:59,760 --> 00:12:02,040
a tag is for this fun without Watermark
333
00:12:02,040 --> 00:12:03,959
so basically we only care about this
334
00:12:03,959 --> 00:12:06,360
href from this a tag and if you right
335
00:12:06,360 --> 00:12:08,519
click on this network activity go to
336
00:12:08,519 --> 00:12:11,820
copy and then click copy as curl go to
337
00:12:11,820 --> 00:12:14,640
the site called curled converter.com and
338
00:12:14,640 --> 00:12:16,560
inside this text field just paste what
339
00:12:16,560 --> 00:12:18,540
we copied and this will turn the curl
340
00:12:18,540 --> 00:12:20,940
command into python so now scroll to the
341
00:12:20,940 --> 00:12:23,640
bottom and click copy to clipboard and
342
00:12:23,640 --> 00:12:25,260
now let's go back to your code and let's
343
00:12:25,260 --> 00:12:26,880
create a function and call it download
344
00:12:26,880 --> 00:12:29,640
video and open the parenthesis and here
345
00:12:29,640 --> 00:12:31,920
it can take a link as a parameter and
346
00:12:31,920 --> 00:12:34,380
open the colon and hit enter and now
347
00:12:34,380 --> 00:12:36,779
paste the code that we just copied and
348
00:12:36,779 --> 00:12:38,459
now let's move this import to the top of
349
00:12:38,459 --> 00:12:40,800
the file so delete this line and scroll
350
00:12:40,800 --> 00:12:42,839
to the top and let's paste it on line
351
00:12:42,839 --> 00:12:45,360
four and let's fix the spacing and hit
352
00:12:45,360 --> 00:12:47,940
enter and now scroll down and now let's
353
00:12:47,940 --> 00:12:49,740
indent all the code that we just copied
354
00:12:49,740 --> 00:12:52,019
and now let's fix the spacing so
355
00:12:52,019 --> 00:12:53,579
basically all the code here just
356
00:12:53,579 --> 00:12:56,040
represents the data that gets sent over
357
00:12:56,040 --> 00:12:58,320
when we make the request to that website
358
00:12:58,320 --> 00:12:59,880
and if you scroll all the way to the
359
00:12:59,880 --> 00:13:01,500
bottom you're going to see that we make
360
00:13:01,500 --> 00:13:05,459
a request to post data to this link and
361
00:13:05,459 --> 00:13:07,680
we pass in the parameters and the
362
00:13:07,680 --> 00:13:11,040
cookies headers and the data and if you
363
00:13:11,040 --> 00:13:12,660
look in the data field you're going to
364
00:13:12,660 --> 00:13:15,300
see this ID which has a link to the tick
365
00:13:15,300 --> 00:13:17,579
tock video so instead of hard coding a
366
00:13:17,579 --> 00:13:20,519
link let's replace this with link which
367
00:13:20,519 --> 00:13:22,200
is the parameter that we added to this
368
00:13:22,200 --> 00:13:23,760
function so now let's copy the function
369
00:13:23,760 --> 00:13:25,920
name and here instead of printing this
370
00:13:25,920 --> 00:13:28,320
value let's just call our download video
371
00:13:28,320 --> 00:13:30,060
function and now let's go back to the
372
00:13:30,060 --> 00:13:31,860
bottom of the page and after we make
373
00:13:31,860 --> 00:13:33,899
this request we'll get back a response
374
00:13:33,899 --> 00:13:36,779
and since the response is HTML all we
375
00:13:36,779 --> 00:13:38,760
got to do is just use beautiful soup to
376
00:13:38,760 --> 00:13:40,860
parse the HTML data so now let's create
377
00:13:40,860 --> 00:13:43,560
a new variable called download soup and
378
00:13:43,560 --> 00:13:46,500
this will equal beautiful soup open the
379
00:13:46,500 --> 00:13:49,260
parenthesis and type response and if we
380
00:13:49,260 --> 00:13:51,420
dot text this will give us the HTML
381
00:13:51,420 --> 00:13:54,120
inside the response and put a comma and
382
00:13:54,120 --> 00:13:56,519
open the quotation marks and type HTML
383
00:13:56,519 --> 00:13:59,880
dot parser and then hit enter and like I
384
00:13:59,880 --> 00:14:02,040
mentioned before we only care about the
385
00:14:02,040 --> 00:14:04,139
first a tag so we can get that very
386
00:14:04,139 --> 00:14:07,920
easily so let's do download link equals
387
00:14:07,920 --> 00:14:13,139
download soup dot a square bracket href
388
00:14:13,139 --> 00:14:14,639
and now the last thing that we need to
389
00:14:14,639 --> 00:14:17,040
do is to download the file so we have to
390
00:14:17,040 --> 00:14:19,139
import another Library so let's scroll
391
00:14:19,139 --> 00:14:21,000
to the top and now in line five just
392
00:14:21,000 --> 00:14:25,500
type from URL lib dot request import URL
393
00:14:25,500 --> 00:14:28,139
open and hit enter and basically we will
394
00:14:28,139 --> 00:14:30,360
use this library to allow us to download
395
00:14:30,360 --> 00:14:32,820
the raw data of the file so now let's go
396
00:14:32,820 --> 00:14:34,920
back to the bottom of the page and after
397
00:14:34,920 --> 00:14:36,600
the download link let's create a new
398
00:14:36,600 --> 00:14:40,620
variable called mp4 file equals URL open
399
00:14:40,620 --> 00:14:42,839
and then we can pass it the download
400
00:14:42,839 --> 00:14:45,060
link and this will download the file as
401
00:14:45,060 --> 00:14:47,699
raw data and now all we have to do is
402
00:14:47,699 --> 00:14:50,100
save this file onto our computer so
403
00:14:50,100 --> 00:14:52,260
let's do with open
404
00:14:52,260 --> 00:14:54,839
and then open the parenthesis and let's
405
00:14:54,839 --> 00:14:57,360
use an F string here so F quotation
406
00:14:57,360 --> 00:14:59,880
marks and I want to put it in a folder
407
00:14:59,880 --> 00:15:02,639
called videos slash and open this Google
408
00:15:02,639 --> 00:15:04,380
brackets and let's put the ID of the
409
00:15:04,380 --> 00:15:06,959
video here and then we can do dot MP4
410
00:15:06,959 --> 00:15:09,480
and then add a comma and open the
411
00:15:09,480 --> 00:15:12,180
quotation marks WB which stands for
412
00:15:12,180 --> 00:15:14,399
writing in binary so we need this in
413
00:15:14,399 --> 00:15:16,740
order to write to a file and then type
414
00:15:16,740 --> 00:15:19,380
as output and then put a colon and hit
415
00:15:19,380 --> 00:15:21,060
enter and now we're going to write a
416
00:15:21,060 --> 00:15:23,100
while loop so while true all we're going
417
00:15:23,100 --> 00:15:27,920
to do is data equals mp4 file dot read
418
00:15:27,920 --> 00:15:30,480
4096 so now let's get back to the code
419
00:15:30,480 --> 00:15:34,560
so hit enter and now type if data so
420
00:15:34,560 --> 00:15:36,779
basically if we're able to read data we
421
00:15:36,779 --> 00:15:38,339
want to write this data to the output
422
00:15:38,339 --> 00:15:41,760
file so do output dot right and then
423
00:15:41,760 --> 00:15:43,920
open the parenthesis and put data and
424
00:15:43,920 --> 00:15:45,720
then hit enter and then add an else
425
00:15:45,720 --> 00:15:47,760
statement where basically if no data
426
00:15:47,760 --> 00:15:49,320
comes back that just means that we
427
00:15:49,320 --> 00:15:51,300
finish reading from the file so now we
428
00:15:51,300 --> 00:15:52,740
can add a break statement which means
429
00:15:52,740 --> 00:15:54,959
that we're done with the wow so now hit
430
00:15:54,959 --> 00:15:56,820
save and before you run the code make
431
00:15:56,820 --> 00:15:58,680
sure you make a folder called videos in
432
00:15:58,680 --> 00:16:00,180
the folder where your script is located
433
00:16:00,180 --> 00:16:02,579
because in the example here I'm creating
434
00:16:02,579 --> 00:16:04,380
the video inside of folder called videos
435
00:16:04,380 --> 00:16:07,079
and I'm giving it an ID and that just
436
00:16:07,079 --> 00:16:09,180
reminded me I forgot to pass a ID
437
00:16:09,180 --> 00:16:11,699
parameter so let's scroll up and and
438
00:16:11,699 --> 00:16:13,860
here inside the download video add a
439
00:16:13,860 --> 00:16:16,980
comma and put ID here and in here let's
440
00:16:16,980 --> 00:16:19,440
just pass the index as the ID so now if
441
00:16:19,440 --> 00:16:20,880
we want the index all we have to do is
442
00:16:20,880 --> 00:16:23,519
add index here and add a comma and here
443
00:16:23,519 --> 00:16:25,860
on the videos we do enumerate and open
444
00:16:25,860 --> 00:16:27,660
the parenthesis and close the
445
00:16:27,660 --> 00:16:29,459
parenthesis and this will basically give
446
00:16:29,459 --> 00:16:31,800
us the index along with the video and
447
00:16:31,800 --> 00:16:33,839
now let's hit save and now let's run our
448
00:16:33,839 --> 00:16:36,600
script cool so it opens the page and now
449
00:16:36,600 --> 00:16:39,000
it Scrolls to the bottom and yeah of
450
00:16:39,000 --> 00:16:41,040
course I got an error live coding is
451
00:16:41,040 --> 00:16:43,440
just too hard so basically the error is
452
00:16:43,440 --> 00:16:45,300
saying that download video is not
453
00:16:45,300 --> 00:16:48,600
defined so let's look at my code and I'm
454
00:16:48,600 --> 00:16:50,579
assuming the error is because I declared
455
00:16:50,579 --> 00:16:53,459
the function after it was called so
456
00:16:53,459 --> 00:16:55,620
that's why the code can't find it so we
457
00:16:55,620 --> 00:16:57,540
can fix this very easily so let's copy
458
00:16:57,540 --> 00:16:59,820
this whole function and let's delete it
459
00:16:59,820 --> 00:17:01,620
and now scroll to the top of the video
460
00:17:01,620 --> 00:17:04,140
and let's paste it above everything else
461
00:17:04,140 --> 00:17:07,199
and hit save and let's try this again oh
462
00:17:07,199 --> 00:17:09,179
crap looks like I got another error I
463
00:17:09,179 --> 00:17:10,919
just had a spelling mistake so I was
464
00:17:10,919 --> 00:17:13,079
supposed to capitalize this F so let me
465
00:17:13,079 --> 00:17:15,720
do that and let me try this again all
466
00:17:15,720 --> 00:17:17,220
right let's cross our fingers and hope
467
00:17:17,220 --> 00:17:19,799
that everything works fine
468
00:17:19,799 --> 00:17:22,859
ah crap we got another error hmm it
469
00:17:22,859 --> 00:17:24,299
looks like our script wasn't able to
470
00:17:24,299 --> 00:17:27,839
read the href inside the a tag but on
471
00:17:27,839 --> 00:17:29,520
the bright side it looks like we did
472
00:17:29,520 --> 00:17:31,860
download one video so let's open this
473
00:17:31,860 --> 00:17:34,520
video and make sure that it works
474
00:17:34,520 --> 00:17:37,080
nice it looks like it works
475
00:17:37,080 --> 00:17:38,580
but technically we're supposed to
476
00:17:38,580 --> 00:17:40,559
download all the videos right and the
477
00:17:40,559 --> 00:17:42,059
issue here is that we're spamming a
478
00:17:42,059 --> 00:17:43,919
server with a lot of requests very
479
00:17:43,919 --> 00:17:45,900
quickly so the server probably thinks
480
00:17:45,900 --> 00:17:47,280
that we're bought and we can actually
481
00:17:47,280 --> 00:17:49,080
get around this very easily by just
482
00:17:49,080 --> 00:17:51,179
adding an arbitrary DeLay So now let's
483
00:17:51,179 --> 00:17:52,980
scroll down and go to where we called
484
00:17:52,980 --> 00:17:55,320
the download video so right here so
485
00:17:55,320 --> 00:17:57,539
after we download one video Let's do
486
00:17:57,539 --> 00:17:59,760
time dot sleep and let's just sleep for
487
00:17:59,760 --> 00:18:02,280
10 seconds just be safe so let's save
488
00:18:02,280 --> 00:18:04,440
this and let's run the code again
489
00:18:04,440 --> 00:18:07,080
third time's the charm right crossing my
490
00:18:07,080 --> 00:18:09,600
fingers let's go and look at that we got
491
00:18:09,600 --> 00:18:11,340
zero and one
492
00:18:11,340 --> 00:18:13,559
and let's wait another 10 seconds and
493
00:18:13,559 --> 00:18:14,940
you're gonna see the next video get
494
00:18:14,940 --> 00:18:17,400
downloaded and look the third video just
495
00:18:17,400 --> 00:18:19,559
came in so because there's a 10 second
496
00:18:19,559 --> 00:18:21,600
delay it's gonna be a long while before
497
00:18:21,600 --> 00:18:23,880
we download all the videos but basically
498
00:18:23,880 --> 00:18:25,919
look it's working and this is awesome
499
00:18:25,919 --> 00:18:28,500
and unfortunately YouTube has a cap and
500
00:18:28,500 --> 00:18:30,480
I can't upload more than 10 videos at a
501
00:18:30,480 --> 00:18:32,100
time so I guess what we have right now
502
00:18:32,100 --> 00:18:34,200
is pretty good anyways that's it for
503
00:18:34,200 --> 00:18:36,000
this video I hope that you guys learned
504
00:18:36,000 --> 00:18:37,860
something new I wanted to share my
505
00:18:37,860 --> 00:18:39,780
thought process of how I approach this
506
00:18:39,780 --> 00:18:42,000
problem and how I was able to come up
507
00:18:42,000 --> 00:18:44,100
with the solution as you can see we only
508
00:18:44,100 --> 00:18:45,780
scratched the surface of automation
509
00:18:45,780 --> 00:18:47,700
there is so many things that you guys
510
00:18:47,700 --> 00:18:50,460
can do if you ever get stuck try to read
511
00:18:50,460 --> 00:18:53,039
the documentation for beautiful soup and
512
00:18:53,039 --> 00:18:55,200
also selenium and if you guys are up for
513
00:18:55,200 --> 00:18:57,000
the challenge try to build your own web
514
00:18:57,000 --> 00:18:59,880
scraper to grab some useful data maybe
515
00:18:59,880 --> 00:19:02,520
you guys can even build a sneaker bot or
516
00:19:02,520 --> 00:19:04,200
even some application that will notify
517
00:19:04,200 --> 00:19:05,880
you when an item that you want to buy
518
00:19:05,880 --> 00:19:08,280
goes on sale anyways the world is your
519
00:19:08,280 --> 00:19:10,080
oyster so try to build something cool
520
00:19:10,080 --> 00:19:11,700
and let me know what you guys built in
521
00:19:11,700 --> 00:19:13,260
the comments below thank you guys I'll
522
00:19:13,260 --> 00:19:14,390
see you later
523
00:19:14,390 --> 00:19:17,939
[Music]
524
00:19:21,370 --> 00:19:24,530
[Music]
38060
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.