All language subtitles for [English (auto-generated)] Automate downloading TikTok videos with no watermark _ Python Automation tutorial [DownSub.com]

af Afrikaans
ak Akan
sq Albanian
am Amharic
ar Arabic
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bem Bemba
bn Bengali
bh Bihari
bs Bosnian
br Breton
bg Bulgarian
km Cambodian
ca Catalan
ceb Cebuano
chr Cherokee
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
ee Ewe
fo Faroese
tl Filipino
fi Finnish
fr French
fy Frisian
gaa Ga
gl Galician
ka Georgian
de German
el Greek
gn Guarani
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ia Interlingua
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
rw Kinyarwanda
rn Kirundi
kg Kongo
ko Korean
kri Krio (Sierra Leone)
ku Kurdish
ckb Kurdish (Soranî)
ky Kyrgyz
lo Laothian
la Latin
lv Latvian
ln Lingala
lt Lithuanian
loz Lozi
lg Luganda
ach Luo
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mfe Mauritian Creole
mo Moldavian
mn Mongolian
my Myanmar (Burmese)
sr-ME Montenegrin
ne Nepali
pcm Nigerian Pidgin
nso Northern Sotho
no Norwegian
nn Norwegian (Nynorsk)
oc Occitan
or Oriya
om Oromo
ps Pashto
fa Persian
pl Polish
pt-BR Portuguese (Brazil)
pt Portuguese (Portugal)
pa Punjabi
qu Quechua
ro Romanian
rm Romansh
nyn Runyakitara
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
sh Serbo-Croatian
st Sesotho
tn Setswana
crs Seychellois Creole
sn Shona
sd Sindhi
si Sinhalese
sk Slovak
sl Slovenian
so Somali
es Spanish
es-419 Spanish (Latin American)
su Sundanese
sw Swahili Download
sv Swedish
tg Tajik
ta Tamil
tt Tatar
te Telugu
th Thai
ti Tigrinya
to Tonga
lua Tshiluba
tum Tumbuka
tr Turkish
tk Turkmen
tw Twi
ug Uighur
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
wo Wolof
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:00,000 --> 00:00:02,280 yo what's up YouTube it's been a while 2 00:00:02,280 --> 00:00:04,019 since I made a tutorial but it's great 3 00:00:04,019 --> 00:00:06,000 to be back recently I've been getting a 4 00:00:06,000 --> 00:00:07,680 lot of questions about what projects 5 00:00:07,680 --> 00:00:09,780 should I build to get a job to be honest 6 00:00:09,780 --> 00:00:11,400 there's no one project that will 7 00:00:11,400 --> 00:00:13,080 guarantee that you'll get a job how I 8 00:00:13,080 --> 00:00:14,340 like to look at it is that you should 9 00:00:14,340 --> 00:00:16,199 build something that you're able to talk 10 00:00:16,199 --> 00:00:18,420 about for at least 10 minutes during an 11 00:00:18,420 --> 00:00:20,279 interview for example if you built a 12 00:00:20,279 --> 00:00:21,840 calculator app do you think that you can 13 00:00:21,840 --> 00:00:25,080 talk about it for at least 10 minutes so 14 00:00:25,080 --> 00:00:26,699 I built this calculator app that can add 15 00:00:26,699 --> 00:00:28,920 numbers subtract numbers multiply Etc 16 00:00:28,920 --> 00:00:31,260 it's super cool 17 00:00:31,260 --> 00:00:33,239 um yeah that's about it as you can see 18 00:00:33,239 --> 00:00:34,860 this calculated app wasn't that 19 00:00:34,860 --> 00:00:36,840 interesting I wasn't passionate about it 20 00:00:36,840 --> 00:00:38,399 and I couldn't really talk about it that 21 00:00:38,399 --> 00:00:40,800 much however if you build something that 22 00:00:40,800 --> 00:00:42,540 can solve a real life problem that 23 00:00:42,540 --> 00:00:44,579 you're facing you'll be more inclined to 24 00:00:44,579 --> 00:00:46,680 talk more about it so for today's video 25 00:00:46,680 --> 00:00:48,480 I want to talk about a problem that I've 26 00:00:48,480 --> 00:00:50,760 been facing recently and how I was able 27 00:00:50,760 --> 00:00:52,500 to solve it with python automation 28 00:00:52,500 --> 00:00:54,539 remember folks if you ever find yourself 29 00:00:54,539 --> 00:00:57,059 doing a manual task over and over 30 00:00:57,059 --> 00:00:59,160 there's probably a way that you can 31 00:00:59,160 --> 00:01:01,800 automate it with code so without further 32 00:01:01,800 --> 00:01:04,140 Ado let's get into it so before we touch 33 00:01:04,140 --> 00:01:06,060 any code we need to identify a problem 34 00:01:06,060 --> 00:01:07,979 that we want to solve with automation 35 00:01:07,979 --> 00:01:10,140 for those of you that don't know I hope 36 00:01:10,140 --> 00:01:12,000 my girlfriend run a tick tock account 37 00:01:12,000 --> 00:01:14,340 for our cat papaya you guys should 38 00:01:14,340 --> 00:01:15,140 follow 39 00:01:15,140 --> 00:01:17,580 papayohoe.cat on tick tock 40 00:01:17,580 --> 00:01:19,320 currently if you want to be a content 41 00:01:19,320 --> 00:01:21,119 creator you have to spread yourself out 42 00:01:21,119 --> 00:01:23,159 and repost your content on multiple 43 00:01:23,159 --> 00:01:25,200 platforms so that way you can grow we've 44 00:01:25,200 --> 00:01:26,759 been creating our own content on Tick 45 00:01:26,759 --> 00:01:28,560 Tock but we also want to grow our 46 00:01:28,560 --> 00:01:30,479 presence on YouTube shorts the problem 47 00:01:30,479 --> 00:01:33,600 is we have over 500 Tech talk videos and 48 00:01:33,600 --> 00:01:35,460 ain't nobody got time to manually do 49 00:01:35,460 --> 00:01:37,619 this now moving forward we need to 50 00:01:37,619 --> 00:01:39,299 identify the steps to solve the problem 51 00:01:39,299 --> 00:01:41,579 manually so first I would open up the 52 00:01:41,579 --> 00:01:43,799 tick tock website click the share button 53 00:01:43,799 --> 00:01:46,500 copy the link and use some third-party 54 00:01:46,500 --> 00:01:48,840 websites to download the video without a 55 00:01:48,840 --> 00:01:50,820 watermark and then I would upload it to 56 00:01:50,820 --> 00:01:52,979 YouTube shorts everything I'm doing here 57 00:01:52,979 --> 00:01:55,979 is totally fine but it's very tedious 58 00:01:55,979 --> 00:01:58,560 and cumbersome and it takes a lot of 59 00:01:58,560 --> 00:02:00,840 time so since I know Python and I know 60 00:02:00,840 --> 00:02:03,360 how to code I can automate this and 61 00:02:03,360 --> 00:02:04,439 that's what I'll be showing you guys 62 00:02:04,439 --> 00:02:06,299 today so before you watch this video 63 00:02:06,299 --> 00:02:08,098 make sure you know some python so that 64 00:02:08,098 --> 00:02:09,840 you can follow along if you don't have 65 00:02:09,840 --> 00:02:12,060 any python experience don't worry I have 66 00:02:12,060 --> 00:02:13,739 a playlist over here it's completely 67 00:02:13,739 --> 00:02:15,959 free and top by myself so check it out 68 00:02:15,959 --> 00:02:18,180 without further Ado let's go loading 69 00:02:18,180 --> 00:02:19,140 time 70 00:02:19,140 --> 00:02:21,000 alright so first things first let's open 71 00:02:21,000 --> 00:02:23,459 up Papaya's Tick Tock page and also just 72 00:02:23,459 --> 00:02:25,319 a reminder that this is just an example 73 00:02:25,319 --> 00:02:27,360 so feel free to use any Tick Tock page 74 00:02:27,360 --> 00:02:28,980 that you want to automate downloading 75 00:02:28,980 --> 00:02:31,020 videos for to get the link to the video 76 00:02:31,020 --> 00:02:33,480 all we have to do is open the video and 77 00:02:33,480 --> 00:02:35,520 copy the address at the top this method 78 00:02:35,520 --> 00:02:37,800 requires a lot of clicking so instead we 79 00:02:37,800 --> 00:02:39,840 should think like programmers every 80 00:02:39,840 --> 00:02:42,900 website basically uses HTML in order to 81 00:02:42,900 --> 00:02:44,640 link a user to another page we have to 82 00:02:44,640 --> 00:02:47,400 use an a tag and provide an href with 83 00:02:47,400 --> 00:02:49,500 the URL that we want to link the user to 84 00:02:49,500 --> 00:02:52,080 so to access the HTML on the page all we 85 00:02:52,080 --> 00:02:53,580 have to do is right click and click 86 00:02:53,580 --> 00:02:55,800 inspect and this opens the console tool 87 00:02:55,800 --> 00:02:58,140 so that we can inspect the HTML on the 88 00:02:58,140 --> 00:03:00,180 page and as you move your cursor around 89 00:03:00,180 --> 00:03:02,040 you'll see that it will highlight each 90 00:03:02,040 --> 00:03:04,980 HTML element so for our case we want to 91 00:03:04,980 --> 00:03:07,260 see the a tags for each of these videos 92 00:03:07,260 --> 00:03:09,660 and when I move my mouse here I see that 93 00:03:09,660 --> 00:03:11,700 the stuff that I want is highlighted so 94 00:03:11,700 --> 00:03:13,800 let's expand this so click the arrow to 95 00:03:13,800 --> 00:03:15,599 expand and now as you can see there are 96 00:03:15,599 --> 00:03:17,940 two divs the first one is for videos and 97 00:03:17,940 --> 00:03:20,340 liked and the second one is for all of 98 00:03:20,340 --> 00:03:22,980 the videos so now let's expand this and 99 00:03:22,980 --> 00:03:25,620 let's expand this one and here we go we 100 00:03:25,620 --> 00:03:28,200 got each individual video so now let's 101 00:03:28,200 --> 00:03:30,720 open the first one and let's try to find 102 00:03:30,720 --> 00:03:32,940 the a tag I'm going to expand this 103 00:03:32,940 --> 00:03:36,300 expand this expand this and look at that 104 00:03:36,300 --> 00:03:38,640 we found the a tag and I'm just going to 105 00:03:38,640 --> 00:03:40,200 click on it just to make sure that it 106 00:03:40,200 --> 00:03:42,060 works and this looks like the correct 107 00:03:42,060 --> 00:03:44,459 link so let's close this so now that we 108 00:03:44,459 --> 00:03:46,260 confirmed that this works the next thing 109 00:03:46,260 --> 00:03:48,299 we need to do is find a way to get this 110 00:03:48,299 --> 00:03:50,700 HTML document so that way we can grab 111 00:03:50,700 --> 00:03:53,400 all the hrefs for each video so we can 112 00:03:53,400 --> 00:03:55,560 achieve this very easily by using python 113 00:03:55,560 --> 00:03:58,319 selenium and beautiful soup so now open 114 00:03:58,319 --> 00:04:00,480 a new file and give it a name and save 115 00:04:00,480 --> 00:04:04,099 it and inside here type from selenium 116 00:04:04,099 --> 00:04:07,620 import Webdriver and this will basically 117 00:04:07,620 --> 00:04:09,780 import the selenium Library which 118 00:04:09,780 --> 00:04:11,640 basically allows you to write automation 119 00:04:11,640 --> 00:04:14,519 code that opens a web browser and then 120 00:04:14,519 --> 00:04:17,100 it can click stuff search stuff and Etc 121 00:04:17,100 --> 00:04:19,019 you can kind of think of this as a bot 122 00:04:19,019 --> 00:04:21,860 so cool next just type driver equals 123 00:04:21,860 --> 00:04:23,759 webdriver.chrome which basically 124 00:04:23,759 --> 00:04:25,560 specifies the browser that we want to 125 00:04:25,560 --> 00:04:27,419 use so in this case we want to use 126 00:04:27,419 --> 00:04:30,780 Chrome next type driver.get and then 127 00:04:30,780 --> 00:04:33,000 open the parenthesis and inside here 128 00:04:33,000 --> 00:04:35,220 open quotation marks and paste the link 129 00:04:35,220 --> 00:04:37,139 of the website that you want to get so 130 00:04:37,139 --> 00:04:38,699 in this case we want to get the tick 131 00:04:38,699 --> 00:04:41,040 tock website and next we want to add a 132 00:04:41,040 --> 00:04:43,380 delay to the code so that way it waits 133 00:04:43,380 --> 00:04:45,360 for the website to load so we can do 134 00:04:45,360 --> 00:04:47,340 time dot sleep and then open the 135 00:04:47,340 --> 00:04:49,620 parenthesis and put one for one second 136 00:04:49,620 --> 00:04:52,139 and in order to use time in Python we 137 00:04:52,139 --> 00:04:54,360 have to import time so let's type import 138 00:04:54,360 --> 00:04:57,000 time cool so now that we have loaded the 139 00:04:57,000 --> 00:04:59,460 web page we want to parse the HTML 140 00:04:59,460 --> 00:05:02,100 inside it and a great library for this 141 00:05:02,100 --> 00:05:04,139 is beautiful soup so to use beautiful 142 00:05:04,139 --> 00:05:06,479 soup all we have to do is import it so 143 00:05:06,479 --> 00:05:08,840 we have to type from bs4 import 144 00:05:08,840 --> 00:05:11,699 beautiful soup and now in our code we 145 00:05:11,699 --> 00:05:14,580 can do soup equals beautiful soup and 146 00:05:14,580 --> 00:05:16,080 then open the parenthesis and type 147 00:05:16,080 --> 00:05:19,380 driver Dot Page underscore source which 148 00:05:19,380 --> 00:05:21,120 will give us the HTML Source from the 149 00:05:21,120 --> 00:05:23,340 page and now hit comma space and open 150 00:05:23,340 --> 00:05:25,580 the quotes and inside here type 151 00:05:25,580 --> 00:05:28,020 html.parser to specify that we want to 152 00:05:28,020 --> 00:05:30,660 parse HTML cool now in the next line 153 00:05:30,660 --> 00:05:33,240 just type print open the parenthesis and 154 00:05:33,240 --> 00:05:36,360 do soup Dot prettify and then open the 155 00:05:36,360 --> 00:05:38,220 parenthesis this line will let us test 156 00:05:38,220 --> 00:05:40,440 whether our code works and purify just 157 00:05:40,440 --> 00:05:42,360 makes the HTML look pretty so that it's 158 00:05:42,360 --> 00:05:44,639 easy for us to read now hit save and 159 00:05:44,639 --> 00:05:46,080 save this code somewhere on your 160 00:05:46,080 --> 00:05:48,360 computer and now open your terminal and 161 00:05:48,360 --> 00:05:49,979 vs code in the top you can click 162 00:05:49,979 --> 00:05:52,320 terminal and click new terminal and this 163 00:05:52,320 --> 00:05:54,419 will just open a terminal and to run 164 00:05:54,419 --> 00:05:55,740 this code you can type the command 165 00:05:55,740 --> 00:05:58,620 Python and I'm using python3 so I'll 166 00:05:58,620 --> 00:06:01,440 type three space and then type the the 167 00:06:01,440 --> 00:06:03,600 name of your file so for me it's scrape 168 00:06:03,600 --> 00:06:05,759 underscore video dot Pi if you're not 169 00:06:05,759 --> 00:06:07,440 python on your computer make sure to 170 00:06:07,440 --> 00:06:09,539 install it installing python is out of 171 00:06:09,539 --> 00:06:11,280 the scope for this tutorial so I won't 172 00:06:11,280 --> 00:06:12,780 be covering that and if you run into 173 00:06:12,780 --> 00:06:15,240 errors related to missing dependencies 174 00:06:15,240 --> 00:06:17,880 you can fix that very easily by typing 175 00:06:17,880 --> 00:06:19,919 pip and in my case since I'm using 176 00:06:19,919 --> 00:06:22,500 python3 I have to put a 3 at the end and 177 00:06:22,500 --> 00:06:24,840 then you can just do install and 178 00:06:24,840 --> 00:06:26,880 basically the name of the module so here 179 00:06:26,880 --> 00:06:29,039 you can type selenium and then we want 180 00:06:29,039 --> 00:06:31,380 to install beautiful soup which is bs4 181 00:06:31,380 --> 00:06:34,440 like this and then hit enter and this 182 00:06:34,440 --> 00:06:35,639 will basically install the libraries 183 00:06:35,639 --> 00:06:37,860 that we need to run this code once 184 00:06:37,860 --> 00:06:39,419 everything is set up feel free to run 185 00:06:39,419 --> 00:06:41,639 the code and as you can see it opened a 186 00:06:41,639 --> 00:06:43,919 web browser and then it loads the page 187 00:06:43,919 --> 00:06:46,259 and after one second the page closes 188 00:06:46,259 --> 00:06:48,360 because the code finished executing and 189 00:06:48,360 --> 00:06:50,220 now if you look inside my terminal I 190 00:06:50,220 --> 00:06:52,800 basically got the HTML for the page cool 191 00:06:52,800 --> 00:06:55,139 so now we basically have this HTML 192 00:06:55,139 --> 00:06:57,419 within our python code and the cool 193 00:06:57,419 --> 00:06:59,220 thing about beautiful soup is that it's 194 00:06:59,220 --> 00:07:02,940 able to find elements within the HTML so 195 00:07:02,940 --> 00:07:05,039 for example each video is within a div 196 00:07:05,039 --> 00:07:09,419 with a class called Tick Tock Dash x6y 197 00:07:09,419 --> 00:07:11,400 Etc and they all have the same class 198 00:07:11,400 --> 00:07:13,740 name so basically with beautiful soup we 199 00:07:13,740 --> 00:07:15,960 can specify that we only want divs with 200 00:07:15,960 --> 00:07:19,500 the name Tick Tock Dash x6y Etc and it 201 00:07:19,500 --> 00:07:21,120 will select all of them and return it 202 00:07:21,120 --> 00:07:22,979 back as a list and to make our lives 203 00:07:22,979 --> 00:07:24,660 easier we should try to get the div 204 00:07:24,660 --> 00:07:27,539 closest to the a tag and the div that 205 00:07:27,539 --> 00:07:31,319 contains the a tag is Tick Tock Dash YZ 206 00:07:31,319 --> 00:07:34,560 6i Etc and just be safe we should double 207 00:07:34,560 --> 00:07:36,479 check that all the other divs have the 208 00:07:36,479 --> 00:07:38,520 same naming so let's open the second 209 00:07:38,520 --> 00:07:41,400 video and open the first div and open 210 00:07:41,400 --> 00:07:44,699 this div and open this div as well and 211 00:07:44,699 --> 00:07:46,740 here we see the a tag and now if we look 212 00:07:46,740 --> 00:07:48,960 at the class we see Tick Tock Dash 213 00:07:48,960 --> 00:07:51,060 yz6ijl 214 00:07:51,060 --> 00:07:53,280 Etc which basically matches the one at 215 00:07:53,280 --> 00:07:55,259 the top so this makes our lives very 216 00:07:55,259 --> 00:07:57,120 easy so now we can just look for the div 217 00:07:57,120 --> 00:07:59,340 with this class name so let's copy this 218 00:07:59,340 --> 00:08:01,319 and now let's go back to our code and 219 00:08:01,319 --> 00:08:03,560 now remove this line and type videos 220 00:08:03,560 --> 00:08:07,380 equals soup dot find underscore all and 221 00:08:07,380 --> 00:08:09,120 now open the parenthesis and open the 222 00:08:09,120 --> 00:08:11,220 quotation marks and type div and then 223 00:08:11,220 --> 00:08:13,800 add a comma and then space and then open 224 00:08:13,800 --> 00:08:15,539 this clinical brackets and now inside 225 00:08:15,539 --> 00:08:17,460 here open the quotation marks and type 226 00:08:17,460 --> 00:08:20,759 class and then open the colon and I open 227 00:08:20,759 --> 00:08:22,680 the quotation marks and paste that class 228 00:08:22,680 --> 00:08:24,840 name and all we're doing here is we're 229 00:08:24,840 --> 00:08:27,599 saying we want to find all the divs with 230 00:08:27,599 --> 00:08:31,379 the class name Tick Tock YZ Etc and I'll 231 00:08:31,379 --> 00:08:33,719 do print and let's get the length of the 232 00:08:33,719 --> 00:08:35,458 videos so that way we know how many 233 00:08:35,458 --> 00:08:37,679 videos we got back and then next we can 234 00:08:37,679 --> 00:08:39,539 just Loop through each video so we can 235 00:08:39,539 --> 00:08:43,260 do for video in videos and add a colon 236 00:08:43,260 --> 00:08:45,120 and go to the next line and now just 237 00:08:45,120 --> 00:08:47,100 type prints open the parenthesis and 238 00:08:47,100 --> 00:08:49,560 type video and now we can do Dot and 239 00:08:49,560 --> 00:08:51,480 then we can type a which means it's 240 00:08:51,480 --> 00:08:53,820 getting the a tag within the div and 241 00:08:53,820 --> 00:08:55,620 then open the square brackets and then 242 00:08:55,620 --> 00:08:57,720 open the quotation marks and type href 243 00:08:57,720 --> 00:08:59,399 which means we want to grab the href 244 00:08:59,399 --> 00:09:01,560 value from inside the 8 245 00:09:01,560 --> 00:09:03,600 and now let's hit save and now go back 246 00:09:03,600 --> 00:09:04,860 to your terminal and let's run this 247 00:09:04,860 --> 00:09:07,200 command again cool so it opens the page 248 00:09:07,200 --> 00:09:09,600 and then one second later it closes and 249 00:09:09,600 --> 00:09:11,100 now if you look at my terminal again we 250 00:09:11,100 --> 00:09:13,080 basically got all the links of the 251 00:09:13,080 --> 00:09:15,240 videos inside the page and if you look 252 00:09:15,240 --> 00:09:18,240 closely we only got 29 videos this 253 00:09:18,240 --> 00:09:19,320 doesn't really add up because 254 00:09:19,320 --> 00:09:21,720 technically we actually have around 500 255 00:09:21,720 --> 00:09:23,339 videos so it looks like something's not 256 00:09:23,339 --> 00:09:25,019 working properly so let's go back to 257 00:09:25,019 --> 00:09:27,120 this page so one interesting thing about 258 00:09:27,120 --> 00:09:29,459 this page is that as you scroll more 259 00:09:29,459 --> 00:09:32,040 videos get loaded so this is going to be 260 00:09:32,040 --> 00:09:33,779 a problem for our code because we're 261 00:09:33,779 --> 00:09:35,760 only loading whatever we can see 262 00:09:35,760 --> 00:09:37,860 initially so if we want to see more 263 00:09:37,860 --> 00:09:41,100 videos we have to tell selenium to 264 00:09:41,100 --> 00:09:43,320 scroll through the page until it reaches 265 00:09:43,320 --> 00:09:46,140 the bottom so that way we can get all of 266 00:09:46,140 --> 00:09:48,240 the videos now that sounds super 267 00:09:48,240 --> 00:09:49,980 complicated and that's where Google 268 00:09:49,980 --> 00:09:52,140 comes in we should use Google as a 269 00:09:52,140 --> 00:09:54,000 resource to help us find a way to 270 00:09:54,000 --> 00:09:55,920 achieve this so I came across this 271 00:09:55,920 --> 00:09:58,320 medium post by Quan Wei and this does 272 00:09:58,320 --> 00:10:00,360 exactly what we want to do here's the 273 00:10:00,360 --> 00:10:02,339 code that basically handles scrolling 274 00:10:02,339 --> 00:10:04,440 all the way to the bottom of the page so 275 00:10:04,440 --> 00:10:06,000 we only care about this part of the code 276 00:10:06,000 --> 00:10:07,920 so let's copy it and let's go back to 277 00:10:07,920 --> 00:10:09,779 our code and let's paste this in the 278 00:10:09,779 --> 00:10:11,760 middle before we grab the page Source 279 00:10:11,760 --> 00:10:13,680 from the driver and I'm missing an S 280 00:10:13,680 --> 00:10:15,600 here so let's add it back so basically 281 00:10:15,600 --> 00:10:17,880 this code just runs a while loop that 282 00:10:17,880 --> 00:10:20,160 executes JavaScript code to scroll down 283 00:10:20,160 --> 00:10:22,200 and then we have a variable I to keep 284 00:10:22,200 --> 00:10:24,300 track of how many times we have scrolled 285 00:10:24,300 --> 00:10:25,980 and here we tell the program to sleep 286 00:10:25,980 --> 00:10:28,260 for scroll pause time which is just one 287 00:10:28,260 --> 00:10:30,060 second and here we execute another 288 00:10:30,060 --> 00:10:32,100 JavaScript function which basically 289 00:10:32,100 --> 00:10:34,019 grabs the new scroll height of the page 290 00:10:34,019 --> 00:10:36,360 and we basically compare the height of 291 00:10:36,360 --> 00:10:38,519 the screen multiplied by the number of 292 00:10:38,519 --> 00:10:40,440 times that we have scrolled and we check 293 00:10:40,440 --> 00:10:42,899 whether this value is greater than the 294 00:10:42,899 --> 00:10:44,399 current scroll height and if it's 295 00:10:44,399 --> 00:10:46,140 greater that basically just means that 296 00:10:46,140 --> 00:10:48,420 we've reached the end of the page cool 297 00:10:48,420 --> 00:10:50,519 so now let's hit save and now let's run 298 00:10:50,519 --> 00:10:53,459 our code again hit enter and here it 299 00:10:53,459 --> 00:10:56,100 opens the browser and let's watch some 300 00:10:56,100 --> 00:10:57,000 magic 301 00:10:57,000 --> 00:10:59,100 you see that it's scrolling by itself 302 00:10:59,100 --> 00:11:01,620 isn't that awesome and now I'll just 303 00:11:01,620 --> 00:11:03,300 speed up the video and let it scroll all 304 00:11:03,300 --> 00:11:05,339 the way to the bottom five minutes later 305 00:11:05,339 --> 00:11:07,260 and that took a while because we had a 306 00:11:07,260 --> 00:11:08,760 lot of videos but now look at our 307 00:11:08,760 --> 00:11:11,339 console we actually have a lot of videos 308 00:11:11,339 --> 00:11:13,560 and when I scroll all the way to the top 309 00:11:13,560 --> 00:11:15,839 you're gonna see that we have 457 videos 310 00:11:15,839 --> 00:11:18,120 in total I guess I was off by like 50 311 00:11:18,120 --> 00:11:20,640 but still that's a lot of videos cool 312 00:11:20,640 --> 00:11:22,140 now that we have all these links the 313 00:11:22,140 --> 00:11:23,820 next thing we need to do is just go to 314 00:11:23,820 --> 00:11:25,100 this website 315 00:11:25,100 --> 00:11:27,899 ssstick.io which basically allows you to 316 00:11:27,899 --> 00:11:29,760 download Tick Tock videos without a 317 00:11:29,760 --> 00:11:31,320 watermark so let's paste the link to one 318 00:11:31,320 --> 00:11:33,060 of the videos and before we hit download 319 00:11:33,060 --> 00:11:35,040 right click on the page and go to 320 00:11:35,040 --> 00:11:37,380 inspect and then go to the networks Tab 321 00:11:37,380 --> 00:11:39,540 and basically on this tab it will record 322 00:11:39,540 --> 00:11:42,120 network activity so let me show you what 323 00:11:42,120 --> 00:11:43,920 that means so let's go back to the page 324 00:11:43,920 --> 00:11:45,959 and click download and if you look at 325 00:11:45,959 --> 00:11:47,279 the right you're going to see that three 326 00:11:47,279 --> 00:11:49,380 Network activities happened when we 327 00:11:49,380 --> 00:11:51,360 click the download button and if you 328 00:11:51,360 --> 00:11:53,339 click response you're gonna see that it 329 00:11:53,339 --> 00:11:55,920 returns HTML and if you look here 330 00:11:55,920 --> 00:11:58,079 there's an a tag and if I scroll to the 331 00:11:58,079 --> 00:11:59,760 right and you're going to see that this 332 00:11:59,760 --> 00:12:02,040 a tag is for this fun without Watermark 333 00:12:02,040 --> 00:12:03,959 so basically we only care about this 334 00:12:03,959 --> 00:12:06,360 href from this a tag and if you right 335 00:12:06,360 --> 00:12:08,519 click on this network activity go to 336 00:12:08,519 --> 00:12:11,820 copy and then click copy as curl go to 337 00:12:11,820 --> 00:12:14,640 the site called curled converter.com and 338 00:12:14,640 --> 00:12:16,560 inside this text field just paste what 339 00:12:16,560 --> 00:12:18,540 we copied and this will turn the curl 340 00:12:18,540 --> 00:12:20,940 command into python so now scroll to the 341 00:12:20,940 --> 00:12:23,640 bottom and click copy to clipboard and 342 00:12:23,640 --> 00:12:25,260 now let's go back to your code and let's 343 00:12:25,260 --> 00:12:26,880 create a function and call it download 344 00:12:26,880 --> 00:12:29,640 video and open the parenthesis and here 345 00:12:29,640 --> 00:12:31,920 it can take a link as a parameter and 346 00:12:31,920 --> 00:12:34,380 open the colon and hit enter and now 347 00:12:34,380 --> 00:12:36,779 paste the code that we just copied and 348 00:12:36,779 --> 00:12:38,459 now let's move this import to the top of 349 00:12:38,459 --> 00:12:40,800 the file so delete this line and scroll 350 00:12:40,800 --> 00:12:42,839 to the top and let's paste it on line 351 00:12:42,839 --> 00:12:45,360 four and let's fix the spacing and hit 352 00:12:45,360 --> 00:12:47,940 enter and now scroll down and now let's 353 00:12:47,940 --> 00:12:49,740 indent all the code that we just copied 354 00:12:49,740 --> 00:12:52,019 and now let's fix the spacing so 355 00:12:52,019 --> 00:12:53,579 basically all the code here just 356 00:12:53,579 --> 00:12:56,040 represents the data that gets sent over 357 00:12:56,040 --> 00:12:58,320 when we make the request to that website 358 00:12:58,320 --> 00:12:59,880 and if you scroll all the way to the 359 00:12:59,880 --> 00:13:01,500 bottom you're going to see that we make 360 00:13:01,500 --> 00:13:05,459 a request to post data to this link and 361 00:13:05,459 --> 00:13:07,680 we pass in the parameters and the 362 00:13:07,680 --> 00:13:11,040 cookies headers and the data and if you 363 00:13:11,040 --> 00:13:12,660 look in the data field you're going to 364 00:13:12,660 --> 00:13:15,300 see this ID which has a link to the tick 365 00:13:15,300 --> 00:13:17,579 tock video so instead of hard coding a 366 00:13:17,579 --> 00:13:20,519 link let's replace this with link which 367 00:13:20,519 --> 00:13:22,200 is the parameter that we added to this 368 00:13:22,200 --> 00:13:23,760 function so now let's copy the function 369 00:13:23,760 --> 00:13:25,920 name and here instead of printing this 370 00:13:25,920 --> 00:13:28,320 value let's just call our download video 371 00:13:28,320 --> 00:13:30,060 function and now let's go back to the 372 00:13:30,060 --> 00:13:31,860 bottom of the page and after we make 373 00:13:31,860 --> 00:13:33,899 this request we'll get back a response 374 00:13:33,899 --> 00:13:36,779 and since the response is HTML all we 375 00:13:36,779 --> 00:13:38,760 got to do is just use beautiful soup to 376 00:13:38,760 --> 00:13:40,860 parse the HTML data so now let's create 377 00:13:40,860 --> 00:13:43,560 a new variable called download soup and 378 00:13:43,560 --> 00:13:46,500 this will equal beautiful soup open the 379 00:13:46,500 --> 00:13:49,260 parenthesis and type response and if we 380 00:13:49,260 --> 00:13:51,420 dot text this will give us the HTML 381 00:13:51,420 --> 00:13:54,120 inside the response and put a comma and 382 00:13:54,120 --> 00:13:56,519 open the quotation marks and type HTML 383 00:13:56,519 --> 00:13:59,880 dot parser and then hit enter and like I 384 00:13:59,880 --> 00:14:02,040 mentioned before we only care about the 385 00:14:02,040 --> 00:14:04,139 first a tag so we can get that very 386 00:14:04,139 --> 00:14:07,920 easily so let's do download link equals 387 00:14:07,920 --> 00:14:13,139 download soup dot a square bracket href 388 00:14:13,139 --> 00:14:14,639 and now the last thing that we need to 389 00:14:14,639 --> 00:14:17,040 do is to download the file so we have to 390 00:14:17,040 --> 00:14:19,139 import another Library so let's scroll 391 00:14:19,139 --> 00:14:21,000 to the top and now in line five just 392 00:14:21,000 --> 00:14:25,500 type from URL lib dot request import URL 393 00:14:25,500 --> 00:14:28,139 open and hit enter and basically we will 394 00:14:28,139 --> 00:14:30,360 use this library to allow us to download 395 00:14:30,360 --> 00:14:32,820 the raw data of the file so now let's go 396 00:14:32,820 --> 00:14:34,920 back to the bottom of the page and after 397 00:14:34,920 --> 00:14:36,600 the download link let's create a new 398 00:14:36,600 --> 00:14:40,620 variable called mp4 file equals URL open 399 00:14:40,620 --> 00:14:42,839 and then we can pass it the download 400 00:14:42,839 --> 00:14:45,060 link and this will download the file as 401 00:14:45,060 --> 00:14:47,699 raw data and now all we have to do is 402 00:14:47,699 --> 00:14:50,100 save this file onto our computer so 403 00:14:50,100 --> 00:14:52,260 let's do with open 404 00:14:52,260 --> 00:14:54,839 and then open the parenthesis and let's 405 00:14:54,839 --> 00:14:57,360 use an F string here so F quotation 406 00:14:57,360 --> 00:14:59,880 marks and I want to put it in a folder 407 00:14:59,880 --> 00:15:02,639 called videos slash and open this Google 408 00:15:02,639 --> 00:15:04,380 brackets and let's put the ID of the 409 00:15:04,380 --> 00:15:06,959 video here and then we can do dot MP4 410 00:15:06,959 --> 00:15:09,480 and then add a comma and open the 411 00:15:09,480 --> 00:15:12,180 quotation marks WB which stands for 412 00:15:12,180 --> 00:15:14,399 writing in binary so we need this in 413 00:15:14,399 --> 00:15:16,740 order to write to a file and then type 414 00:15:16,740 --> 00:15:19,380 as output and then put a colon and hit 415 00:15:19,380 --> 00:15:21,060 enter and now we're going to write a 416 00:15:21,060 --> 00:15:23,100 while loop so while true all we're going 417 00:15:23,100 --> 00:15:27,920 to do is data equals mp4 file dot read 418 00:15:27,920 --> 00:15:30,480 4096 so now let's get back to the code 419 00:15:30,480 --> 00:15:34,560 so hit enter and now type if data so 420 00:15:34,560 --> 00:15:36,779 basically if we're able to read data we 421 00:15:36,779 --> 00:15:38,339 want to write this data to the output 422 00:15:38,339 --> 00:15:41,760 file so do output dot right and then 423 00:15:41,760 --> 00:15:43,920 open the parenthesis and put data and 424 00:15:43,920 --> 00:15:45,720 then hit enter and then add an else 425 00:15:45,720 --> 00:15:47,760 statement where basically if no data 426 00:15:47,760 --> 00:15:49,320 comes back that just means that we 427 00:15:49,320 --> 00:15:51,300 finish reading from the file so now we 428 00:15:51,300 --> 00:15:52,740 can add a break statement which means 429 00:15:52,740 --> 00:15:54,959 that we're done with the wow so now hit 430 00:15:54,959 --> 00:15:56,820 save and before you run the code make 431 00:15:56,820 --> 00:15:58,680 sure you make a folder called videos in 432 00:15:58,680 --> 00:16:00,180 the folder where your script is located 433 00:16:00,180 --> 00:16:02,579 because in the example here I'm creating 434 00:16:02,579 --> 00:16:04,380 the video inside of folder called videos 435 00:16:04,380 --> 00:16:07,079 and I'm giving it an ID and that just 436 00:16:07,079 --> 00:16:09,180 reminded me I forgot to pass a ID 437 00:16:09,180 --> 00:16:11,699 parameter so let's scroll up and and 438 00:16:11,699 --> 00:16:13,860 here inside the download video add a 439 00:16:13,860 --> 00:16:16,980 comma and put ID here and in here let's 440 00:16:16,980 --> 00:16:19,440 just pass the index as the ID so now if 441 00:16:19,440 --> 00:16:20,880 we want the index all we have to do is 442 00:16:20,880 --> 00:16:23,519 add index here and add a comma and here 443 00:16:23,519 --> 00:16:25,860 on the videos we do enumerate and open 444 00:16:25,860 --> 00:16:27,660 the parenthesis and close the 445 00:16:27,660 --> 00:16:29,459 parenthesis and this will basically give 446 00:16:29,459 --> 00:16:31,800 us the index along with the video and 447 00:16:31,800 --> 00:16:33,839 now let's hit save and now let's run our 448 00:16:33,839 --> 00:16:36,600 script cool so it opens the page and now 449 00:16:36,600 --> 00:16:39,000 it Scrolls to the bottom and yeah of 450 00:16:39,000 --> 00:16:41,040 course I got an error live coding is 451 00:16:41,040 --> 00:16:43,440 just too hard so basically the error is 452 00:16:43,440 --> 00:16:45,300 saying that download video is not 453 00:16:45,300 --> 00:16:48,600 defined so let's look at my code and I'm 454 00:16:48,600 --> 00:16:50,579 assuming the error is because I declared 455 00:16:50,579 --> 00:16:53,459 the function after it was called so 456 00:16:53,459 --> 00:16:55,620 that's why the code can't find it so we 457 00:16:55,620 --> 00:16:57,540 can fix this very easily so let's copy 458 00:16:57,540 --> 00:16:59,820 this whole function and let's delete it 459 00:16:59,820 --> 00:17:01,620 and now scroll to the top of the video 460 00:17:01,620 --> 00:17:04,140 and let's paste it above everything else 461 00:17:04,140 --> 00:17:07,199 and hit save and let's try this again oh 462 00:17:07,199 --> 00:17:09,179 crap looks like I got another error I 463 00:17:09,179 --> 00:17:10,919 just had a spelling mistake so I was 464 00:17:10,919 --> 00:17:13,079 supposed to capitalize this F so let me 465 00:17:13,079 --> 00:17:15,720 do that and let me try this again all 466 00:17:15,720 --> 00:17:17,220 right let's cross our fingers and hope 467 00:17:17,220 --> 00:17:19,799 that everything works fine 468 00:17:19,799 --> 00:17:22,859 ah crap we got another error hmm it 469 00:17:22,859 --> 00:17:24,299 looks like our script wasn't able to 470 00:17:24,299 --> 00:17:27,839 read the href inside the a tag but on 471 00:17:27,839 --> 00:17:29,520 the bright side it looks like we did 472 00:17:29,520 --> 00:17:31,860 download one video so let's open this 473 00:17:31,860 --> 00:17:34,520 video and make sure that it works 474 00:17:34,520 --> 00:17:37,080 nice it looks like it works 475 00:17:37,080 --> 00:17:38,580 but technically we're supposed to 476 00:17:38,580 --> 00:17:40,559 download all the videos right and the 477 00:17:40,559 --> 00:17:42,059 issue here is that we're spamming a 478 00:17:42,059 --> 00:17:43,919 server with a lot of requests very 479 00:17:43,919 --> 00:17:45,900 quickly so the server probably thinks 480 00:17:45,900 --> 00:17:47,280 that we're bought and we can actually 481 00:17:47,280 --> 00:17:49,080 get around this very easily by just 482 00:17:49,080 --> 00:17:51,179 adding an arbitrary DeLay So now let's 483 00:17:51,179 --> 00:17:52,980 scroll down and go to where we called 484 00:17:52,980 --> 00:17:55,320 the download video so right here so 485 00:17:55,320 --> 00:17:57,539 after we download one video Let's do 486 00:17:57,539 --> 00:17:59,760 time dot sleep and let's just sleep for 487 00:17:59,760 --> 00:18:02,280 10 seconds just be safe so let's save 488 00:18:02,280 --> 00:18:04,440 this and let's run the code again 489 00:18:04,440 --> 00:18:07,080 third time's the charm right crossing my 490 00:18:07,080 --> 00:18:09,600 fingers let's go and look at that we got 491 00:18:09,600 --> 00:18:11,340 zero and one 492 00:18:11,340 --> 00:18:13,559 and let's wait another 10 seconds and 493 00:18:13,559 --> 00:18:14,940 you're gonna see the next video get 494 00:18:14,940 --> 00:18:17,400 downloaded and look the third video just 495 00:18:17,400 --> 00:18:19,559 came in so because there's a 10 second 496 00:18:19,559 --> 00:18:21,600 delay it's gonna be a long while before 497 00:18:21,600 --> 00:18:23,880 we download all the videos but basically 498 00:18:23,880 --> 00:18:25,919 look it's working and this is awesome 499 00:18:25,919 --> 00:18:28,500 and unfortunately YouTube has a cap and 500 00:18:28,500 --> 00:18:30,480 I can't upload more than 10 videos at a 501 00:18:30,480 --> 00:18:32,100 time so I guess what we have right now 502 00:18:32,100 --> 00:18:34,200 is pretty good anyways that's it for 503 00:18:34,200 --> 00:18:36,000 this video I hope that you guys learned 504 00:18:36,000 --> 00:18:37,860 something new I wanted to share my 505 00:18:37,860 --> 00:18:39,780 thought process of how I approach this 506 00:18:39,780 --> 00:18:42,000 problem and how I was able to come up 507 00:18:42,000 --> 00:18:44,100 with the solution as you can see we only 508 00:18:44,100 --> 00:18:45,780 scratched the surface of automation 509 00:18:45,780 --> 00:18:47,700 there is so many things that you guys 510 00:18:47,700 --> 00:18:50,460 can do if you ever get stuck try to read 511 00:18:50,460 --> 00:18:53,039 the documentation for beautiful soup and 512 00:18:53,039 --> 00:18:55,200 also selenium and if you guys are up for 513 00:18:55,200 --> 00:18:57,000 the challenge try to build your own web 514 00:18:57,000 --> 00:18:59,880 scraper to grab some useful data maybe 515 00:18:59,880 --> 00:19:02,520 you guys can even build a sneaker bot or 516 00:19:02,520 --> 00:19:04,200 even some application that will notify 517 00:19:04,200 --> 00:19:05,880 you when an item that you want to buy 518 00:19:05,880 --> 00:19:08,280 goes on sale anyways the world is your 519 00:19:08,280 --> 00:19:10,080 oyster so try to build something cool 520 00:19:10,080 --> 00:19:11,700 and let me know what you guys built in 521 00:19:11,700 --> 00:19:13,260 the comments below thank you guys I'll 522 00:19:13,260 --> 00:19:14,390 see you later 523 00:19:14,390 --> 00:19:17,939 [Music] 524 00:19:21,370 --> 00:19:24,530 [Music] 38060

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.