All language subtitles for 004什么是迁移学习

af Afrikaans
ak Akan
sq Albanian
am Amharic
ar Arabic
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bem Bemba
bn Bengali
bh Bihari
bs Bosnian
br Breton
bg Bulgarian
km Cambodian
ca Catalan
ceb Cebuano
chr Cherokee
ny Chichewa
zh-CN Chinese (Simplified) Download
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
ee Ewe
fo Faroese
tl Filipino
fi Finnish
fr French
fy Frisian
gaa Ga
gl Galician
ka Georgian
de German
el Greek
gn Guarani
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ia Interlingua
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
rw Kinyarwanda
rn Kirundi
kg Kongo
ko Korean
kri Krio (Sierra Leone)
ku Kurdish
ckb Kurdish (Soranî)
ky Kyrgyz
lo Laothian
la Latin
lv Latvian
ln Lingala
lt Lithuanian
loz Lozi
lg Luganda
ach Luo
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mfe Mauritian Creole
mo Moldavian
mn Mongolian
my Myanmar (Burmese)
sr-ME Montenegrin
ne Nepali
pcm Nigerian Pidgin
nso Northern Sotho
no Norwegian
nn Norwegian (Nynorsk)
oc Occitan
or Oriya
om Oromo
ps Pashto
fa Persian
pl Polish
pt-BR Portuguese (Brazil)
pt Portuguese (Portugal)
pa Punjabi
qu Quechua
ro Romanian
rm Romansh
nyn Runyakitara
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
sh Serbo-Croatian
st Sesotho
tn Setswana
crs Seychellois Creole
sn Shona
sd Sindhi
si Sinhalese
sk Slovak
sl Slovenian
so Somali
es Spanish
es-419 Spanish (Latin American)
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
tt Tatar
te Telugu
th Thai
ti Tigrinya
to Tonga
lua Tshiluba
tum Tumbuka
tr Turkish
tk Turkmen
tw Twi
ug Uighur
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
wo Wolof
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:05,440 --> 00:00:07,120 What is transfer learning?   2 00:00:09,360 --> 00:00:13,760 The idea of Transfer Learning is to leverage the  knowledge acquired by a model trained with lots of   3 00:00:13,760 --> 00:00:20,720 data on another task. The model A will be trained  specifically for task A. Now, let's say you want   4 00:00:20,720 --> 00:00:26,320 to train a model B for a different task. One  option would be to train the model from scratch.   5 00:00:27,120 --> 00:00:34,240 This could take lots of computation, time and  data. Instead, we could initialize model B with   6 00:00:34,240 --> 00:00:38,880 the same weights as model A, transferring  the knowledge of model A on task B.   7 00:00:40,800 --> 00:00:47,040 When training from scratch, all the model’s  weight are initialized randomly. In this example,   8 00:00:47,040 --> 00:00:52,480 we are training a BERT model on the task of  recognizing if two sentences are similar or not.   9 00:00:53,680 --> 00:00:58,560 On the left, it’s trained from scratch, and  on the right, it’s fine-tuning a pretrained   10 00:00:58,560 --> 00:01:04,080 model. As we can see, using transfer learning  and the pretrained model yields better results.   11 00:01:04,959 --> 00:01:09,360 And it doesn’t matter if we train longer, the  training from scratch is capped around 70%   12 00:01:09,360 --> 00:01:13,040 accuracy while the pretrained  model beats the 86% easily.   13 00:01:14,240 --> 00:01:18,720 This is because pretrained models are usually  trained on large amounts of data that provide   14 00:01:18,720 --> 00:01:22,720 the model with a statistical understanding  of the language used during pretraining.   15 00:01:24,240 --> 00:01:28,960 In computer vision, transfer learning has been  applied successfully for almost ten years.   16 00:01:29,840 --> 00:01:35,840 Models are frequently pretrained on ImageNet, a  dataset containing 1.2 millions of photo images.   17 00:01:36,880 --> 00:01:41,680 Each image is classified by one of  1000 labels. Training like this,   18 00:01:42,240 --> 00:01:48,960 on labeled data is called supervised  learning. In Natural Language Processing,   19 00:01:48,960 --> 00:01:54,320 transfer learning is a bit more recent. A key  difference with ImageNet is that the pretraining   20 00:01:54,320 --> 00:01:59,280 is usually self-supervised, which means it  doesn’t require humans annotations for the labels.   21 00:02:00,480 --> 00:02:05,040 A very common pretraining objective is  to guess the next word in a sentence,   22 00:02:05,040 --> 00:02:08,720 which only requires lots and  lots of text. GPT-2 for instance,   23 00:02:09,360 --> 00:02:16,720 was pretrained this way using the content of 45  millions links posted by users on Reddit. Another   24 00:02:16,720 --> 00:02:21,520 example of self-supervised pretraining objective  is to predict the value of randomly masked words,   25 00:02:22,160 --> 00:02:25,360 which is similar to fill-in-the-blank  tests you may have done in school.   26 00:02:26,560 --> 00:02:31,520 BERT was pretrained this way using the English  Wikipedia and 11,000 unpublished books.   27 00:02:32,960 --> 00:02:38,880 In practice, transfer learning is applied on a  given model by throwing away its head, that is,   28 00:02:38,880 --> 00:02:43,680 its last layers focused on the pretraining  objective, and replacing it with a new,   29 00:02:43,680 --> 00:02:50,000 randomly initialized, head suitable for the task  at hand. For instance, when we fine-tuned a BERT   30 00:02:50,000 --> 00:02:55,440 model earlier, we removed the head that classified  mask words and replaced it with a classifier with   31 00:02:55,440 --> 00:03:01,680 2 outputs, since our task had two labels. To  be as efficient as possible, the pretrained   32 00:03:01,680 --> 00:03:07,200 model used should be as similar as possible  to the task it’s fine-tuned on. For instance,   33 00:03:07,200 --> 00:03:12,720 if the problem it’s to classify German sentences,  it’s best to use a German pretrained model.   34 00:03:14,160 --> 00:03:19,200 But with the good comes the bad. The pretrained  model does not only transfer its knowledge,   35 00:03:19,200 --> 00:03:25,440 but also any bias it may contain. ImageNet mostly  contains images coming from the United States and   36 00:03:25,440 --> 00:03:29,680 Western Europe, so models fine-tuned with it  usually will perform better on images from   37 00:03:29,680 --> 00:03:35,280 these countries. OpenAI also studied the  bias in the predictions of its GPT-3 model   38 00:03:35,840 --> 00:03:40,960 (which was pretrained using the guess the next  work objective). Changing the gender of the prompt   39 00:03:40,960 --> 00:03:46,720 from "He was very" to "She was very" changed  the predictions from mostly neutral adjectives   40 00:03:47,360 --> 00:03:52,240 to almost only physical ones. In  their model card of the GPT-2 model,   41 00:03:52,240 --> 00:03:59,840 OpenAI also acknowledges its bias and discourages  its use in systems that interact with humans. 5215

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.