subtitlecat.com

All language subtitles for 019将输入批量处理在一起（TensorFlow

Afrikaans

Akan

Albanian

Amharic

Arabic

Armenian

Azerbaijani

Basque

Belarusian

Bemba

Bengali

Bihari

Bosnian

Breton

Bulgarian

Cambodian

Catalan

Cebuano

Cherokee

Chichewa

Chinese (Simplified) Download

Chinese (Traditional)

Corsican

Croatian

Czech

Danish

Dutch

English

Esperanto

Estonian

Ewe

Faroese

Filipino

Finnish

French

Frisian

Galician

Georgian

German

Greek

Guarani

Gujarati

Haitian Creole

Hausa

Hawaiian

Hebrew

Hindi

Hmong

Hungarian

Icelandic

Igbo

Indonesian

Interlingua

Irish

Italian

Japanese

Javanese

Kannada

Kazakh

Kinyarwanda

Kirundi

Kongo

Korean

Krio (Sierra Leone)

Kurdish

Kurdish (Soranî)

Kyrgyz

Laothian

Latin

Latvian

Lingala

Lithuanian

Lozi

Luganda

Luo

Luxembourgish

Macedonian

Malagasy

Malay

Malayalam

Maltese

Maori

Marathi

Mauritian Creole

Moldavian

Mongolian

Myanmar (Burmese)

Montenegrin

Nepali

Nigerian Pidgin

Northern Sotho

Norwegian

Norwegian (Nynorsk)

Occitan

Oriya

Oromo

Pashto

Persian

Polish

Portuguese (Brazil)

Portuguese (Portugal)

Punjabi

Quechua

Romanian

Romansh

Runyakitara

Russian

Samoan

Scots Gaelic

Serbian

Serbo-Croatian

Sesotho

Setswana

Seychellois Creole

Shona

Sindhi

Sinhalese

Slovak

Slovenian

Somali

Spanish

Spanish (Latin American)

Sundanese

Swahili

Swedish

Tajik

Tamil

Tatar

Telugu

Thai

Tigrinya

Tonga

Tshiluba

Tumbuka

Turkish

Turkmen

Twi

Uighur

Ukrainian

Urdu

Uzbek

Vietnamese

Welsh

Wolof

Xhosa

Yiddish

Yoruba

Zulu

Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:05,120 --> 00:00:10,880 How to batch inputs together? In this video, we will see how to batch input sequences together. 2 00:00:12,480 --> 00:00:16,560 In general, the sentences we want to pass through our model won't all have the same 3 00:00:16,560 --> 00:00:23,520 lengths. Here we are using the model we saw in the sentiment analysis pipeline and want to classify 4 00:00:23,520 --> 00:00:29,760 two sentences. When tokenizing them and mapping each token to its corresponding input IDs, 5 00:00:29,760 --> 00:00:31,680 we get two lists of different lengths. 6 00:00:33,120 --> 00:00:38,240 Trying to create a tensor or a NumPy array from those two lists will result in an error, because 7 00:00:38,240 --> 00:00:44,320 all arrays and tensors should be rectangular. One way to overcome this limit is to make the 8 00:00:44,320 --> 00:00:50,080 second sentence the same length as the first by adding a special token as many times as necessary. 9 00:00:51,040 --> 00:00:55,360 Another way would be to truncate the first sequence to the length of the second, but we 10 00:00:55,360 --> 00:01:00,080 would them lose a lot of information that might be necessary to properly classify the sentence. 11 00:01:01,040 --> 00:01:05,760 In general, we only truncate sentences when they are longer than the maximum length the 12 00:01:05,760 --> 00:01:12,560 model can handle. The value used to pad the second sentence should not be picked randomly: the model 13 00:01:12,560 --> 00:01:18,000 has been pretrained with a certain padding ID, which you can find in tokenizer.pad_token_id. 14 00:01:19,760 --> 00:01:22,640 Now that we have padded our sentences, we can make a batch with them. 15 00:01:23,920 --> 00:01:28,400 If we pass the two sentences to the model separately and batched together however, 16 00:01:28,400 --> 00:01:33,600 we notice that we don't get the same results for the sentence that is padded (here the second one). 17 00:01:37,360 --> 00:01:41,440 If you remember that Transformer models make heavy use of attention layers, this should 18 00:01:41,440 --> 00:01:46,800 not come as a total surprise: when computing the contextual representation of each token, 19 00:01:46,800 --> 00:01:52,800 the attention layers look at all the other words in the sentence. If we have just the sentence or 20 00:01:52,800 --> 00:01:57,200 the sentence with several padding tokens added, it's logical we don't get the same values. 21 00:01:58,560 --> 00:02:03,520 To get the same results with or without padding, we need to indicate to the attention layers 22 00:02:03,520 --> 00:02:08,639 that they should ignore those padding tokens. This is done by creating an attention mask, 23 00:02:08,639 --> 00:02:15,920 a tensor with the same shape as the input IDs, with zeros and ones. Ones indicate the tokens the 24 00:02:15,920 --> 00:02:22,160 attention layers should consider in the context and zeros the tokens they should ignore. Now 25 00:02:22,160 --> 00:02:27,040 passing this attention mask along with the input ids will give us the same results as when we sent 26 00:02:27,040 --> 00:02:33,600 the two sentences individually to the model! This is all done behind the scenes by the tokenizer 27 00:02:33,600 --> 00:02:39,680 when you apply it to several sentences with the flag padding=True. It will apply the padding with 28 00:02:39,680 --> 00:02:49,840 the proper value to the smaller sentences and create the appropriate attention mask. 3641