All language subtitles for 08-Lecture 1 Segment 8 What can AI do Language.en

af Afrikaans
ak Akan
sq Albanian
am Amharic
ar Arabic Download
hy Armenian
az Azerbaijani
eu Basque
be Belarusian
bem Bemba
bn Bengali
bh Bihari
bs Bosnian
br Breton
bg Bulgarian
km Cambodian
ca Catalan
ceb Cebuano
chr Cherokee
ny Chichewa
zh-CN Chinese (Simplified)
zh-TW Chinese (Traditional)
co Corsican
hr Croatian
cs Czech
da Danish
nl Dutch
en English
eo Esperanto
et Estonian
ee Ewe
fo Faroese
tl Filipino
fi Finnish
fr French
fy Frisian
gaa Ga
gl Galician
ka Georgian
de German
el Greek
gn Guarani
gu Gujarati
ht Haitian Creole
ha Hausa
haw Hawaiian
iw Hebrew
hi Hindi
hmn Hmong
hu Hungarian
is Icelandic
ig Igbo
id Indonesian
ia Interlingua
ga Irish
it Italian
ja Japanese
jw Javanese
kn Kannada
kk Kazakh
rw Kinyarwanda
rn Kirundi
kg Kongo
ko Korean
kri Krio (Sierra Leone)
ku Kurdish
ckb Kurdish (Soranî)
ky Kyrgyz
lo Laothian
la Latin
lv Latvian
ln Lingala
lt Lithuanian
loz Lozi
lg Luganda
ach Luo
lb Luxembourgish
mk Macedonian
mg Malagasy
ms Malay
ml Malayalam
mt Maltese
mi Maori
mr Marathi
mfe Mauritian Creole
mo Moldavian
mn Mongolian
my Myanmar (Burmese)
sr-ME Montenegrin
ne Nepali
pcm Nigerian Pidgin
nso Northern Sotho
no Norwegian
nn Norwegian (Nynorsk)
oc Occitan
or Oriya
om Oromo
ps Pashto
fa Persian
pl Polish
pt-BR Portuguese (Brazil)
pt Portuguese (Portugal)
pa Punjabi
qu Quechua
ro Romanian
rm Romansh
nyn Runyakitara
ru Russian
sm Samoan
gd Scots Gaelic
sr Serbian
sh Serbo-Croatian
st Sesotho
tn Setswana
crs Seychellois Creole
sn Shona
sd Sindhi
si Sinhalese
sk Slovak
sl Slovenian
so Somali
es Spanish
es-419 Spanish (Latin American)
su Sundanese
sw Swahili
sv Swedish
tg Tajik
ta Tamil
tt Tatar
te Telugu
th Thai
ti Tigrinya
to Tonga
lua Tshiluba
tum Tumbuka
tr Turkish
tk Turkmen
tw Twi
ug Uighur
uk Ukrainian
ur Urdu
uz Uzbek
vi Vietnamese
cy Welsh
wo Wolof
xh Xhosa
yi Yiddish
yo Yoruba
zu Zulu
Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:01,999 --> 00:00:05,600 Alright, so we're not yet to the point where we can tell intentionally funny stories. 2 00:00:05,600 --> 00:00:07,270 What can we do with language? 3 00:00:07,270 --> 00:00:11,260 Well I mentioned Siri. Siri may not be the best at telling bedtime stories 4 00:00:11,260 --> 00:00:14,960 but Siri does some amazing things, and the pieces that make that up are 5 00:00:14,960 --> 00:00:18,450 actually used in many places in industry. There's automatic speech recognition, 6 00:00:18,450 --> 00:00:22,040 where you go from speech to text. There's text-to-speech synthesis which 7 00:00:22,040 --> 00:00:26,180 is an easier problem where you go from the text to the speech. And then there's dialogue systems 8 00:00:26,180 --> 00:00:28,740 that integrate all this together, linguistic analysis. 9 00:00:28,740 --> 00:00:30,349 Let me show you what a 10 00:00:30,349 --> 00:00:32,270 speech recognition system looks like 11 00:00:32,270 --> 00:00:36,400 just kind of when you point it at the TV. So this is not customized to a 12 00:00:36,400 --> 00:00:39,930 specific speaker, this is not over some great microphone like how your phones 13 00:00:39,930 --> 00:00:42,640 have really sophisticated microphones these days. 14 00:00:42,640 --> 00:00:45,920 This is just plugged straight into the TV as essentially automatic transcription. Let's see 15 00:00:45,920 --> 00:00:50,530 how well it does, and in particular watch the errors. 16 00:00:50,530 --> 00:00:54,429 17 00:00:54,429 --> 00:00:59,289 18 00:00:59,289 --> 00:01:03,579 19 00:01:03,579 --> 00:01:07,520 20 00:01:07,520 --> 00:01:11,290 21 00:01:11,290 --> 00:01:13,960 So, what's interesting about this. First of all, 22 00:01:13,960 --> 00:01:16,340 is it good? Is it bad? 23 00:01:16,340 --> 00:01:18,310 It does a lot of stuff. 24 00:01:18,310 --> 00:01:20,369 It does a lot of things right. It makes some mistakes. 25 00:01:20,369 --> 00:01:24,290 The linguistics are of multiple kinds, so for example, here 26 00:01:24,290 --> 00:01:26,690 "The classmates said their final goodbyes". 27 00:01:26,690 --> 00:01:30,280 That's like good buys like Best Buy. right. That is exactly the sounds that 28 00:01:30,280 --> 00:01:31,670 the reporter said. 29 00:01:31,670 --> 00:01:34,659 The failing here in this case was not in the acoustic modeling which tries to 30 00:01:34,659 --> 00:01:35,420 connect 31 00:01:35,420 --> 00:01:37,869 the wave forms to the underlying linguistic sounds. 32 00:01:37,869 --> 00:01:40,479 Here the failing is there multiple things that sound the same. 33 00:01:40,479 --> 00:01:45,040 You gotta figure which one the reporter could possibly mean in the context. 34 00:01:45,040 --> 00:01:48,840 This is a sad story right. Somebody died, people are not going shopping, right, and 35 00:01:48,840 --> 00:01:52,230 we know this is humans, but the system does not and so in this case this 36 00:01:52,230 --> 00:01:55,070 isn't a problem the language model. there are other cases here where the problem is 37 00:01:55,070 --> 00:01:58,270 more in the acoustics and putting all the stuff together in some probabilistic framework 38 00:01:58,270 --> 00:02:00,060 that lets you reconcile it all, 39 00:02:00,060 --> 00:02:01,949 that's a big part of how speech recognition works. 40 00:02:01,949 --> 00:02:04,720 We'll have more discussion on that later. 41 00:02:04,720 --> 00:02:08,120 We can do more with language than just manipulate the signal from speech to text. 42 00:02:08,120 --> 00:02:11,830 This is actually my research area. We can do things like question answering. 43 00:02:11,830 --> 00:02:14,390 We talked a little about Watson and we'll have a lot more 44 00:02:14,390 --> 00:02:15,430 later in the course about Watson. 45 00:02:15,430 --> 00:02:18,549 So Watson is basically a question answering system. 46 00:02:18,549 --> 00:02:22,189 Like, yeah, there's this layer of remembering to phrase it as a question 'cause 47 00:02:22,189 --> 00:02:25,049 you're on Jeopardy and making sure you wager the right amount on the 48 00:02:25,049 --> 00:02:28,680 Daily Double and that kinda stuff but to a first approximation a question comes in, 49 00:02:28,680 --> 00:02:30,540 Watson kind of has to 50 00:02:30,540 --> 00:02:33,839 dig through a lot of information like you know largely Wikipedia 51 00:02:33,839 --> 00:02:36,010 and connect up some answer to the question, 52 00:02:36,010 --> 00:02:39,670 so that you know how to respond. Basically a question answering system, 53 00:02:39,670 --> 00:02:42,420 although an amazingly cool demonstration of a very good one. 54 00:02:42,420 --> 00:02:45,730 Another thing we can do is machine translation. How many of you have used 55 00:02:45,730 --> 00:02:47,390 a tool like Google translate? 56 00:02:47,390 --> 00:02:51,859 So, you know, again, C-3PO. How good is machine translation? 57 00:02:51,859 --> 00:02:56,529 Well, depends on the language pair. I mean, if I'm looking at a page, say in Chinese, 58 00:02:56,529 --> 00:03:00,159 and I don't speak any Chinese, the machine translation's pretty good because I was kind 59 00:03:00,159 --> 00:03:03,359 of starting with nothing. But if I actually speak the language maybe 60 00:03:03,359 --> 00:03:06,749 I'm better off reading reading it in its natural form. You can see some of these problems if 61 00:03:06,749 --> 00:03:10,479 you do round trip from say, English to Chinese and back, and you can see how 62 00:03:10,479 --> 00:03:13,389 good what comes back--actually that's a good way to make an unintentionally funny story. 63 00:03:13,389 --> 00:03:14,409 64 00:03:14,409 --> 00:03:18,459 What else can we do: things like web search, really are about a lot of things. 65 00:03:18,459 --> 00:03:21,159 It has something to do with the words but also kind of click stream information, 66 00:03:21,159 --> 00:03:22,340 67 00:03:22,340 --> 00:03:25,370 and kind of a local search and things like that. And so there's a lot that goes into 68 00:03:25,370 --> 00:03:27,400 web search. A big part of that is the language. 69 00:03:27,400 --> 00:03:30,670 Text classification, spam filtering. Again, spam filtering is a case where it's 70 00:03:30,670 --> 00:03:33,579 part language, part not language. We'll talk more about spam filtering later 71 00:03:33,579 --> 00:03:36,349 --and so on. These are the kinds of things you can do in the domain of natural language. 72 00:03:36,349 --> 00:03:40,260 We're no longer trying so hard to tell stories funny or otherwise. 73 00:03:40,260 --> 00:03:41,610 We're trying to build things like this. 74 00:03:41,610 --> 00:03:44,459 And there has been a lot of traction. There's a lot of stuff we can build. 75 00:03:44,459 --> 00:03:46,999 We're not yet to C-3PO, but 76 00:03:46,999 --> 00:03:49,499 we actually can now translate Russian, 77 00:03:49,499 --> 00:03:51,949 which we couldn't do in the fifties even though they thought would be able to do 78 00:03:51,949 --> 00:03:55,409 it by the sixties. But now, today, we can. 79 00:03:55,409 --> 00:03:56,959 It only took 80 00:03:56,959 --> 00:03:59,289 something like twelve times longer than they thought it would. 7223

Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.