Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:11,070 --> 00:00:16,140
So in this lecture we will be introducing the next section of this course, which is all about a very
2
00:00:16,140 --> 00:00:18,840
interesting topic known as question answering.
3
00:00:18,990 --> 00:00:23,580
Specifically, we will look at how to fine tune a transformer for this task.
4
00:00:24,660 --> 00:00:30,120
As you recall, this is the task where we give the transformer a piece of text and ask it a question
5
00:00:30,120 --> 00:00:31,500
based on that text.
6
00:00:32,580 --> 00:00:38,160
Currently, the state of the art allows us to do extractive question answering, meaning that the correct
7
00:00:38,160 --> 00:00:41,700
answer is simply a substring of the given text.
8
00:00:42,740 --> 00:00:47,960
As a little exercise, it's worth thinking about how a neural network would be able to do that.
9
00:00:48,260 --> 00:00:53,750
Consider whether this is a problem of classification or regression and what the outputs and loss function
10
00:00:53,750 --> 00:00:54,710
might look like.
11
00:00:55,940 --> 00:01:01,400
In order to prime your mind to learn the content of this section, you may find it beneficial to review
12
00:01:01,400 --> 00:01:06,800
the beginner's corner where we looked at the question answering pipeline, which applies a pre-trained
13
00:01:06,800 --> 00:01:08,780
model with just one line of code.
14
00:01:13,350 --> 00:01:16,890
So let's discuss a brief outline for this section of the course.
15
00:01:17,430 --> 00:01:22,860
As with the other sections of the course on fine tuning, we will follow the same high level steps.
16
00:01:24,180 --> 00:01:30,420
As usual, we begin by tokenizing the inputs and processing the inputs so that they can be passed into
17
00:01:30,420 --> 00:01:31,140
the model.
18
00:01:31,950 --> 00:01:35,220
In this section, this will be our most laborious task.
19
00:01:36,720 --> 00:01:42,480
We'll then look at how to compute metrics which, like the previous steps, will require a large amount
20
00:01:42,480 --> 00:01:43,200
of work.
21
00:01:44,040 --> 00:01:48,870
This will look very different from the previous sections, since we'll need to do quite a bit of work
22
00:01:48,870 --> 00:01:52,680
to convert the model outputs into an actual string of text.
23
00:01:54,020 --> 00:01:59,750
After these preliminary steps, we can finally move on to training the model and evaluating the model
24
00:01:59,750 --> 00:02:01,340
after training is complete.
25
00:02:01,610 --> 00:02:03,980
As usual, this portion will be brief.
26
00:02:08,330 --> 00:02:14,120
So as a general theme for this section, basically it involves a little more API hunting, which means
27
00:02:14,120 --> 00:02:20,000
figuring out the right functions to call in what they do, but also a lot more in terms of getting down
28
00:02:20,000 --> 00:02:23,930
into the weeds, much more so than the previous sections of the course.
29
00:02:24,260 --> 00:02:26,780
This is primarily due to two reasons.
30
00:02:27,740 --> 00:02:33,440
Reason number one is that, as you recall, inputs for question answering come in the form of context
31
00:02:33,440 --> 00:02:34,730
and question pairs.
32
00:02:35,450 --> 00:02:42,710
You can imagine a context as something like a Wikipedia page on some topic because of this context can
33
00:02:42,710 --> 00:02:48,590
be very long and we'll need some way to handle this, along with any complications that arise from how
34
00:02:48,590 --> 00:02:49,880
we choose to do that.
35
00:02:51,480 --> 00:02:57,000
The number two issue is that it's going to take quite a bit of work to convert our model outputs into
36
00:02:57,000 --> 00:02:59,670
an actual answer represented as text.
37
00:03:00,150 --> 00:03:05,460
At a high level, this is because the neural networks output numbers while what we want is text.
38
00:03:06,420 --> 00:03:12,090
In my opinion, issue number one may simply be due to the fact that this is still a new library and
39
00:03:12,090 --> 00:03:17,820
the developers haven't yet had a chance to encapsulate these steps into a more convenient API.
40
00:03:18,300 --> 00:03:23,910
In any case, what this does mean is that like some of the previous sections, you will need to put
41
00:03:23,910 --> 00:03:26,880
on your programming hat and write actual code.
42
00:03:27,090 --> 00:03:32,520
This is not basic code like you'd see in a typical Udemy course, but real code that will require you
43
00:03:32,520 --> 00:03:34,230
to think algorithmically.
44
00:03:38,670 --> 00:03:44,160
Now, just as a heads up, there is one quirk with the hugging face API that will become apparent in
45
00:03:44,160 --> 00:03:51,570
this section and this is that they tend to call everything an ID, so you'll have sequence IDs, example
46
00:03:51,570 --> 00:03:55,920
IDs, token type IDs, token IDs, all kinds of IDs.
47
00:03:56,760 --> 00:04:02,970
This becomes very annoying to keep track of since sequence and example are such generic words.
48
00:04:03,840 --> 00:04:07,290
I've done my best to give variables less insane looking names.
49
00:04:07,290 --> 00:04:12,780
But do keep in mind that one of the challenging things in this section is keeping track of what each
50
00:04:12,780 --> 00:04:14,340
variable actually is.
51
00:04:15,450 --> 00:04:20,850
To be honest with you, when I first encountered this code, I found it to be quite boring and overwhelming.
52
00:04:20,850 --> 00:04:25,050
But in fact the code is quite interesting, so I encourage you to stick with it.
53
00:04:25,830 --> 00:04:31,320
Put an honest amount of effort into understanding each step, and it will become an interesting problem
54
00:04:31,320 --> 00:04:35,040
to solve if you are the type of person that likes to code.
55
00:04:35,190 --> 00:04:41,040
Plus, it's a very cool application of NLP, so if you want to train your own question answering system
56
00:04:41,040 --> 00:04:45,060
on a custom data set, this is something you'll have to know how to do.
6260
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.