Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:00,420 --> 00:00:06,500
In these first few videos, I want to lay the\n
2
00:00:06,500 --> 00:00:13,809
need throughout these video tutorials. Let's\n
3
00:00:13,808 --> 00:00:21,160
data structure? one definition that I really\n
4
00:00:21,160 --> 00:00:27,460
data so that it can be used efficiently. And\n
5
00:00:27,460 --> 00:00:33,980
a way of organizing data in some fashion so\n
6
00:00:33,979 --> 00:00:45,158
or perhaps even updated quickly and easily.\n
7
00:00:45,158 --> 00:00:53,588
Well, they are essential ingredients in creating\n
8
00:00:53,588 --> 00:01:01,100
reason might be that they help us manage and\n
9
00:01:01,100 --> 00:01:09,390
this last point, is more of my own making.\n
10
00:01:09,390 --> 00:01:14,939
to understand. As a side note, one of the\n
11
00:01:14,938 --> 00:01:21,419
bad mediocre to excellent programmers is that\n
12
00:01:21,420 --> 00:01:27,700
fundamentally understand how and when to use\n
13
00:01:27,700 --> 00:01:32,170
they're trying to finish. data structures\n
14
00:01:32,170 --> 00:01:38,890
okay product and an outstanding one. It's\n
15
00:01:38,890 --> 00:01:46,950
student is required to take a course in data\n
16
00:01:46,950 --> 00:01:53,100
begin talking about data structures that we\n
17
00:01:53,099 --> 00:02:01,438
structures. What I'm talking about is the\n
18
00:02:01,438 --> 00:02:06,728
an abstracted type and how does it differ\n
19
00:02:06,728 --> 00:02:12,759
that an abstract data type is an abstraction\n
20
00:02:12,759 --> 00:02:19,759
interface to which that data structure must\n
21
00:02:22,750 --> 00:02:29,479
a specific data structure should be implemented,\n
22
00:02:29,479 --> 00:02:38,339
example I like to give is to suppose that\n
23
00:02:38,340 --> 00:02:45,590
to get from point A to point B. Well, as we\n
24
00:02:45,590 --> 00:02:51,719
to get from one place to another. So which\n
25
00:02:51,719 --> 00:02:57,908
transportation might be walking or biking,\n
26
00:02:57,908 --> 00:03:03,959
specific modes of transportation would be\n
27
00:03:03,959 --> 00:03:10,370
We want to get from one place to another through\n
28
00:03:10,370 --> 00:03:17,759
abstract data type. How did we do that? Exactly?\n
29
00:03:17,759 --> 00:03:25,968
some examples of abstract data types on the\n
30
00:03:25,968 --> 00:03:32,968
on the right hand side. As you can see, a\n
31
00:03:32,968 --> 00:03:39,489
have a dynamic array or a linked list. They\n
32
00:03:39,489 --> 00:03:47,530
indexing elements in the list. Next, we have\n
33
00:03:47,530 --> 00:03:54,019
themselves can be implemented in a variety\n
34
00:03:54,019 --> 00:04:02,079
for a queue, I put a stack based queue because\n
35
00:04:02,079 --> 00:04:09,659
using only stacks. This may not be the most\n
36
00:04:09,658 --> 00:04:16,800
it does work and it is possible. The point\n
37
00:04:16,800 --> 00:04:22,860
how a data structure should behave, and what\n
38
00:04:22,860 --> 00:04:31,069
surrounding how those methods are implemented.\n
39
00:04:31,069 --> 00:04:37,819
data types, we need to have a quick look at\n
40
00:04:37,819 --> 00:04:45,829
to understand the performance that our data\n
41
00:04:45,829 --> 00:04:53,389
we often find ourselves asking the same two\n
42
00:04:53,389 --> 00:05:01,279
is, how much time does this algorithm need\n
43
00:05:01,279 --> 00:05:08,529
algorithm need for my computation. So, if\n
44
00:05:08,529 --> 00:05:15,819
to finish, then it's no good. Similarly, if\n
45
00:05:15,819 --> 00:05:21,639
a space equal to the sum of all the bytes\n
46
00:05:21,639 --> 00:05:30,300
your algorithm is also useless. So just standardize\n
47
00:05:30,300 --> 00:05:35,759
much space is required for an algorithm to\n
48
00:05:35,759 --> 00:05:43,949
invented big O notation, amongst other things\n
49
00:05:43,949 --> 00:05:50,909
we're interested in Big O because it tells\n
50
00:05:50,910 --> 00:05:57,450
cares about the worst case. So if your algorithm\n
51
00:05:57,449 --> 00:06:05,579
possible arrangement of numbers for your particular\n
52
00:06:05,579 --> 00:06:10,649
suppose you have an unordered list of unique\n
53
00:06:10,649 --> 00:06:16,669
seven, or the position where seven occurs\n
54
00:06:16,670 --> 00:06:21,629
seven, that is at the beginning. Or in the\n
55
00:06:21,629 --> 00:06:29,339
very last element of the list. for that particular\n
56
00:06:29,339 --> 00:06:35,119
with respect to the number of elements in\n
57
00:06:35,120 --> 00:06:43,550
every single element until you find seven.\n
58
00:06:43,550 --> 00:06:48,740
just consider the worst possible amount of\n
59
00:06:48,740 --> 00:06:55,990
that particular input. There's also the fact\n
60
00:06:55,990 --> 00:07:01,410
when your input becomes arbitrarily large.\n
61
00:07:01,410 --> 00:07:07,170
the input is small. For this reason, you'll\n
62
00:07:07,170 --> 00:07:18,240
multiplicative factors. So in our big O notation,\n
63
00:07:21,639 --> 00:07:27,819
when I say n, n is usually always want to\n
64
00:07:27,819 --> 00:07:35,459
there's always going to be some limitation\n
65
00:07:35,459 --> 00:07:42,911
a one wrapped around a big O. If your algorithm\n
66
00:07:42,911 --> 00:07:50,180
we say that's big O of a log event. If it's\n
67
00:07:50,180 --> 00:07:57,470
quadratic time or cubic time, then we say\n
68
00:07:57,470 --> 00:08:03,030
Usually, this is going to be something like\n
69
00:08:03,029 --> 00:08:10,019
than one to the n. And then we also have n\n
70
00:08:10,019 --> 00:08:17,509
these like square root of n log log of n,\n
71
00:08:17,509 --> 00:08:24,379
any mathematical expression containing n can\n
72
00:08:24,379 --> 00:08:33,448
valid. Now, we want to talk about some properties\n
73
00:08:33,448 --> 00:08:38,929
in the last two slides, Big O only really\n
74
00:08:38,929 --> 00:08:45,128
really big. So we're not interested when n\nis small, only
75
00:08:45,129 --> 00:08:54,149
what happens when n goes to infinity. So this\n
76
00:08:54,149 --> 00:09:00,568
The first that we can simply remove constant\n
77
00:09:00,568 --> 00:09:06,389
if you're adding a constant to infinity, well,\n
78
00:09:06,389 --> 00:09:14,220
a constant by infinity, yeah, that's still\n
79
00:09:14,220 --> 00:09:20,810
of course, this is all theoretical. In the\n
80
00:09:20,809 --> 00:09:28,000
billion, probably, that's going to have a\n
81
00:09:29,620 --> 00:09:38,578
However, let us look at a function f, which\n
82
00:09:38,578 --> 00:09:49,500
f of n is seven log of n cubed plus 15 n squared\n
83
00:09:49,500 --> 00:10:01,169
of n is just n cubed, because n cubed is the\n
84
00:10:02,990 --> 00:10:12,579
look at some concrete examples of how big\n
85
00:10:12,578 --> 00:10:19,748
constant time with respect to the input size,\n
86
00:10:21,070 --> 00:10:27,350
So on the left, when we're just adding or\n
87
00:10:27,350 --> 00:10:33,060
constant time. And on the right, okay, we're\n
88
00:10:33,059 --> 00:10:39,669
depend on n. So it runs also in a constant\n
89
00:10:39,669 --> 00:10:46,958
arbitrarily large, well, that loop is still\n
90
00:10:46,958 --> 00:10:51,239
Now let's look at a linear example. So both\n
91
00:10:51,240 --> 00:10:57,680
linear time with respect to the input size,\n
92
00:11:00,620 --> 00:11:07,740
So on the left, we're incrementing, the counter\n
93
00:11:07,740 --> 00:11:14,180
clearly, when we wrap this in a big go get\n
94
00:11:14,179 --> 00:11:21,349
complicated, we're not incrementing by one,\n
95
00:11:21,350 --> 00:11:28,329
to finish that loop three times faster. So\n
96
00:11:28,328 --> 00:11:36,599
is two algorithms that run in quadratic time.\n
97
00:11:36,600 --> 00:11:45,319
at times. So n times as big O of n squared.\n
98
00:11:45,318 --> 00:11:52,938
zero with an eye. So pause the video and try\n
99
00:11:52,938 --> 00:12:02,088
squared. Okay, let's go over the solution.\n
100
00:12:02,089 --> 00:12:10,660
first loop isn't as important. So since I\n
101
00:12:10,659 --> 00:12:16,039
is going to be directly related to what AI\n
102
00:12:16,039 --> 00:12:24,429
loop. So if we fix AI to be zero, we do n\n
103
00:12:24,429 --> 00:12:32,938
If we fix it, too, we do n minus two work\n
104
00:12:32,938 --> 00:12:39,659
what is n plus n minus one plus n minus two\n
105
00:12:39,659 --> 00:12:47,958
a well known identity, and turns out to be\n
106
00:12:47,958 --> 00:12:56,599
wrap this in a big O, we split our equation,\n
107
00:12:56,600 --> 00:13:03,870
Now let's look at perhaps a more complicated\n
108
00:13:03,870 --> 00:13:10,339
how do we ever get these log Eurythmics or\n
109
00:13:10,339 --> 00:13:18,410
a classic algorithm of doing a binary search,\n
110
00:13:18,409 --> 00:13:26,179
So what this algorithm does is it starts by\n
111
00:13:26,179 --> 00:13:31,628
in one at the very end of the array, then\n
112
00:13:31,629 --> 00:13:38,610
if the value we're looking for was found at\n
113
00:13:38,610 --> 00:13:46,269
or not, if it has found it, it stops. Otherwise,\n
114
00:13:46,269 --> 00:13:52,649
either the high or the low pointer. remark\n
115
00:13:52,649 --> 00:14:00,078
half of the array, each iteration. So very,\n
116
00:14:00,078 --> 00:14:06,458
range check. So if you do the math, the worst\n
117
00:14:06,458 --> 00:14:14,528
N iterations, meaning that the binary search\n
118
00:14:14,528 --> 00:14:21,860
a very powerful algorithm. Here's a slightly\n
119
00:14:21,860 --> 00:14:28,789
notice that there is an outer loop with a\n
120
00:14:28,789 --> 00:14:35,528
that there are two inner loops, one that does\n
121
00:14:35,528 --> 00:14:40,499
work. So the rule we use to determine the\n
122
00:14:40,499 --> 00:14:49,439
loops on different levels and add those that\n
123
00:14:49,438 --> 00:14:56,889
using the rule above, we can see that it takes\n
124
00:14:56,889 --> 00:15:09,149
n plus two n for both inner loop Which gives\n
125
00:15:09,149 --> 00:15:15,919
All right, so this next one looks very similar,\n
126
00:15:15,919 --> 00:15:24,969
outer loop with AI, we have AI going from\n
127
00:15:24,970 --> 00:15:29,910
on the outside. But we have to multiply that\n
128
00:15:31,198 --> 00:15:39,808
j goes from 10, to 50. So that does 40 loops\n
129
00:15:39,808 --> 00:15:48,278
the amount of work. Plus however, the second\n
130
00:15:48,278 --> 00:15:54,999
J equals j plus two, so it's accelerated a\n
131
00:15:54,999 --> 00:16:04,060
a little faster. So we're gonna get on the\n
132
00:16:04,059 --> 00:16:11,318
we have to multiply that by three n. So we\n
133
00:16:11,318 --> 00:16:18,039
is going to give us big of enter the force,\n
134
00:16:18,039 --> 00:16:31,078
in our function, f of n. For some other classic\n
135
00:16:31,078 --> 00:16:36,479
the subsets of a set, that takes an exponential\n
136
00:16:36,480 --> 00:16:42,870
N subsets, finding all permutations of a string\n
137
00:16:42,870 --> 00:16:51,948
one is merge sort. So we have n times log\n
138
00:16:51,948 --> 00:16:59,688
iterate over all the cells of an array of\n
139
00:16:59,688 --> 00:17:07,519
right, let's talk about arrays, probably the\n
140
00:17:07,519 --> 00:17:13,910
of two in the array videos. The reason the\n
141
00:17:13,910 --> 00:17:21,029
a fundamental building block for all other\n
142
00:17:21,029 --> 00:17:27,579
with arrays and pointers alone, I'm pretty\n
143
00:17:27,579 --> 00:17:35,409
structure. So an outline for today's video.\n
144
00:17:35,410 --> 00:17:44,250
about arrays and answer some fundamental questions\n
145
00:17:44,250 --> 00:17:50,619
Next, I will explain the basic structure of\n
146
00:17:50,619 --> 00:17:58,549
able to perform on them. Lastly, we will go\n
147
00:17:58,549 --> 00:18:06,319
look at some source code on how to construct\n
148
00:18:06,319 --> 00:18:14,669
discussion and examples. So what is a static\n
149
00:18:14,670 --> 00:18:22,259
containing elements, which are indexable,\n
150
00:18:22,259 --> 00:18:31,940
n minus one also inclusive. So a follow up\n
151
00:18:31,940 --> 00:18:39,500
So answer this is this means that each slot\n
152
00:18:39,500 --> 00:18:47,519
a number. Furthermore, I would like to add\n
153
00:18:47,519 --> 00:18:55,250
chunks of memory. Meaning that your chunk\n
154
00:18:55,250 --> 00:19:04,230
cheese with a bunch of holes and gaps. It's\n
155
00:19:04,230 --> 00:19:14,130
in your static array. Okay, so when and where\n
156
00:19:14,130 --> 00:19:21,410
everywhere, absolutely everywhere. It's hard\n
157
00:19:21,410 --> 00:19:29,509
fact, here are a few places you may or may\n
158
00:19:29,509 --> 00:19:36,359
the first simple example is to temporarily\n
159
00:19:36,359 --> 00:19:42,429
you're probably familiar with. Next is that\n
160
00:19:42,430 --> 00:19:48,259
from an input or an output stream. Suppose\n
161
00:19:48,259 --> 00:19:55,180
that you need to process but that file is\n
162
00:19:55,180 --> 00:20:02,390
we use a buffer to read small chunks of the\n
163
00:20:02,390 --> 00:20:08,150
a time. And so eventually we're able to read\n
164
00:20:08,150 --> 00:20:15,160
as lookup tables because of their indexing\n
165
00:20:15,160 --> 00:20:20,960
retrieve data from the lookup table if you\n
166
00:20:20,960 --> 00:20:27,361
and at what offset. Next, we also use arrays\n
167
00:20:27,361 --> 00:20:30,410
that only allows one return value.
168
00:20:30,410 --> 00:20:38,279
So the hack we use then is to return a pointer\n
169
00:20:38,279 --> 00:20:45,190
all the return values that we want. This last\n
170
00:20:45,190 --> 00:20:51,400
are heavily used the programming technique\n
171
00:20:51,400 --> 00:20:57,880
to cache already computed subproblems. So\n
172
00:20:57,880 --> 00:21:03,820
problem, or the coin change problem. All right,\n
173
00:21:03,819 --> 00:21:11,819
access time for static array and a dynamic\n
174
00:21:11,819 --> 00:21:17,980
arrays are indexable. Searching, however,\n
175
00:21:17,980 --> 00:21:24,589
we potentially have to traverse all the elements\n
176
00:21:24,589 --> 00:21:31,929
the element you're looking for does not exist.\n
177
00:21:31,930 --> 00:21:38,730
array doesn't really make sense. The static\n
178
00:21:38,730 --> 00:21:46,490
grow larger or smaller. When inserting with\n
179
00:21:46,490 --> 00:21:51,849
linear time, because you potentially have\n
180
00:21:51,849 --> 00:21:57,559
recopy all the elements into the new static\n
181
00:21:57,559 --> 00:22:05,149
a dynamic array using static arrays. However,\n
182
00:22:05,150 --> 00:22:13,920
seem a little strange? Well, when we append\n
183
00:22:13,920 --> 00:22:21,310
the internal static array containing all those\n
184
00:22:21,309 --> 00:22:28,779
appending becomes constant time. deletions\n
185
00:22:28,779 --> 00:22:36,700
are linear, you have to shift all of the elements\n
186
00:22:36,700 --> 00:22:45,370
into your static array. Okay, so we have a\n
187
00:22:45,369 --> 00:22:56,739
a contains the values 4412 minus 517 6039\n
188
00:22:56,740 --> 00:23:06,970
but this is not a requirement of an array.\n
189
00:23:06,970 --> 00:23:15,940
has index position zero in the array, not\n
190
00:23:15,940 --> 00:23:23,880
science students, you have no idea. The confusing\n
191
00:23:23,880 --> 00:23:36,880
is one base to work computer science is one.\n
192
00:23:36,880 --> 00:23:49,850
the values 4412 minus 517 6039, and 100. Currently,\n
193
00:23:49,849 --> 00:23:58,199
is not at all a requirement of the array.\n
194
00:23:58,200 --> 00:24:09,200
is indexed, or positioned at index of zero\n
195
00:24:09,200 --> 00:24:17,039
a lot of intro computer science students.\n
196
00:24:17,039 --> 00:24:24,599
mathematics is one based while computer science\n
197
00:24:24,599 --> 00:24:30,719
But Worst of all, is quantum computing. I\n
198
00:24:30,720 --> 00:24:38,710
during my undergrad, and the field is a mess.\n
199
00:24:38,710 --> 00:24:45,740
scientists and physicists all at the same\n
200
00:24:45,740 --> 00:24:53,420
Anyways, back to arrays. I should also note\n
201
00:24:53,420 --> 00:24:59,350
for each loop something that's offered in\n
202
00:24:59,349 --> 00:25:06,409
you to explicitly reference the indices of\n
203
00:25:06,410 --> 00:25:14,850
internally, behind the scenes, the notation\n
204
00:25:14,849 --> 00:25:23,449
array A square bracket zero, close square\n
205
00:25:24,839 --> 00:25:35,319
is equal to the value 44. Similarly, a position\n
206
00:25:35,319 --> 00:25:43,289
But a index nine is out of bounds. And our\n
207
00:25:43,289 --> 00:25:51,819
in C, it doesn't always throw an exception,\n
208
00:25:51,819 --> 00:26:02,649
zero to B minus one that happens if we assign\n
209
00:26:02,650 --> 00:26:11,250
be 25, let's look at operations on dynamic\n
210
00:26:11,250 --> 00:26:18,259
in size as needed. So the dynamic array can\n
211
00:26:18,259 --> 00:26:25,400
can do but unlike the static array, it grows\n
212
00:26:25,400 --> 00:26:33,180
have a containing 34, and four, there, if\n
213
00:26:33,180 --> 00:26:41,070
end. If we add 34, again, then it will add\n
214
00:26:41,069 --> 00:26:49,109
So you see here, our dynamic array shrink\n
215
00:26:49,109 --> 00:26:54,279
talked about this a little bit. But how do\n
216
00:26:54,279 --> 00:26:59,589
the answer is typically this is done with\n
217
00:26:59,589 --> 00:27:07,109
of course. So first, we create a stack if\n
218
00:27:07,109 --> 00:27:14,259
nine zero. So as we add elements, we add elements\n
219
00:27:14,259 --> 00:27:21,250
of the number of elements added. Once, we\n
220
00:27:21,250 --> 00:27:26,569
of our internal static array, what we can\n
221
00:27:26,569 --> 00:27:31,500
elements into this new static array, and add\n
222
00:27:31,500 --> 00:27:39,509
an example. So suppose we create a dynamic\n
223
00:27:39,509 --> 00:27:46,730
we begin adding elements to it. So the little\n
224
00:27:46,730 --> 00:27:52,089
for an empty position. Okay, so we add seven,\n
225
00:27:52,089 --> 00:27:56,839
fine. But once we add three, it doesn't fit\n
226
00:27:56,839 --> 00:28:03,059
the size of the array, copy all the elements\n
227
00:28:03,059 --> 00:28:09,539
12, everything's still okay, we're doing good.\n
228
00:28:09,539 --> 00:28:16,339
resize again. So double the size of the container,\n
229
00:28:16,339 --> 00:28:22,849
array, and then finish off by adding five.\n
230
00:28:22,849 --> 00:28:31,609
issues. All right, time to look at the dynamic\n
231
00:28:31,609 --> 00:28:41,639
in the array series. So the source code for\n
232
00:28:41,640 --> 00:28:49,110
github.com, slash my username slash data dash\n
233
00:28:49,109 --> 00:28:55,719
video so you know what's going on with this\n
234
00:28:55,720 --> 00:29:04,680
in the array class. So I've designed an array\n
235
00:29:04,680 --> 00:29:13,160
type of data we want put in this array, that's\n
236
00:29:13,160 --> 00:29:18,050
that we care about our, which is our internal\nstatic array.
237
00:29:19,140 --> 00:29:26,080
which is the length the user thinks the array\n
238
00:29:26,079 --> 00:29:34,210
array, because sometimes our array might have\n
239
00:29:34,210 --> 00:29:42,340
the user that there's extra free slots that\n
240
00:29:42,339 --> 00:29:50,089
So there's two constructors. The first one\n
241
00:29:50,089 --> 00:29:55,859
The other one you give it a capacity, the\n
242
00:29:55,859 --> 00:30:06,299
or equal to zero. And then we finish Why's\n
243
00:30:06,299 --> 00:30:17,730
I put this suppress warnings and unchecked\n
244
00:30:17,730 --> 00:30:22,940
So here are two simple methods size, get the\n
245
00:30:22,940 --> 00:30:28,850
empty, pretty self explanatory. Similarly\n
246
00:30:28,849 --> 00:30:34,971
if we want the value or set it if we want\n
247
00:30:34,971 --> 00:30:45,380
be doing a bounds check for both of these,\n
248
00:30:45,380 --> 00:30:54,240
we just remove all the data in our array and\n
249
00:30:54,240 --> 00:31:00,529
the Add method where things actually get a\n
250
00:31:02,369 --> 00:31:10,689
plus one is greater than or equal to the capacity,\n
251
00:31:10,690 --> 00:31:16,610
I'm resizing is I'm doubling the size of the\n
252
00:31:16,609 --> 00:31:25,299
size, but I've decided that doubling the size\n
253
00:31:25,299 --> 00:31:34,079
I have to create a new array with the new\n
254
00:31:34,079 --> 00:31:41,559
this line or these lines are doing, it's copying\n
255
00:31:41,559 --> 00:31:52,319
then it sets the old array to be the new array.\n
256
00:31:52,319 --> 00:32:04,319
into our right. So this remove AZ method will\n
257
00:32:04,319 --> 00:32:11,519
First, we check if the index is valid. If\n
258
00:32:11,519 --> 00:32:20,720
otherwise, grab the data at the Remove index.\n
259
00:32:20,720 --> 00:32:28,950
one. Now copy everything into the new array
260
00:32:30,250 --> 00:32:36,849
for when it's at that remove index. Now I'm\n
261
00:32:36,849 --> 00:32:45,819
decided to do the following maintain two indices\n
262
00:32:45,819 --> 00:32:54,399
when i is equal to the Remove index, then\n
263
00:32:54,400 --> 00:33:07,540
temporarily and using j to lag behind, if\n
264
00:33:07,539 --> 00:33:13,210
So I guess pretty clever overall. And then\n
265
00:33:13,210 --> 00:33:23,509
generated. Reset the capacity and return the\n
266
00:33:23,509 --> 00:33:31,049
remove, we scan through the array. If we find\n
267
00:33:31,049 --> 00:33:38,759
the index and return true otherwise return\n
268
00:33:38,759 --> 00:33:48,650
it return I otherwise return minus one. Contains\n
269
00:33:48,650 --> 00:34:00,430
one. All right, this next one, I return an\n
270
00:34:00,430 --> 00:34:08,730
us to iterate over the array providing an\n
271
00:34:08,730 --> 00:34:15,440
two methods. And this is has next. So there\n
272
00:34:15,440 --> 00:34:21,500
is less than the length of the array. I should\n
273
00:34:21,500 --> 00:34:27,480
here, just in case someone decides to change\n
274
00:34:27,480 --> 00:34:34,539
it might add that later might not.
275
00:34:37,028 --> 00:34:45,869
there's the next method, which just returns\n
276
00:34:45,869 --> 00:34:51,250
the iterator. Okay, and lastly is the to string\n
277
00:34:51,250 --> 00:34:56,829
Nothing too complicated. Alright, so this\n
278
00:34:56,829 --> 00:35:02,700
a dynamic array. If you look at Java's ArrayList\n
279
00:35:02,699 --> 00:35:08,189
we're going to talk about singly and doubly\n
280
00:35:08,190 --> 00:35:12,960
out there. This is part one of two and the\n
281
00:35:12,960 --> 00:35:21,150
code on how to implement a doubly linked list.\n
282
00:35:21,150 --> 00:35:26,010
we're going to answer some basic questions\n
283
00:35:26,010 --> 00:35:31,970
namely, what are they and where are they used.\n
284
00:35:31,969 --> 00:35:37,139
linked lists, so everyone knows what I mean\n
285
00:35:37,139 --> 00:35:43,039
the tail of the weight class. Then last in\n
286
00:35:43,039 --> 00:35:49,759
and cons of using singly and doubly linked\n
287
00:35:49,760 --> 00:35:57,830
from both singly and doubly linked lists as\n
288
00:35:57,829 --> 00:36:04,018
discussion. So what is the link list linked\n
289
00:36:04,018 --> 00:36:11,078
data which point to other nodes also containing\n
290
00:36:11,079 --> 00:36:17,670
list containing some arbitrary data. Notice\n
291
00:36:17,670 --> 00:36:25,269
node. Also notice that the last node points\n
292
00:36:29,170 --> 00:36:37,260
always has a null reference to the next note,\n
293
00:36:37,260 --> 00:36:45,000
slides. Okay, so where are linked lists use.\n
294
00:36:45,000 --> 00:36:52,239
actually in the abstract data type implementation\n
295
00:36:52,239 --> 00:36:58,649
great time complexity for adding and removing\n
296
00:36:58,650 --> 00:37:05,889
like circular lists, making the pointer of\n
297
00:37:05,889 --> 00:37:13,029
linked lists are used to model repeating events\n
298
00:37:13,030 --> 00:37:18,190
on a bunch of elements or representing corners\n
299
00:37:18,190 --> 00:37:24,380
linked lists can also be used to model real\n
300
00:37:24,380 --> 00:37:31,019
that could be useful. And moving on some more\n
301
00:37:31,018 --> 00:37:38,239
lists and hash table separate chaining, and\n
302
00:37:38,239 --> 00:37:45,568
to those in a later video. Okay, a bit of\n
303
00:37:45,568 --> 00:37:50,369
thing you need to know when creating a linked\n
304
00:37:50,369 --> 00:37:56,068
to the head of the link lists. This is because\n
305
00:37:56,068 --> 00:38:03,019
our list. We give a name to the last element\n
306
00:38:03,019 --> 00:38:09,719
tail of the list. Then there are also the\n
307
00:38:09,719 --> 00:38:15,858
pointers are also sometimes called references.\n
308
00:38:15,858 --> 00:38:22,259
You should also know that the nodes themselves\n
309
00:38:22,259 --> 00:38:28,809
when actually implemented. We'll get to this\n
310
00:38:28,809 --> 00:38:34,750
versus doubly linked lists, sort of concerning\n
311
00:38:34,750 --> 00:38:42,349
are two types, singly linked and doubly length.\n
312
00:38:42,349 --> 00:38:48,740
to the next node, while doubly linked lists\n
313
00:38:48,739 --> 00:38:54,819
but also to the previous node, which makes\n
314
00:38:54,820 --> 00:38:59,769
say we cannot have triple or quadruple the\n
315
00:38:59,768 --> 00:39:06,858
place additional pointers, pros and cons of\n
316
00:39:06,858 --> 00:39:12,400
Between picking a singly and a doubly linked\n
317
00:39:12,400 --> 00:39:18,900
linked lists we observed that uses less memory.\n
318
00:39:18,900 --> 00:39:25,230
up a lot of memory. If you're running on a\n
319
00:39:25,230 --> 00:39:31,298
on a 32 bit machine four bytes each. So having\n
320
00:39:31,298 --> 00:39:38,759
pointer for each node, hence, twice as much\n
321
00:39:38,759 --> 00:39:44,681
you cannot access previous elements because\n
322
00:39:44,681 --> 00:39:49,239
to traverse from the head of a linked lists\n
323
00:39:49,239 --> 00:39:55,219
it. Now concerning doubly linked lists with\n
324
00:39:55,219 --> 00:39:59,649
we can easily traverse the list backwards,\n
325
00:39:59,650 --> 00:40:05,338
list. Also having a reference to know Do you\n
326
00:40:05,338 --> 00:40:11,949
time and patch the hole you just created.\n
327
00:40:11,949 --> 00:40:17,460
previous and an ex notes, this is something\n
328
00:40:17,460 --> 00:40:25,181
would leave the list severed into a downside\n
329
00:40:25,181 --> 00:40:31,489
use twice as much memory. Okay, let's go into\n
330
00:40:31,489 --> 00:40:37,818
create linked lists and remove elements from\n
331
00:40:37,818 --> 00:40:45,429
list. So here is a singly linked list. I've\n
332
00:40:45,429 --> 00:40:52,108
we want to insert 11. At the third position\n
333
00:40:52,108 --> 00:40:59,038
an example. So the first thing we do is we\n
334
00:40:59,039 --> 00:41:05,049
This is almost always the first step in all\n
335
00:41:05,048 --> 00:41:12,230
to do is seek up to but not including the\n
336
00:41:12,230 --> 00:41:17,289
and we advanced our traverser pointers, setting\n
337
00:41:17,289 --> 00:41:25,588
23. And now we're actually ready already where\n
338
00:41:25,588 --> 00:41:30,068
we create the next node, that's the green\nnode 11.
339
00:41:30,068 --> 00:41:38,460
And we make 11 elevens. Next pointer, point\n
340
00:41:38,460 --> 00:41:45,289
is next pointer to be 11. Remember, we have\n
341
00:41:45,289 --> 00:41:52,950
a reference to it with the traverser. Okay,\n
342
00:41:52,949 --> 00:42:01,460
that we've correctly inserted 11 at the right\n
343
00:42:01,460 --> 00:42:08,119
time to insert with a doubly linked list.\n
344
00:42:08,119 --> 00:42:15,891
flying around. But it's the exact same concept.\n
345
00:42:15,891 --> 00:42:22,220
only has pointers to the next node, but also\n
346
00:42:22,219 --> 00:42:30,649
those in the insertion phase. Okay, create\n
347
00:42:30,650 --> 00:42:38,639
the head is, and advance it until you are\n
348
00:42:38,639 --> 00:42:43,618
advanced the traversal by one and now we're\n
349
00:42:43,619 --> 00:42:53,240
let's create the new node which is node 11.\n
350
00:42:53,239 --> 00:43:02,969
Also point leptons previous pointer to be\n
351
00:43:02,969 --> 00:43:13,248
traverser. How we make sevens previous pointer\n
352
00:43:13,248 --> 00:43:23,778
to 11. And the last step, make 20 threes next\n
353
00:43:23,778 --> 00:43:33,141
go forwards from 23 to 11. So in total remarked\n
354
00:43:33,141 --> 00:43:40,590
we've flattened out the list, you can see\n
355
00:43:40,590 --> 00:43:48,200
All right now how to remove elements from\n
356
00:43:48,199 --> 00:43:55,558
remove the node with a value nine. How do\n
357
00:43:55,559 --> 00:44:02,950
use is not to use one pointer but two, you\n
358
00:44:02,949 --> 00:44:11,909
to show you how it is done by using two. So\n
359
00:44:11,909 --> 00:44:19,828
for traverser one in Traverse e two respectively.\n
360
00:44:19,829 --> 00:44:30,150
or two points to the heads. next node. Now\n
361
00:44:30,150 --> 00:44:41,490
we find the node we want to remove while also\n
362
00:44:41,489 --> 00:44:48,838
nine. So this is the stopping point. I'm going\n
363
00:44:48,838 --> 00:44:57,268
to remove so we can deallocate its memory\n
364
00:44:57,268 --> 00:45:04,818
traffic to to the next node Note nine has\n
365
00:45:04,818 --> 00:45:12,880
this will that at this point, node nine is\n
366
00:45:12,880 --> 00:45:23,180
for the visual effect. Okay, so now set trav\n
367
00:45:23,179 --> 00:45:30,659
And now is an appropriate time to remove the\n
368
00:45:30,659 --> 00:45:37,980
it and their temp has been deallocated. Make\n
369
00:45:37,980 --> 00:45:44,588
memory leaks. This is especially important\n
370
00:45:44,588 --> 00:45:50,690
where you manage your memory. Now you can\n
371
00:45:50,690 --> 00:45:58,259
list is shorter. Okay, now for the last bit\n
372
00:45:58,259 --> 00:46:04,130
nodes from a doubly linked list, which is\n
373
00:46:04,130 --> 00:46:11,430
from singly linked lists. The idea is the\n
374
00:46:11,429 --> 00:46:18,239
But this time, we only need one pointer. I\n
375
00:46:18,239 --> 00:46:24,318
node is singly linked list has a reference\n
376
00:46:24,318 --> 00:46:28,768
maintain it like we did with the singly linked\nlist.
377
00:46:28,768 --> 00:46:39,318
So let's start travel at the very beginning\n
378
00:46:39,318 --> 00:46:47,699
reached nine. And we want to remove it from\n
379
00:46:47,699 --> 00:46:56,528
be equal to 15 with access to four and 15\n
380
00:46:56,528 --> 00:47:06,498
pointer respectively. Similarly set 15 previous\n
381
00:47:06,498 --> 00:47:15,189
now read, meaning it is ready to be removed.\n
382
00:47:15,190 --> 00:47:23,019
out the doubly linked lists, we see that it\n
383
00:47:23,018 --> 00:47:31,769
complexity analysis on linked lists how good\n
384
00:47:31,769 --> 00:47:38,429
we have singly linked lists. And on the right\n
385
00:47:38,429 --> 00:47:44,328
in a linked list is linear in the worst case,\n
386
00:47:44,329 --> 00:47:52,420
not there, we have to traverse all of the\n
387
00:47:52,420 --> 00:47:58,940
is constant time, because we always maintain\n
388
00:47:58,940 --> 00:48:09,858
hence we can add it in constant time. Similarly\n
389
00:48:09,858 --> 00:48:14,818
linked lists, and a doubly linked list is\n
390
00:48:14,818 --> 00:48:22,358
a reference to it, so we can just move it\n
391
00:48:22,358 --> 00:48:27,768
removing from the tail is another story. It\n
392
00:48:27,768 --> 00:48:35,498
a singly linked list. Can you think of Why?\n
393
00:48:35,498 --> 00:48:43,549
tail in a singly linked lists, we can remove\n
394
00:48:43,550 --> 00:48:50,568
value of what the tail is. So we had to seek\n
395
00:48:50,568 --> 00:48:57,869
new tail is equal to. W linked list however,\n
396
00:48:57,869 --> 00:49:04,329
a pointer to the previous node. So we can\n
397
00:49:04,329 --> 00:49:10,450
finally, removing somewhere in the middle\n
398
00:49:10,449 --> 00:49:15,189
we would need to seek through n minus one\n
399
00:49:16,400 --> 00:49:23,088
to look at some double e linked list source\n
400
00:49:23,088 --> 00:49:30,998
list series. So the link for the source code\n
401
00:49:30,998 --> 00:49:38,329
Williams he's a slash data dash structures.\n
402
00:49:38,329 --> 00:49:43,828
source code helpful so that others may also\n
403
00:49:43,829 --> 00:49:49,880
first part of the linkless series before continuing.\n
404
00:49:49,880 --> 00:49:59,818
at the implementation of a doubly linked list\n
405
00:49:59,818 --> 00:50:06,380
a Few instance variables. So we are keeping\n
406
00:50:06,380 --> 00:50:13,160
as what the head and the tail currently are.\n
407
00:50:13,159 --> 00:50:20,489
meaning link list is empty. Furthermore, we\n
408
00:50:20,489 --> 00:50:27,259
excessively, because it contains the data\n
409
00:50:27,259 --> 00:50:33,880
and next pointers for each node since this\n
410
00:50:33,880 --> 00:50:39,809
for the node, namely the data and the previous\n
411
00:50:39,809 --> 00:50:47,150
both otherwise, we can't do much. So his first\n
412
00:50:47,150 --> 00:50:54,079
list, it does so in linear time by going through\n
413
00:50:54,079 --> 00:51:02,269
time deallocates them by setting them equal\n
414
00:51:02,268 --> 00:51:08,838
head, we loop while the traverser is likely\n
415
00:51:08,838 --> 00:51:13,599
and then we do our deallocation business.\n
416
00:51:13,599 --> 00:51:20,760
and reset the head and tail. Perfect. These\n
417
00:51:20,760 --> 00:51:30,720
get the size and check if the size of our\n
418
00:51:30,719 --> 00:51:36,419
a public method to add an element by default,\n
419
00:51:36,420 --> 00:51:44,639
list or at the tail. But I also support adding\n
420
00:51:44,639 --> 00:51:51,808
do we do this, if this is the first element,\n
421
00:51:51,809 --> 00:52:02,739
and the tail to be equal to the new node,\n
422
00:52:02,739 --> 00:52:14,130
and next pointers set to No. Otherwise, if\n
423
00:52:14,130 --> 00:52:22,729
previous pointer is equal to this new node.\n
424
00:52:22,728 --> 00:52:29,728
to be whatever hands previous is. So we backup\n
425
00:52:29,728 --> 00:52:36,998
forget to increment the size. A very similar\n
426
00:52:36,998 --> 00:52:42,608
length list, except we're moving the tail\npointer around.
427
00:52:44,009 --> 00:52:50,960
move to peak. So peaking is just looking at\n
428
00:52:50,960 --> 00:52:56,329
linked list or at the end of the linked list.\n
429
00:52:56,329 --> 00:53:04,650
is empty, because doesn't make sense to peek\n
430
00:53:04,650 --> 00:53:11,200
more complex method, which is remove first.\n
431
00:53:11,199 --> 00:53:19,808
the linked list. So we can't do much if the\n
432
00:53:19,809 --> 00:53:24,680
we extract the data at the head, and then\n
433
00:53:24,679 --> 00:53:33,230
the size by one. So if the list is empty,\n
434
00:53:33,230 --> 00:53:42,009
the head and the tail are now No. Otherwise,\n
435
00:53:42,009 --> 00:53:49,719
that we just removed. This is especially important\n
436
00:53:49,719 --> 00:53:58,848
delete pointers, then at the end, we return\n
437
00:53:58,849 --> 00:54:05,470
except we're using the tail this time to remove\n
438
00:54:05,469 --> 00:54:15,929
head. Okay, and here's a generic method to\n
439
00:54:15,929 --> 00:54:23,058
this to private because the node class itself\n
440
00:54:23,059 --> 00:54:28,700
to the node. That's just something we're using\n
441
00:54:28,699 --> 00:54:36,528
to manage the list. So if the node that we're\n
442
00:54:36,528 --> 00:54:45,838
detect that and call our methods either remove\n
443
00:54:45,838 --> 00:54:52,108
somewhere in the middle of linked list. And\n
444
00:54:52,108 --> 00:54:57,960
the to our current node equal to each other.\n
445
00:54:57,960 --> 00:55:05,000
node and And of course, don't forget to clean\n
446
00:55:05,000 --> 00:55:10,088
have to temporarily store the data. Of course,\n
447
00:55:10,088 --> 00:55:18,068
deleted the node and the data is already gone.\n
448
00:55:18,068 --> 00:55:25,920
particular index and our linked list. Yes,\n
449
00:55:25,920 --> 00:55:32,829
not explicitly indexed, we can pretend that\n
450
00:55:32,829 --> 00:55:40,068
valid, otherwise throw an illegal argument\n
451
00:55:40,068 --> 00:55:45,469
bit smarter than just naively going through\n
452
00:55:45,469 --> 00:55:51,429
from the front of the linked list to find\n
453
00:55:51,429 --> 00:56:00,169
the index is closer to the front or to the\n
454
00:56:00,170 --> 00:56:04,568
So for the Remove method, we want to be able\n
455
00:56:04,568 --> 00:56:13,690
list, which is object. So we're going to also\n
456
00:56:13,690 --> 00:56:20,690
someone decided that the value of the node\n
457
00:56:20,690 --> 00:56:27,229
special case. Otherwise, we traverse through\n
458
00:56:27,228 --> 00:56:34,338
and then remove that node and return true\n
459
00:56:34,338 --> 00:56:40,929
we want to remove. Otherwise, we return false\n
460
00:56:40,929 --> 00:56:48,199
for the element we want to remove, we use\n
461
00:56:48,199 --> 00:56:53,179
the element. If so, remove that node and return\ntrue.
462
00:56:54,710 --> 00:57:01,940
here we have a related method which is index\n
463
00:57:01,940 --> 00:57:08,950
remove value, but get whatever index this\n
464
00:57:08,949 --> 00:57:14,598
null. So even if our values No, we'll just\n
465
00:57:14,599 --> 00:57:23,579
So again, first link list. Otherwise, search\n
466
00:57:23,579 --> 00:57:35,099
the index as we go. We can use the index of\n
467
00:57:35,099 --> 00:57:40,329
is contained within a linked list because\n
468
00:57:40,329 --> 00:57:47,250
found. Something that's useful sometimes is\n
469
00:57:47,250 --> 00:57:56,880
is also trivial to implement, just start a\n
470
00:57:56,880 --> 00:58:03,940
until you reach the end. Notice I'm not checking\n
471
00:58:03,940 --> 00:58:15,880
you want to, it's pretty easy to do that.\n
472
00:58:15,880 --> 00:58:23,568
the to string method to print a string or\n
473
00:58:23,568 --> 00:58:32,978
linked list. May I begin by saying that the\n
474
00:58:32,978 --> 00:58:40,710
data structure, one of my favorites. In fact,\n
475
00:58:40,710 --> 00:58:46,318
Part Two will consist of looking at a stack\n
476
00:58:46,318 --> 00:58:54,639
source code for how a stack is implemented\n
477
00:58:54,639 --> 00:59:00,818
that we'll be covering in this video as well\n
478
00:59:00,818 --> 00:59:08,548
about what is a stack and where is it used?\n
479
00:59:08,548 --> 00:59:16,338
of how to solve problems using stacks. Afterwards,\n
480
00:59:16,338 --> 00:59:23,190
internally and the time complexity associated\n
481
00:59:23,190 --> 00:59:31,519
some source code. Moving on to the discussion\n
482
00:59:31,518 --> 00:59:37,899
is a one ended linear data structure which\n
483
00:59:37,900 --> 00:59:46,219
primary operations, namely, push and pop.\n
484
00:59:46,219 --> 00:59:52,550
constructed. There is one data member again\n
485
00:59:52,550 --> 00:59:59,249
data member getting added to the stack. Notice\n
486
00:59:59,248 --> 01:00:06,558
block at the top The stack. This is because\n
487
01:00:06,559 --> 01:00:16,309
added to the top of the pile. This behavior\n
488
01:00:16,309 --> 01:00:23,548
out. Let's look at a more detailed example\n
489
01:00:23,548 --> 01:00:32,099
and removed from a stack. So let's walk through\n
490
01:00:32,099 --> 01:00:40,360
on what we need to add and remove to the stack.\n
491
01:00:40,360 --> 01:00:49,068
the top element from the stack, which is Apple.\n
492
01:00:49,068 --> 01:00:56,298
onto the stack. So we add onion to the top\n
493
01:00:56,298 --> 01:01:05,268
celery onto the stack. Next is watermelon,\n
494
01:01:05,268 --> 01:01:10,118
says to pop so we remove the element at the\n
495
01:01:10,119 --> 01:01:17,219
just added. The next operation also a pop.\n
496
01:01:17,219 --> 01:01:24,849
is celery. And last operation push lettuce\n
497
01:01:24,849 --> 01:01:30,200
top of the stack. So as you can see, everything\n
498
01:01:30,199 --> 01:01:35,598
have access to anything else but the top of\n
499
01:01:36,778 --> 01:01:46,329
a stack works. So when in Where is a stack\n
500
01:01:46,329 --> 01:01:52,859
everywhere. They're using text editors to\n
501
01:01:52,858 --> 01:01:58,170
backwards or forwards. They use some compilers\n
502
01:01:58,170 --> 01:02:05,739
matching braces, and in the right order. stacks\n
503
01:02:05,739 --> 01:02:10,949
books, plates, and even games like the Tower\n
504
01:02:13,329 --> 01:02:17,859
stacks are also used behind the scenes to\n
505
01:02:17,858 --> 01:02:23,630
function calls. When a function returns it\n
506
01:02:23,630 --> 01:02:32,539
and rewinds to the next function that is on\n
507
01:02:32,539 --> 01:02:38,359
stacks all the time in programming and never\n
508
01:02:38,358 --> 01:02:43,900
stacks for us to perform a depth first search\n
509
01:02:43,900 --> 01:02:50,130
manually by maintaining your own stack, or\n
510
01:02:50,130 --> 01:02:58,440
stacks as we have just discussed complexity\n
511
01:02:58,440 --> 01:03:04,579
assumes that you implemented a stack using\n
512
01:03:04,579 --> 01:03:10,339
because we have a reference at the top of\n
513
01:03:10,338 --> 01:03:19,039
goes for popping and peeking. Searching however,\n
514
01:03:19,039 --> 01:03:23,489
searching for isn't necessarily at the top\n
515
01:03:23,489 --> 01:03:31,268
elements in the stack, hence require a linear\n
516
01:03:31,268 --> 01:03:39,449
of a problem using stacks problem. So given\n
517
01:03:39,449 --> 01:03:45,458
round brackets square brackets curly brackets\n
518
01:03:45,458 --> 01:03:52,288
So analyzing examples below to understand\n
519
01:03:52,289 --> 01:03:59,079
ones are invalid. So before I show you the\n
520
01:04:04,460 --> 01:04:10,499
in this first example, consider the following\n
521
01:04:10,498 --> 01:04:16,738
string from left to right, I will be displaying\n
522
01:04:16,739 --> 01:04:27,150
bracket. So let's begin. For every left bracket\n
523
01:04:27,150 --> 01:04:33,450
So this is the left square bracket that I\n
524
01:04:33,449 --> 01:04:42,798
this one on the stack. Same goes for the next\n
525
01:04:46,079 --> 01:04:55,280
this is a right square bracket. So we encountered\n
526
01:04:55,280 --> 01:05:01,630
checks. First we check if the stack is empty.\n
527
01:05:01,630 --> 01:05:08,420
if there are still things in the stack that\n
528
01:05:08,420 --> 01:05:15,740
is equal to the reversed current bracket.\n
529
01:05:15,739 --> 01:05:25,088
reversed bracket. So we are good. Next is\n
530
01:05:25,088 --> 01:05:32,338
empty. No, it isn't. So we're good. Is the\n
531
01:05:32,338 --> 01:05:40,190
bracket? Yes, it is. So let's keep going around\n
532
01:05:40,190 --> 01:05:48,349
A right bracket. Is the stack empty? No. Okay,\n
533
01:05:48,349 --> 01:05:57,999
equal to the reverse bracket? Yes. Okay. So\n
534
01:05:57,998 --> 01:06:04,471
stack empty? No. Okay, good. And there's the\n
535
01:06:04,471 --> 01:06:10,748
bracket. Yes. Okay, good. And now we're done\n
536
01:06:10,748 --> 01:06:17,018
the stack is empty. Now. Why is that? Well,\n
537
01:06:17,018 --> 01:06:23,838
sequence were left brackets, they would still\n
538
01:06:23,838 --> 01:06:31,989
So we can conclude that this bracket sequence\n
539
01:06:31,989 --> 01:06:39,568
example with another bracket sequence. So\n
540
01:06:39,568 --> 01:06:45,619
bracket is a left bracket, so we push onto\n
541
01:06:45,619 --> 01:06:51,880
bracket. So we push onto the stack. This next\n
542
01:06:51,880 --> 01:06:58,720
if the stack is empty. No, it's good. And\n
543
01:06:58,719 --> 01:07:05,449
reverse bracket? Yes, it is. This next bracket\n
544
01:07:05,449 --> 01:07:11,568
So we're good. And is the reverse bracket\n
545
01:07:11,568 --> 01:07:21,679
No, it isn't. So this bracket sequence is\n
546
01:07:21,679 --> 01:07:29,149
we just ran through. So if we let us be a\n
547
01:07:29,150 --> 01:07:36,410
string, we can get the reverse bracket for\n
548
01:07:36,409 --> 01:07:44,210
is a left bracket, push it on to the stack.\n
549
01:07:44,210 --> 01:07:50,179
And if the element at the top of the stack\n
550
01:07:50,179 --> 01:07:56,338
those conditions are true, then we return\n
551
01:07:56,338 --> 01:08:03,048
is empty or not. And if it is empty, then\n
552
01:08:03,048 --> 01:08:09,440
we do not. I want to take a moment and look\n
553
01:08:09,440 --> 01:08:14,920
amongst mathematicians and computer scientists.\n
554
01:08:14,920 --> 01:08:21,529
is played as follows. You start with a pile\n
555
01:08:21,529 --> 01:08:26,020
the objective of the game is to move all the\n
556
01:08:26,020 --> 01:08:33,480
this pile, and each move, you can move the\n
557
01:08:33,479 --> 01:08:37,658
a restriction that no disk be placed on top\nof
558
01:08:37,658 --> 01:08:44,848
a smaller desk. So we can think of each peg\n
559
01:08:44,849 --> 01:08:53,760
top element in a peg and placing it on another\n
560
01:08:53,760 --> 01:09:18,469
run. And you will see how each peg acts like\n
561
01:09:18,469 --> 01:09:25,380
So you just saw how transferring elements\n
562
01:09:25,380 --> 01:09:33,270
as popping a disk from one stack and pushing\n
563
01:09:33,270 --> 01:09:41,680
the disk you're placing on top is smaller.\n
564
01:09:41,680 --> 01:09:48,670
of three in the stack series. This is going\n
565
01:09:48,670 --> 01:09:57,449
a stack. So those stacks are often implemented\n
566
01:09:57,448 --> 01:10:04,609
sometimes double linked lists here We'll cover\n
567
01:10:04,609 --> 01:10:10,269
linked list. Later on, and we will look at\n
568
01:10:10,270 --> 01:10:18,751
using a doubly linked list. Okay, to begin\n
569
01:10:18,751 --> 01:10:26,980
our link place, so we're going to point the\n
570
01:10:26,979 --> 01:10:35,679
is initially empty. Then the trick to creating\n
571
01:10:35,680 --> 01:10:43,270
the new elements before the head and not at\n
572
01:10:43,270 --> 01:10:50,480
pointing in the correct direction when we\n
573
01:10:50,479 --> 01:10:56,299
we will soon see, the next element however,\n
574
01:10:56,300 --> 01:11:02,310
let's do that. To create a new node, adjust\n
575
01:11:02,310 --> 01:11:09,719
then hook on the nodes next pointer to where\n
576
01:11:09,719 --> 01:11:21,179
for five and also 13. Now let's have a look\n
577
01:11:21,179 --> 01:11:28,800
just move the head pointer to the next node\n
578
01:11:28,800 --> 01:11:35,639
the first node off the stack and set the nodes\n
579
01:11:35,639 --> 01:11:41,460
up by the garbage collector if you're coding\n
580
01:11:41,460 --> 01:11:46,929
references pointing to it. If you're in another\n
581
01:11:46,929 --> 01:11:53,849
explicitly deallocate free memory yourself\n
582
01:11:53,849 --> 01:11:59,840
Or you will get memory leaks. Getting a memory\n
583
01:11:59,840 --> 01:12:06,329
kinds of memory leaks, especially if it's\n
584
01:12:06,329 --> 01:12:13,019
reusing. So keep watching out for that not\n
585
01:12:13,020 --> 01:12:19,900
that we will be covering. If you see in an\n
586
01:12:19,899 --> 01:12:27,618
up my memory, please please point out to me,\n
587
01:12:27,618 --> 01:12:34,769
so we can patch that. Okay, so we keep proceeding\n
588
01:12:34,770 --> 01:12:43,980
pointer down to the next node. Pop again,\n
589
01:12:43,979 --> 01:12:52,199
popping we've reached last note and the stack\n
590
01:12:52,199 --> 01:12:59,050
in the stack series videos. Today we'll be\n
591
01:12:59,050 --> 01:13:06,260
stack. So the source code can be found on\n
592
01:13:06,260 --> 01:13:13,550
structures. Make sure you understood part\n
593
01:13:13,550 --> 01:13:19,840
So you actually know how we implement a stack\n
594
01:13:19,840 --> 01:13:25,400
video series, and the implementation of the\n
595
01:13:25,399 --> 01:13:31,859
to you then please start this repository on\n
596
01:13:31,859 --> 01:13:40,049
finding it as well. Here we are in the stack\n
597
01:13:40,050 --> 01:13:48,239
Java programming language. So the first thing\n
598
01:13:48,238 --> 01:13:56,109
variable of a length list. This is the linked\n
599
01:13:56,109 --> 01:14:02,000
linked list provided by Java. This is a little\n
600
01:14:02,000 --> 01:14:09,649
Java dot util that I will be using today,\n
601
01:14:10,770 --> 01:14:17,090
videos, this is just for portability, in case\n
602
01:14:17,090 --> 01:14:22,170
So we have two constructors, we can create\n
603
01:14:22,170 --> 01:14:29,800
one initial element. This is occasionally\n
604
01:14:29,800 --> 01:14:36,940
the stack. So to get to do that, we return\n
605
01:14:36,939 --> 01:14:42,698
the elements of our stack easy. We also check\n
606
01:14:42,698 --> 01:14:51,460
is zero. So this next one is just push so\n
607
01:14:51,460 --> 01:14:59,501
append that element as the last element in\n
608
01:14:59,501 --> 01:15:07,780
also pull element of the stack. So to do this,\n
609
01:15:07,779 --> 01:15:14,189
then we throw an empty stack exception because\n
610
01:15:14,189 --> 01:15:24,819
That doesn't make sense. Similarly, the same\n
611
01:15:24,819 --> 01:15:32,738
top element of the stack is, if the stack\n
612
01:15:32,738 --> 01:15:40,099
the last element of our list. And lastly,\n
613
01:15:40,100 --> 01:15:48,570
to iterate through our stack. This iterator\n
614
01:15:48,569 --> 01:15:54,359
supports concurrent modification errors. So\n
615
01:15:54,359 --> 01:16:00,880
that was static. It's only like 50 lines of\n
616
01:16:00,880 --> 01:16:06,010
one of the most useful data structures in\n
617
01:16:06,010 --> 01:16:13,130
one of three in the Q series. So the outline\n
618
01:16:13,130 --> 01:16:18,090
going to begin by talking about queues and\n
619
01:16:18,090 --> 01:16:24,110
some complexity analysis concerning queues.\n
620
01:16:24,109 --> 01:16:29,529
of n queuing and D queuing elements from a\n
621
01:16:29,529 --> 01:16:37,800
very end in the last video. So a discussion\n
622
01:16:37,800 --> 01:16:43,110
So below you can see an image of a queue.\n
623
01:16:43,109 --> 01:16:50,089
that models a real world queue. Having two\n
624
01:16:50,090 --> 01:16:59,739
and D queuing. So ever queue has a front and\n
625
01:16:59,738 --> 01:17:05,299
back and remove through the front. Adding\n
626
01:17:05,300 --> 01:17:12,989
n queuing. and removing elements from the\n
627
01:17:12,988 --> 01:17:21,888
there's a bit of terminology surrounding queues\n
628
01:17:21,889 --> 01:17:28,560
or when we refer to as queuing D queuing,\n
629
01:17:28,560 --> 01:17:37,119
So and queuing is also called adding but also\n
630
01:17:37,118 --> 01:17:42,389
we're talking about D queuing. So this is\n
631
01:17:42,390 --> 01:17:48,780
queue. This is also called polling elements.\n
632
01:17:48,779 --> 01:17:55,109
as removing an element from the queue. But\n
633
01:17:55,109 --> 01:18:02,880
some ambiguity, did they mean removing from\n
634
01:18:02,880 --> 01:18:08,969
the entire queue? Make note that if I say\n
635
01:18:08,969 --> 01:18:16,989
from the front of the queue unless I say otherwise.\n
636
01:18:16,989 --> 01:18:23,399
in detail. However, first, notice I have labeled\n
637
01:18:23,399 --> 01:18:29,359
where I'm going to be in queueing and D queuing\n
638
01:18:29,359 --> 01:18:36,449
instruction says in queue 12, so we add 12\n
639
01:18:36,449 --> 01:18:44,720
the first element from the front of the queue,\n
640
01:18:44,720 --> 01:18:52,320
we removed minus one from the front of the\n
641
01:18:55,359 --> 01:19:05,009
dq, so remove the front element being 33.\n
642
01:19:08,039 --> 01:19:13,380
So now that we know where a queue is, where\n
643
01:19:13,380 --> 01:19:19,050
Well, a classic example of where cuneus gets\n
644
01:19:19,050 --> 01:19:26,989
waiting in line at a movie theater or in the\n
645
01:19:26,988 --> 01:19:33,919
ever been to say McDonald's, where all the\n
646
01:19:33,920 --> 01:19:40,340
fried, the next person in line gets to order\n
647
01:19:40,340 --> 01:19:47,170
also be really useful if you have a sequence\n
648
01:19:47,170 --> 01:19:54,800
of say, the x most recent elements, while\n
649
01:19:54,800 --> 01:20:03,650
your queue gets larger than x elements, just\n
650
01:20:03,649 --> 01:20:11,039
in server management. So, suppose for a moment\n
651
01:20:11,039 --> 01:20:18,198
for requests from people to use your website,\n
652
01:20:18,198 --> 01:20:24,839
serve up to five people. But no more. If 12\n
653
01:20:24,840 --> 01:20:30,949
you're not going to be able to process all\n
654
01:20:30,948 --> 01:20:36,118
is you process the five that you're able to,\n
655
01:20:36,118 --> 01:20:43,408
queue waiting to be served. And whenever you\n
656
01:20:43,408 --> 01:20:49,179
next request, and then you start processing\n
657
01:20:49,179 --> 01:20:53,920
While you're doing this, more requests come\n
658
01:20:53,920 --> 01:21:01,539
add them to the end of the cube. queues are\n
659
01:21:01,539 --> 01:21:05,380
first search traversal on a graph, which is\n
660
01:21:05,380 --> 01:21:14,630
this example in the next video. All right\n
661
01:21:14,630 --> 01:21:19,510
So as we're seeing, it's pretty obvious that\n
662
01:21:19,510 --> 01:21:25,010
time. There's also another operation on a\n
663
01:21:25,010 --> 01:21:30,760
peaking. peaking means that we're looking\n
664
01:21:30,760 --> 01:21:37,489
removing it, the source or cost and time.\n
665
01:21:37,488 --> 01:21:44,019
within the queue, is linear time since we\n
666
01:21:44,020 --> 01:21:50,139
the elements. There's also element removal\n
667
01:21:50,139 --> 01:21:56,539
or polling, but in actually removing an element\n
668
01:21:56,539 --> 01:22:03,389
linear time, since we would have to scan through\n
669
01:22:03,389 --> 01:22:08,590
this video, we're going to have a look at\n
670
01:22:08,590 --> 01:22:14,369
search. And then we're going to look at the\n
671
01:22:14,369 --> 01:22:22,948
and D queuing elements works. Okay, onto the\n
672
01:22:22,948 --> 01:22:29,678
search is an operation we can do on the graph\n
673
01:22:29,679 --> 01:22:35,679
what I mean, when I say graph, I mean a network\n
674
01:22:35,679 --> 01:22:41,170
like that. But first, I should explain the\n
675
01:22:41,170 --> 01:22:47,279
search. The objective is to start a node and\n
676
01:22:47,279 --> 01:22:54,090
all the neighbors of the starting node, and\n
677
01:22:54,090 --> 01:22:59,670
node you visited and then all the neighbors\n
678
01:22:59,670 --> 01:23:07,559
so forth, expanding through all the neighbors\n
679
01:23:07,559 --> 01:23:14,929
of the breadth first search as expanding the\n
680
01:23:14,929 --> 01:23:23,260
as you go on. So let's begin our breadth first\n
681
01:23:23,260 --> 01:23:32,119
node zero as yellow and put it in the frontier\n
682
01:23:32,118 --> 01:23:38,380
visit all the neighbors of zero being one\n
683
01:23:38,380 --> 01:23:45,631
then we resolve the neighbors of one and nine\n
684
01:23:45,631 --> 01:23:51,410
seven, and visit all the neighbors of seven.\n
685
01:23:51,409 --> 01:23:59,689
here, and now visit all the neighbors of the\nyellow nodes.
686
01:23:59,689 --> 01:24:06,149
And now we're done our breadth first search\n
687
01:24:06,149 --> 01:24:14,198
frontier. Notice that there's 12 that is the\n
688
01:24:14,198 --> 01:24:20,169
island all by itself. So we are not able to\n
689
01:24:20,170 --> 01:24:26,719
is fine. Suppose you want to actually code\n
690
01:24:26,719 --> 01:24:36,529
done? Well, the idea is to use a cube. So\n
691
01:24:36,529 --> 01:24:43,639
And then we mark the starting node as visited.\n
692
01:24:43,639 --> 01:24:52,819
an element from our queue or D queuing. And\n
693
01:24:52,819 --> 01:24:58,799
D queued if the neighbor has not been visited\n
694
01:24:58,800 --> 01:25:07,429
to the queue. So now we have a way of processing\n
695
01:25:07,429 --> 01:25:15,658
search order. Really, really useful, very,\n
696
01:25:15,658 --> 01:25:23,219
now let's look at implementation of queues.\n
697
01:25:23,219 --> 01:25:29,439
out that you can implement the queue abstract\n
698
01:25:29,439 --> 01:25:36,219
the most popular methods are to either use\n
699
01:25:36,219 --> 01:25:42,800
lists. If you're using an array, you have\n
700
01:25:44,319 --> 01:25:48,960
array, if it's a dynamic array, then you'll\n
701
01:25:50,710 --> 01:25:54,730
a singly linked list and the source code,\n
702
01:25:54,729 --> 01:26:01,439
tuned for that. In a singly linked list, we're\n
703
01:26:01,439 --> 01:26:11,089
So initially, they're both No. But as we n\n
704
01:26:11,090 --> 01:26:16,630
so nothing really interesting is going on\n
705
01:26:16,630 --> 01:26:23,109
can see that we're pushing the tail pointer\n
706
01:26:23,109 --> 01:26:32,170
the tail pointer point to the next node. Now\n
707
01:26:32,170 --> 01:26:38,279
the tail forward, we're going to be pushing\n
708
01:26:38,279 --> 01:26:44,050
one, and then the element that was left over\n
709
01:26:44,050 --> 01:26:50,159
user. So why don't we push the head pointer\n
710
01:26:50,158 --> 01:26:55,399
so that it can be picked up by the garbage\n
711
01:26:55,399 --> 01:27:00,799
in another programming language, which requires\n
712
01:27:00,800 --> 01:27:08,210
yourself like C or c++, now's the time to\n
713
01:27:08,210 --> 01:27:16,448
we're just pushing the head forward and forward\n
714
01:27:16,448 --> 01:27:22,219
of elements, then remove them all then the\n
715
01:27:22,219 --> 01:27:31,789
is where we started. All right, now it's time\n
716
01:27:31,789 --> 01:27:37,948
So I implemented a queue and you can find\n
717
01:27:37,948 --> 01:27:47,808
slash my user name slash data dash structures.\n
718
01:27:47,809 --> 01:27:55,320
parts one and two from the Q series before\n
719
01:27:55,319 --> 01:28:03,380
some source code for a queue. So this source\n
720
01:28:03,380 --> 01:28:09,828
although you can probably translate it into\n
721
01:28:09,828 --> 01:28:17,009
the first thing to remark is I have an instance\n
722
01:28:17,010 --> 01:28:24,760
a Java's implementation of a doubly linked\n
723
01:28:24,760 --> 01:28:33,930
as you'll see the queue and the stack implementations\n
724
01:28:33,930 --> 01:28:40,920
constructors, one create just an empty queue.\n
725
01:28:40,920 --> 01:28:47,670
a queue but with a first element. In fact,\n
726
01:28:47,670 --> 01:28:56,210
we we might want to allow no elements. So\n
727
01:28:56,210 --> 01:29:00,149
method is the size, it just gets the size\n
728
01:29:00,149 --> 01:29:08,210
if the length list is empty. Those are both\n
729
01:29:08,210 --> 01:29:14,618
method is the peak method, the peak method\n
730
01:29:14,618 --> 01:29:23,000
the queue, but it will throw an error if your\n
731
01:29:23,000 --> 01:29:31,090
when your queue is empty. Similarly, for poll,\n
732
01:29:31,090 --> 01:29:41,510
the queue, but unlike peak will actually remove\n
733
01:29:41,510 --> 01:29:50,170
down a little bit, I have offer which adds\n
734
01:29:50,170 --> 01:29:55,949
I am allowing for no elements. So if you don't\n
735
01:29:55,948 --> 01:30:04,848
throw an error or something. So the poll removed\n
736
01:30:04,849 --> 01:30:11,840
to the back. So remove first and add last.\n
737
01:30:11,840 --> 01:30:18,889
in case you want to be able to iterate through\n
738
01:30:18,889 --> 01:30:26,579
and very simple implementation just under\n
739
01:30:26,579 --> 01:30:30,260
Although there are faster ways of creating\n
740
01:30:30,260 --> 01:30:39,780
The idea with arrays, especially static arrays,\n
741
01:30:39,779 --> 01:30:47,139
that will be in your queue at any given time,\n
742
01:30:47,139 --> 01:30:54,599
size and have pointers to the front and the\n
743
01:30:54,599 --> 01:31:03,659
remove elements based on the relative position\n
744
01:31:03,658 --> 01:31:10,939
where you're running off the edge of your\n
745
01:31:10,939 --> 01:31:16,559
of the array and keep processing elements\n
746
01:31:16,560 --> 01:31:24,969
maintain references to the next node, such\n
747
01:31:24,969 --> 01:31:33,059
your homework is to create a static array\n
748
01:31:33,059 --> 01:31:39,949
everything to do with priority queues from\n
749
01:31:39,948 --> 01:31:45,529
And towards the end, we'll also have a look\n
750
01:31:45,529 --> 01:31:52,090
queue stuff. We're also going to talk about\n
751
01:31:52,090 --> 01:32:00,159
although not the same. So the outline for\n
752
01:32:00,158 --> 01:32:06,170
start with the basics talking about what are\n
753
01:32:06,170 --> 01:32:10,859
then we'll move on to some common operations\n
754
01:32:10,859 --> 01:32:16,699
how we can turn min priority queues into max\n
755
01:32:16,699 --> 01:32:23,019
analysis. And we'll talk about common ways\n
756
01:32:23,020 --> 01:32:27,820
people think heaps are the only way we can\n
757
01:32:27,819 --> 01:32:34,658
queues somehow are heaps, I want to dispel\n
758
01:32:34,658 --> 01:32:40,348
some great detail about how to implement the\n
759
01:32:40,349 --> 01:32:48,310
we'll look at methods of sinking and swimming\n
760
01:32:48,310 --> 01:32:55,310
used to get and shuffle around elements in\n
761
01:32:55,310 --> 01:33:04,699
explanation, I also go over how to pull and\n
762
01:33:04,698 --> 01:33:12,638
let's get started. discussion and examples.\n
763
01:33:12,639 --> 01:33:19,630
priority queue series. So what is a priority\n
764
01:33:19,630 --> 01:33:25,809
type that operates similar to a normal queue\n
765
01:33:25,809 --> 01:33:34,360
a certain priority. So elements with a higher\n
766
01:33:34,359 --> 01:33:42,649
As a side note, I'd like to remark that priority\n
767
01:33:42,649 --> 01:33:48,379
meaning that the data we insert into the priority\n
768
01:33:48,380 --> 01:33:54,270
from least to greatest or raised lease. This\n
769
01:33:54,270 --> 01:34:01,380
elements. Okay, let's go into an example.\n
770
01:34:01,380 --> 01:34:06,170
inserted into a priority queue on the right,\n
771
01:34:06,170 --> 01:34:12,840
such that we want to order them from least\n
772
01:34:12,840 --> 01:34:21,630
higher priority than the bigger ones. So they\n
773
01:34:21,630 --> 01:34:27,868
Suppose we have now a list of instructions.\n
774
01:34:27,868 --> 01:34:35,328
the element that has the highest priority\n
775
01:34:35,328 --> 01:34:42,380
works. So if I say Paul, then I remove the\n
776
01:34:42,380 --> 01:34:51,059
to be one. Now I say add two, so we add two\n
777
01:34:51,059 --> 01:34:56,510
of smallest elements in our priority queue,\n
778
01:34:56,510 --> 01:35:06,619
Next, we add for all this smallest, this is\n
779
01:35:06,618 --> 01:35:12,479
pull the rest. So as I pull the rest, I'm\n
780
01:35:14,000 --> 01:35:21,109
priority queue. So it turns out that as we\n
781
01:35:21,109 --> 01:35:26,738
sequence. This is a coincidence. Actually,\n
782
01:35:26,738 --> 01:35:32,238
queue, we do not necessarily end up getting\n
783
01:35:32,238 --> 01:35:38,049
that the next number that is removed from\n
784
01:35:38,050 --> 01:35:45,260
that was currently in the priority queue.\n
785
01:35:45,260 --> 01:35:51,489
is the next smallest number to remove? As\n
786
01:35:51,488 --> 01:35:57,828
inside the priority queue and look and know\n
787
01:35:57,828 --> 01:36:04,210
was going to return. But fundamentally, how\n
788
01:36:04,210 --> 01:36:09,429
all the elements inside a priority queue before\n
789
01:36:09,429 --> 01:36:19,980
be highly ineffective. Instead, it uses what\n
790
01:36:19,979 --> 01:36:26,589
then is what is a heap? Usually I make up\n
791
01:36:26,590 --> 01:36:34,210
one from wiki. A heap is a tree based data\n
792
01:36:34,210 --> 01:36:41,800
also called the heap property. If a is a parent\n
793
01:36:41,800 --> 01:36:49,800
to B for all nodes A and B in the heap. What\n
794
01:36:49,800 --> 01:36:55,360
is always greater than or equal to the value\n
795
01:36:55,359 --> 01:37:01,269
way around that the value of the parent node\n
796
01:37:01,270 --> 01:37:08,810
child node for all nodes. This means we end\n
797
01:37:08,810 --> 01:37:15,620
and min heaps. So max heaps are the one with\n
798
01:37:15,619 --> 01:37:22,328
its children. And the min heap is the opposite.\n
799
01:37:22,328 --> 01:37:28,090
heaps binary because every node has exactly\n
800
01:37:28,090 --> 01:37:37,369
or no values I have not drawn in. So why are\n
801
01:37:37,368 --> 01:37:44,738
underlying data structure for priority queues?\n
802
01:37:44,738 --> 01:37:49,829
called heaps, although this isn't technically\n
803
01:37:49,829 --> 01:37:57,010
is an abstract data type, meaning it can be\n
804
01:37:57,010 --> 01:38:03,429
Okay, we're going to play a little game, I'm\n
805
01:38:03,429 --> 01:38:08,949
need to tell me whether it is a heap or not.\n
806
01:38:08,948 --> 01:38:13,069
to determine whether it's a heap or not, you\n
807
01:38:13,069 --> 01:38:21,799
just going to give you a short moment here.\n
808
01:38:21,800 --> 01:38:38,110
and this tree. So it's not a heap. Is this\n
809
01:38:38,109 --> 01:38:44,569
it satisfies a heap invariant, and it is a\n
810
01:38:44,569 --> 01:38:50,000
why we're called binomial heaps. Note that\n
811
01:38:50,000 --> 01:39:01,710
can have any number of branches. On to our\n
812
01:39:01,710 --> 01:39:07,969
a valid heap. Because even though this one\n
813
01:39:07,969 --> 01:39:15,340
free to move around the visual representation\n
814
01:39:15,340 --> 01:39:27,110
valid heap. How about this one? No, this structure\n
815
01:39:27,109 --> 01:39:40,849
the cycles. All heaps must be trees. What\n
816
01:39:40,850 --> 01:39:51,079
this one? Also heap because it satisfies the\n
817
01:39:51,078 --> 01:39:59,698
than or equal to or greater than or equal\n
818
01:39:59,698 --> 01:40:05,118
not onto the heap and because it does not\n
819
01:40:05,118 --> 01:40:14,679
do change the root to be 10, then we can satisfy\n
820
01:40:14,680 --> 01:40:27,761
Or rather sorry, a max heap. So when and where\n
821
01:40:27,761 --> 01:40:34,020
of the most popular places we see priority\n
822
01:40:34,020 --> 01:40:42,250
to fetch the next nodes we explore. priority\n
823
01:40:42,250 --> 01:40:49,810
a behavior in which you need to dynamically\n
824
01:40:49,810 --> 01:40:59,520
They're also used in Huffman encoding, which\n
825
01:40:59,520 --> 01:41:04,880
Many best first search algorithms use priority\n
826
01:41:04,880 --> 01:41:11,140
gain grab the next most promising node in\n
827
01:41:11,140 --> 01:41:18,340
we also see priority queues in prims minimum\n
828
01:41:18,340 --> 01:41:24,119
So it seems priority queues are really important,\n
829
01:41:24,118 --> 01:41:32,469
is where we see them often. Okay, on to some\n
830
01:41:32,469 --> 01:41:40,630
as a binary heap. To begin with, there exists\n
831
01:41:40,630 --> 01:41:46,250
unordered array in linear time, we're not\n
832
01:41:46,250 --> 01:41:54,559
cool. And it forms the basis for the sorting\n
833
01:41:54,559 --> 01:42:00,850
rather removing pulling or removing an element\n
834
01:42:00,850 --> 01:42:06,480
time, because you need to restore the heap\n
835
01:42:06,479 --> 01:42:13,698
time. So peaking or seeing the value at the\n
836
01:42:13,698 --> 01:42:20,269
is really nice. Adding an element to our heap\n
837
01:42:20,270 --> 01:42:26,950
we possibly have to reshuffled heap by bubbling\n
838
01:42:26,949 --> 01:42:36,109
Then there are a few more operations we can\n
839
01:42:36,109 --> 01:42:43,189
which is not the root element. So the naive\n
840
01:42:43,189 --> 01:42:50,349
do a linear scan to find the items position\n
841
01:42:50,350 --> 01:42:57,260
is it can be extremely slow in some situations,\n
842
01:42:57,260 --> 01:43:02,389
don't do this. And it's not a problem, which\n
843
01:43:02,389 --> 01:43:10,029
lazy and do the linear scan solution. However,\n
844
01:43:10,029 --> 01:43:17,130
time complexity, which I will go over later\n
845
01:43:17,130 --> 01:43:23,020
series. So stay tuned for that this method\n
846
01:43:23,020 --> 01:43:27,789
complexity to be logarithmic, which is super\n
847
01:43:27,788 --> 01:43:37,630
much as you are adding. Now the naive method\n
848
01:43:37,630 --> 01:43:43,590
heap is linear. Again, you just scan through\n
849
01:43:43,590 --> 01:43:51,340
table, we can reduce this to be a constant\n
850
01:43:51,340 --> 01:43:58,510
use the hash table implementation for the\n
851
01:43:58,510 --> 01:44:04,289
The downside however, to using hash table\n
852
01:44:04,288 --> 01:44:11,329
extra linear space factor. And it does add\n
853
01:44:11,329 --> 01:44:18,569
your table a lot during swaps. Today, we're\n
854
01:44:18,569 --> 01:44:24,939
into max priority queues. This is part two,\n
855
01:44:24,939 --> 01:44:29,799
may already be asking yourself, why is it\n
856
01:44:29,800 --> 01:44:36,400
priority queue into a max priority queue?\n
857
01:44:36,399 --> 01:44:42,519
library, most programming languages, they\n
858
01:44:42,520 --> 01:44:48,070
queue or a min priority queue. Usually it's\n
859
01:44:48,069 --> 01:44:53,639
by the smallest element first, but sometimes\n
860
01:44:53,639 --> 01:44:59,788
what we're programming. So how do we do this?\n
861
01:44:59,788 --> 01:45:07,750
queue To another type. Well, a hack we can\n
862
01:45:07,750 --> 01:45:10,050
in a priority queue must implement some sort
863
01:45:10,050 --> 01:45:17,220
of comparable interface, which we can simply\n
864
01:45:17,220 --> 01:45:22,110
heap. Let's look at some examples. Suppose\n
865
01:45:22,109 --> 01:45:28,349
consisting of elements that are on the right\n
866
01:45:28,350 --> 01:45:35,719
that min priority queue. So if x and y are\n
867
01:45:35,719 --> 01:45:41,989
than or equal to y, then x will come out of\n
868
01:45:41,988 --> 01:45:50,510
of this is x is greater than or equal to y.\n
869
01:45:50,511 --> 01:45:56,510
all these elements are still in the priority\n
870
01:45:56,510 --> 01:46:05,199
of x is less than or equal to y, just x greater\n
871
01:46:05,198 --> 01:46:12,859
to y? Well, not for competitors. You see if\n
872
01:46:12,859 --> 01:46:20,799
is negated, x should still equal y. So now\n
873
01:46:20,800 --> 01:46:30,061
elements out of priority queue with our negated\n
874
01:46:30,060 --> 01:46:41,179
it's the greatest. Next comes 11 753. And\n
875
01:46:41,180 --> 01:46:48,060
to negate the number before you insert it\n
876
01:46:48,060 --> 01:46:55,360
is a hack specific to numbers, but it's pretty\n
877
01:46:55,359 --> 01:47:01,019
negate all the numbers inside a priority queue.\n
878
01:47:01,020 --> 01:47:09,489
is the smallest so should come out first.\n
879
01:47:09,488 --> 01:47:17,948
now we have to remake the data and we 13.\n
880
01:47:17,948 --> 01:47:26,779
So really positive 11. And so on my seven,\n
881
01:47:26,779 --> 01:47:34,488
then arena, get the value to get it out of\n
882
01:47:34,488 --> 01:47:42,729
Okay, now let's look at my other examples\n
883
01:47:42,729 --> 01:47:49,419
for strings, which sorts strings in lexicographic.\n
884
01:47:49,420 --> 01:47:59,679
languages, then, let's call n Lex be the negation\n
885
01:47:59,679 --> 01:48:10,800
two to be some non null strings. So below,\n
886
01:48:10,800 --> 01:48:18,650
one if s one is less than s two Lexa graphically\n
887
01:48:18,649 --> 01:48:28,609
one if s one is greater than s two lexicographically.\n
888
01:48:28,609 --> 01:48:35,939
So just to break it down, ALEKS sorts strings,\n
889
01:48:35,939 --> 01:48:41,979
in gaining legs so that longer strings appear\n
890
01:48:41,979 --> 01:48:49,319
strings with letters at the end of the alphabet\n
891
01:48:49,319 --> 01:48:55,840
the beginning of the alphabet, I think I said\n
892
01:48:55,840 --> 01:49:04,069
he into a maxi beep. Let's look at a concrete\n
893
01:49:04,069 --> 01:49:09,859
right to a prayer queue with the lexicographic\n
894
01:49:09,859 --> 01:49:18,710
expect. First we get a because it's the shortest\n
895
01:49:18,710 --> 01:49:27,179
closest to the start of the alphabet, then\n
896
01:49:27,179 --> 01:49:38,270
and x x. So now let's do the same thing with\n
897
01:49:38,270 --> 01:49:43,069
opposite sequence in reverse order.
898
01:49:43,069 --> 01:49:55,018
And then we get x x x r x f, Zed mi N A. So\n
899
01:49:55,019 --> 01:50:00,090
we're going to talk about adding elements\n
900
01:50:00,090 --> 01:50:07,840
The priority queue series, we'll get to adding\n
901
01:50:07,840 --> 01:50:13,810
there are some important terminology and concepts\n
902
01:50:13,810 --> 01:50:24,889
prior to add elements effectively to our priority\n
903
01:50:24,889 --> 01:50:31,060
a priority queue is to use some kind of heap.\n
904
01:50:31,060 --> 01:50:37,579
which give us the best possible time complexity\n
905
01:50:37,578 --> 01:50:44,009
a priority queue. However, I want to make\n
906
01:50:44,010 --> 01:50:49,480
A priority queue is an abstract data type\n
907
01:50:49,479 --> 01:50:56,359
should have. The heap just lets us actually\n
908
01:50:56,359 --> 01:51:02,559
could use an unordered list to achieve the\n
909
01:51:02,560 --> 01:51:09,610
But this would not give us the best possible\n
910
01:51:09,609 --> 01:51:15,639
are many different types of heaps including\n
911
01:51:15,639 --> 01:51:26,550
pairing heaps, and so on, so on. But for simplicity,\n
912
01:51:26,550 --> 01:51:33,969
heap is a binary tree that supports the heap\n
913
01:51:33,969 --> 01:51:40,880
exactly two children. So the following structure\n
914
01:51:40,880 --> 01:51:47,510
property that every parent's value is greater\n
915
01:51:47,510 --> 01:51:54,489
has exactly two children. Well, no, you may\n
916
01:51:54,489 --> 01:52:00,429
leafs don't have children. Well, actually,\n
917
01:52:00,429 --> 01:52:09,649
children in gray. But for simplicity, I won't\n
918
01:52:09,649 --> 01:52:17,848
Okay. The next important bit of terminology,\n
919
01:52:17,849 --> 01:52:26,179
tree property. The complete binary tree property\n
920
01:52:26,179 --> 01:52:35,480
the last is completely filled, and that all\n
921
01:52:35,479 --> 01:52:44,029
binary tree. As you will see, when we insert\n
922
01:52:44,029 --> 01:52:52,509
row. As far left to meet this complete binary\n
923
01:52:52,510 --> 01:52:59,670
tree property is very, very important, because\n
924
01:52:59,670 --> 01:53:07,000
what the heap looks like, or what values are\n
925
01:53:07,000 --> 01:53:14,578
the hollow circle that and the next one will\n
926
01:53:14,578 --> 01:53:20,029
we fill up the row, at which point we need\n
927
01:53:20,029 --> 01:53:29,238
is a very important. One last thing before\n
928
01:53:29,238 --> 01:53:36,589
a binary heap, is we need to understand how\n
929
01:53:36,590 --> 01:53:44,989
there is a canonical way of doing this, which\n
930
01:53:44,988 --> 01:53:54,189
is a very convenient actually, because when\n
931
01:53:54,189 --> 01:54:02,939
the insertion position is just the last position\n
932
01:54:02,939 --> 01:54:07,889
way we can represent the heap, we can also\n
933
01:54:07,890 --> 01:54:16,219
and recursively add and remove nodes as needed.\n
934
01:54:16,219 --> 01:54:24,019
and also very, very fast. So on the left is\n
935
01:54:24,019 --> 01:54:30,489
the position of each node in the array. And\n
936
01:54:30,488 --> 01:54:36,198
as you read elements in the array from left\n
937
01:54:36,198 --> 01:54:39,678
the heap, one layer at a time.
938
01:54:39,679 --> 01:54:46,880
So if we're at no nine, which is index zero,\n
939
01:54:46,880 --> 01:54:54,190
position one. And as I keep moving along,\n
940
01:54:54,189 --> 01:55:02,118
the array going from left to right. So it's\n
941
01:55:02,118 --> 01:55:08,308
interesting property of story, I'm binary\n
942
01:55:08,309 --> 01:55:17,560
all the children and parent nodes. So suppose\n
943
01:55:17,560 --> 01:55:25,389
left child is going to be at index two times\n
944
01:55:25,389 --> 01:55:31,469
is going to be at two i plus two, this is\n
945
01:55:31,469 --> 01:55:40,059
just subtract one. So suppose we have a node\n
946
01:55:40,059 --> 01:55:46,820
its index is two. So by our formula, the left\n
947
01:55:46,819 --> 01:55:58,038
two, plus one, or, or five. If we look at\n
948
01:55:58,038 --> 01:56:03,658
look at the right child, we should expect\n
949
01:56:03,658 --> 01:56:10,488
look in our array, this gives us the value\n
950
01:56:10,488 --> 01:56:16,308
we need to manipulate the knowns now array\n
951
01:56:16,309 --> 01:56:22,529
Part Five for the series, we will see that\n
952
01:56:22,529 --> 01:56:29,590
All right. So now we want to know, how do\n
953
01:56:29,590 --> 01:56:34,779
the heap invariant, because if we I noticed\n
954
01:56:34,779 --> 01:56:41,420
heap property? Well, the binary heap is useless.\n
955
01:56:41,420 --> 01:56:47,029
some instructions, which tell us what values\n
956
01:56:47,029 --> 01:56:52,279
value is a one, which we can see, which should\n
957
01:56:52,279 --> 01:56:58,690
with a min heap. But instead of inserting\n
958
01:56:58,690 --> 01:57:04,809
put one at the bottom left of the tree in\n
959
01:57:04,809 --> 01:57:11,489
and performance call bubbling up as my undergrad\n
960
01:57:11,488 --> 01:57:20,000
swimming or even sifting up all really cool\n
961
01:57:20,000 --> 01:57:26,210
one and the insertion position. But now we're\n
962
01:57:26,210 --> 01:57:32,800
is less than seven, but one is found below\n
963
01:57:32,800 --> 01:57:38,779
swap one and seven, like so. But now we're\n
964
01:57:38,779 --> 01:57:47,979
one is a child of six, but one is less than\n
965
01:57:47,979 --> 01:57:53,319
again, violation the property. So we swap,\n
966
01:57:53,319 --> 01:57:59,269
be. And now the heap invariant to satisfy\n
967
01:57:59,270 --> 01:58:05,210
whatever you want to call it. So the next\n
968
01:58:05,210 --> 01:58:13,520
in the insertion position. And now, we need\n
969
01:58:13,520 --> 01:58:19,530
13. Notice that we're no longer in violation\n
970
01:58:19,529 --> 01:58:25,679
13, and 13 is less than 12. So 13 is actually\n
971
01:58:25,679 --> 01:58:32,619
have to bubble up our elements that much.\n
972
01:58:32,619 --> 01:58:39,130
zero and 10. Try seeing where these end up,\n
973
01:58:39,130 --> 01:58:47,659
exercise. But I will keep going for now. So\n
974
01:58:47,658 --> 01:58:54,460
the nodes, it's there, and we bubble it up\n
975
01:58:54,460 --> 01:59:01,420
the property is satisfied. Next zero, my favorite\n
976
01:59:01,420 --> 01:59:07,538
be at the top of the tree as you will see\n
977
01:59:07,538 --> 01:59:18,689
So let us bubble up and like magic zeros at\n
978
01:59:18,689 --> 01:59:24,198
numbers 10. So we put out an insertion position.\n
979
01:59:24,198 --> 01:59:26,269
invariant, so we do nothing.
980
01:59:26,270 --> 01:59:31,761
Today we're going to look at how to remove\n
981
01:59:31,761 --> 01:59:38,380
Four or five in the priority queue series.\n
982
01:59:38,380 --> 01:59:42,480
the underlying structure of the binary heap.
983
01:59:45,420 --> 01:59:52,480
In general, with heaps we always want to remove\n
984
01:59:52,479 --> 02:00:00,129
It's the one of the highest priority is the\n
985
02:00:00,130 --> 02:00:06,190
Route, we call it polling, a special thing\n
986
02:00:06,189 --> 02:00:14,269
to search for its index. Because in an array\n
987
02:00:14,270 --> 02:00:25,670
zero. So when I say pull in red, we have the\n
988
02:00:25,670 --> 02:00:32,840
one we're going to swap it with. So the note\n
989
02:00:32,840 --> 02:00:42,730
the end of our array, which we also have its\n
990
02:00:42,729 --> 02:00:52,109
one. And now, since 10 is at the top, well,\n
991
02:00:52,109 --> 02:00:57,429
we need to make sure that the heap invariant\n
992
02:00:57,429 --> 02:01:03,429
bubbling down now instead of bubbling up.\n
993
02:01:03,429 --> 02:01:11,679
five and one, and we select the smallest,\n
994
02:01:11,679 --> 02:01:20,449
go to one. So make sure you default, selecting\n
995
02:01:20,448 --> 02:01:29,098
as you can see, 1010s children are two and\n
996
02:01:29,099 --> 02:01:39,909
select the left node to break tight. And now\n
997
02:01:39,908 --> 02:01:50,868
invariant is satisfied. Now we want to remove\n
998
02:01:50,868 --> 02:01:57,659
element at the root. But 12 is not at the\n
999
02:01:57,659 --> 02:02:07,550
remove 12. So what we do is we have to search\n
1000
02:02:07,550 --> 02:02:13,980
its position yet. So we start at one, and\n
1001
02:02:13,979 --> 02:02:25,348
until we find 12. So five is not 12, two is\n
1002
02:02:25,349 --> 02:02:32,801
found 12. And now we know where its position\n
1003
02:02:32,801 --> 02:02:41,920
to remove, and also swap it with the purple\n
1004
02:02:41,920 --> 02:02:50,250
them remove the 12. And now we're in violation\n
1005
02:02:50,250 --> 02:03:00,059
up three, until the heap invariant is satisfied.\n
1006
02:03:00,059 --> 02:03:06,000
so we can start. Now we want to remove three,\n
1007
02:03:06,000 --> 02:03:17,988
the tree. Three wasn't far it was just two\n
1008
02:03:17,988 --> 02:03:28,919
swap it with the last node in the tree. Drop\n
1009
02:03:28,920 --> 02:03:37,369
up or bubble down the value because you don't\n
1010
02:03:37,369 --> 02:03:43,599
last position is when you're swapping it in.\n
1011
02:03:43,599 --> 02:03:53,750
we already satisfy that heap invariant from\n
1012
02:03:53,750 --> 02:03:59,050
five was smaller, so we swapped it with five,\n
1013
02:03:59,050 --> 02:04:08,400
eight. And again, the heap invariants are\n
1014
02:04:08,399 --> 02:04:19,340
root node, red swap it, remove the one. And\n
1015
02:04:19,341 --> 02:04:25,860
is satisfied. Now we want to remove six. So\n
1016
02:04:25,859 --> 02:04:36,479
Okay, we have found six and do the swap. Remove\n
1017
02:04:36,479 --> 02:04:43,158
answer is neither the heap invariant is already\n
1018
02:04:43,158 --> 02:04:49,109
We got lucky. So from all this polling and\n
1019
02:04:49,109 --> 02:04:53,920
polling takes logarithmic time since we're\n
1020
02:04:53,920 --> 02:05:00,329
to find it. And also that removing a random\n
1021
02:05:00,329 --> 02:05:06,908
to actually find the index of that node we\n
1022
02:05:06,908 --> 02:05:12,488
if you're as dissatisfied with this linear\n
1023
02:05:12,488 --> 02:05:19,039
has to be a better way. And indeed there is.\n
1024
02:05:19,039 --> 02:05:28,090
this complexity to be logarithmic in the general\n
1025
02:05:28,090 --> 02:05:34,039
at how to remove nodes on my heap with the\n
1026
02:05:34,039 --> 02:05:40,738
need to make use of a hash table, a data structure\n
1027
02:05:40,738 --> 02:05:47,198
are about to get a little wild. I promise\n
1028
02:05:47,198 --> 02:05:53,299
later video. But right now, it's going to\n
1029
02:05:53,300 --> 02:05:58,320
we have a bunch of nodes scatter across our\n
1030
02:05:58,319 --> 02:06:03,058
a linear scan to find out where the node we\n
1031
02:06:03,059 --> 02:06:09,690
to do a lookup and figure that out. The way\n
1032
02:06:09,689 --> 02:06:17,229
is going to be mapped to the indexes found\n
1033
02:06:17,229 --> 02:06:24,968
node, just look up its index and started doing\n
1034
02:06:24,969 --> 02:06:32,899
sounds great, except for one caveat. What\n
1035
02:06:32,899 --> 02:06:42,029
the same value? What problems will that cause?\n
1036
02:06:42,029 --> 02:06:48,259
To begin with, let's talk about how we can\n
1037
02:06:48,260 --> 02:06:56,210
of mapping one value to one position, we will\n
1038
02:06:56,210 --> 02:07:04,420
can do this by maintaining a set or tree set\n
1039
02:07:04,420 --> 02:07:14,899
or key if you want maps to. So can I example.\n
1040
02:07:14,899 --> 02:07:25,349
has repeated values. Namely, we can see that\n
1041
02:07:25,350 --> 02:07:33,130
twice, 11 and 13. Once the low I have drawn\n
1042
02:07:33,130 --> 02:07:42,819
determine the index position of a node in\n
1043
02:07:42,819 --> 02:07:52,460
at index five, and the first to index zero.\n
1044
02:07:52,460 --> 02:07:59,269
value pairs. Notice that two is found in three\n
1045
02:07:59,269 --> 02:08:04,280
two positions, one and four, and so on. So\n
1046
02:08:04,279 --> 02:08:10,288
positions of the values in the tree. If notes\n
1047
02:08:10,288 --> 02:08:16,618
keep track of that. For example, if a bubble\n
1048
02:08:16,618 --> 02:08:23,609
movements, and where the swabs go to so we\n
1049
02:08:23,609 --> 02:08:32,618
we swap 13. And the last seven, for example,\n
1050
02:08:32,618 --> 02:08:42,328
where seven and 13 are in our table. And then\n
1051
02:08:42,328 --> 02:08:49,518
red for the seven and yellow for the 13. And\n
1052
02:08:49,519 --> 02:08:58,429
do a swap in the tree but also in the table.\n
1053
02:08:58,429 --> 02:09:05,199
We keep track of repeated values by maintaining\n
1054
02:09:05,198 --> 02:09:12,649
value was found out. But now let's ask a further\n
1055
02:09:12,649 --> 02:09:18,118
node in our heap, which node do we remove\n
1056
02:09:18,118 --> 02:09:24,219
our heap right here, there's three possible\n
1057
02:09:27,099 --> 02:09:35,110
The answer is no, it does not matter. As long\n
1058
02:09:35,109 --> 02:09:41,670
and that's the most important thing. So let's\n
1059
02:09:41,670 --> 02:09:48,440
but also of adding and pulling elements with\n
1060
02:09:48,439 --> 02:09:56,288
that hard, trust me. So first one, we want\n
1061
02:09:56,288 --> 02:10:03,788
bottom of the heap in the insertion position.\n
1062
02:10:03,788 --> 02:10:11,139
so we add three to our table long with its\n
1063
02:10:11,139 --> 02:10:18,029
Look at the index, tree and grade confirm\n
1064
02:10:18,029 --> 02:10:24,170
we need to make sure the heap invariant satisfied,\n
1065
02:10:24,170 --> 02:10:31,179
up three, the parent of three is 11, which\n
1066
02:10:31,179 --> 02:10:38,868
those two notes, I have highlighted the seven\n
1067
02:10:38,868 --> 02:10:46,899
in the heap and three in the index three,\n
1068
02:10:46,899 --> 02:10:55,029
those both in the tree and in the table. Awesome.\n
1069
02:10:55,029 --> 02:11:00,960
So do a similar thing. For the note above,\n
1070
02:11:00,960 --> 02:11:10,960
on the table. And now the bat invariants are\n
1071
02:11:10,960 --> 02:11:17,649
The next instruction is to remove to from\n
1072
02:11:17,649 --> 02:11:24,379
Well, as I said, it doesn't matter as long\n
1073
02:11:24,380 --> 02:11:30,480
If we remove the last two, we can immediately\n
1074
02:11:30,479 --> 02:11:35,468
But for learning purposes, I will simply remove\n
1075
02:11:35,469 --> 02:11:47,448
to be located at index zero. So we want to\n
1076
02:11:47,448 --> 02:11:52,678
we remove a note again, so we did a look up.\n
1077
02:11:52,679 --> 02:12:00,868
which was nice. And now we swap it with the\n
1078
02:12:00,868 --> 02:12:08,509
we remove the last node. Now we need to satisfy\n
1079
02:12:08,510 --> 02:12:16,699
11. So we look at 11 children, which happens\n
1080
02:12:16,698 --> 02:12:26,569
the one we're going to swap with. So swap\n
1081
02:12:26,569 --> 02:12:33,130
are still not in satisfaction of the human\n
1082
02:12:33,130 --> 02:12:42,190
into two smaller, so swap it with two. And\n
1083
02:12:42,189 --> 02:12:50,339
says the poll, so we get the value of the\n
1084
02:12:50,340 --> 02:12:57,501
of the two and bubble down 11. So as you can\n
1085
02:12:57,501 --> 02:13:04,079
but still doing the same operations. This\n
1086
02:13:04,078 --> 02:13:10,448
series. And we're going to have a look at\n
1087
02:13:10,448 --> 02:13:16,069
So if you want the source code, here's the\n
1088
02:13:16,069 --> 02:13:20,889
in the series, the priority queue is one of\n
1089
02:13:20,889 --> 02:13:27,609
124, so you can actually understand what's\n
1090
02:13:27,609 --> 02:13:37,198
Alright, here we are inside the source code.\n
1091
02:13:37,198 --> 02:13:43,019
types of elements I'm allowing inside my priority\n
1092
02:13:43,020 --> 02:13:47,769
talked about. So if they implement the comparable\n
1093
02:13:47,769 --> 02:13:53,300
inside our queue. So this is anything like\n
1094
02:13:53,300 --> 02:13:58,690
interface. So let's have a look at some of\n
1095
02:13:58,689 --> 02:14:05,210
size. So this is the number of elements currently\n
1096
02:14:05,210 --> 02:14:12,550
instance variable, which is the heat capacity.\n
1097
02:14:12,550 --> 02:14:22,369
That we have four elements which may be larger\n
1098
02:14:22,368 --> 02:14:28,408
heap. And we're going to be maintaining it\n
1099
02:14:30,599 --> 02:14:39,019
Next, for our logging of and removals, I'm\n
1100
02:14:39,019 --> 02:14:45,639
want to map an element to a tree set of integers.\n
1101
02:14:45,639 --> 02:14:55,859
our heap, which we can find this element T.\n
1102
02:14:55,859 --> 02:15:02,198
constructors for our priority queue. We can\n
1103
02:15:02,198 --> 02:15:11,939
I'm creating and initially empty priority\n
1104
02:15:11,939 --> 02:15:19,469
you to create a priority queue with a defined\n
1105
02:15:19,469 --> 02:15:24,969
because then we don't have to keep expanding\n
1106
02:15:24,969 --> 02:15:32,050
this. But also, even better, is if you know\n
1107
02:15:32,050 --> 02:15:40,599
your heap, you can actually construct the\n
1108
02:15:40,599 --> 02:15:47,761
called heapify, which I didn't talk about\n
1109
02:15:47,761 --> 02:15:55,420
useful. So So this just has all the usual\n
1110
02:15:55,420 --> 02:16:03,219
to the math and but also to the heap. And\n
1111
02:16:03,219 --> 02:16:09,349
at halfway through the heap size, and then\n
1112
02:16:09,349 --> 02:16:16,550
the elements. And you're like, Wait a second,\n
1113
02:16:16,550 --> 02:16:22,409
Well, yes, it is in the general case, but\n
1114
02:16:22,408 --> 02:16:31,018
a link to this paper appear just because the\n
1115
02:16:31,019 --> 02:16:40,148
it has this linear complexity. And if you\n
1116
02:16:40,148 --> 02:16:46,439
end up seeing that the complexity boils down\n
1117
02:16:46,439 --> 02:16:53,489
constantly say it's linear time. But in general,\n
1118
02:16:53,489 --> 02:16:58,760
If you're given a collection of elements,\n
1119
02:16:58,760 --> 02:17:04,068
would just use our add method to add the elements\n
1120
02:17:04,068 --> 02:17:12,769
log N bound. But definitely use the heapify\n
1121
02:17:12,769 --> 02:17:21,159
methods we have is empty, just returns true\n
1122
02:17:21,159 --> 02:17:26,500
clear. So when we clear the heap, we remove\n
1123
02:17:26,500 --> 02:17:35,478
also inside our map. So that's why he called\n
1124
02:17:35,478 --> 02:17:46,000
simple. peek, the first really useful method\n
1125
02:17:46,000 --> 02:17:53,189
queue. And if it's empty returns No. Otherwise,\n
1126
02:17:53,189 --> 02:17:58,359
our heap and return it because it's the root\nnode
1127
02:17:59,359 --> 02:18:09,170
Similar to pique, except that we're going\n
1128
02:18:09,170 --> 02:18:16,808
And we're also going to return it because\n
1129
02:18:16,808 --> 02:18:23,929
because we have a map with our elements, we\n
1130
02:18:23,929 --> 02:18:31,500
inside heap. And this reduces our complexity\n
1131
02:18:31,500 --> 02:18:37,409
through a linear scan through all elements\n
1132
02:18:37,409 --> 02:18:44,619
is remarkable. But you have a job in case\n
1133
02:18:44,620 --> 02:18:50,559
just wanted to do it. Just to show you guys\n
1134
02:18:50,558 --> 02:18:59,590
a lot of constant overhead and you may or\n
1135
02:18:59,590 --> 02:19:08,279
is quite a lot. And I usually don't really\n
1136
02:19:08,279 --> 02:19:14,159
it might not be entirely worth it. But it's\n
1137
02:19:14,159 --> 02:19:20,079
as you are removals then definitely worth\n
1138
02:19:20,079 --> 02:19:29,138
Add method. So, so this element, sorry, this\n
1139
02:19:29,138 --> 02:19:39,849
And that element cannot be no. So what we\n
1140
02:19:39,850 --> 02:19:47,620
less than capacity. Otherwise we have to expand\n
1141
02:19:47,620 --> 02:19:54,829
we add it to the map. So we keep track of\n
1142
02:19:54,829 --> 02:20:01,690
we have to swim. I know Rob because we add\n
1143
02:20:01,690 --> 02:20:09,600
had to, like adjust where it goes inside our\n
1144
02:20:09,600 --> 02:20:17,250
called less is is a helper method, which helps\n
1145
02:20:17,250 --> 02:20:26,879
node j. And this uses the fact that both elements\n
1146
02:20:26,879 --> 02:20:34,269
can invoke the Compare to method. If we go\n
1147
02:20:34,270 --> 02:20:44,210
comparable interface, which we needed. So,\n
1148
02:20:44,209 --> 02:20:54,159
if i is less than or equal to J. Awesome.\n
1149
02:20:54,159 --> 02:21:02,409
are going to try to swim node k. So first,\n
1150
02:21:02,409 --> 02:21:12,939
that by solving for the parent. So remember\n
1151
02:21:12,939 --> 02:21:19,309
some people like to start the heaps index,\n
1152
02:21:19,309 --> 02:21:25,689
So I get the parent, which is that this position\n
1153
02:21:25,689 --> 02:21:35,510
going upwards. And while K is still greater\n
1154
02:21:35,510 --> 02:21:41,159
less than our parent, then we want to swim\n
1155
02:21:41,159 --> 02:21:47,738
parent and K. And then K is when we can become\n
1156
02:21:47,738 --> 02:21:54,129
parent of K once more. And then we'll keep\n
1157
02:21:54,129 --> 02:22:01,489
notes. So that's how you do the swim. So the\n
1158
02:22:01,489 --> 02:22:12,681
top down node sink. And here, we want to sync\n
1159
02:22:12,681 --> 02:22:18,840
node, but I also grab the right node. Remember,\n
1160
02:22:18,840 --> 02:22:29,110
two instead of a plus zero plus one. And then\n
1161
02:22:29,110 --> 02:22:34,360
is going to be the left one or the right one.\n
1162
02:22:34,360 --> 02:22:39,790
one is going to be smaller than the right\n
1163
02:22:39,790 --> 02:22:47,360
case it was false. So I checked that the right\n
1164
02:22:47,360 --> 02:22:54,550
the right node is less than the left node,\n
1165
02:22:55,760 --> 02:23:03,818
And our stopping condition is that we are\n
1166
02:23:03,818 --> 02:23:12,219
sink any more. And we can do a similar thing,\n
1167
02:23:12,219 --> 02:23:21,539
like like we did in the last method also.\n
1168
02:23:21,540 --> 02:23:30,590
to swap because I also have to swap things\n
1169
02:23:30,590 --> 02:23:36,139
And this is really what adds a lot of overhead\n
1170
02:23:36,139 --> 02:23:41,809
call the swap method, we also have to swap\n
1171
02:23:41,809 --> 02:23:48,139
a lot of overhead really, it technically maps\n
1172
02:23:48,139 --> 02:23:57,180
you're doing all this internal hashing and\n
1173
02:23:57,180 --> 02:24:04,729
So remove. So if the element is now returned\n
1174
02:24:04,728 --> 02:24:11,849
inside our heap. So this is how you would\n
1175
02:24:11,850 --> 02:24:17,329
out in case you want to revert back and remove\n
1176
02:24:17,329 --> 02:24:23,100
all the elements. And once you find the element\n
1177
02:24:23,100 --> 02:24:31,559
index and return true. Otherwise, we're going\n
1178
02:24:31,559 --> 02:24:39,989
the element one of the elements are. And if\n
1179
02:24:39,989 --> 02:24:49,629
now let's have a look at the Remove add method.\n
1180
02:24:49,629 --> 02:24:58,469
So if our heap is empty, well, can't really\n
1181
02:24:58,469 --> 02:25:09,898
swap the index Why remove with the last element,\n
1182
02:25:09,898 --> 02:25:19,139
we're going to kill off that node and also\n
1183
02:25:19,139 --> 02:25:27,340
that I was equal to the heap size, meaning\n
1184
02:25:27,340 --> 02:25:34,170
heap, just remove, return the removed data.\n
1185
02:25:34,170 --> 02:25:41,770
either sink that node up or down. And I'm\n
1186
02:25:41,770 --> 02:25:48,950
sink or swim. So I just tried both. So first\n
1187
02:25:48,950 --> 02:25:59,800
then I try swimming downwards. And in either\n
1188
02:25:59,799 --> 02:26:06,379
this just readjusts where, where the swap\n
1189
02:26:06,379 --> 02:26:17,379
down. This method is just a method I use in\n
1190
02:26:17,379 --> 02:26:24,889
is good. So it checks essentially the integrity\n
1191
02:26:24,889 --> 02:26:29,349
this method with K equals zero, and that starts\n
1192
02:26:29,350 --> 02:26:37,800
down the tree and check are we maintaining\n
1193
02:26:37,799 --> 02:26:44,250
So our basis, you want to be that k is outside\n
1194
02:26:44,250 --> 02:26:52,370
to return true. Otherwise, get our children\n
1195
02:26:52,370 --> 02:27:03,140
sure that k is less than both our children.\n
1196
02:27:03,139 --> 02:27:08,439
And if we ever returned false, because we\n
1197
02:27:12,209 --> 02:27:18,389
that gets propagated throughout the recursion,\n
1198
02:27:18,389 --> 02:27:24,680
if everything returns true and hits the base\n
1199
02:27:24,680 --> 02:27:34,479
heap. Okay, these last few methods are just\n
1200
02:27:34,478 --> 02:27:42,148
into the map, things, how to remove elements\n
1201
02:27:42,148 --> 02:27:48,799
here, as I'm using a tree set to add and remove\n
1202
02:27:48,799 --> 02:27:56,059
Java has a Balanced Binary Search Tree. So\n
1203
02:27:56,059 --> 02:28:03,799
which is really nice. So you guys can have\n
1204
02:28:03,799 --> 02:28:14,429
values removes values. And lastly, do a map\n
1205
02:28:14,430 --> 02:28:21,760
or in the map rather. So yes, have a look\n
1206
02:28:21,760 --> 02:28:29,219
covered everything about the priority queue.\n
1207
02:28:29,219 --> 02:28:36,829
also sometimes called the disjoint set, this\n
1208
02:28:36,829 --> 02:28:42,271
started. So an outline of things we'll be\n
1209
02:28:42,271 --> 02:28:49,689
be going over a motivating example, magnets.\n
1210
02:28:49,689 --> 02:28:56,818
be. Then we'll go over a classic example of\n
1211
02:28:56,818 --> 02:29:04,309
crucibles a minimum spanning tree algorithm,\n
1212
02:29:04,309 --> 02:29:11,549
it needs the union find to get the complexity\n
1213
02:29:11,549 --> 02:29:17,769
concerning the find in the Union operations,\n
1214
02:29:17,770 --> 02:29:24,540
And finally, we'll have a look at path compression.\n
1215
02:29:24,540 --> 02:29:33,479
time the unifying provides? Ok, let's dive\n
1216
02:29:33,478 --> 02:29:40,238
union find. So what what is the union fine.\n
1217
02:29:40,238 --> 02:29:46,478
elements which are split into one or more\n
1218
02:29:46,478 --> 02:29:55,228
primary operations. Find an union. A word\n
1219
02:29:55,228 --> 02:30:06,358
will tell you what group that element belongs\n
1220
02:30:06,359 --> 02:30:14,470
So if we have this example with magnets, suppose\n
1221
02:30:14,469 --> 02:30:19,889
are magnets. And also suppose that the magnets\n
1222
02:30:19,889 --> 02:30:27,079
meaning they want to merge together to form\n
1223
02:30:27,079 --> 02:30:32,379
all the magnets and give them numbers, and\n
1224
02:30:32,379 --> 02:30:40,319
attraction, first, we're going to merge six\n
1225
02:30:40,319 --> 02:30:49,270
our union find, we would say union six, and\n
1226
02:30:49,270 --> 02:30:56,861
out which groups six and eight belong to,\n
1227
02:30:56,861 --> 02:31:02,399
two, three, and three and four are highly\n
1228
02:31:02,398 --> 02:31:10,559
a group. So they would form the yellow group.\n
1229
02:31:10,559 --> 02:31:21,129
And this keeps on going, and we unify magnets\n
1230
02:31:21,129 --> 02:31:29,959
onto already existing groups. So we unify\n
1231
02:31:29,959 --> 02:31:38,898
own group, and add to an already existing\n
1232
02:31:38,898 --> 02:31:45,648
which are different colors, and we assign\n
1233
02:31:45,648 --> 02:31:51,478
everything in the yellow group went into the\n
1234
02:31:51,478 --> 02:31:58,688
in our union find to determine which group\n
1235
02:31:58,689 --> 02:32:06,300
in the blue group, and the union fine, does\n
1236
02:32:06,299 --> 02:32:12,099
manner, which is why it's so handy to have\naround.
1237
02:32:12,100 --> 02:32:15,772
Now explaining currently how that works. We'll\n
1238
02:32:15,772 --> 02:32:25,300
just a motivating example. So where are other\n
1239
02:32:25,299 --> 02:32:34,228
we will. Well, we see the unifying again in\n
1240
02:32:34,228 --> 02:32:40,000
In another problem called grid percolation,\n
1241
02:32:40,000 --> 02:32:45,540
we're trying to see if there's a path from\n
1242
02:32:45,540 --> 02:32:52,069
or vice versa, then the union find lets us\n
1243
02:32:52,068 --> 02:32:59,539
Also, similar kind of problem in network activity\n
1244
02:32:59,540 --> 02:33:04,771
each other through a series of edges. And\n
1245
02:33:04,771 --> 02:33:11,850
like the least common ancestor in a tree,\n
1246
02:33:11,850 --> 02:33:20,149
complexity can we attribute to the union fight?\n
1247
02:33:20,148 --> 02:33:28,439
is linear time, which isn't actually bad at\n
1248
02:33:28,439 --> 02:33:35,010
check if connected operations all happened\n
1249
02:33:35,010 --> 02:33:44,460
So almost constant time, although not quite\n
1250
02:33:44,459 --> 02:33:49,188
where we can determine how many components\n
1251
02:33:49,189 --> 02:33:54,829
groups of nine that's we have. And we can\n
1252
02:33:54,829 --> 02:34:03,250
really great. Okay, let's talk about a really\n
1253
02:34:03,250 --> 02:34:09,750
is crew skills, minimum spanning tree algorithm.\n
1254
02:34:09,750 --> 02:34:17,020
minimum spanning tree? So if we're given some\n
1255
02:34:17,020 --> 02:34:22,979
minimum spanning tree is a subset of the edges\n
1256
02:34:22,978 --> 02:34:34,590
so at a minimal cost. So, if this is our graph,\n
1257
02:34:34,590 --> 02:34:42,500
minimum spanning tree is the following and\n
1258
02:34:42,500 --> 02:34:47,770
Note that the minimum spanning tree is not\n
1259
02:34:47,770 --> 02:34:56,850
minimum spanning tree, it will also have a\n
1260
02:34:56,850 --> 02:35:02,689
we can break it up into three steps essentially.\n
1261
02:35:02,689 --> 02:35:09,470
sort them by ascending edge edge weight. Next\n
1262
02:35:09,469 --> 02:35:16,670
the sorted edges and compare the two nodes\n
1263
02:35:16,670 --> 02:35:22,850
already belong to the same group, then we\n
1264
02:35:22,850 --> 02:35:27,270
cycle in our minimum spanning tree, which\n
1265
02:35:27,270 --> 02:35:36,720
the, the two groups those nodes belong to.\n
1266
02:35:36,719 --> 02:35:43,699
until either we run out of edges, or all the\n
1267
02:35:43,700 --> 02:35:49,550
And you'll soon see what I mean by a group,\n
1268
02:35:49,549 --> 02:35:57,789
is going to come into play. So if this is\n
1269
02:35:57,790 --> 02:36:04,330
on it, first, let's scale the edges and sort\n
1270
02:36:04,329 --> 02:36:12,689
all the edges and their edge weights sort\n
1271
02:36:12,689 --> 02:36:20,850
processing the edges one at a time, started\n
1272
02:36:20,850 --> 02:36:28,000
highlighted the edge, it Jane, orange. And\n
1273
02:36:28,000 --> 02:36:37,829
i and j currently don't belong to any group.\n
1274
02:36:37,829 --> 02:36:45,950
orange. Next is edge eight, he, so he don't\n
1275
02:36:45,950 --> 02:36:57,329
them together into group purple. Next is CGI.\n
1276
02:36:57,329 --> 02:37:04,889
have a group yet. So see can go into group\n
1277
02:37:04,889 --> 02:37:07,969
to a group. So F can go to group purple.
1278
02:37:07,969 --> 02:37:16,608
Next, H and G, knee, neither age nor g belong\n
1279
02:37:16,609 --> 02:37:26,450
to the red group. And next we have the Ruby.\n
1280
02:37:26,450 --> 02:37:34,190
them their own group, let's say group green.\n
1281
02:37:34,190 --> 02:37:41,220
get interesting. Now, we're trying to connect\n
1282
02:37:41,219 --> 02:37:47,438
belong to group orange. So we don't want to\n
1283
02:37:47,439 --> 02:37:54,389
a cycle, so ignore it. And to check that they\n
1284
02:37:54,389 --> 02:38:00,889
find operation in our union fine to check\n
1285
02:38:00,889 --> 02:38:12,170
unifying really comes into play. Next is edge\n
1286
02:38:12,170 --> 02:38:17,180
and D belongs to group green. So now we want\n
1287
02:38:17,180 --> 02:38:21,680
belong to the same group. So either the purple\n
1288
02:38:21,680 --> 02:38:26,190
green groups and the purple group. And it\n
1289
02:38:26,190 --> 02:38:31,850
them. And this is when the union operation\n
1290
02:38:31,850 --> 02:38:39,930
us to merge groups of colors together very\n
1291
02:38:39,930 --> 02:38:51,139
Next edge would be d, h, h belongs to group\n
1292
02:38:51,139 --> 02:39:00,099
let's say they both become group purple. Next\n
1293
02:39:00,100 --> 02:39:04,530
belong to the same group. So that would create\n
1294
02:39:04,530 --> 02:39:15,530
edge. So skip. next rounds include edge BTC\n
1295
02:39:15,530 --> 02:39:22,029
two groups into one larger group. So we have\n
1296
02:39:22,029 --> 02:39:27,979
minimum spanning tree algorithm. Pretty neat,\n
1297
02:39:27,978 --> 02:39:33,938
allows us to do this is the union find it\n
1298
02:39:33,939 --> 02:39:40,729
efficiently, but also to find out which groups\n
1299
02:39:40,729 --> 02:39:49,100
cycle. So so that's crystals algorithm. It's\n
1300
02:39:49,100 --> 02:39:55,279
union find works. So I'm going to go into\n
1301
02:39:55,279 --> 02:40:02,729
the Find and the union operations work internally,\n
1302
02:40:02,728 --> 02:40:09,289
useful way. Okay, so now we're going to talk\n
1303
02:40:09,290 --> 02:40:17,270
do on the union find, or the disjoint. Set.\n
1304
02:40:17,270 --> 02:40:24,680
actually work internally. So to create our\n
1305
02:40:24,680 --> 02:40:32,420
do is we're going to construct a by ejection,\n
1306
02:40:32,420 --> 02:40:40,850
the integers in the range zero inclusive to\n
1307
02:40:40,850 --> 02:40:47,079
So this step in general is actually not necessary.\n
1308
02:40:47,079 --> 02:40:55,969
based unit find, which is very efficient,\n
1309
02:40:55,969 --> 02:41:03,809
have some random objects, and we want to assign\n
1310
02:41:03,809 --> 02:41:15,389
as long as each element maps to exactly one\n
1311
02:41:15,389 --> 02:41:20,170
And we want to store these mappings perhaps\n
1312
02:41:20,170 --> 02:41:30,109
them and determine what everything is mapped\n
1313
02:41:30,109 --> 02:41:39,449
And each index is going to have an associated\n
1314
02:41:39,449 --> 02:41:49,350
So for instance, in the last slide, a was\n
1315
02:41:55,228 --> 02:42:01,920
So what you see in this picture is at the\n
1316
02:42:01,920 --> 02:42:10,949
our mapping. And in the center is just a visual\n
1317
02:42:10,949 --> 02:42:19,120
in the array for each position is currently\n
1318
02:42:19,120 --> 02:42:30,140
originally, every node is a root node, meaning\n
1319
02:42:30,139 --> 02:42:38,978
on the left of unifying groups together, or\n
1320
02:42:38,978 --> 02:42:49,148
to find that we're going to change the values\n
1321
02:42:49,148 --> 02:42:57,059
specifically, the way we're going to do it\n
1322
02:42:57,059 --> 02:43:08,269
eyes parent is going to be whatever index\n
1323
02:43:08,270 --> 02:43:19,601
to unify C and K, we look at C and K. And\n
1324
02:43:19,601 --> 02:43:28,620
and K as of nine. So either C's won't come\n
1325
02:43:28,620 --> 02:43:37,220
And I chose that case parent is going to be\n
1326
02:43:37,219 --> 02:43:44,219
position, I'm going to put a four, because\n
1327
02:43:44,219 --> 02:43:50,510
are going to do a similar type of thing. And\n
1328
02:43:50,510 --> 02:43:59,670
is going to be E. So at F position, which\n
1329
02:43:59,670 --> 02:44:07,050
is zero. Similar thing for a and J. But here's\n
1330
02:44:07,050 --> 02:44:20,519
now we want to unify A and B. So if I look\n
1331
02:44:20,520 --> 02:44:30,630
I know that A's root node for group greens\n
1332
02:44:30,629 --> 02:44:40,170
loop. And in general, I'm going to merge smaller\n
1333
02:44:40,170 --> 02:44:52,850
are point to J, because the green groups root\n
1334
02:44:52,850 --> 02:44:58,750
find the root node of D which is the and find\n
1335
02:44:58,750 --> 02:45:04,670
to merge the smartcompany into the into the\n
1336
02:45:04,670 --> 02:45:11,439
Now these want to be part of the orange group.\n
1337
02:45:11,439 --> 02:45:23,930
happens. And I now points that C. Now, I want\n
1338
02:45:23,930 --> 02:45:33,939
going to merge L and B into the red group.\n
1339
02:45:33,939 --> 02:45:40,738
interesting example. So I find a C's root\n
1340
02:45:40,738 --> 02:45:50,260
node which is J. Now, component, orange has\n
1341
02:45:50,260 --> 02:45:58,909
three. So I'm going to merge the green component\n
1342
02:45:58,909 --> 02:46:09,439
to point to C. So I want to unify A and B.\n
1343
02:46:09,439 --> 02:46:18,600
nodes until I reach a root node, as parents\n
1344
02:46:18,600 --> 02:46:23,760
belongs to the orange group. And if I do a\n
1345
02:46:23,760 --> 02:46:29,158
B's parent is also C, which is the orange\n
1346
02:46:29,158 --> 02:46:40,260
already unified together. So H and J, G, they\n
1347
02:46:40,260 --> 02:46:47,728
to arbitrarily merge them into a new group.\n
1348
02:46:47,728 --> 02:46:53,679
if I look, h is parent ID, G, and s parent\nis he
1349
02:46:53,680 --> 02:47:03,139
the right component is larger, so I'm going\n
1350
02:47:03,139 --> 02:47:08,849
g was the root node, I make it point to E,\n
1351
02:47:08,850 --> 02:47:19,930
merge H and B. So H is root node is E, if\n
1352
02:47:19,930 --> 02:47:26,779
B's root node is C, because we go from B to\n
1353
02:47:26,779 --> 02:47:31,760
is larger than the red component, we're going\n
1354
02:47:31,760 --> 02:47:38,100
the root of the orange component. So he now\n
1355
02:47:38,100 --> 02:47:44,520
example, I'm not using a technique called\n
1356
02:47:44,520 --> 02:47:52,909
going to look at in the next video, which\n
1357
02:47:52,909 --> 02:47:59,600
if we want to find out which component a particular\n
1358
02:47:59,600 --> 02:48:06,988
the root of that component by following all\n
1359
02:48:06,988 --> 02:48:13,969
or a node whose parent is itself and that\n
1360
02:48:13,969 --> 02:48:22,969
element belongs to. And to unify two components\n
1361
02:48:22,969 --> 02:48:29,889
of each component. And then if the root nodes\n
1362
02:48:29,889 --> 02:48:35,278
then they belong to the same component already.\n
1363
02:48:35,279 --> 02:48:44,699
nodes point to the become the parent of the\n
1364
02:48:44,699 --> 02:48:50,899
union find data structure. So in general,\n
1365
02:48:50,898 --> 02:48:54,939
because this would be inefficient as we'd\n
1366
02:48:54,939 --> 02:49:01,430
to that note, we don't have access to those.\n
1367
02:49:01,430 --> 02:49:09,090
of that. I just don't see any application\n
1368
02:49:09,090 --> 02:49:15,189
that the number of components in our union\n
1369
02:49:15,189 --> 02:49:21,790
root nodes remaining. Because each root node\n
1370
02:49:21,790 --> 02:49:26,770
that the number of root nodes never increases,\n
1371
02:49:26,770 --> 02:49:33,109
unify components, so components only get bigger\n
1372
02:49:33,109 --> 02:49:39,109
about the complexity of the Union find. So\n
1373
02:49:39,109 --> 02:49:46,649
has an amortized time complexity. However,\n
1374
02:49:46,648 --> 02:49:52,189
have an amortized time complexity. Not yet\n
1375
02:49:52,189 --> 02:49:57,920
something we're going to look at in the next\n
1376
02:49:57,920 --> 02:50:05,920
an absolute beast of it. structure, you must\n
1377
02:50:05,920 --> 02:50:12,790
if we need to check, if H and B belong to\n
1378
02:50:12,790 --> 02:50:20,260
going to take five hops in the worst case,\n
1379
02:50:20,260 --> 02:50:26,260
find the root node, which is C, and then we\n
1380
02:50:26,260 --> 02:50:35,100
also C. So this takes quite a few hops. Let's\n
1381
02:50:35,100 --> 02:50:41,930
is really what makes the union find one of\n
1382
02:50:41,930 --> 02:50:51,790
it's how the union find gets to boast in its\n
1383
02:50:51,790 --> 02:50:57,010
we get started, it's critical that you watch\n
1384
02:50:57,010 --> 02:51:04,350
the find in the Union operation work. Otherwise,\n
1385
02:51:04,350 --> 02:51:10,149
what's up with path compression, and how we're\n
1386
02:51:11,779 --> 02:51:19,100
Alright, suppose we have this hypothetical\n
1387
02:51:19,100 --> 02:51:24,590
path compression, I'm almost certain it's\n
1388
02:51:24,590 --> 02:51:31,340
a structure that looks like this. Nonetheless,\n
1389
02:51:31,340 --> 02:51:42,449
to unify nodes, E and L. Or just unify groups,\n
1390
02:51:42,449 --> 02:51:48,500
and L. And that's what we're calling the unify\n
1391
02:51:48,500 --> 02:51:54,949
that start on E and L. And where we would\n
1392
02:51:54,949 --> 02:52:02,390
find the root node of L, and then get one\n
1393
02:52:02,389 --> 02:52:09,129
compression, we do that, but we're also going\n
1394
02:52:09,129 --> 02:52:18,849
the parent note of E. So E's parent is D,\n
1395
02:52:18,850 --> 02:52:24,988
F. So we found the root node of E. But with\n
1396
02:52:24,988 --> 02:52:30,689
to do. Now that we have a reference to the\n
1397
02:52:30,690 --> 02:52:38,899
the root node. And similarly DS are important\n
1398
02:52:38,898 --> 02:52:47,039
everything along the path got compressed,\n
1399
02:52:47,040 --> 02:52:55,500
so, at every time we do a lookup, on either\n
1400
02:52:55,500 --> 02:53:04,209
be able to find out what the parent or the\n
1401
02:53:04,209 --> 02:53:10,669
immediately point to it, we don't have to\n
1402
02:53:10,670 --> 02:53:17,040
can do this because in a union find, we're\n
1403
02:53:17,040 --> 02:53:22,979
them more and more compressed. We're never\n
1404
02:53:22,978 --> 02:53:27,898
we do the same thing for L, we find LS parent.\n
1405
02:53:27,898 --> 02:53:36,170
find the root. And then we compress the path.\n
1406
02:53:36,170 --> 02:53:45,020
to G. And so we compress that path. But we\n
1407
02:53:45,020 --> 02:53:52,430
point to the other. And we've unified both\n
1408
02:53:52,430 --> 02:53:58,360
E, and once with l have now been merged into\n
1409
02:53:58,360 --> 02:54:06,630
is we've compressed along the way as we've\n
1410
02:54:06,629 --> 02:54:13,769
Now let's have a look at another example.\n
1411
02:54:13,770 --> 02:54:20,430
the regular union find operations where we're\n
1412
02:54:20,430 --> 02:54:29,479
version, we now know. So if I run all those\n
1413
02:54:29,478 --> 02:54:38,028
So it's the beginning all these pairs of components,\n
1414
02:54:38,029 --> 02:54:51,710
right. And this is the final state of our\n
1415
02:54:51,709 --> 02:55:00,278
determine what groups say a and j or n, then\n
1416
02:55:00,279 --> 02:55:09,600
nodes. So j goes, I use h h goes to eat. But\n
1417
02:55:09,600 --> 02:55:18,680
what happens. So I still have all those components.\n
1418
02:55:18,680 --> 02:55:23,710
the right hand side, this is what happens.\n
1419
02:55:23,709 --> 02:55:32,329
of path compression, that j merged into the\n
1420
02:55:32,329 --> 02:55:39,110
And then I keep executing more instructions.\n
1421
02:55:39,110 --> 02:55:50,940
dynamically. So so I'm getting more and more\n
1422
02:55:50,940 --> 02:55:57,069
So on the last example, we haven't even finish\n
1423
02:55:57,068 --> 02:56:01,359
the final state. But with path compression,\n
1424
02:56:01,359 --> 02:56:08,159
our path, we get to compress the path along\n
1425
02:56:08,159 --> 02:56:13,299
now, we only have one, root, B, E, and
1426
02:56:13,299 --> 02:56:21,000
almost everything in constant time, points\n
1427
02:56:21,000 --> 02:56:26,029
And we know that the route is easy. So we\n
1428
02:56:26,029 --> 02:56:32,800
becomes very stable eventually, because of\n
1429
02:56:32,799 --> 02:56:40,398
find with path compression is so efficient.\n
1430
02:56:40,398 --> 02:56:49,318
source code. So here's the link to the source\n
1431
02:56:49,318 --> 02:56:55,459
github.com slash William fees, slash data\n
1432
02:56:55,459 --> 02:57:00,858
structures from past videos. And before we\n
1433
02:57:00,859 --> 02:57:07,220
the other videos pertaining to the union find,\n
1434
02:57:07,219 --> 02:57:17,528
Okay, let's dig in. Here we are inside the\n
1435
02:57:17,529 --> 02:57:23,488
and see a few instance variables. So let's\n
1436
02:57:23,488 --> 02:57:30,478
many elements we have in our union find. There\n
1437
02:57:30,478 --> 02:57:38,159
one called size. So the interest, while the\n
1438
02:57:38,159 --> 02:57:47,859
array I talked about which at index i points\n
1439
02:57:47,859 --> 02:57:55,609
is equal to AI, then we know that AI is a\n
1440
02:57:55,609 --> 02:58:04,000
of all these like tree like structures right\n
1441
02:58:04,000 --> 02:58:10,790
because we create a by ejection between our\n
1442
02:58:10,790 --> 02:58:18,110
able to access them through this ID array.\n
1443
02:58:18,110 --> 02:58:24,470
of components, that's sometimes some useful\n
1444
02:58:24,469 --> 02:58:28,898
a union find, well, you need to know how many\n
1445
02:58:28,898 --> 02:58:37,930
find. And I make sure that we have a positive\n
1446
02:58:37,930 --> 02:58:48,220
Now go ahead and initialize some instance\n
1447
02:58:48,219 --> 02:59:00,959
So initially, everyone is a root node, and\n
1448
02:59:00,959 --> 02:59:10,259
is pretty simple. It's given a a node, it\n
1449
02:59:10,260 --> 02:59:17,591
does path compression along the way. So if\n
1450
02:59:17,591 --> 02:59:22,760
of P, what we're going to do is we're going\n
1451
02:59:22,760 --> 02:59:29,659
loop. So we initialize a new variable called\n
1452
02:59:29,659 --> 02:59:37,029
is not equal to ID at root. So aka This is\n
1453
02:59:37,029 --> 02:59:42,989
so we can stop and the root is stored in the\n
1454
02:59:42,989 --> 02:59:49,898
is we do the path compression. This is what\n
1455
02:59:49,898 --> 02:59:59,219
back at p, we assign everything from idmp\n
1456
02:59:59,219 --> 03:00:05,659
the path gives us that nice amortized time\n
1457
03:00:05,659 --> 03:00:12,939
But I don't like having the overhead and doing\n
1458
03:00:12,939 --> 03:00:25,950
Okay, so now, we have these simple methods,\n
1459
03:00:25,950 --> 03:00:35,409
same component, this will return true, because\n
1460
03:00:35,409 --> 03:00:43,010
this will return false. And just calling find\n
1461
03:00:43,010 --> 03:00:49,620
just checking if two components are connected,\n
1462
03:00:49,620 --> 03:00:56,560
the path, same thing here, if we decide to\n
1463
03:00:56,559 --> 03:01:05,818
index p, then when we index into the size\n
1464
03:01:05,818 --> 03:01:11,409
the root but at the same time, we'll also\n
1465
03:01:11,409 --> 03:01:20,238
And I would just like to note that the the\n
1466
03:01:20,238 --> 03:01:24,010
the size because they're the ones that are\n
1467
03:01:24,010 --> 03:01:31,309
at the end of the chain. Size just returns\n
1468
03:01:31,309 --> 03:01:37,840
disjoint, set components number components\n
1469
03:01:37,840 --> 03:01:45,069
method is the last interesting method. So\n
1470
03:01:45,069 --> 03:01:54,909
together. So so first of all we do is we find\n
1471
03:01:54,909 --> 03:02:01,159
node for Q is. And if the root nodes are equal,\n
1472
03:02:01,159 --> 03:02:08,549
we don't do anything. Otherwise, by convention,\n
1473
03:02:08,549 --> 03:02:19,259
group. Although I know some people like to\n
1474
03:02:19,260 --> 03:02:26,489
and then merge according to not and that may\n
1475
03:02:26,489 --> 03:02:33,318
work. So I just like to merge the smaller\n
1476
03:02:33,318 --> 03:02:40,409
the roots are different, and we're emerging,\n
1477
03:02:40,409 --> 03:02:47,000
must have decreased by one. So that's why\n
1478
03:02:47,000 --> 03:02:55,350
components, subtract that by one, because\n
1479
03:02:55,350 --> 03:03:03,859
So this whole time, inside this class, I've\n
1480
03:03:03,859 --> 03:03:12,130
as elements, like letters that I that we saw\n
1481
03:03:12,129 --> 03:03:18,709
by ejection, I would do a lookup to find out\n
1482
03:03:18,709 --> 03:03:25,919
give me an integer, and what maps to the element\n
1483
03:03:25,920 --> 03:03:31,100
union find data structure created, and turn\n
1484
03:03:31,100 --> 03:03:40,840
of dealing with objects and having all this\n
1485
03:03:40,840 --> 03:03:48,079
an array based union find. You could also\n
1486
03:03:48,079 --> 03:03:53,760
objects. But this is really nice, and it's\n
1487
03:03:53,760 --> 03:04:01,110
I want to start talking about a very exciting\n
1488
03:04:01,110 --> 03:04:08,750
start to realize that there are tons and tons\n
1489
03:04:08,750 --> 03:04:16,399
I want to focus on a very popular country\n
1490
03:04:16,399 --> 03:04:23,648
trees, we must talk about binary search trees\n
1491
03:04:23,648 --> 03:04:31,010
on binary trees and binary search trees where\n
1492
03:04:31,010 --> 03:04:36,540
tutorials, we're going to cover how to insert\n
1493
03:04:36,540 --> 03:04:44,740
and also do some of the more popular tree\n
1494
03:04:44,739 --> 03:04:53,549
trees. Also, not just binary trees. Okay.\n
1495
03:04:53,549 --> 03:05:00,090
course on trees before we get started. So\n
1496
03:05:00,090 --> 03:05:05,210
can satisfy either of the following definitions,\n
1497
03:05:05,209 --> 03:05:15,619
these are the most popular ones. So trees\n
1498
03:05:15,620 --> 03:05:24,590
a cyclic, a cyclic means there are no cycles.\n
1499
03:05:24,590 --> 03:05:33,909
we have n minus one edges. And lastly, for\n
1500
03:05:33,909 --> 03:05:41,549
those two vertices, you can have two different\n
1501
03:05:41,549 --> 03:05:49,059
a tree because there's another route to get\n
1502
03:05:49,059 --> 03:05:58,488
Okay, and context is trees, we can have something\n
1503
03:05:58,488 --> 03:06:04,359
node of our tree, you can think of it that\n
1504
03:06:04,359 --> 03:06:10,140
and you don't have a route yet, it doesn't\n
1505
03:06:10,139 --> 03:06:19,409
because any node you pick can become the root\n
1506
03:06:19,409 --> 03:06:27,139
that node. And suddenly, it's the new root.\n
1507
03:06:27,139 --> 03:06:33,920
child and parent nodes. So child node is a\n
1508
03:06:33,920 --> 03:06:39,978
of it as going down or it's an A parent node\n
1509
03:06:39,978 --> 03:06:47,789
towards the root. So we have an interesting\n
1510
03:06:47,790 --> 03:06:56,689
of the root node? The answer is that the root\n
1511
03:06:56,689 --> 03:07:04,439
may be useful to say that the parent of the\n
1512
03:07:04,439 --> 03:07:11,739
when programming, for instance, a file system,\n
1513
03:07:11,739 --> 03:07:19,199
command line, I'm in some directory, so I\n
1514
03:07:19,200 --> 03:07:27,750
somewhere in the file system tree. And if\n
1515
03:07:27,750 --> 03:07:36,238
dot dot slash, and now I'm up in another directory,\n
1516
03:07:36,238 --> 03:07:43,898
doing this and going up and up in the file\n
1517
03:07:43,898 --> 03:07:50,670
directly to the root node, which is slash,\n
1518
03:07:50,670 --> 03:07:57,989
very top at the root of the directory, and\n
1519
03:07:57,989 --> 03:08:06,119
I am, I'm again at the root. So in this context,\n
1520
03:08:06,120 --> 03:08:13,710
is the root. Pretty cool. So just as an example,\n
1521
03:08:13,709 --> 03:08:20,579
three and two and a parent four, we also have\n
1522
03:08:20,579 --> 03:08:26,271
this a node which has no children, and these\n
1523
03:08:26,271 --> 03:08:32,360
just at the very bottom of your tree. Think\n
1524
03:08:32,360 --> 03:08:38,329
which is sub tree, this is the tree entirely\n
1525
03:08:38,329 --> 03:08:45,329
use triangles to denote sub trees. It's possible\n
1526
03:08:45,329 --> 03:08:56,969
node, so that's fine. So if this so tree with\n
1527
03:08:56,969 --> 03:09:02,459
particular sub tree and look what's inside\n
1528
03:09:02,459 --> 03:09:08,698
nodes and more sub trees. Then we pick another\n
1529
03:09:08,699 --> 03:09:14,870
we get another tree. And eventually, we're\n
1530
03:09:14,870 --> 03:09:22,120
the question, what is a binary tree? And this\n
1531
03:09:22,120 --> 03:09:30,760
node has at most, two children. So both those\n
1532
03:09:30,760 --> 03:09:38,829
have at most two children. You can see that\n
1533
03:09:38,829 --> 03:09:45,569
and that's fine, because the criteria is at\n
1534
03:09:45,569 --> 03:09:50,879
I'm going to give you some various structures,\n
1535
03:09:50,879 --> 03:10:02,379
it is a binary tree or not. So is this a binary\n
1536
03:10:02,379 --> 03:10:10,929
two children. How about this one? No, you\n
1537
03:10:10,930 --> 03:10:17,809
it's not a binary tree. How about this one?\n
1538
03:10:17,809 --> 03:10:25,698
Yes, this is a binary tree. It may be a degenerate\n
1539
03:10:25,699 --> 03:10:30,790
let's move on to binary search trees. So what\n
1540
03:10:30,790 --> 03:10:36,489
it's a binary tree. But Furthermore, it also\n
1541
03:10:36,488 --> 03:10:43,989
tree invariant. And that is that the less\n
1542
03:10:43,989 --> 03:10:50,389
value of the current node. And the right subtree\n
1543
03:10:50,389 --> 03:10:56,639
node. So below are a few binary search trees.\n
1544
03:10:56,639 --> 03:11:01,099
and I'm going to give you some trees, and\n
1545
03:11:01,100 --> 03:11:11,000
binary search trees or not. What about this\n
1546
03:11:11,000 --> 03:11:16,469
on whether you want to allow duplicate values\n
1547
03:11:16,469 --> 03:11:21,770
search tree operations allow for duplicate\n
1548
03:11:21,771 --> 03:11:27,659
most of the time, we're only interested in\n
1549
03:11:27,658 --> 03:11:36,129
particular tree depends on what your definition\n
1550
03:11:36,129 --> 03:11:44,969
tree? Yes, this is a binary search tree. How\n
1551
03:11:44,969 --> 03:11:54,719
the elements within the tree. Yes, this is\n
1552
03:11:54,719 --> 03:12:00,409
to only having numbers within our binary search\n
1553
03:12:00,409 --> 03:12:13,398
is comparable and can be ordered. How about\n
1554
03:12:13,398 --> 03:12:21,219
tree. And the reason is the the node nine\n
1555
03:12:21,219 --> 03:12:30,408
inserting nine, we would have to place it\n
1556
03:12:30,408 --> 03:12:41,810
is larger than eight so belongs in its right\n
1557
03:12:41,810 --> 03:12:47,799
isn't even a tree actually, because it contains\n
1558
03:12:47,799 --> 03:12:55,679
search tree is that you must be a tree. And\n
1559
03:12:55,680 --> 03:13:02,790
more time to look at this one. Because it's\n
1560
03:13:02,790 --> 03:13:11,680
think of as a binary search tree. And the\n
1561
03:13:11,680 --> 03:13:19,170
the binary search tree invariant that every\n
1562
03:13:19,170 --> 03:13:25,579
will you'll see that that is true. And also\n
1563
03:13:25,579 --> 03:13:32,799
It doesn't look like a tree, but it satisfies\n
1564
03:13:32,799 --> 03:13:38,349
tree. Okay, so we've been talking about binary\n
1565
03:13:38,350 --> 03:13:45,870
used? Why are they useful? So in particular,\n
1566
03:13:45,870 --> 03:13:53,930
of abstract data types for sets, and maps\n
1567
03:13:53,930 --> 03:13:59,439
balanced binary search trees, which we'll\n
1568
03:13:59,439 --> 03:14:05,738
see binary search, or sorry, binary trees\n
1569
03:14:05,738 --> 03:14:12,770
priority queues when we're making a binary\n
1570
03:14:12,770 --> 03:14:20,319
like syntax trees, so you're parsing an arithmetic\n
1571
03:14:20,318 --> 03:14:28,539
syntax tree. And then you can simplify expression.\n
1572
03:14:28,540 --> 03:14:34,790
expressions. So wherever you punch in your\n
1573
03:14:34,790 --> 03:14:41,949
evaluated. And lastly, I just threw in a trip,\n
1574
03:14:41,949 --> 03:14:47,590
structure. So now let's look at the complexity\n
1575
03:14:47,590 --> 03:14:54,540
looked very interesting and also very useful.\n
1576
03:14:54,540 --> 03:14:58,149
when you're just given some random data
1577
03:14:58,148 --> 03:15:04,939
the time complexity is growing. Be logarithmic,\n
1578
03:15:04,939 --> 03:15:11,120
nodes, deleting nodes, removing nodes searching\n
1579
03:15:11,120 --> 03:15:17,000
complexity is going to be the logarithmic.\n
1580
03:15:17,000 --> 03:15:24,930
are very easy to implement. So this is really\n
1581
03:15:24,930 --> 03:15:34,590
tree and D generates to being a line, then\n
1582
03:15:34,590 --> 03:15:41,100
which is really bad. So there's some trade\n
1583
03:15:41,100 --> 03:15:47,100
search tree, that it's going to be easy to\n
1584
03:15:47,100 --> 03:15:52,210
this logarithmic behavior. But in the worst\n
1585
03:15:52,209 --> 03:16:00,778
stuff, which is not so good. Okay, let's have\n
1586
03:16:00,779 --> 03:16:10,180
a binary search tree. So let's dive right\n
1587
03:16:10,180 --> 03:16:15,500
search tree, we need to make sure that the\n
1588
03:16:15,500 --> 03:16:22,309
meaning that we can order them in some way\n
1589
03:16:22,309 --> 03:16:29,420
know whether we need to place the element\n
1590
03:16:29,420 --> 03:16:36,370
And we're going to encounter essentially four\n
1591
03:16:36,370 --> 03:16:43,630
to compare the value to the value of the current\n
1592
03:16:43,629 --> 03:16:51,000
things. Either, we're going to recurse down\n
1593
03:16:51,000 --> 03:16:55,000
than the current element, or we're going to\n
1594
03:16:55,000 --> 03:17:01,500
element is greater than the current element.\n
1595
03:17:01,500 --> 03:17:10,850
has the same value as the one we're considering.\n
1596
03:17:10,850 --> 03:17:18,829
if we're deciding to add duplicate values\n
1597
03:17:18,829 --> 03:17:22,850
we have the case that we've hit a null node,\n
1598
03:17:22,850 --> 03:17:29,470
and insert it in our tree. Let's look at some\n
1599
03:17:29,469 --> 03:17:35,969
of insert instructions. So we have all these\n
1600
03:17:35,969 --> 03:17:40,228
tree. And currently the search tree or the\n
1601
03:17:40,228 --> 03:17:50,059
want to insert seven. So seven becomes the\n
1602
03:17:50,059 --> 03:17:59,750
Next, we want to insert 20. So 20 is greater\n
1603
03:17:59,750 --> 03:18:04,840
we want insert five. So we always start at\n
1604
03:18:04,840 --> 03:18:09,630
an important point. So you start at the root,\n
1605
03:18:09,629 --> 03:18:14,778
figure out where you want to insert the node.\n
1606
03:18:14,779 --> 03:18:22,370
oh, five is less than seven. So we're going\n
1607
03:18:22,370 --> 03:18:30,390
go to the right, because 15 is greater than\n
1608
03:18:30,389 --> 03:18:44,879
20 at 10. Now four, so four is less than seven,\n
1609
03:18:44,879 --> 03:18:51,500
create the new node. Now we have four again.\n
1610
03:18:51,500 --> 03:18:57,620
to the left and moved to the left. Now we've\n
1611
03:18:57,620 --> 03:19:04,040
our tree. So as I said before, if your tree\n
1612
03:19:04,040 --> 03:19:10,370
to add another node. And you would either\n
1613
03:19:10,370 --> 03:19:16,779
or on the right. Otherwise, you'd do nothing.\n
1614
03:19:16,779 --> 03:19:24,010
33. So start at the root, go to the right,\n
1615
03:19:24,010 --> 03:19:31,950
Now insert two, so two smaller than everything\n
1616
03:19:31,950 --> 03:19:41,390
the left. Now try and see where 25 would go.\n
1617
03:19:41,389 --> 03:19:48,108
to go to the right again, because it's greater\n
1618
03:19:50,840 --> 03:19:59,949
And finally sex so once left, once right,\n
1619
03:19:59,949 --> 03:20:05,790
search tree. So on average, the insertion\n
1620
03:20:05,790 --> 03:20:12,449
worst case, this behavior could degrade to\n
1621
03:20:12,449 --> 03:20:21,229
if our instructions are the following insert\n
1622
03:20:21,228 --> 03:20:26,159
and insert two sets to the right. Okay, now\n
1623
03:20:26,159 --> 03:20:31,449
than everything. So I have to place the right\n
1624
03:20:31,449 --> 03:20:38,960
greater than everything. Oh, looks like we're\n
1625
03:20:38,959 --> 03:20:44,289
still greater than everything. So as you can\n
1626
03:20:44,290 --> 03:20:51,229
bad. And we don't want to create lines like\n
1627
03:20:51,228 --> 03:20:56,020
is in the tree, or if we want to remove five,\n
1628
03:20:56,021 --> 03:21:01,420
thing to find the node, that's going to take\n
1629
03:21:01,420 --> 03:21:07,329
one of the reasons why people haven't invented\n
1630
03:21:07,329 --> 03:21:13,680
or self balancing trees, which balance themselves\n
1631
03:21:13,680 --> 03:21:21,510
But that's it for insertion. It's really simple.\n
1632
03:21:21,510 --> 03:21:26,809
we know how to insert elements into a binary\n
1633
03:21:26,809 --> 03:21:32,250
elements from a binary search tree. And this\n
1634
03:21:32,250 --> 03:21:38,750
to make it very simple for you guys. So when\n
1635
03:21:38,750 --> 03:21:44,889
you can think of it as a two step process.\n
1636
03:21:44,889 --> 03:21:51,559
to remove within the binary search tree, if\n
1637
03:21:51,559 --> 03:21:58,859
we want to replace the node we're removing\n
1638
03:21:58,860 --> 03:22:04,810
to maintain the binary search tree invariant,\n
1639
03:22:04,809 --> 03:22:11,978
invariant is it's that the left subtree has\n
1640
03:22:11,978 --> 03:22:18,459
the right subtree has larger elements than\n
1641
03:22:18,459 --> 03:22:26,629
phase one the find phase. So if we're searching\n
1642
03:22:26,629 --> 03:22:32,879
one of four things is going to happen. The\n
1643
03:22:32,879 --> 03:22:38,579
we've went all the way down our binary search\n
1644
03:22:38,579 --> 03:22:44,920
So the value does not exist inside our binary\n
1645
03:22:44,920 --> 03:22:54,500
is the competitor value is equal to zero.\n
1646
03:22:54,500 --> 03:23:01,430
function that will return minus one if it's\n
1647
03:23:01,430 --> 03:23:07,139
it's greater than the current value. So it\n
1648
03:23:07,139 --> 03:23:12,579
to go down, or if we found value that we're\n
1649
03:23:12,579 --> 03:23:19,350
we found the value. If it's less than zero,\n
1650
03:23:19,350 --> 03:23:26,770
to be the left subtree if I compared to returns\n
1651
03:23:26,770 --> 03:23:33,819
exists, it's going to be the right subtree.\n
1652
03:23:33,818 --> 03:23:42,859
So suppose we have four or five queries, find\n
1653
03:23:42,859 --> 03:23:49,329
tree there on the right. So if we're trying\n
1654
03:23:49,329 --> 03:23:59,000
the root and 14 is less than so go left. 14\n
1655
03:23:59,000 --> 03:24:07,520
than 15. Go left. 14 is greater than 12. So\n
1656
03:24:07,520 --> 03:24:14,750
value that we are looking for. Alright, and\n
1657
03:24:14,750 --> 03:24:25,450
Wi Fi is less than 31. And now we found 25.\n
1658
03:24:27,090 --> 03:24:37,270
Okay, here's her go, go right, go right, your\n
1659
03:24:37,270 --> 03:24:43,940
look at 17. So 17 should be on the left. Now\n
1660
03:24:43,940 --> 03:24:53,630
again. And, oh, we've hit a point where we\n
1661
03:24:53,629 --> 03:25:00,379
that's another possibility the value simply\n
1662
03:25:00,379 --> 03:25:07,608
Left of 90, and the left of 19 is a null node.\n
1663
03:25:07,609 --> 03:25:13,739
value we're looking for. So now that we found\n
1664
03:25:13,739 --> 03:25:18,969
the Remove phase. And in the Remove phase,\n
1665
03:25:18,969 --> 03:25:24,608
first case is that the node we want to remove\n
1666
03:25:24,609 --> 03:25:32,790
two, and three, as I like to call them, is\n
1667
03:25:32,790 --> 03:25:38,930
right subtree, but no left subtree, or those\n
1668
03:25:38,930 --> 03:25:44,270
finally case for is that we have both a left\n
1669
03:25:44,270 --> 03:25:51,600
you guys how to handle each of these cases,\n
1670
03:25:51,600 --> 03:25:57,430
one, we have a leaf node. So if you have a\n
1671
03:25:57,430 --> 03:26:03,699
you can do so with a side effect, which is\n
1672
03:26:03,699 --> 03:26:11,979
search tree on the right, and we want to remove\n
1673
03:26:11,978 --> 03:26:20,760
eight is. Oh, and it is a case one, because\n
1674
03:26:20,760 --> 03:26:28,719
it without side effect. So we remove it. Perfect.\n
1675
03:26:28,719 --> 03:26:38,019
two and three. Meaning that either the left\n
1676
03:26:38,020 --> 03:26:45,449
the successor of the node we're trying to\n
1677
03:26:45,449 --> 03:26:54,250
left or right subtree. Let's look at an example.\n
1678
03:26:54,250 --> 03:27:04,870
let's find nine. Okay, we found nine. Now\n
1679
03:27:04,870 --> 03:27:12,050
is nine doesn't have a right subtree. So the\n
1680
03:27:12,049 --> 03:27:20,179
that left subtree. So seven. So now I can\n
1681
03:27:20,180 --> 03:27:29,180
of nine. Perfect. Now let's do another example\n
1682
03:27:29,180 --> 03:27:39,568
find four. So we find four. And this is our\n
1683
03:27:39,568 --> 03:27:46,609
But no, right. subtree. So where do we do,\n
1684
03:27:46,609 --> 03:27:52,800
node of that left. subtree. So three, so we\n
1685
03:27:52,799 --> 03:28:01,369
successor. Alright, that wasn't so bad, was\n
1686
03:28:01,370 --> 03:28:09,819
want to remove node which has both a left\n
1687
03:28:09,818 --> 03:28:16,260
is, in which subtree will the successor of\n
1688
03:28:16,260 --> 03:28:26,829
And the answer is both the successor can either\n
1689
03:28:26,829 --> 03:28:34,469
the smallest value in the right. subtree and\n
1690
03:28:34,469 --> 03:28:42,809
there can be two successors. So the largest\n
1691
03:28:42,809 --> 03:28:48,478
would satisfy the binary search tree invariant,\n
1692
03:28:48,478 --> 03:28:54,799
be larger than everything in the left subtree\n
1693
03:28:54,799 --> 03:29:00,369
left subtree and also it would be smaller\n
1694
03:29:00,370 --> 03:29:08,010
we had found it in the left. subtree similarly,\n
1695
03:29:08,010 --> 03:29:12,510
subtree It would also satisfy the biosurgery\n
1696
03:29:12,510 --> 03:29:19,000
be smaller than everything in the right subtree\n
1697
03:29:19,000 --> 03:29:24,680
than the right subtree and also larger than\n
1698
03:29:24,680 --> 03:29:31,540
was found in the right subtree and we know\n
1699
03:29:31,540 --> 03:29:38,439
than everything in the left subtree. So we\n
1700
03:29:38,439 --> 03:29:47,380
be two possible successors. So we can choose\n
1701
03:29:47,379 --> 03:29:55,389
and we want to remove seven, well seven is\n
1702
03:29:55,389 --> 03:30:05,269
and right subtree also seven, so find seven\n
1703
03:30:05,270 --> 03:30:15,770
the successor in our left subtree or our right\n
1704
03:30:15,770 --> 03:30:26,550
in the right subtree what we do is we go into\n
1705
03:30:26,549 --> 03:30:35,478
as possible, go left and go left again. And\n
1706
03:30:35,478 --> 03:30:44,829
stop. And this node is going to be the successor,\n
1707
03:30:44,829 --> 03:30:50,590
And you can see that quite clearly 11 is smaller\n
1708
03:30:50,590 --> 03:30:58,969
now what we want to do is, we want to copy\n
1709
03:30:58,969 --> 03:31:05,238
subtree 11 into the node we want to originally\n
1710
03:31:05,238 --> 03:31:12,689
seven with 11. Now we have a problem, which\n
1711
03:31:12,689 --> 03:31:21,199
want to now remove that element 11 which is\n
1712
03:31:21,199 --> 03:31:28,030
shouldn't no longer be inside the tree. And\n
1713
03:31:28,030 --> 03:31:33,399
to remove is always going to be either a case\n
1714
03:31:33,398 --> 03:31:40,478
it would be the case where there's a right\n
1715
03:31:40,478 --> 03:31:47,658
just do this recursively. So so just call\n
1716
03:31:47,658 --> 03:31:57,000
cases. Okay, so it's right subtree. So I want\n
1717
03:31:57,000 --> 03:32:06,648
right subtree and then remove it. So remove\n
1718
03:32:06,648 --> 03:32:14,219
to rebalance the tree like that. Alright,\n
1719
03:32:14,219 --> 03:32:22,158
this example, let's remove 14. So first, let's\n
1720
03:32:22,158 --> 03:32:30,279
go right. Alright, we found 14. Now we either\n
1721
03:32:30,279 --> 03:32:35,640
subtree like we did last time, or the largest\n
1722
03:32:35,639 --> 03:32:42,898
ladder this time and find the largest value\n
1723
03:32:42,898 --> 03:32:49,369
is we would go into the left subtree and dig\n
1724
03:32:49,370 --> 03:33:01,760
the left subtree in digges far right 913.\n
1725
03:33:01,760 --> 03:33:12,219
And now what we want to do as before is we\n
1726
03:33:12,219 --> 03:33:19,899
into the node we want to remove which is 14.\n
1727
03:33:19,899 --> 03:33:25,770
the remaining 13. So now just remove it and\n
1728
03:33:25,770 --> 03:33:33,060
I just want to go over some more examples.\n
1729
03:33:33,059 --> 03:33:39,260
removing is not quite so obvious. Alright,\n
1730
03:33:39,260 --> 03:33:46,370
see if we can remove 18 from this strange\n
1731
03:33:46,370 --> 03:33:54,880
root find a teen so dig all the way down.\n
1732
03:33:54,879 --> 03:34:03,059
be it's one of those case two or threes. It's\n
1733
03:34:03,059 --> 03:34:10,299
successor is just going to be the root node\n
1734
03:34:10,299 --> 03:34:17,579
replace it 17. So 17 is a new successor. Perfect.\n
1735
03:34:17,579 --> 03:34:24,709
want to remove minus two. So now first find\n
1736
03:34:24,709 --> 03:34:31,789
it found it. Now there's two subtrees pick\n
1737
03:34:31,790 --> 03:34:38,470
and we're going to go to the right. Alright,\n
1738
03:34:38,469 --> 03:34:48,559
one. And that's it. Okay, and that is removals\n
1739
03:34:48,559 --> 03:34:55,699
off binary trees and binary search trees with\n
1740
03:34:55,700 --> 03:35:04,069
in order post order and level order. You see\n
1741
03:35:04,068 --> 03:35:11,079
they're good to know. I want to focus on pre\n
1742
03:35:11,079 --> 03:35:20,579
because they're very similar. They're also\n
1743
03:35:20,579 --> 03:35:30,408
sort of get a feel for why they have their\n
1744
03:35:30,408 --> 03:35:38,609
before the two recursive calls, in order will\n
1745
03:35:38,609 --> 03:35:47,689
will print after the recursive calls. So if\n
1746
03:35:47,689 --> 03:35:54,340
the only thing that's different between them\n
1747
03:35:54,340 --> 03:36:01,049
move on to some detail on how preorder works.\n
1748
03:36:01,049 --> 03:36:07,479
stack of what gets called. So when we're recursing\n
1749
03:36:07,479 --> 03:36:14,909
node to go to. And what you need to know about\n
1750
03:36:14,909 --> 03:36:21,440
current node and then we traverse the left\n
1751
03:36:21,440 --> 03:36:30,101
for order where we're going to do is insert\n
1752
03:36:30,101 --> 03:36:41,800
we go down to D, go down to H. And now we\n
1753
03:36:41,799 --> 03:36:51,849
call stack and go to D and then we go to I\n
1754
03:36:51,850 --> 03:37:00,180
so we recurse back up, so we push AI off the\n
1755
03:37:01,180 --> 03:37:10,670
go back to B we also are a process B but now\n
1756
03:37:10,670 --> 03:37:20,360
and explore II. Now we've explored ease. So\n
1757
03:37:20,360 --> 03:37:30,540
off the stack. And now a now we need to explore\n
1758
03:37:30,540 --> 03:37:42,830
C then F then j and then right at the bottom,\n
1759
03:37:42,829 --> 03:37:51,049
are recursive push node k off the stack, push\n
1760
03:37:51,049 --> 03:38:03,329
C's right subtree. So g now l and now we're\n
1761
03:38:03,329 --> 03:38:11,659
would exit our function. And at the bottom,\n
1762
03:38:11,659 --> 03:38:20,020
Okay, now let's cover inorder traversal. So,\n
1763
03:38:20,020 --> 03:38:29,050
the left subtree that we print the value.\n
1764
03:38:29,049 --> 03:38:34,728
for this example, I'm going to be using a\n
1765
03:38:34,728 --> 03:38:43,670
binary tree. And you'll see something interesting\n
1766
03:38:43,670 --> 03:38:51,500
it's our route. Then we go left, then we go\n
1767
03:38:51,500 --> 03:38:58,779
I was going left, I would push those on to\n
1768
03:38:58,779 --> 03:39:06,949
because when I call in order, the very first\n
1769
03:39:06,949 --> 03:39:15,140
I only print once I've traversed the entire\n
1770
03:39:15,139 --> 03:39:22,420
now one is a leaf node, then I've already\n
1771
03:39:22,420 --> 03:39:30,579
I can print the current value. Then I recurse\n
1772
03:39:30,579 --> 03:39:39,978
threes left subtree now we go right now and\n
1773
03:39:39,978 --> 03:39:51,010
recurse. Now I can print six because I've\n
1774
03:39:51,010 --> 03:40:01,318
then recurse then print 11. Now we need to\n
1775
03:40:01,318 --> 03:40:12,840
left, go left may explore 12 recurs and we're\n
1776
03:40:12,840 --> 03:40:22,719
14. Also, because 14 has no sub trees up.\n
1777
03:40:22,719 --> 03:40:34,159
stack, print 15, because we will explode 15th\n
1778
03:40:34,159 --> 03:40:42,350
last thing we need to do is finish our function\n
1779
03:40:42,350 --> 03:40:49,250
So go back up. And now, did you notice something\n
1780
03:40:49,250 --> 03:41:00,119
traversal? Well, what happened was we printed\n
1781
03:41:00,119 --> 03:41:05,819
which is why it's called an inorder traversal.\n
1782
03:41:05,818 --> 03:41:11,818
the values in increasing order, which is really\n
1783
03:41:11,818 --> 03:41:19,709
inorder traversal. Now, let's look at the\n
1784
03:41:19,709 --> 03:41:24,459
says, okay, traverse the left subtree, then\n
1785
03:41:24,459 --> 03:41:31,599
done doing both of those, only then print\n
1786
03:41:31,600 --> 03:41:38,550
look at our tree right now, the last value\n
1787
03:41:38,550 --> 03:41:46,350
we need to process a lemon's entire left subtree\n
1788
03:41:46,350 --> 03:41:54,988
So let's start at 11 and explore its left\n
1789
03:41:54,988 --> 03:42:01,158
one, because we've explored both its left\n
1790
03:42:01,158 --> 03:42:06,238
we haven't explored its right subtree yet,\n
1791
03:42:06,238 --> 03:42:12,448
trees which don't exist. Now we can print\n
1792
03:42:12,449 --> 03:42:20,739
trees. And then similarly, they go down to\n
1793
03:42:20,738 --> 03:42:30,199
don't print 11, because we still need to do\n
1794
03:42:30,200 --> 03:42:38,590
Go up to 13, print 14, go back up to 13. And\n
1795
03:42:38,590 --> 03:42:47,510
we haven't explored all of its right subtree\n
1796
03:42:47,510 --> 03:42:54,840
the stack and print on the way back up. And\n
1797
03:42:54,840 --> 03:43:02,170
node we have visited. And that's pre order\n
1798
03:43:02,170 --> 03:43:07,920
to look at level order traversal, which is\n
1799
03:43:07,920 --> 03:43:15,699
other two, a level order traversal is we want\n
1800
03:43:15,699 --> 03:43:24,930
start with 11. They want to print six and\n
1801
03:43:24,930 --> 03:43:35,158
And you're like oh, how am I going to do that.\n
1802
03:43:35,158 --> 03:43:40,939
is by doing something called a breadth first\n
1803
03:43:40,939 --> 03:43:46,420
to the leaf node. So she knows what a breadth\n
1804
03:43:46,420 --> 03:43:53,658
the same thing, a tree is a type of graph,\n
1805
03:43:53,658 --> 03:43:58,579
do to do our breadth first search is we're\n
1806
03:43:58,579 --> 03:44:05,430
left to explore. And how it's going to work\n
1807
03:44:05,430 --> 03:44:13,639
only the root node. And we're going to keep\n
1808
03:44:13,639 --> 03:44:24,199
in our queue until our queue is over. A bit\n
1809
03:44:24,200 --> 03:44:32,210
queue. On the right, I've inserted node 11.\n
1810
03:44:32,209 --> 03:44:37,719
add elevens left child and right child to\n
1811
03:44:37,719 --> 03:44:44,988
the queue. And I've also removed 11. So so\n
1812
03:44:44,988 --> 03:44:55,368
by 15. And then I would keep adding children\n
1813
03:44:55,369 --> 03:45:04,040
So let's have a look. So I've pulled 11 from\n
1814
03:45:04,040 --> 03:45:12,279
Now the next thing on the top of the queue\n
1815
03:45:12,279 --> 03:45:22,359
three and eight the queue, then 15. Next up,\n
1816
03:45:22,359 --> 03:45:31,380
queue, and next up in the queues three. So\n
1817
03:45:31,379 --> 03:45:39,148
and then move on, explore eight, eight has\n
1818
03:45:39,148 --> 03:45:47,358
As you can see that as I'm exploring nodes,\n
1819
03:45:47,359 --> 03:45:55,350
most recent thing in the queue. And this gives\n
1820
03:45:55,350 --> 03:46:01,500
And this is how you do a breadth first search\n
1821
03:46:01,500 --> 03:46:09,430
we had to use that special trick of using\n
1822
03:46:09,430 --> 03:46:17,550
do level order traversals recursively, we\n
1823
03:46:17,549 --> 03:46:22,679
Okay, finally time to look at some source\n
1824
03:46:23,680 --> 03:46:29,568
code I'm about to show you can be found at\n
1825
03:46:29,568 --> 03:46:37,059
in the description at the bottom of this video,\n
1826
03:46:37,059 --> 03:46:45,859
can also find it more easily. And now let's\n
1827
03:46:45,859 --> 03:46:55,939
source code for the binary search tree. The\n
1828
03:46:55,939 --> 03:47:05,648
thing you will notice is I have a class representing\n
1829
03:47:05,648 --> 03:47:12,948
anything that is comparable, we need things\n
1830
03:47:12,949 --> 03:47:21,869
them accordingly within the binary search\n
1831
03:47:21,869 --> 03:47:28,300
variables, actually only two, in fact, one\n
1832
03:47:28,299 --> 03:47:35,219
binary search tree, and another one, which\n
1833
03:47:35,219 --> 03:47:48,129
this binary search tree is a rooted tree.\n
1834
03:47:48,129 --> 03:47:56,429
a left node and a right node, as well as some\n
1835
03:47:56,430 --> 03:48:07,380
here. So it's some comparable type T. Okay,\n
1836
03:48:07,379 --> 03:48:13,589
search tree is empty. It simply checks if\n
1837
03:48:13,590 --> 03:48:21,100
the node count, which gets either incremented\n
1838
03:48:21,100 --> 03:48:30,180
Okay, here's the public method to add elements\n
1839
03:48:30,180 --> 03:48:37,568
private method down here, as you will notice.\n
1840
03:48:37,568 --> 03:48:45,868
business and the public method to just check\n
1841
03:48:45,869 --> 03:48:54,869
the binary search tree. This insertion method\n
1842
03:48:54,869 --> 03:48:59,760
a new element into the binary search tree\n
1843
03:48:59,760 --> 03:49:05,949
something inside the binary search tree. So\n
1844
03:49:05,949 --> 03:49:14,050
search tree, okay, so supposing this branch\n
1845
03:49:14,049 --> 03:49:19,799
does not already exist in the tree, then we're\n
1846
03:49:19,799 --> 03:49:27,269
add this new element to the binary search\n
1847
03:49:27,270 --> 03:49:34,460
by one and return true because well, this\n
1848
03:49:34,459 --> 03:49:46,459
at the recursive method now. So our base case\n
1849
03:49:46,459 --> 03:49:56,769
insert our element at so we will create a\n
1850
03:49:56,770 --> 03:50:05,699
with the value of the element, we want insert\n
1851
03:50:05,699 --> 03:50:14,390
which sub tree we want to place our element\n
1852
03:50:14,389 --> 03:50:22,680
subtree of the first branch, or the right\n
1853
03:50:22,680 --> 03:50:30,648
look at removing. So here's the public method\n
1854
03:50:30,648 --> 03:50:38,059
the method if it exists within the tree. So\n
1855
03:50:38,059 --> 03:50:43,510
it is going to return false, meaning we have\n
1856
03:50:43,510 --> 03:50:51,228
string. And if it is contained, I'm also going\n
1857
03:50:51,228 --> 03:51:04,799
at this recursive method to remove the node.\n
1858
03:51:04,799 --> 03:51:14,090
base case, it's now returned now. And in the\n
1859
03:51:14,090 --> 03:51:20,329
to find it. And we know it exists. Because\n
1860
03:51:20,329 --> 03:51:28,738
within the tree so we can remove it. And that\n
1861
03:51:28,738 --> 03:51:36,489
Phase I was talking about in the later video.\n
1862
03:51:36,489 --> 03:51:44,430
if it's less than, so we're going in the left\n
1863
03:51:44,430 --> 03:51:51,908
It's going to be one of these two cases, otherwise,\n
1864
03:51:51,908 --> 03:52:01,219
finds the node. And here's where we do the\n
1865
03:52:01,219 --> 03:52:08,929
in my slides. But in fact, you can think of\n
1866
03:52:08,930 --> 03:52:16,420
because two the cases are very similar. So\n
1867
03:52:16,420 --> 03:52:24,978
node, it's really that that can also be thought\n
1868
03:52:24,978 --> 03:52:33,750
case is the case where the left subtree is\n
1869
03:52:33,750 --> 03:52:43,430
case, the right subtree is now but the left\n
1870
03:52:43,430 --> 03:52:52,488
I'll get to later, we have both subsidiaries.\n
1871
03:52:52,488 --> 03:53:01,639
have a right subtree, then I'm going to say\n
1872
03:53:01,639 --> 03:53:14,028
the root node of that right subtree. So node\n
1873
03:53:14,029 --> 03:53:23,810
destroy the data within this node and the\n
1874
03:53:23,809 --> 03:53:32,988
where I only have a left subtree. What I'm\n
1875
03:53:32,988 --> 03:53:40,520
left subtree and grab the root node. And I'm\n
1876
03:53:40,520 --> 03:53:46,630
to destroy this node because we know we don't\n
1877
03:53:46,629 --> 03:53:52,988
easy. Now let's look at the key. So we have\n
1878
03:53:52,988 --> 03:54:03,020
subtree. So as I mentioned in my slides, we\n
1879
03:54:03,020 --> 03:54:13,250
subtree, or the smallest node in the right\n
1880
03:54:13,250 --> 03:54:24,100
in the right subtree. So I go down to the\n
1881
03:54:24,100 --> 03:54:32,020
the node or the successor node if you will.\n
1882
03:54:32,020 --> 03:54:39,960
and call ourselves to remove the successor\n
1883
03:54:39,959 --> 03:54:46,019
in the left subtree then you can just uncomment\n
1884
03:54:46,020 --> 03:54:53,010
removing a nutshell and they also had these\n
1885
03:54:53,010 --> 03:55:00,579
dig right. Moving on, I also have this method\nthat checks
1886
03:55:00,579 --> 03:55:08,699
contains an element. So given an element return\n
1887
03:55:08,699 --> 03:55:15,510
is within this binary subtree. And this is\n
1888
03:55:15,510 --> 03:55:22,059
phase, if we reach a null node, we would definitely\n
1889
03:55:22,059 --> 03:55:27,760
get our comparative value, which is either\n
1890
03:55:27,760 --> 03:55:35,449
the left subtree, meaning this case, or greater\n
1891
03:55:35,449 --> 03:55:42,630
Or if we found the element, then that's zero\n
1892
03:55:42,629 --> 03:55:49,639
Just as a bonus, I also threw in a height\n
1893
03:55:49,639 --> 03:55:58,299
of the tree, it will do so in linear time,\n
1894
03:55:58,299 --> 03:56:09,159
method. And all this does is it's fairly simple.\n
1895
03:56:09,159 --> 03:56:16,818
return zero. Otherwise, we're going to return\n
1896
03:56:16,818 --> 03:56:22,549
the right subtree. Because one of our sub\n
1897
03:56:22,549 --> 03:56:28,129
that's going to be the one that we want the\n
1898
03:56:28,129 --> 03:56:36,609
add plus one. So this corresponds to a depth.\n
1899
03:56:36,610 --> 03:56:47,930
the height of the tree is, you want to go\n
1900
03:56:47,930 --> 03:56:59,109
created this method called traverse. And what\n
1901
03:56:59,109 --> 03:57:06,829
which is an enamel type, I'm going to show\n
1902
03:57:06,829 --> 03:57:13,398
then I pick whichever order you give me and\n
1903
03:57:13,398 --> 03:57:19,858
I want to traverse. So if you tell me I want\n
1904
03:57:19,859 --> 03:57:28,630
order fashion that I'm going to return you\n
1905
03:57:28,629 --> 03:57:36,420
want to traverse the tree in order for it\n
1906
03:57:36,420 --> 03:57:47,879
Let's have a look at what this tree traversal\n
1907
03:57:47,879 --> 03:57:56,279
So that is simply an e&m type you can see\n
1908
03:57:56,280 --> 03:58:06,390
things, it's pre order in order post order.\n
1909
03:58:06,389 --> 03:58:14,648
these traversals iteratively. So that you\n
1910
03:58:14,648 --> 03:58:20,340
be slightly slower, and perhaps less convenient,\n
1911
03:58:20,340 --> 03:58:26,648
want to dive into the code because it is fairly\n
1912
03:58:26,648 --> 03:58:33,109
traversal, then it does look pretty gross,\n
1913
03:58:33,110 --> 03:58:39,569
have to check for concurrent modification\n
1914
03:58:39,568 --> 03:58:45,949
great interview questions like how do i do\n
1915
03:58:45,950 --> 03:58:53,020
I do a post order traversal, iteratively,\n
1916
03:58:53,020 --> 03:58:59,390
great to practice. If you want to actually\n
1917
03:58:59,389 --> 03:59:09,469
go on the GitHub repository and have a look.\n
1918
03:59:09,469 --> 03:59:16,108
in the last slides. So you should be good\n
1919
03:59:16,109 --> 03:59:22,510
Anyways, I just want to be a bit more fancy\n
1920
03:59:22,510 --> 03:59:24,639
it for the binary search tree.
1921
03:59:24,639 --> 03:59:30,689
Today we're going to be talking about hash\n
1922
03:59:30,689 --> 03:59:38,550
of all times, if I had a subtitle, it would\n
1923
03:59:38,549 --> 03:59:43,519
let's get started. There's going to be a lot\n
1924
03:59:43,520 --> 03:59:48,729
We're gonna start off with with a hash table\n
1925
03:59:48,728 --> 03:59:55,760
why do we need one. Then we're going to talk\n
1926
03:59:55,760 --> 04:00:00,719
In particular, separate chaining, open addressing\n
1927
04:00:00,719 --> 04:00:06,448
there's a ton more, there, we're going to\n
1928
04:00:06,449 --> 04:00:12,220
is a really popular implementation. And there's\n
1929
04:00:12,219 --> 04:00:16,679
covered. So we're going to be talking about\n
1930
04:00:16,680 --> 04:00:21,300
that's done. I'm not even giving lots and\n
1931
04:00:21,299 --> 04:00:28,250
not super obvious how they work. And to finish\n
1932
04:00:28,250 --> 04:00:34,399
finally removing elements from the open addressing\n
1933
04:00:34,399 --> 04:00:39,148
All right. So to begin with, what is a hash\ntable.
1934
04:00:40,369 --> 04:00:47,270
a data structure that lets us construct a\n
1935
04:00:47,270 --> 04:00:56,590
a technique called hashing, which we'll talk\n
1936
04:00:56,590 --> 04:01:06,119
as long as it's unique, which maps to a value.\n
1937
04:01:06,119 --> 04:01:12,529
names, and the value could be someone's favorite\n
1938
04:01:12,529 --> 04:01:19,800
Mike is this purple, Katherine's is yellow,\n
1939
04:01:19,799 --> 04:01:25,438
so each key is associated with the value.\n
1940
04:01:25,439 --> 04:01:31,350
don't have to be unique. For example, Micah's\n
1941
04:01:31,350 --> 04:01:41,239
favorite color is purple. We often use hash\n
1942
04:01:41,239 --> 04:01:48,520
Below is a frequency table of the number of\n
1943
04:01:48,520 --> 04:01:56,850
Hamlet, which I Parson obtained this table.\n
1944
04:01:56,850 --> 04:02:04,109
word, Lord 223. But the word cabbage did not\n
1945
04:02:04,109 --> 04:02:13,090
track frequencies, which is really handy.\n
1946
04:02:13,090 --> 04:02:19,380
any key value pairs given that the keys are\n
1947
04:02:19,379 --> 04:02:25,519
about shortly. So to be able to understand\n
1948
04:02:25,520 --> 04:02:33,251
table, we need to understand what a hash function\n
1949
04:02:33,251 --> 04:02:41,639
will denote from now on as h of x, for some\n
1950
04:02:41,639 --> 04:02:47,158
number in some fixed range. So that's pretty\n
1951
04:02:47,158 --> 04:02:54,118
hash function. So if our hash function is\n
1952
04:02:54,119 --> 04:03:01,880
6x, plus nine modulo 10, well, all integer\n
1953
04:03:01,879 --> 04:03:09,778
inclusive. No matter what integer number I\n
1954
04:03:09,779 --> 04:03:17,790
certain values, or keys fuel into the hash\n
1955
04:03:17,790 --> 04:03:26,140
that the output is not unique. In that there\n
1956
04:03:26,139 --> 04:03:32,500
the hash function which yield the same value,\n
1957
04:03:32,500 --> 04:03:39,369
the same thing, and that's completely fine.\n
1958
04:03:39,369 --> 04:03:46,100
not just on integers, but on arbitrary objects,\n
1959
04:03:46,100 --> 04:03:52,870
suppose we have a string s, and we're gonna\n
1960
04:03:52,870 --> 04:04:05,590
defined below. So this hash function, all\n
1961
04:04:05,590 --> 04:04:13,460
characters within the string, and then at\n
1962
04:04:13,459 --> 04:04:22,039
any given string just output a number. And\n
1963
04:04:22,040 --> 04:04:30,890
simple inputs, so h of BB gets 66, or 66.\n
1964
04:04:30,889 --> 04:04:37,019
The empty string gets zero because we don't\n
1965
04:04:37,020 --> 04:04:43,489
effectively converted a string to a number.\n
1966
04:04:43,489 --> 04:04:52,189
you. Suppose we have a database of people\n
1967
04:04:52,189 --> 04:05:00,109
have an age, a name and a sex. I want you\n
1968
04:05:00,109 --> 04:05:07,479
going to be given a person object as an argument.\n
1969
04:05:07,478 --> 04:05:13,969
going to map a person to the set of values,\n
1970
04:05:13,969 --> 04:05:20,658
the video, just create any hash function that\n
1971
04:05:20,658 --> 04:05:28,250
so here's my attempt at creating such a hash\n
1972
04:05:28,250 --> 04:05:32,879
of possible hash functions we could define\n
1973
04:05:32,879 --> 04:05:39,170
And here's a simple one. So I'm gonna say,\n
1974
04:05:39,170 --> 04:05:44,939
whatever there is, that's the starting value\n
1975
04:05:44,939 --> 04:05:51,270
of the person's name, again, an arbitrary\n
1976
04:05:51,270 --> 04:05:58,180
males, and then mod by six. So as you can\n
1977
04:05:58,180 --> 04:06:02,630
defined. At this point, we're going to get\n
1978
04:06:02,629 --> 04:06:12,270
later. So this particular hash function yields\n
1979
04:06:12,271 --> 04:06:18,770
very, very important that we need to talk\n
1980
04:06:18,770 --> 04:06:27,310
if we have a hash function, and two objects,\n
1981
04:06:27,309 --> 04:06:37,618
then those two objects might be equal. We\n
1982
04:06:37,619 --> 04:06:45,300
check x against y. Yes, the hash function\n
1983
04:06:45,299 --> 04:06:52,978
there. But if the hash functions are not equal,\n
1984
04:06:52,978 --> 04:06:59,609
a good question. How can we use this to our\n
1985
04:06:59,610 --> 04:07:07,540
The answer is that instead of comparing X\n
1986
04:07:07,540 --> 04:07:16,260
hash values, and first let's compare the hash\n
1987
04:07:16,260 --> 04:07:22,658
and Y explicitly, and in this next example,\n
1988
04:07:22,658 --> 04:07:29,359
the problem of trying to determine if two\n
1989
04:07:29,359 --> 04:07:35,100
is something you want to do in an operating\n
1990
04:07:35,100 --> 04:07:42,790
the hash values for file one and file two,\n
1991
04:07:42,790 --> 04:07:50,211
to see if they match or don't match, because\n
1992
04:07:50,210 --> 04:07:58,778
super fast. So if possible, we don't want\n
1993
04:07:58,779 --> 04:08:05,060
X and Y directly or file one against file\n
1994
04:08:05,059 --> 04:08:13,409
per byte if their hash values are equal. So\n
1995
04:08:13,409 --> 04:08:20,510
more sophisticated than those that we use\n
1996
04:08:20,510 --> 04:08:30,000
hashing algorithms called cryptographic hash\n
1997
04:08:30,000 --> 04:08:37,260
property of hash functions is that they must\n
1998
04:08:37,260 --> 04:08:46,510
if h of x produces a y, then h of x must always,\n
1999
04:08:46,510 --> 04:08:54,880
value. This is super critical, because this\n
2000
04:08:54,879 --> 04:09:02,769
this happens, we do not want this. So an example\n
2001
04:09:02,770 --> 04:09:10,390
that introduces say, a global variable or\n
2002
04:09:10,389 --> 04:09:16,198
So in this particular hash function, the first\n
2003
04:09:16,199 --> 04:09:21,380
six. But then if I call it again, we've incremented\n
2004
04:09:21,379 --> 04:09:30,488
Oh, not good. Something else but hash functions\n
2005
04:09:30,488 --> 04:09:35,139
To minimize the number of hash collisions,\n
2006
04:09:35,139 --> 04:09:41,439
shortly. But a hash collision is when two\n
2007
04:09:41,439 --> 04:09:52,479
is h of x equals h of y. And the reason why\n
2008
04:09:52,478 --> 04:10:00,698
hash function are hit, so that we can fill\n
2009
04:10:00,699 --> 04:10:10,970
Table we generated earlier, William and key\n
2010
04:10:10,969 --> 04:10:16,129
able to answer a central question about the\n
2011
04:10:16,129 --> 04:10:24,969
our hash table. So what makes a key of type\n
2012
04:10:24,969 --> 04:10:31,049
going to implement our hash table using these\n
2013
04:10:33,280 --> 04:10:40,430
And to enforce that behavior, you'll see a\n
2014
04:10:40,430 --> 04:10:48,460
the keys you use, be immutable, meaning you\n
2015
04:10:48,459 --> 04:10:55,429
constants. So they're things like immutable\n
2016
04:10:55,430 --> 04:11:03,189
or lists or things you can add or remove things\n
2017
04:11:03,189 --> 04:11:10,069
condition, and we have a hash function that\n
2018
04:11:10,069 --> 04:11:19,199
say that, that that type is hashable. Okay.\n
2019
04:11:19,199 --> 04:11:27,609
details. How does the hash table work? Well,\n
2020
04:11:27,609 --> 04:11:35,578
really quick insertion look up, and the removal\n
2021
04:11:35,578 --> 04:11:44,029
that using the hash function as a way to index\n
2022
04:11:44,029 --> 04:11:49,649
a fancy word for an array. Think of it like\n
2023
04:11:49,648 --> 04:11:58,238
So again, we use the hash function as a way\n
2024
04:11:58,238 --> 04:12:05,250
function gives us an index to go look up inside\n
2025
04:12:05,250 --> 04:12:14,379
time. Given that we have a uniform hash function,\n
2026
04:12:14,379 --> 04:12:21,379
on the right as an indexable block of memory.\n
2027
04:12:21,379 --> 04:12:28,670
entries with the hash function, h of x. So\n
2028
04:12:28,670 --> 04:12:35,238
string key value pairs into the table that\n
2029
04:12:35,238 --> 04:12:41,420
their usernames, say in an online programming\n
2030
04:12:41,420 --> 04:12:49,090
function, I chose x squared plus three, Montana.\n
2031
04:12:49,090 --> 04:12:58,790
insert the key value pair, three by eater,\n
2032
04:12:58,790 --> 04:13:06,659
is by eater got a rank of three, and we want\n
2033
04:13:06,658 --> 04:13:13,670
we do is we hash the key, which is three,\n
2034
04:13:13,670 --> 04:13:21,350
plus three montagner. Two, so in our hash\n
2035
04:13:21,350 --> 04:13:29,779
put the key B three, and the value to be byte\n
2036
04:13:29,779 --> 04:13:36,680
say, which is usually what I call myself an\n
2037
04:13:36,680 --> 04:13:44,540
one. And if we hash one, then we get four.\nSo at index four
2038
04:13:44,540 --> 04:13:54,100
then we're going to put the key in the value\n
2039
04:13:54,100 --> 04:14:02,250
want to insert Lauren 425, we've got rank\n
2040
04:14:02,250 --> 04:14:11,238
table where she goes, then we can do the same\n
2041
04:14:11,238 --> 04:14:17,789
insert orange Knight, again began by hashing\n
2042
04:14:17,790 --> 04:14:22,100
keep doing this, you keep filling the table,\n
2043
04:14:22,100 --> 04:14:29,500
Now, in the event that we want to do a lookup,\n
2044
04:14:29,500 --> 04:14:37,180
to do is compute the hash value for R and\n
2045
04:14:37,180 --> 04:14:42,658
this user is. So say I want to find the user\nwith reign 10.
2046
04:14:42,658 --> 04:14:50,139
And I hash 10. Figure out its index is three\n
2047
04:14:53,510 --> 04:15:01,770
However, how do we handle hash collisions?\n
2048
04:15:01,770 --> 04:15:09,279
and eight have the same value. This is problematic.\n
2049
04:15:09,279 --> 04:15:17,290
them inside our table, they would index into\n
2050
04:15:17,290 --> 04:15:23,710
we use one of many hash collision resolution\n
2051
04:15:23,709 --> 04:15:29,469
but the two most popular ones are separate\n
2052
04:15:29,469 --> 04:15:35,908
separate chaining is that we deal with hash\n
2053
04:15:35,908 --> 04:15:42,109
usually a link lists to hold all the different\n
2054
04:15:42,110 --> 04:15:49,399
value. Open addressing is slightly different.\n
2055
04:15:49,398 --> 04:15:55,898
finding other places in the hash table offset\n
2056
04:15:55,898 --> 04:16:00,959
So an open addressing, everything is kept\n
2057
04:16:00,959 --> 04:16:07,099
you have multiple auxiliary data structures.\n
2058
04:16:07,100 --> 04:16:13,050
actually pretty remarkable. In fact, it's\n
2059
04:16:13,049 --> 04:16:20,429
time insertion, removal and search. But if\n
2060
04:16:20,430 --> 04:16:27,279
uniform, then you can get linear time, which\n
2061
04:16:27,279 --> 04:16:35,140
something super, super cool. And that is hash\n
2062
04:16:35,139 --> 04:16:43,590
dive right in what is separate chaining. Separate\n
2063
04:16:43,590 --> 04:16:51,408
resolution techniques. And how it works is\n
2064
04:16:51,408 --> 04:16:56,430
keys hash of the same value, we need to have\n
2065
04:16:56,430 --> 04:17:01,540
hash table. So it's still functional. Or what\n
2066
04:17:01,540 --> 04:17:09,319
auxilary data structure to essentially hold\n
2067
04:17:09,318 --> 04:17:17,209
and look up inside that bucket or that data\n
2068
04:17:17,209 --> 04:17:23,898
for. And usually, we use the length list for\n
2069
04:17:23,898 --> 04:17:29,978
lists, we can use arrays, binary trees, self\n
2070
04:17:29,978 --> 04:17:36,948
Okay, so suppose we have the following hash\n
2071
04:17:36,949 --> 04:17:49,460
of key value pairs of age and names. And we\n
2072
04:17:49,459 --> 04:17:55,509
been computed with some hash function. So\n
2073
04:17:55,510 --> 04:18:01,250
For now, we're just going to see how we can\n
2074
04:18:01,250 --> 04:18:09,850
Okay, so on the left is our hash table. So\n
2075
04:18:09,850 --> 04:18:17,649
of these key value pairs into this hash table\n
2076
04:18:17,648 --> 04:18:27,778
easy it is. Okay, so our first person is will\n
2077
04:18:27,779 --> 04:18:36,550
going to put in, in that slot three, Leah,\n
2078
04:18:36,549 --> 04:18:43,039
we're going to put her at index four. So the\n
2079
04:18:43,040 --> 04:18:52,260
rig age 61 hash two and put them there. And\n
2080
04:18:52,260 --> 04:18:58,350
get a little bit full in our hash table here.\n
2081
04:18:58,350 --> 04:19:06,600
Okay. Lera, age 34, hash to four, but we say\n
2082
04:19:06,600 --> 04:19:16,949
we do? Well, in separate chaining, we just\n
2083
04:19:16,949 --> 04:19:25,180
in the array is actually a linked list data\n
2084
04:19:25,180 --> 04:19:32,300
the link list and see if alera exists and\n
2085
04:19:32,299 --> 04:19:36,278
add lira at the very end of the chain.
2086
04:19:38,158 --> 04:19:45,469
so Ryan also hash to one but then we look\n
2087
04:19:45,469 --> 04:19:59,579
add a new entry at position one. alera age\n
2088
04:19:59,579 --> 04:20:08,068
exists. So in our hash table, so we're good.\n
2089
04:20:08,068 --> 04:20:17,769
to update it. So fin age 21, hash to three.\n
2090
04:20:17,770 --> 04:20:23,158
hash to three. So what we're going to do is\n
2091
04:20:23,158 --> 04:20:30,680
length list chain. So note that even though\n
2092
04:20:30,680 --> 04:20:36,790
value, that is index three, and they have\n
2093
04:20:36,790 --> 04:20:44,010
we store both the key and the value as an\n
2094
04:20:44,010 --> 04:20:49,488
how we're able to tell them apart. Okay, now\n
2095
04:20:49,488 --> 04:20:56,750
has to four. So scan through the linked list\n
2096
04:20:56,750 --> 04:21:01,728
we have to append mark at the very end. All\n
2097
04:21:01,728 --> 04:21:10,760
do lookups in this structure. So it's basically\n
2098
04:21:10,760 --> 04:21:18,800
a name, we want to find what the person's\n
2099
04:21:18,799 --> 04:21:26,068
when we hash him, we get one. So we suspect\n
2100
04:21:26,068 --> 04:21:33,850
say a bucket, I just mean, whatever data structure\n
2101
04:21:33,850 --> 04:21:38,579
link lists. So if you scan this linked list\n
2102
04:21:38,579 --> 04:21:44,959
comparing the key. So we're comparing the\n
2103
04:21:44,959 --> 04:21:50,868
a match. So keep going. Compare Ryan, Ryan,\n
2104
04:21:50,869 --> 04:21:58,210
inside in that entry, say, oh, his age is\n
2105
04:21:58,209 --> 04:22:05,209
the age of Mark hash mark. And since our hash\n
2106
04:22:05,209 --> 04:22:12,408
if there's a mark, then it's going to be found\n
2107
04:22:12,408 --> 04:22:21,439
bucket for scan through, oh, last one is a\n
2108
04:22:21,439 --> 04:22:28,488
that the value or the key looking forward\n
2109
04:22:28,488 --> 04:22:34,770
turn now. Okay, so here's a good question.\n
2110
04:22:34,770 --> 04:22:42,710
and lookup time complexity? If the hash table\n
2111
04:22:42,709 --> 04:22:48,809
list chains? Good question. And the answer\n
2112
04:22:48,809 --> 04:22:53,409
your hash table, you'll actually want to create\n
2113
04:22:53,409 --> 04:23:02,020
rehash all your items and re insert them into\n
2114
04:23:02,020 --> 04:23:10,828
fixed size. Okay, and another good question,\n
2115
04:23:10,828 --> 04:23:16,770
table with separate chaining? Well, the answer\n
2116
04:23:16,770 --> 04:23:23,260
your key. And instead of doing a lookup while\n
2117
04:23:23,260 --> 04:23:30,828
length list, that's another question. How\n
2118
04:23:30,828 --> 04:23:37,119
to model the bucket behavior? Yes, of course.\n
2119
04:23:37,119 --> 04:23:43,319
linked lists include arrays, binary tree,\n
2120
04:23:43,318 --> 04:23:49,760
approach and our hash map. So once they get\n
2121
04:23:49,760 --> 04:23:57,010
a binary tree, or maybe a cell balance spanning\n
2122
04:23:57,010 --> 04:24:02,590
methods are a bit more memory intensive and\n
2123
04:24:02,590 --> 04:24:09,728
be less popular, but they might be a lot faster\n
2124
04:24:09,728 --> 04:24:14,858
All right, it's time to have a look at some\n
2125
04:24:14,859 --> 04:24:21,920
table. So here I am on my GitHub repository\n
2126
04:24:21,920 --> 04:24:27,309
find the source code here under the hash table\n
2127
04:24:27,309 --> 04:24:33,039
the hash table. Today, we're going to be looking\n
2128
04:24:33,040 --> 04:24:38,970
in the later videos, probably one of these\n
2129
04:24:38,969 --> 04:24:46,719
So let's dive into the code. I have it here\n
2130
04:24:46,719 --> 04:24:55,238
So first things first, I have two classes,\n
2131
04:24:55,238 --> 04:25:02,618
chaining hash table. So let's have a look\n
2132
04:25:02,619 --> 04:25:10,130
individual items or key value pairs, you would\n
2133
04:25:10,129 --> 04:25:20,309
So in Java, we have generics, so a generic\n
2134
04:25:20,309 --> 04:25:27,689
So when I create an entry, I give it the key\n
2135
04:25:27,689 --> 04:25:34,040
So there's a built in method in Java to compute\n
2136
04:25:34,040 --> 04:25:38,660
you can override it to specify the hash code\n
2137
04:25:38,659 --> 04:25:44,289
convenient. So compute the hash code and cache\n
2138
04:25:44,290 --> 04:25:49,720
don't have to re compute this thing multiple\n
2139
04:25:49,719 --> 04:25:55,049
for something like a string, hash code can\n
2140
04:25:55,049 --> 04:26:01,250
good. So here, I have an equals method, which\n
2141
04:26:01,250 --> 04:26:06,189
because I don't want to have to do any casting.\n
2142
04:26:06,189 --> 04:26:11,970
If the hashes are not equal, we know from\n
2143
04:26:11,969 --> 04:26:17,608
equal, so we can return false. Otherwise,\n
2144
04:26:17,609 --> 04:26:23,630
about it for the entry class. Very simple,\n
2145
04:26:23,629 --> 04:26:32,289
thing. So the hash table itself. Okay, so\n
2146
04:26:32,290 --> 04:26:40,579
holds three, three items, and the load factor\n
2147
04:26:40,578 --> 04:26:49,068
or be 0.75. So that's the maximum capacity\n
2148
04:26:49,068 --> 04:26:52,819
important instance variables, we need to go\n
2149
04:26:52,819 --> 04:27:00,799
the load factor goes above this value, then\n
2150
04:27:00,799 --> 04:27:07,429
so the actual maximum number of items that\n
2151
04:27:07,430 --> 04:27:15,328
So the threshold. So this is computed to be\n
2152
04:27:15,328 --> 04:27:22,449
us, hey, you're above the threshold to resize\n
2153
04:27:22,450 --> 04:27:32,510
in the table. And this is the table itself.\n
2154
04:27:32,510 --> 04:27:38,340
have entries, pretty simple. So there's a\n
2155
04:27:38,340 --> 04:27:45,648
table, just using the default settings with\n
2156
04:27:45,648 --> 04:27:54,409
load factor. So this is a designated constructor,\n
2157
04:27:54,409 --> 04:28:01,969
factor is compute default capacity. And make\n
2158
04:28:01,969 --> 04:28:08,028
and capacity just so that I know you don't\n
2159
04:28:08,029 --> 04:28:14,790
weird things happening if the capacity is\n
2160
04:28:14,790 --> 04:28:21,409
threshold and then finally initialize the\n
2161
04:28:21,408 --> 04:28:26,219
a look at all these methods right here. So\n
2162
04:28:26,219 --> 04:28:33,350
table empty, is the hash table empty. So this\n
2163
04:28:33,351 --> 04:28:41,109
normalized index. And it's used when you want\n
2164
04:28:41,109 --> 04:28:44,729
it says in the comments here, essentially,\n
2165
04:28:44,728 --> 04:28:52,459
hash value in a domain, zero to capacity.\n
2166
04:28:52,459 --> 04:28:59,809
can be anywhere in the domain of an integer,\n
2167
04:28:59,809 --> 04:29:09,889
to positive to the 31. around that. So what\n
2168
04:29:09,889 --> 04:29:16,250
sign from the hash value, and then modified\n
2169
04:29:16,250 --> 04:29:23,200
so we can actually use it as a lookup index.\n
2170
04:29:23,200 --> 04:29:32,449
how the table that's straightforward. Contains\n
2171
04:29:32,449 --> 04:29:36,890
going to do is compute given a key. So we\n
2172
04:29:36,889 --> 04:29:45,078
the hash table. Right? So we're going to do\n
2173
04:29:45,078 --> 04:29:55,289
then now give us the bucket index. So which\n
2174
04:29:55,290 --> 04:30:01,100
in a hash table? I'm just going to seek to\n
2175
04:30:01,100 --> 04:30:07,340
if the entry is not equal to no exists, if\n
2176
04:30:07,340 --> 04:30:16,369
for add an insert are all common names for\n
2177
04:30:16,369 --> 04:30:21,899
or updating a value inside of the hash table\n
2178
04:30:21,898 --> 04:30:27,250
something we absolutely don't want. So just\n
2179
04:30:27,250 --> 04:30:33,790
going to create a new entry, find the bucket\n
2180
04:30:33,790 --> 04:30:42,609
method we'll get to. Okay, get. So given a\n
2181
04:30:42,609 --> 04:30:48,670
that key. Again, though, allow no keys. And\n
2182
04:30:48,670 --> 04:30:54,969
don't want to find which bucket this particular\n
2183
04:30:54,969 --> 04:31:02,578
find the entry. assuming it's not no, then\n
2184
04:31:02,578 --> 04:31:13,250
value. If it is no, well, the key doesn't\n
2185
04:31:13,250 --> 04:31:21,068
the key now from the hash table. So he's not\n
2186
04:31:21,068 --> 04:31:28,028
call his private remove entry method, which\n
2187
04:31:28,029 --> 04:31:35,140
so which, which bucket does this keyboard\n
2188
04:31:35,139 --> 04:31:44,049
to seek for the entry inside the link list\n
2189
04:31:44,049 --> 04:31:51,750
we're going to extract the actual link list\n
2190
04:31:51,750 --> 04:31:58,420
in Java. So this removed from that link this\n
2191
04:31:58,420 --> 04:32:07,020
the actual value, that's all we have to do.\n
2192
04:32:07,020 --> 04:32:14,680
So insert bucket insert entry is a given a\n
2193
04:32:14,680 --> 04:32:22,859
inside of it. Okay. So first, since we know\n
2194
04:32:22,859 --> 04:32:27,790
automatically get the linked list structure.\n
2195
04:32:27,790 --> 04:32:36,540
have to create a new linked list. So we're\n
2196
04:32:36,540 --> 04:32:41,989
list data structures, which is good, because\n
2197
04:32:41,988 --> 04:32:50,469
to. So next up, I find the entry that already\n
2198
04:32:50,469 --> 04:32:58,618
an update, for instance. So if the existence\n
2199
04:32:58,619 --> 04:33:04,439
a new entry to the end of the blank last.\n
2200
04:33:04,438 --> 04:33:12,340
the size, and check if we're above the threshold,\n
2201
04:33:12,340 --> 04:33:18,069
now to indicate that there was no previous\n
2202
04:33:18,069 --> 04:33:24,370
So then just update the value in the existing\n
2203
04:33:26,330 --> 04:33:32,770
So seek entry this method we've been using\n
2204
04:33:32,770 --> 04:33:38,909
particular entry at a given bucket index,\n
2205
04:33:38,909 --> 04:33:45,778
They probably know what's going on by now.\n
2206
04:33:45,778 --> 04:33:52,708
bucket index. Otherwise return now if it doesn't\n
2207
04:33:52,708 --> 04:34:00,569
the entries in the linked list and compare\n
2208
04:34:00,569 --> 04:34:08,020
was a match, return that entry otherwise return\n
2209
04:34:08,020 --> 04:34:14,569
method called resize table. So this resizes\n
2210
04:34:14,569 --> 04:34:22,420
the table. First we double the capacity. We\n
2211
04:34:22,420 --> 04:34:30,340
have a higher threshold because we have increasing\n
2212
04:34:30,340 --> 04:34:36,368
new capacity. So this new table is bigger\n
2213
04:34:36,368 --> 04:34:43,530
current table. Look for linkless data structures\n
2214
04:34:43,530 --> 04:34:51,458
loop through all these entries, calculate\n
2215
04:34:51,458 --> 04:35:02,409
insert it into this new table and after that\n
2216
04:35:02,409 --> 04:35:13,490
the old table at the end, set the table to\n
2217
04:35:13,490 --> 04:35:20,830
these last two methods, so just return all\n
2218
04:35:20,830 --> 04:35:28,270
fairly simple. And these last two methods\n
2219
04:35:28,270 --> 04:35:34,300
need to go over. So that's essentially, separate\n
2220
04:35:34,300 --> 04:35:42,169
with the link lists much more difficult to\n
2221
04:35:42,169 --> 04:35:47,618
or something like that. I'm pretty excited,\n
2222
04:35:47,618 --> 04:35:54,989
collision resolution technique for hash tables.\n
2223
04:35:54,990 --> 04:36:00,320
quick recap on hash tables so that everyone's\n
2224
04:36:00,319 --> 04:36:07,250
table is to construct a mapping from a set\n
2225
04:36:07,250 --> 04:36:14,708
to be hashable. Now, what we do is we define\n
2226
04:36:14,708 --> 04:36:22,278
into numbers, then we use the number obtained\n
2227
04:36:22,278 --> 04:36:29,147
into the array or the hash table. However,\n
2228
04:36:29,148 --> 04:36:38,260
time to time, we're going to have hash collisions,\n
2229
04:36:38,259 --> 04:36:44,458
So we need a way to resolve this and open\n
2230
04:36:44,458 --> 04:36:50,039
so what we're going to be using the open addressing\n
2231
04:36:50,039 --> 04:36:56,650
to keep in mind is the actual key value pairs\n
2232
04:36:56,650 --> 04:37:04,819
itself. So as opposed to say, an auxilary\n
2233
04:37:04,819 --> 04:37:12,359
method we saw in the last video. So this means\n
2234
04:37:12,359 --> 04:37:17,618
the hash tables, and how many elements are\n
2235
04:37:17,618 --> 04:37:22,329
there are too many elements inside of the\n
2236
04:37:22,330 --> 04:37:29,160
hard to find an open slot or a position to\n
2237
04:37:29,159 --> 04:37:34,169
of terminology, we say that the load factor\n
2238
04:37:34,169 --> 04:37:41,708
table and the size of the table. So this means\n
2239
04:37:41,708 --> 04:37:48,430
Here's a neat chart from Wikipedia. So on\n
2240
04:37:48,430 --> 04:37:53,900
methods. One of them is chaining, that is\n
2241
04:37:53,900 --> 04:38:01,459
open addressing technique. And we can see\n
2242
04:38:01,458 --> 04:38:09,298
it gets to a certain threshold, it gets exponentially\n
2243
04:38:09,298 --> 04:38:15,118
that say point eight mark. In fact, we're\n
2244
04:38:15,118 --> 04:38:20,868
usually. And what this says is, we always\n
2245
04:38:20,868 --> 04:38:27,649
by the Greek letter alpha, below a certain\n
2246
04:38:27,650 --> 04:38:36,730
our table once that threshold is met. Right,\n
2247
04:38:36,729 --> 04:38:42,791
into our hash table, here's what we do, we\n
2248
04:38:42,791 --> 04:38:47,789
on our key and we hash the value and this\n
2249
04:38:47,789 --> 04:38:54,969
table for where the key should go. But suppose\n
2250
04:38:54,969 --> 04:39:02,289
a key in that slot, well, we can't have two\n
2251
04:39:02,289 --> 04:39:12,719
work. So what we do is we use a probing sequence,\n
2252
04:39:12,719 --> 04:39:19,888
tell us where to go next. So we hashed to\n
2253
04:39:19,888 --> 04:39:27,548
And now we're going to probe along using this\n
2254
04:39:27,548 --> 04:39:36,298
going to eventually find an open spots along\n
2255
04:39:36,298 --> 04:39:43,189
an infinite amount of probing sequences to\n
2256
04:39:43,189 --> 04:39:53,479
have linear probing, which probes via a linear\n
2257
04:39:53,479 --> 04:40:02,029
when we're probing we start, usually x at\n
2258
04:40:02,029 --> 04:40:07,309
slot, then we just increment x by one. And\n
2259
04:40:07,310 --> 04:40:12,909
functions. for linear probing, we use a linear\n
2260
04:40:12,909 --> 04:40:19,387
function. And then there's double hashing,\n
2261
04:40:19,387 --> 04:40:26,968
is we define a secondary hash function on\n
2262
04:40:26,968 --> 04:40:32,878
inside the probing function. And the last\n
2263
04:40:32,878 --> 04:40:39,299
probing function that we can use. So given\n
2264
04:40:39,299 --> 04:40:46,610
seed it using the hash value of our key, which\n
2265
04:40:46,610 --> 04:40:53,750
to be the same thing. And then we can use\n
2266
04:40:53,750 --> 04:40:58,637
pretty neat, and increment by x each time\n
2267
04:40:58,637 --> 04:41:03,309
just getting the next number in the random\n
2268
04:41:03,310 --> 04:41:09,210
that. Alright, so here's a general insertion\n
2269
04:41:09,209 --> 04:41:18,797
a table of size n. And here's how the algorithm\n
2270
04:41:18,797 --> 04:41:25,989
x is a constant, or sorry, a variable that\n
2271
04:41:25,990 --> 04:41:34,420
going to increment x each time we fail to\n
2272
04:41:34,419 --> 04:41:39,217
just by hashing our key. And that is actually\n
2273
04:41:39,218 --> 04:41:46,952
look in a table first. So while the table\n
2274
04:41:46,952 --> 04:41:57,218
to No, we're going to say our new index is\n
2275
04:41:57,218 --> 04:42:06,387
hash to plus the probing function, mod n,\n
2276
04:42:06,387 --> 04:42:11,759
and then we're going to increment x, so that\n
2277
04:42:11,759 --> 04:42:19,949
at a different position. And then eventually,\n
2278
04:42:19,950 --> 04:42:25,680
set up our probing function in such a way\n
2279
04:42:25,680 --> 04:42:32,740
we will always find a free slot, because we\n
2280
04:42:35,759 --> 04:42:44,969
so here's the big issue with open addressing.\n
2281
04:42:44,970 --> 04:42:51,878
we choose modulo n, are going to end up producing\n
2282
04:42:51,878 --> 04:43:00,639
size itself. So imagine your probing sequence\n
2283
04:43:00,639 --> 04:43:07,479
cycles. And your table is of size 10. But\n
2284
04:43:07,479 --> 04:43:13,649
because it's stuck in a cycle. And all of\n
2285
04:43:13,650 --> 04:43:19,770
in an infinite loop. So this is very problematic,\n
2286
04:43:19,770 --> 04:43:27,869
to handle. Right, so let's have a look at\n
2287
04:43:27,869 --> 04:43:34,968
And using open addressing, it's got some key\n
2288
04:43:34,968 --> 04:43:40,490
the circle with a bar through it is the no\n
2289
04:43:40,490 --> 04:43:49,120
probing sequence p of x equals 4x. And suppose\n
2290
04:43:49,119 --> 04:43:57,307
and that the key hashes to eight. So that\n
2291
04:43:57,308 --> 04:44:03,049
at position eight, but Oh, it's already occupied,\n
2292
04:44:03,049 --> 04:44:13,489
there. So what we do well, we probe, so we\n
2293
04:44:13,490 --> 04:44:19,791
we get eight plus for my 12. Well, that's\n
2294
04:44:19,791 --> 04:44:26,760
Oh, that is also occupied, because the key\n
2295
04:44:26,759 --> 04:44:36,599
we compute P of two, and then they gives us\n
2296
04:44:36,599 --> 04:44:42,840
already occupied, and then we keep probing.\n
2297
04:44:42,840 --> 04:44:48,779
So we'll keep probing and probing and probing,\n
2298
04:44:48,779 --> 04:44:56,110
position. So although we have a proving function,\n
2299
04:44:56,110 --> 04:45:03,208
The probing function is flawed. So that's\n
2300
04:45:03,207 --> 04:45:08,779
functions are viable. They produce cycles\n
2301
04:45:08,779 --> 04:45:16,137
handle this? And in general, the consensus\n
2302
04:45:16,137 --> 04:45:23,729
we try to avoid it all together by restricting\n
2303
04:45:23,729 --> 04:45:30,399
to be those which produce a cycle of exactly\n
2304
04:45:30,400 --> 04:45:37,480
I have a little Asterix here and says there\n
2305
04:45:37,479 --> 04:45:43,169
are some probing functions we can use, which\n
2306
04:45:43,169 --> 04:45:49,707
And we're going to have a look at I think,\n
2307
04:45:49,707 --> 04:45:55,909
Alright, so just to recap, techniques such\n
2308
04:45:55,909 --> 04:46:02,740
and double hashing, they're all subject to\n
2309
04:46:02,740 --> 04:46:09,670
to do is redefine probing functions, which\n
2310
04:46:09,669 --> 04:46:16,000
length and to avoid not being able to insert\n
2311
04:46:16,000 --> 04:46:21,718
loop. So this is a bit of an issue with the\n
2312
04:46:21,718 --> 04:46:27,250
we can handle. Although notice that this isn't\n
2313
04:46:27,250 --> 04:46:32,630
in the separate chaining world, just because\n
2314
04:46:32,630 --> 04:46:38,670
just captures all our collisions. Okay, this\n
2315
04:46:38,669 --> 04:46:47,250
talking about hash tables, and the linear\n
2316
04:46:49,919 --> 04:46:55,969
So in general, if we have a table of size\n
2317
04:46:55,970 --> 04:46:59,840
what your probing function is. So we start\nour constant
2318
04:46:59,840 --> 04:47:07,150
or sorry, variable x at one, the key hash\n
2319
04:47:07,150 --> 04:47:13,510
gives us four key. And our first index we're\n
2320
04:47:13,509 --> 04:47:20,637
hash position. And while the table at the\n
2321
04:47:20,637 --> 04:47:30,489
is already occupied, then we're going to offset\n
2322
04:47:30,490 --> 04:47:37,048
probing function mode. And every time we do\n
2323
04:47:37,047 --> 04:47:44,729
our probing function pushes us along one extra\n
2324
04:47:44,729 --> 04:47:47,797
then we can insert the key value pair into\nthe table
2325
04:47:49,790 --> 04:47:55,990
Alright, so what is linear probing? So linear\n
2326
04:47:55,990 --> 04:48:04,548
according to some linear formula, specifically,\n
2327
04:48:04,547 --> 04:48:09,180
b. And we have to make sure that a is not\n
2328
04:48:09,180 --> 04:48:14,900
a constant which does nothing. Now have a\n
2329
04:48:14,900 --> 04:48:22,830
constant b is obsolete. And if you know why,\n
2330
04:48:22,830 --> 04:48:32,340
with the others. And as we saw in the last\n
2331
04:48:32,340 --> 04:48:39,229
currently, and it's that some linear functions\n
2332
04:48:39,229 --> 04:48:46,159
n. And we might end up getting stuck in a\n
2333
04:48:46,159 --> 04:48:56,958
our linear function to be p of x equals 3x,\n
2334
04:48:56,958 --> 04:49:04,387
some reason our table size was nine, then\n
2335
04:49:04,387 --> 04:49:13,250
Assuming that positions, four, seven and one\n
2336
04:49:13,250 --> 04:49:18,520
So the fact that we're only probing at those\n
2337
04:49:18,520 --> 04:49:24,529
all the other buckets, which is really bad.\n
2338
04:49:24,529 --> 04:49:30,207
loop, we cannot get stuck in this situation\n
2339
04:49:30,207 --> 04:49:38,967
question which values of the constant A and\n
2340
04:49:38,968 --> 04:49:48,549
M. It turns out that this happens when the\n
2341
04:49:48,549 --> 04:49:58,599
prime to each other. The two numbers are relatively\n
2342
04:49:58,599 --> 04:50:09,329
is equal to one So that is a an N have a GCD\n
2343
04:50:09,330 --> 04:50:15,580
function will always be able to generate a\n
2344
04:50:15,580 --> 04:50:24,400
to find an empty bucket. Awesome. Alright,\n
2345
04:50:24,400 --> 04:50:30,290
suppose we have an originally empty hash table,\n
2346
04:50:30,290 --> 04:50:37,378
And we selected our probing function to be\n
2347
04:50:37,378 --> 04:50:44,860
equals nine. And then we also selected a max\n
2348
04:50:44,860 --> 04:50:54,458
and the threshold will then be six. So we\n
2349
04:50:54,458 --> 04:51:01,189
based on the probing function we chose at\n
2350
04:51:01,189 --> 04:51:07,840
infinite loop while inserting? Based on what\n
2351
04:51:07,840 --> 04:51:14,040
answer is, yes, the greatest common denominator\n
2352
04:51:14,040 --> 04:51:23,218
not one. So let's go ahead and attempt to\n
2353
04:51:23,218 --> 04:51:31,360
may not hit any problems. Okay. So first,\n
2354
04:51:31,360 --> 04:51:38,790
some key value pairs, I want to insert, and\n
2355
04:51:38,790 --> 04:51:46,043
that the hash value for K one is equal to\n
2356
04:51:46,043 --> 04:51:51,980
of K one plus the probing sequence, add zero,\n
2357
04:51:51,979 --> 04:52:01,819
insert that key value pair at position two.\n
2358
04:52:01,819 --> 04:52:08,191
is equal to two again. So we're going to try\n
2359
04:52:08,191 --> 04:52:14,270
two. But oh snap, we have a hash collision.\n
2360
04:52:14,270 --> 04:52:21,260
to offset the probing function at one and\n
2361
04:52:21,259 --> 04:52:29,289
of inserting it now at two, we're going to\n
2362
04:52:29,290 --> 04:52:37,170
in because that slot was free. Now, let's\n
2363
04:52:37,169 --> 04:52:44,099
that hashes the three, then we can insert\n
2364
04:52:44,099 --> 04:52:51,349
Now, notice that we're trying to re insert\n
2365
04:52:51,349 --> 04:52:55,579
table. So instead of inserting it, we're actually\n
2366
04:52:55,580 --> 04:53:02,638
exists in the hash table. Alright, so from\n
2367
04:53:02,637 --> 04:53:10,718
two is two. So So then we look at position\n
2368
04:53:11,819 --> 04:53:19,889
So we increment x offset by P of one. And\n
2369
04:53:19,889 --> 04:53:28,430
update valuate there. Let's go to K five.\n
2370
04:53:28,430 --> 04:53:37,150
eight. So eight is taken. So we're going to\n
2371
04:53:37,150 --> 04:53:43,950
we're going to insert the key value pair there.\n
2372
04:53:43,950 --> 04:53:56,280
six hashes the five, then let's probe ones\n
2373
04:53:56,279 --> 04:54:05,649
a hash collision, let's keep probing. So now,\n
2374
04:54:05,650 --> 04:54:11,240
right, another hash collision, so we have\n
2375
04:54:11,240 --> 04:54:19,280
back to five. So we've hit a cycle. Alright,\n
2376
04:54:19,279 --> 04:54:26,439
expected this to happen because we knew that\n
2377
04:54:26,439 --> 04:54:33,877
to three and not one. So if we look at all\n
2378
04:54:33,878 --> 04:54:41,490
instead of six, we see that the ones that\n
2379
04:54:41,490 --> 04:54:51,808
with and the table size are 12457 and eight,\n
2380
04:54:51,808 --> 04:55:01,360
something else. So this comes to the realization\n
2381
04:55:01,360 --> 04:55:09,119
p of x to be one times x, then the greatest\n
2382
04:55:09,119 --> 04:55:16,029
going to be one no matter what our choice\n
2383
04:55:16,029 --> 04:55:23,729
one times x is a very popular probing function\n
2384
04:55:23,729 --> 04:55:29,819
hash table, and we wish in certain more key\n
2385
04:55:29,819 --> 04:55:36,121
going to pick a probing function that works,\n
2386
04:55:36,121 --> 04:55:42,229
I'm going to pick the table size to be 12\n
2387
04:55:42,229 --> 04:55:53,110
should occur. All right, so let's go with\n
2388
04:55:53,110 --> 04:56:00,869
one has a hash value of 10, then at index\n
2389
04:56:00,869 --> 04:56:08,229
a hash value of eight, then slot eight is\n
2390
04:56:08,229 --> 04:56:18,887
now, suppose k three is equal to 10, hash\n
2391
04:56:18,887 --> 04:56:26,450
to keep probing. Alright, so if we use our\n
2392
04:56:26,450 --> 04:56:35,271
So they'll give us three module and when we\n
2393
04:56:35,271 --> 04:56:44,720
k four. Now suppose the hash value for K four\n
2394
04:56:44,720 --> 04:56:55,958
we hit k three, which inserted last time.\n
2395
04:56:55,957 --> 04:57:04,229
able to pull out eventually when we hit the\n
2396
04:57:04,229 --> 04:57:11,950
we've actually reached the threshold of our\n
2397
04:57:11,950 --> 04:57:21,468
that I picked alpha to be 0.35. So n, which\n
2398
04:57:21,468 --> 04:57:31,069
And we just finished inserting the fourth\n
2399
04:57:31,069 --> 04:57:37,957
So how we usually resize the table is VSM,\n
2400
04:57:37,957 --> 04:57:49,297
or so on. But we need to double in such a\n
2401
04:57:49,297 --> 04:57:59,557
a doubling, and is equal to 24, and the GCD\n
2402
04:57:59,558 --> 04:58:05,298
so it's still 3.5. So our new threshold is\n
2403
04:58:05,297 --> 04:58:08,457
the programming function. Alright
2404
04:58:08,457 --> 04:58:14,789
so let's allocate a new chunk of memory for\n
2405
04:58:14,790 --> 04:58:26,990
the old elements in our old table into this\n
2406
04:58:26,990 --> 04:58:33,860
right. So we scan across all these elements,\n
2407
04:58:33,860 --> 04:58:39,350
along. So from before we knew that hash value\n
2408
04:58:39,349 --> 04:58:45,439
is going to go at position 10. Scan along\n
2409
04:58:45,439 --> 04:58:52,459
three was 10. So it should go in position\n
2410
04:58:52,459 --> 04:59:00,340
so we have to keep probing. So if we add our\n
2411
04:59:00,340 --> 04:59:06,790
we get 10 plus five, which is 15. So we're\n
2412
04:59:06,790 --> 04:59:11,878
table, keep probing nothing here and nothing\n
2413
04:59:11,878 --> 04:59:18,878
k two. So we know from before k two is equal\n
2414
04:59:18,878 --> 04:59:27,100
eight. Now we know k one is equal to 10. So\n
2415
04:59:27,099 --> 04:59:32,729
that's taken so to probe so the next position\n
2416
04:59:32,729 --> 04:59:36,957
us 20. So insert k one v one at 20.
2417
04:59:38,069 --> 04:59:43,409
so now we throw away the old table and we\n
2418
04:59:43,409 --> 04:59:52,939
table we're working with and we were at inserting\n
2419
04:59:52,939 --> 05:00:00,039
And that spot is free. So we are good. So\n
2420
05:00:00,040 --> 05:00:06,388
I know how insertion works. Now how do I remove\n
2421
05:00:06,387 --> 05:00:14,069
open addressing? And my answer this is that\n
2422
05:00:14,069 --> 05:00:19,389
And we're going to do it after we see all\n
2423
05:00:19,389 --> 05:00:25,968
it's actually non trivial. All right, let's\n
2424
05:00:25,968 --> 05:00:34,619
works. Let's dive right in. So let's recall\n
2425
05:00:34,619 --> 05:00:41,569
of size and using the open addressing collision\n
2426
05:00:41,569 --> 05:00:47,759
a variable called x to be one, which we're\n
2427
05:00:47,759 --> 05:00:57,319
to find a free slot, then we compute the key\n
2428
05:00:57,319 --> 05:01:04,279
going to check and we're going to the loop.\n
2429
05:01:04,279 --> 05:01:10,090
the table at that index is not equal to null,\n
2430
05:01:10,090 --> 05:01:16,880
happens, we're going to offset the key hash\n
2431
05:01:16,880 --> 05:01:23,650
in our case is going to be a quadratic function.\n
2432
05:01:23,650 --> 05:01:31,530
we will find an open slot to insert our key\n
2433
05:01:31,529 --> 05:01:39,029
probing? So quadratic probing is simply probing\n
2434
05:01:39,029 --> 05:01:45,638
when our probing function looks something\n
2435
05:01:45,638 --> 05:01:52,128
c, and a, b, and c are all constants. And\n
2436
05:01:52,128 --> 05:02:01,540
we degrade to linear probing. But as we saw\n
2437
05:02:01,540 --> 05:02:06,781
functions are viable because they don't produce\n
2438
05:02:06,781 --> 05:02:13,628
an infinite loop. So as it turns out, most\n
2439
05:02:13,628 --> 05:02:20,340
will end up producing a cycle. Here's an example.\n
2440
05:02:20,340 --> 05:02:29,150
to be p of x equals 2x squared plus two, the\n
2441
05:02:29,150 --> 05:02:38,170
the current table size was nine, then we would\n
2442
05:02:38,169 --> 05:02:45,869
we would probe at position zero, we would\n
2443
05:02:45,869 --> 05:02:54,259
suppose those two entries are full, and then\n
2444
05:02:54,259 --> 05:03:00,639
function is only ever able to hit the buckets,\n
2445
05:03:00,639 --> 05:03:06,739
to reach all the other buckets, 012356, and\n
2446
05:03:06,740 --> 05:03:13,100
when four and seven are already occupied.\n
2447
05:03:13,099 --> 05:03:20,769
then is, how do we pick a probing function,\n
2448
05:03:20,770 --> 05:03:27,450
numerous ways. But here are the three most\n
2449
05:03:27,450 --> 05:03:36,031
is to select the probing function to be p\n
2450
05:03:36,031 --> 05:03:43,878
size a prime number greater than three, and\n
2451
05:03:43,878 --> 05:03:50,378
one half or less than or equal to one half.\n
2452
05:03:50,378 --> 05:03:57,270
equals x squared plus x divided by two, and\n
2453
05:03:57,270 --> 05:04:07,128
And the last and final one says that p of\n
2454
05:04:07,128 --> 05:04:13,610
and keep the table size a prime number where\n
2455
05:04:13,610 --> 05:04:22,137
we can say that I were table size was 23,\n
2456
05:04:22,137 --> 05:04:29,669
to three mod four. So any of these will work.\n
2457
05:04:29,669 --> 05:04:35,119
how they work and whether table size should\n
2458
05:04:37,349 --> 05:04:42,840
So we're going to focus on the second one\n
2459
05:04:42,840 --> 05:04:51,637
by two and the table size is a power of two.\n
2460
05:04:51,637 --> 05:04:59,979
hash table, and we want to insert some key\n
2461
05:04:59,979 --> 05:05:07,229
probing function, p of x equals x squared\n
2462
05:05:07,229 --> 05:05:14,878
is a power of two, so it's eight. And that's\n
2463
05:05:14,878 --> 05:05:22,409
the table threshold is going to be three.\n
2464
05:05:22,409 --> 05:05:29,360
absolutely be a power of two, otherwise, this\n
2465
05:05:29,360 --> 05:05:41,319
guy. So suppose that k one hashes six, then\n
2466
05:05:41,319 --> 05:05:46,718
Right, next k two, suppose k two is equal\n
2467
05:05:46,718 --> 05:05:54,860
five, no collision there. Suppose k threes\n
2468
05:05:54,860 --> 05:06:00,569
need to handle that. So we're going to try\n
2469
05:06:00,569 --> 05:06:08,369
to six. So we probe again, and that brings\n
2470
05:06:08,369 --> 05:06:17,159
going cert, k three, and V three key value\n
2471
05:06:17,159 --> 05:06:21,919
before we can do that, we've reached the table\n
2472
05:06:21,919 --> 05:06:28,529
first. Okay, so let's allocate a new block\n
2473
05:06:28,529 --> 05:06:37,430
table to keep it a power of two. So our new\n
2474
05:06:37,430 --> 05:06:48,128
However, a new threshold is six, and the probing\n
2475
05:06:48,128 --> 05:06:54,630
the entries in the old hash table into the\n
2476
05:06:54,630 --> 05:07:02,208
k three hashed to five, so we're going to\n
2477
05:07:02,207 --> 05:07:08,717
there. And no element at position 123 or four,\n
2478
05:07:08,718 --> 05:07:16,479
right there. So we know from before that key\n
2479
05:07:16,479 --> 05:07:23,457
a preposition five, there's a hash collision,\n
2480
05:07:23,457 --> 05:07:33,207
one equals six, position six, or insert k\n
2481
05:07:33,207 --> 05:07:38,557
them before k one hash two, six, but we can't\n
2482
05:07:38,558 --> 05:07:46,590
So we're going to probe along. So we're going\n
2483
05:07:46,590 --> 05:07:51,619
that does it for resizing the table. So let's\n
2484
05:07:51,619 --> 05:08:03,539
to insert inside our table. So suppose that\n
2485
05:08:03,540 --> 05:08:11,680
buy 16 years, this position to, so we're going\n
2486
05:08:11,680 --> 05:08:17,920
we've already seen k three, and we know its\n
2487
05:08:17,919 --> 05:08:23,637
is already in our hash table, we're going\n
2488
05:08:23,637 --> 05:08:30,250
probing functions, zero gives us five. So\n
2489
05:08:30,250 --> 05:08:46,218
v3, which it was before. So suppose that the\n
2490
05:08:46,218 --> 05:08:55,440
to three mod 16. So that's why it's free.\n
2491
05:08:55,439 --> 05:09:03,180
seven suppose hashes to well, we have a collision\n
2492
05:09:03,180 --> 05:09:08,540
we probe, our probing function gives us an\n
2493
05:09:08,540 --> 05:09:20,069
can so now we are at position five, but that's\n
2494
05:09:20,069 --> 05:09:26,770
for a fourth time scan offset of six. So that\ngives us eight.
2495
05:09:26,770 --> 05:09:34,659
That slot is free. We're going to answer that\n
2496
05:09:34,659 --> 05:09:43,110
and the double hashing, open addressing collision\n
2497
05:09:43,110 --> 05:09:50,069
for those of you who don't know how we do\n
2498
05:09:50,069 --> 05:09:59,259
we start with a variable x initialized to\n
2499
05:09:59,259 --> 05:10:06,759
We set that to be the first index that we're\n
2500
05:10:06,759 --> 05:10:15,637
it's not now. So the goal is to find an empty\n
2501
05:10:15,637 --> 05:10:23,000
still hitting spots where there are already\n
2502
05:10:23,000 --> 05:10:31,878
to offset our key hash using a probing function.\n
2503
05:10:31,878 --> 05:10:38,781
probing function. And we're also going to\n
2504
05:10:38,781 --> 05:10:44,009
along further and further. Once you find a\n
2505
05:10:44,009 --> 05:10:51,329
pair into the hash table. Okay, so what's\n
2506
05:10:51,330 --> 05:10:57,290
is just a probing method like any other. But\n
2507
05:10:57,290 --> 05:11:03,590
to a constant multiple of another hash function.\n
2508
05:11:03,590 --> 05:11:10,009
something like this, we give it as input,\n
2509
05:11:10,009 --> 05:11:19,967
x, and we compute x times h sub two of k,\n
2510
05:11:19,968 --> 05:11:28,409
here's an important note, H 2k, must hash\n
2511
05:11:28,409 --> 05:11:38,941
your key is a string, well, h two of K must\n
2512
05:11:38,941 --> 05:11:46,129
the nature of K must also hash integers. So\n
2513
05:11:46,130 --> 05:11:53,048
that double hashing actually reduces to linear\n
2514
05:11:53,047 --> 05:12:01,529
until runtime, because we dynamically compute\n
2515
05:12:01,529 --> 05:12:07,020
reduces to linear probing at runtime, we may\n
2516
05:12:07,020 --> 05:12:15,670
probing, which is that we get stuck in an\n
2517
05:12:15,669 --> 05:12:23,199
our secondary hash function at runtime calculate\n
2518
05:12:23,200 --> 05:12:29,840
h1 of K was four, at the table size was nine,\n
2519
05:12:29,840 --> 05:12:38,540
occurring. So the cycle produces values of\n
2520
05:12:38,540 --> 05:12:47,720
to reach any of the buckets, 02356, and eight.\n
2521
05:12:47,720 --> 05:12:51,208
that means we're stuck in an infinite loop\n
2522
05:12:51,207 --> 05:12:56,770
to insert our key value pair because we're\n
2523
05:12:56,770 --> 05:13:04,171
issue we have to deal with. So to fix the\n
2524
05:13:04,170 --> 05:13:10,269
one strategy, we're going to pick our table\n
2525
05:13:10,270 --> 05:13:20,029
to compute a value called delta. So delta\n
2526
05:13:20,029 --> 05:13:25,599
occasionally Delta might be zero. And if that's\n
2527
05:13:25,599 --> 05:13:29,930
to be stuck in a cycle because we're not going\n
2528
05:13:29,930 --> 05:13:35,957
going to be multiplying by zero. So when this\n
2529
05:13:35,957 --> 05:13:43,319
All right. So here's the justification why\n
2530
05:13:43,319 --> 05:13:49,547
going to be between one inclusive and non\n
2531
05:13:49,547 --> 05:13:58,869
between delta and n is going to be one, since\n
2532
05:13:58,869 --> 05:14:05,950
conditions, we know that the probing sequence\n
2533
05:14:05,950 --> 05:14:12,790
to able to hit every single slot in our hash\n
2534
05:14:12,790 --> 05:14:17,620
free slot in the hash table, which there will\n
2535
05:14:17,619 --> 05:14:21,861
a certain threshold, that we're going to be\n
2536
05:14:21,862 --> 05:14:29,450
Okay, so here's a core question, how do we\n
2537
05:14:29,450 --> 05:14:37,240
the keys we're using have type T, and whenever\n
2538
05:14:37,240 --> 05:14:44,909
h 2k, two hash keys that are also of type\n
2539
05:14:44,909 --> 05:14:50,779
systematic way of generating these new hash\n
2540
05:14:50,779 --> 05:14:58,039
we might be dealing with multiple different\n
2541
05:14:58,040 --> 05:15:04,600
computer science, Almost every object we ever\n
2542
05:15:04,599 --> 05:15:10,489
building blocks, in particular integers, strings,\n
2543
05:15:10,490 --> 05:15:18,270
on. So we can use this to our advantage. Luckily,\n
2544
05:15:18,270 --> 05:15:24,887
these fundamental data types. And we can combine\n
2545
05:15:24,887 --> 05:15:33,409
function h two of K. Frequently, when we compose\n
2546
05:15:33,409 --> 05:15:38,630
functions called Universal hash functions,\n
2547
05:15:38,630 --> 05:15:44,468
data types, which is quite convenient. Alright,\n
2548
05:15:44,468 --> 05:15:51,218
hashing. So suppose we have an originally\n
2549
05:15:51,218 --> 05:15:57,792
probing function to be p of x equals x squared\n
2550
05:15:57,792 --> 05:16:04,260
be in our table size v n equals seven. Notice\n
2551
05:16:04,259 --> 05:16:10,769
max load factor to be alpha equals point seven,\n
2552
05:16:10,770 --> 05:16:17,700
So once we hit five elements, we need to grow\n
2553
05:16:17,700 --> 05:16:23,430
all these key value pairs on the left into\n
2554
05:16:23,430 --> 05:16:30,650
keyword and v one. Now suppose that the hash\n
2555
05:16:30,650 --> 05:16:39,468
is 67, and H 2k. One is 34. And first thing\n
2556
05:16:39,468 --> 05:16:48,110
two of K one modulo seven, which is the table\n
2557
05:16:48,110 --> 05:16:59,292
where this key value pair should go and should\n
2558
05:16:59,292 --> 05:17:14,308
h one of K two is two, and H, two of K two\n
2559
05:17:14,308 --> 05:17:21,229
just going to insert a position two, because\n
2560
05:17:21,229 --> 05:17:32,250
is two, and H 2k. Three is 10. These are just\n
2561
05:17:32,250 --> 05:17:40,560
So then delta would be three in this, in this\n
2562
05:17:40,560 --> 05:17:49,708
have a hash collision, because we're trying\n
2563
05:17:49,707 --> 05:17:56,251
two is already there. So what we need to do\n
2564
05:17:56,251 --> 05:18:04,729
the position to be our original hash function\n
2565
05:18:04,729 --> 05:18:10,790
plus one times our delta value, mod seven,\n
2566
05:18:10,790 --> 05:18:20,990
to position five right there. Now, right,\n
2567
05:18:20,990 --> 05:18:28,860
h one of K four is equal to two, and h two\n
2568
05:18:28,860 --> 05:18:35,430
delta. So h two of K four modulo seven is\n
2569
05:18:35,430 --> 05:18:41,939
is zero. So when this happens, we know that\n
2570
05:18:41,939 --> 05:18:50,779
we don't get stuck in an infinite loop. So\n
2571
05:18:50,779 --> 05:18:57,807
mod seven gives us two. So we have a hash\n
2572
05:18:57,808 --> 05:19:08,170
you keep probing. So now we probed by multiplying\n
2573
05:19:10,819 --> 05:19:16,669
So now we're going to try insert k three,\n
2574
05:19:16,669 --> 05:19:22,859
but with a new value. So we're going to be\n
2575
05:19:22,860 --> 05:19:30,990
h 1k three is equal to two. Actually, we should\n
2576
05:19:30,990 --> 05:19:41,020
is 10. So compute delta. So we have a collision.\n
2577
05:19:41,020 --> 05:19:50,468
and update its value. Now suppose the first\n
2578
05:19:50,468 --> 05:19:58,010
secondary hash function of k six is 23. Then\n
2579
05:19:58,009 --> 05:20:04,090
we try to insert it, it goes to position So\n
2580
05:20:04,090 --> 05:20:11,349
delta mod seven, that gives us five. There's\n
2581
05:20:11,349 --> 05:20:17,127
offset at two times delta minus seven, which\n
2582
05:20:17,128 --> 05:20:24,920
we're able to insert our key value pair there.\n
2583
05:20:24,919 --> 05:20:30,899
so it's time to resize and grow the table,\n
2584
05:20:30,900 --> 05:20:35,590
So one strategy when we're trying to resize\n
2585
05:20:35,590 --> 05:20:41,950
need to keep our table size to be a prime\n
2586
05:20:41,950 --> 05:20:47,270
find the next prime number above this value.\n
2587
05:20:47,270 --> 05:20:56,128
and the next prime number 14 is 17. So 17\n
2588
05:20:56,128 --> 05:21:05,690
a new table of size 17, and go through the\n
2589
05:21:05,689 --> 05:21:15,079
into the new table. So from before, a 12k,\n
2590
05:21:15,080 --> 05:21:22,150
compute delta, and we know we're going to\n
2591
05:21:22,150 --> 05:21:29,580
collision. Next up, and nothing in position\n
2592
05:21:29,580 --> 05:21:37,420
value for K to two and the secondary hash\n
2593
05:21:37,419 --> 05:21:46,079
to compute delta to be six. So we're going\n
2594
05:21:46,080 --> 05:21:55,240
Next 1k four. So we know h one of K four is\n
2595
05:21:55,240 --> 05:22:02,231
compute our delta value. And notice that our\n
2596
05:22:02,230 --> 05:22:13,957
before, but seven because our mod is a 17\n
2597
05:22:13,957 --> 05:22:21,250
But we need to keep probing. So compute the\n
2598
05:22:21,250 --> 05:22:32,119
nine months 17. Next one, insert k one, suppose\n
2599
05:22:32,119 --> 05:22:42,590
then compute delta. And that gives us zero.\n
2600
05:22:42,590 --> 05:22:49,811
h one of K one plus zero times delta gives\n
2601
05:22:49,811 --> 05:22:57,780
probing. now compute the offset at one times\n
2602
05:22:57,779 --> 05:23:04,649
the x value. So now two times delta t is four,\n
2603
05:23:04,650 --> 05:23:12,620
value pair there. And the last 1k three. So\n
2604
05:23:12,619 --> 05:23:22,218
K three is 10. Delta is then 10. And we have\n
2605
05:23:22,218 --> 05:23:30,940
us 12. And that slot is free. And we reached\n
2606
05:23:30,939 --> 05:23:36,669
old table and replace it with a new table.\n
2607
05:23:36,669 --> 05:23:45,489
pair from before which is k seven. Suppose\n
2608
05:23:45,490 --> 05:23:54,878
is three, then our delta value is three. And\n
2609
05:23:54,878 --> 05:24:02,718
right, I know a lot of you have been anticipating\n
2610
05:24:02,718 --> 05:24:08,909
from a hash table using the open addressing\nscheme.
2611
05:24:08,909 --> 05:24:16,520
So let's have a look first at what issues\n
2612
05:24:16,520 --> 05:24:22,308
it naively, I think this is valuable. Because\n
2613
05:24:22,308 --> 05:24:29,458
hash table, and we're using a linear probing\n
2614
05:24:29,457 --> 05:24:37,637
equal to x. And we want to perform the following\n
2615
05:24:37,637 --> 05:24:47,557
1k, two and K three, and then remove k two\n
2616
05:24:47,558 --> 05:24:53,900
for the sake of argument, let's assume that\n
2617
05:24:53,900 --> 05:25:03,100
and K three, all equal to one. This is a possible\n
2618
05:25:03,099 --> 05:25:10,377
hashes to one. So we're going to insert at\n
2619
05:25:10,378 --> 05:25:15,560
has a hash collision with K one which is already\n
2620
05:25:15,560 --> 05:25:23,830
so insert it in the next slot over, that's\n
2621
05:25:23,830 --> 05:25:29,410
let's probe Okay, another hash collision.\n
2622
05:25:34,520 --> 05:25:40,909
we are going to remove k two, and we're going\n
2623
05:25:40,909 --> 05:25:51,878
just going to clear the contents of the bucket\n
2624
05:25:51,878 --> 05:25:59,208
was equal to k two. So we haven't found the\n
2625
05:25:59,207 --> 05:26:06,569
have found k two. So we're just going to remove\n
2626
05:26:06,569 --> 05:26:13,729
table and obtain the value for K three. Let's\n
2627
05:26:13,729 --> 05:26:20,140
we get one and K one cycles here three. So\n
2628
05:26:20,140 --> 05:26:29,739
no element. So what does this mean? So since\n
2629
05:26:29,740 --> 05:26:34,940
we're forced to conclude that he three does\n
2630
05:26:34,939 --> 05:26:41,289
we would have encountered it before reaching\n
2631
05:26:41,290 --> 05:26:50,058
works inside a hash table. So this method\n
2632
05:26:50,058 --> 05:26:55,790
doesn't work. Because k three clearly exists\n
2633
05:26:55,790 --> 05:27:01,659
index three. So here's a solution to the removing.
2634
05:27:02,659 --> 05:27:08,090
an element, we're going to place a unique\n
2635
05:27:08,090 --> 05:27:14,650
element to indicate that a specific key value\n
2636
05:27:14,650 --> 05:27:20,000
we're doing a search, we're just going to\n
2637
05:27:20,000 --> 05:27:25,718
the deleted bucket with a tombstone like we\n
2638
05:27:25,718 --> 05:27:36,770
we now search for the key k three. Okay, so\n
2639
05:27:36,770 --> 05:27:42,100
equal to k three. So keep probing. Alright,\n
2640
05:27:42,100 --> 05:27:49,449
was deleted. So keep probing. All right, we\n
2641
05:27:49,450 --> 05:27:58,128
value v three as our answer for the search.\n
2642
05:27:58,128 --> 05:28:05,900
I have a lot of tombstones cluttering my hash\n
2643
05:28:05,900 --> 05:28:11,520
with tombstones is that we're actually going\n
2644
05:28:11,520 --> 05:28:17,727
table. So they're going to increase the load\n
2645
05:28:17,727 --> 05:28:25,047
resize the hash table. But there's also another\n
2646
05:28:25,047 --> 05:28:31,599
a new key value pair, then we can replace\n
2647
05:28:31,599 --> 05:28:38,557
key value pair. And I want to give you guys\n
2648
05:28:38,558 --> 05:28:44,510
Suppose this is our hash table of size eight,\n
2649
05:28:44,509 --> 05:28:53,849
p of x equals x squared plus x divided by\n
2650
05:28:53,849 --> 05:29:03,079
play doing this when we want to do a lookup.\n
2651
05:29:03,080 --> 05:29:10,530
inside the hash table and the hash value for\n
2652
05:29:10,529 --> 05:29:20,717
key k seven. So k seven, hash to five. So\n
2653
05:29:20,718 --> 05:29:27,968
four. So let's keep probing. So we probe quadratically.\n
2654
05:29:27,968 --> 05:29:36,790
and that's position six. Now, position six\n
2655
05:29:36,790 --> 05:29:42,500
we encounter which has a tombstone in it.\n
2656
05:29:42,500 --> 05:29:48,700
for later to perform an optimization. Okay,\n
2657
05:29:48,700 --> 05:29:54,968
seven yet. So when we probe at position two,\n
2658
05:29:54,968 --> 05:30:01,680
because we have a tombstone so we have keep\n
2659
05:30:01,680 --> 05:30:11,128
So let's keep probing, and Aha, we have found\n
2660
05:30:11,128 --> 05:30:20,058
v seven. Now, however, we can do an optimization,\n
2661
05:30:20,058 --> 05:30:28,298
four times to find k seven, because we just\n
2662
05:30:28,297 --> 05:30:35,887
an optimization we can do is to relocate the\n
2663
05:30:35,887 --> 05:30:42,128
position where there was a tombstone. So that\n
2664
05:30:42,128 --> 05:30:49,619
We call this lazy deletion or lazy relocation.\n
2665
05:30:49,619 --> 05:30:55,279
there with K seven v seven. And now we have\n
2666
05:30:55,279 --> 05:31:02,797
going to want to replace the old one with\n
2667
05:31:02,797 --> 05:31:07,957
going to be having a look at some source code\n
2668
05:31:07,957 --> 05:31:14,877
as a collision resolution scheme. And you\n
2669
05:31:14,878 --> 05:31:22,297
William fiza, slash data dash structures,\n
2670
05:31:22,297 --> 05:31:28,887
to find a whole bunch of hash table implementations.\n
2671
05:31:28,887 --> 05:31:34,399
here. In particular, we are quadratic probing,\n
2672
05:31:34,400 --> 05:31:40,458
all very similar to each other. So I will\n
2673
05:31:40,457 --> 05:31:47,557
curious, you can go on GitHub and check them\n
2674
05:31:47,558 --> 05:31:50,420
or slightly more different is the double hashing.
2675
05:31:50,419 --> 05:31:57,619
But other than that, they are really essentially\n
2676
05:31:57,619 --> 05:32:04,489
we're going to have a look at the quadratic\n
2677
05:32:04,490 --> 05:32:13,100
Alright, here we are inside the code for the\n
2678
05:32:13,099 --> 05:32:19,717
let's dive right in. So I have a class called\n
2679
05:32:19,718 --> 05:32:25,208
it takes in two generic types K and V. So\n
2680
05:32:25,207 --> 05:32:31,779
type. So you're gonna have to specify these\n
2681
05:32:31,779 --> 05:32:35,957
hash table for quadratic probing. So I have\n
2682
05:32:35,957 --> 05:32:42,467
to need. The first is the load factor, this\n
2683
05:32:42,468 --> 05:32:49,580
that we're willing to tolerate the current\n
2684
05:32:49,580 --> 05:32:56,298
we're willing to tolerate the modification\n
2685
05:32:58,860 --> 05:33:04,260
two instance variables that keep track of\n
2686
05:33:04,259 --> 05:33:08,489
the key count, which tracks the number of\n
2687
05:33:08,490 --> 05:33:15,370
table. Now, since we're doing open addressing,\n
2688
05:33:15,369 --> 05:33:22,450
inside an array. So instead of having one\n
2689
05:33:22,450 --> 05:33:28,119
decided just allocate two different arrays,\n
2690
05:33:28,119 --> 05:33:35,110
the code a lot easier. And shorter. Actually,\n
2691
05:33:35,110 --> 05:33:41,690
to be using, or rather setting when we call\n
2692
05:33:41,689 --> 05:33:45,989
was talking about in the last video. This\n
2693
05:33:45,990 --> 05:33:52,030
deletions. So every time we delete an entry,\n
2694
05:33:52,029 --> 05:33:58,509
we know this tombstone object is unique. Alright,\n
2695
05:33:58,509 --> 05:34:03,619
So whenever you want to initialize a hash\n
2696
05:34:03,619 --> 05:34:09,329
these constants. So this is a default load\n
2697
05:34:09,330 --> 05:34:17,430
it up as you like. So you can initialize it\n
2698
05:34:17,430 --> 05:34:26,000
constructor. So let's have a look. So the\n
2699
05:34:26,000 --> 05:34:31,900
check if the user pass in some sort of a weird\n
2700
05:34:31,900 --> 05:34:40,138
then set the max value for the load factor,\n
2701
05:34:40,137 --> 05:34:49,899
that the capacity is a power of two, I need\n
2702
05:34:49,900 --> 05:34:58,930
going to be with this method next to power.\n
2703
05:34:58,930 --> 05:35:05,750
That is just Above this current number, or\n
2704
05:35:05,750 --> 05:35:11,009
itself is a power of two, so we don't have\n
2705
05:35:11,009 --> 05:35:15,599
to be a power of two, then we compute the\n
2706
05:35:15,599 --> 05:35:25,877
the capacity and initialize our tables. Alright,\n
2707
05:35:25,878 --> 05:35:35,297
probing function I chose. So P of x, so we\n
2708
05:35:35,297 --> 05:35:42,949
x squared plus x divided by two. So this is\n
2709
05:35:42,950 --> 05:35:50,049
So given a hash value, it essentially strips\n
2710
05:35:50,049 --> 05:35:59,000
So it dumps our hash value inside the domain,\n
2711
05:35:59,000 --> 05:36:05,128
a clear method. And this is pretty self explanatory.\n
2712
05:36:05,128 --> 05:36:14,792
hash table and start fresh. Some helper methods\n
2713
05:36:14,792 --> 05:36:21,490
tables empty. And put add an insert or essentially\n
2714
05:36:21,490 --> 05:36:27,950
insert method. This inserts a key value pair\n
2715
05:36:27,950 --> 05:36:34,869
the key already exists. All right, we don't\n
2716
05:36:34,869 --> 05:36:40,180
an exception. If the number of buckets use\n
2717
05:36:40,180 --> 05:36:46,689
we're tolerating, we're going to resize the\n
2718
05:36:46,689 --> 05:36:52,590
we want to calculate the hash value from the\n
2719
05:36:52,590 --> 05:36:59,040
you can override this for your particular\n
2720
05:36:59,040 --> 05:37:04,100
Jane xR. So I is going to be the current index\n
2721
05:37:04,099 --> 05:37:11,899
to be bouncing around this I value is going\n
2722
05:37:11,900 --> 05:37:17,540
the first tombstone we encounter if we encounter\n
2723
05:37:17,540 --> 05:37:25,308
going to be using this for an optimization\n
2724
05:37:25,308 --> 05:37:32,450
to one, initially. Okay, so this is a do while\n
2725
05:37:34,529 --> 05:37:40,860
Alright, so first, we check in the key table,\n
2726
05:37:40,860 --> 05:37:46,819
is equal to minus one, that means we haven't\n
2727
05:37:46,819 --> 05:37:58,360
this tombstone. Okay, so this next check checks\n
2728
05:37:58,360 --> 05:38:03,797
meaning there's a key inside of it. So we\n
2729
05:38:03,797 --> 05:38:11,451
table. So that's what this does. It compares\n
2730
05:38:11,452 --> 05:38:22,409
trying to insert this key. And if j is equal\n
2731
05:38:22,409 --> 05:38:28,628
then just update the value. If we've hit a\n
2732
05:38:28,628 --> 05:38:36,708
the tombstone. And at the modification, count,\n
2733
05:38:36,707 --> 05:38:44,819
was there before just just in case why use\n
2734
05:38:44,819 --> 05:38:54,540
so we can do an insertion. So j is equal to\n
2735
05:38:54,540 --> 05:39:01,420
so far. So increment number of use buckets\n
2736
05:39:01,419 --> 05:39:13,679
pair. Otherwise, we have seen a tombstone\n
2737
05:39:13,680 --> 05:39:20,430
where the element is inserted where the deleted\n
2738
05:39:20,430 --> 05:39:27,297
of AI. So here we're inserting an AI, but\n
2739
05:39:27,297 --> 05:39:35,759
we're gonna return null because there was\n
2740
05:39:35,759 --> 05:39:41,807
a loop, so we get through all these if statements\n
2741
05:39:41,808 --> 05:39:49,620
need to keep probing we had a hash collision,\n
2742
05:39:49,619 --> 05:39:57,419
in it. So we need to probe so we need to offset\n
2743
05:39:57,419 --> 05:40:03,159
probing index or the probe. Addition and increment\n
2744
05:40:03,159 --> 05:40:08,290
to the next spot. And we'll do this while\n
2745
05:40:12,419 --> 05:40:20,679
so contains key and has key, just check if\n
2746
05:40:20,680 --> 05:40:27,430
do this, I'm being pretty lazy. And I'm just\n
2747
05:40:27,430 --> 05:40:33,797
instance variable in there called contains\n
2748
05:40:33,797 --> 05:40:40,201
key is inside our hash table or not. Because\n
2749
05:40:40,202 --> 05:40:46,978
have essentially the same code. So that's\n
2750
05:40:46,977 --> 05:40:53,739
at the get method since it's getting used\n
2751
05:40:53,740 --> 05:41:02,440
the original hash index is equal to the hash\n
2752
05:41:02,439 --> 05:41:10,127
do all the same stuff, or mostly except set\n
2753
05:41:10,128 --> 05:41:20,659
flag to be true, when we identify that the\n
2754
05:41:20,659 --> 05:41:26,968
our else condition is just shorter, we return\n
2755
05:41:26,968 --> 05:41:34,887
a new element, and set the contains slide\n
2756
05:41:34,887 --> 05:41:42,959
Remove method is actually quite a bit shorter.\n
2757
05:41:42,959 --> 05:41:49,319
is now find the hash set x to be equal to\n
2758
05:41:49,319 --> 05:41:57,529
too much. So we don't have a j position. So\n
2759
05:41:57,529 --> 05:42:07,878
probe until we find a spot. So for every loop,\n
2760
05:42:07,878 --> 05:42:15,279
position if this loop gets completed, so here's\n
2761
05:42:15,279 --> 05:42:22,610
skip over those. So if this happens, if the\n
2762
05:42:22,610 --> 05:42:29,690
return now. Otherwise, the key we want to\n
2763
05:42:29,689 --> 05:42:33,957
this check, because we check if it's null\n
2764
05:42:33,957 --> 05:42:45,569
So decrement, the key, count up the modification\n
2765
05:42:45,569 --> 05:42:52,619
here, and just wipe whatever value is in there.\n
2766
05:42:52,619 --> 05:42:59,180
the Remove method, and then just return the\n
2767
05:42:59,180 --> 05:43:04,330
okay, these two methods are pretty self explanatory,\n
2768
05:43:04,330 --> 05:43:09,718
that are contained within our hash table.\n
2769
05:43:09,718 --> 05:43:17,159
resize table method. So this is this gets\n
2770
05:43:17,159 --> 05:43:24,452
I mean, to grow the table size. And remember\n
2771
05:43:24,452 --> 05:43:30,308
implementation, we always need the capacity\n
2772
05:43:30,308 --> 05:43:35,540
the capacity is already a power of two, multiplying\n
2773
05:43:35,540 --> 05:43:46,920
fine. So we compute the new threshold allocates\n
2774
05:43:46,919 --> 05:43:56,169
table, but it's actually going to be the new\n
2775
05:43:56,169 --> 05:44:03,899
of interesting maneuver here, I swap the current\n
2776
05:44:03,900 --> 05:44:10,670
table, which I call an old table. In order\n
2777
05:44:10,669 --> 05:44:17,569
here, we'll get to that. So swap the key tables,\n
2778
05:44:17,569 --> 05:44:25,759
count and the bucket count. And the reason\n
2779
05:44:25,759 --> 05:44:37,069
the swap, well, the then the new table is\n
2780
05:44:37,069 --> 05:44:46,319
the old table. That might sound confusing,\n
2781
05:44:46,319 --> 05:44:53,270
insertions on or the pointer to it. Alright,\n
2782
05:44:53,270 --> 05:45:01,159
if we encounter a token or a pointer that's\n
2783
05:45:02,159 --> 05:45:08,000
So because we're avoiding reinserting tombstones,\n
2784
05:45:08,000 --> 05:45:12,707
even though our table might have been cluttered\n
2785
05:45:12,707 --> 05:45:20,090
all of them here. Alright, so that's that\n
2786
05:45:20,090 --> 05:45:26,659
at this yourself. It's just looping through\n
2787
05:45:26,659 --> 05:45:34,430
That's a pretty standard to string method.\n
2788
05:45:34,430 --> 05:45:39,457
open addressing. Today, I want to talk about\n
2789
05:45:39,457 --> 05:45:45,349
binary index tree, and you'll see why very\n
2790
05:45:45,349 --> 05:45:51,709
because it's such a powerful data structure.\n
2791
05:45:51,709 --> 05:45:58,529
really simple to code. So let's dive right\n
2792
05:45:58,529 --> 05:46:04,129
video series, and just some standard stuff.\n
2793
05:46:04,130 --> 05:46:11,250
structure exists, analyze the time complexity,\n
2794
05:46:11,250 --> 05:46:16,869
So in this video, we'll get to the range query,\n
2795
05:46:16,869 --> 05:46:24,128
and how to construct the Fenwick tree in linear\n
2796
05:46:24,128 --> 05:46:33,128
but I'm not going to be covering that in this\n
2797
05:46:33,128 --> 05:46:40,208
the motivation behind the Fenwick tree. So\n
2798
05:46:40,207 --> 05:46:49,637
and we want to query a range and find the\n
2799
05:46:49,637 --> 05:47:00,250
do would be to start at the position and scan\n
2800
05:47:00,250 --> 05:47:05,909
all the individual values between that range.\n
2801
05:47:05,909 --> 05:47:12,970
it'll soon get pretty slow, because we're\n
2802
05:47:12,970 --> 05:47:21,110
However, if we do something like compute all\n
2803
05:47:21,110 --> 05:47:30,292
do queries in constant time, which is really,\n
2804
05:47:30,292 --> 05:47:39,720
zero to be zero, and then we go in our array\n
2805
05:47:39,720 --> 05:47:46,900
sum, we get five and then five, plus or minus\n
2806
05:47:46,900 --> 05:47:54,470
is eight, and so on. So this is an elementary\n
2807
05:47:54,470 --> 05:48:02,600
out prefix sums. And then if we want to find\n
2808
05:48:02,599 --> 05:48:11,430
then we can get the difference between those\n
2809
05:48:11,430 --> 05:48:18,099
time thing to compute. The sum of the values\n
2810
05:48:18,099 --> 05:48:25,489
really great. However, there's a slight flaw\n
2811
05:48:25,490 --> 05:48:32,990
to update a value in our original array A,\n
2812
05:48:32,990 --> 05:48:40,659
Well, now we have to recompute all the prefix\n
2813
05:48:40,659 --> 05:48:46,169
recalculate all those prefix sums. And to\n
2814
05:48:46,169 --> 05:48:54,039
Chu was essentially created. So what is the\n
2815
05:48:54,040 --> 05:49:04,040
that supports range queries on arrays and\n
2816
05:49:04,040 --> 05:49:13,090
we won't be covering that in this video. So\n
2817
05:49:13,090 --> 05:49:20,229
dates are logarithmic. range, some queries\n
2818
05:49:20,229 --> 05:49:31,250
but you can't say add elements to the array\n
2819
05:49:34,457 --> 05:49:38,797
so let's look at how we can do range queries\n
2820
05:49:38,797 --> 05:49:46,319
is that unlike a regular array, a family tree,\n
2821
05:49:46,319 --> 05:49:56,759
rather for a range of other cells as well.\n
2822
05:49:56,759 --> 05:50:04,739
for other cells depending on what The value\n
2823
05:50:04,740 --> 05:50:13,090
representation. So on the left, I have a one\n
2824
05:50:13,090 --> 05:50:19,029
very important. And I, on the side of that,\n
2825
05:50:19,029 --> 05:50:26,057
of the numbers, you can clearly see what they\n
2826
05:50:26,058 --> 05:50:35,690
index 12, its binary representation is 1100.\n
2827
05:50:35,689 --> 05:50:44,329
most bit. So that is at position three, and\n
2828
05:50:44,330 --> 05:50:51,280
responsible for we're going to say two to\n
2829
05:50:51,279 --> 05:51:03,009
cells below itself. Similarly, 10, has a binary\n
2830
05:51:03,009 --> 05:51:09,279
least significant bit is that position two.\n
2831
05:51:09,279 --> 05:51:18,378
And 11 has this thing fit in there position\n
2832
05:51:18,378 --> 05:51:29,378
So here, I've outlined the lease Sydney, leasing\n
2833
05:51:29,378 --> 05:51:36,110
for all the odd numbers, which are just responsible\n
2834
05:51:36,110 --> 05:51:42,529
indicates, the blue bars don't represent value,\n
2835
05:51:42,529 --> 05:51:51,399
And that's really important for you to keep\n
2836
05:51:51,400 --> 05:51:59,100
arranged responsibility to now the cells have\n
2837
05:51:59,099 --> 05:52:08,920
all these ranges of responsibilities or powers\n
2838
05:52:08,920 --> 05:52:18,479
16 for 16 cells. So now, how do we do a range\n
2839
05:52:18,479 --> 05:52:26,360
in this standard array, but rather this weird\n
2840
05:52:26,360 --> 05:52:32,770
answer is, we're going to calculate the prefix\n
2841
05:52:32,770 --> 05:52:37,220
eventually going to allow us to do a range\n
2842
05:52:37,220 --> 05:52:44,878
sums, just like we did for a regular array,\n
2843
05:52:44,878 --> 05:52:52,860
going to start at some index and cascade downwards\n
2844
05:52:52,860 --> 05:52:59,930
I mean. So for example, let's find the prefix\n
2845
05:52:59,930 --> 05:53:06,240
the prefix sound from index, one to seven,\n
2846
05:53:06,240 --> 05:53:17,190
tree is inclusive. So if we look at where\n
2847
05:53:17,189 --> 05:53:25,419
the array at position seven. And then we want\n
2848
05:53:25,419 --> 05:53:34,709
us is six, and then four. Notice that we were\n
2849
05:53:34,709 --> 05:53:43,279
then we move down to six. And then from six,\n
2850
05:53:43,279 --> 05:53:48,759
responsibility to, and then we're at four\n
2851
05:53:48,759 --> 05:53:53,279
that brings us all the way down to zero. And\n
2852
05:53:53,279 --> 05:54:03,520
we are all the way down to zero. So the prefix\n
2853
05:54:03,520 --> 05:54:10,420
plus the array index six plus the array index\n
2854
05:54:10,419 --> 05:54:16,649
for index 11. So we always start at where\n
2855
05:54:16,650 --> 05:54:24,569
to cascade down. So the cell directly below\n
2856
05:54:24,569 --> 05:54:27,450
to so we're gonna go down to
2857
05:54:27,450 --> 05:54:34,139
so that's eight and an eight brings us all\n
2858
05:54:34,139 --> 05:54:41,509
And one last one, let's find the prefix son\n
2859
05:54:41,509 --> 05:54:47,659
four as arranger responsibility of exactly\n
2860
05:54:47,659 --> 05:54:54,869
so we can stop. Okay, let's pull this all\n
2861
05:54:54,869 --> 05:55:05,000
i and j. So let's calculate the interval sun\n
2862
05:55:05,000 --> 05:55:14,137
going to calculate the prefix sum of 15 and\n
2863
05:55:14,137 --> 05:55:19,369
of 15. And then we're going to calculate the\n
2864
05:55:19,369 --> 05:55:26,878
not going to calculate up to 11. inclusive,\n
2865
05:55:26,878 --> 05:55:36,308
a lot. Okay, so if we start at 15, then we\n
2866
05:55:36,308 --> 05:55:44,218
arranger responsibility of one, subtract one\n
2867
05:55:44,218 --> 05:55:53,159
responsibility if two, because the least significant\n
2868
05:55:53,159 --> 05:56:02,599
and then keep cascading down. So the prefix\n
2869
05:56:02,599 --> 05:56:11,457
plus 12 plus eight. All right, now the prefix\n
2870
05:56:11,457 --> 05:56:17,539
to start at 10. Now we want to cascade down\n
2871
05:56:17,540 --> 05:56:25,128
to subtract two from 10, we get to eight.\n
2872
05:56:25,128 --> 05:56:33,310
of eight, so cascade down, so eight minus\n
2873
05:56:33,310 --> 05:56:43,210
sum of all the indices of 15 minus those of\n
2874
05:56:43,209 --> 05:56:53,399
range sum. So notice that in the worst case,\n
2875
05:56:53,400 --> 05:57:00,069
which is all the ones and these are the numbers\n
2876
05:57:00,069 --> 05:57:08,590
of two, minus one. So a power of two has one\n
2877
05:57:08,590 --> 05:57:15,680
one, then your whole bunch of ones here. So\n
2878
05:57:15,680 --> 05:57:24,670
are ones. And those are the worst cases. So\n
2879
05:57:24,669 --> 05:57:34,429
say 15, and seven, both of which have a lot\n
2880
05:57:34,430 --> 05:57:40,670
base two of N operations. But in the average\n
2881
05:57:40,669 --> 05:57:46,269
going to implement this in such a way that\n
2882
05:57:46,270 --> 05:57:54,860
this is like super fast. So the range query\n
2883
05:57:54,860 --> 05:58:02,477
like literally no code, the range query from\n
2884
05:58:02,477 --> 05:58:09,439
so I'm going to define a function called prefix\n
2885
05:58:09,439 --> 05:58:17,619
down operation. So we started I and while\n
2886
05:58:17,619 --> 05:58:28,899
up the values in our final tree. And we're\n
2887
05:58:28,900 --> 05:58:34,990
bit. And we're all we're going to keep doing\n
2888
05:58:34,990 --> 05:58:42,740
can return the sum. So the range query manages\n
2889
05:58:42,740 --> 05:58:50,792
really neat little algorithm. I want to talk\n
2890
05:58:50,792 --> 05:58:56,650
let's dive right in. But before we get to\n
2891
05:58:56,650 --> 05:59:05,140
the Fenwick tree range query video. And I\n
2892
05:59:05,139 --> 05:59:14,291
the Fenwick tree is set up and how we're doing\n
2893
05:59:14,292 --> 05:59:22,990
your brain on how we actually did a prefix\n
2894
05:59:22,990 --> 05:59:30,770
what we did was, we started at a value at\n
2895
05:59:30,770 --> 05:59:38,740
continuously removed the least significant\n
2896
05:59:41,330 --> 05:59:51,320
so let's say we started at 13 or 13, at least\n
2897
05:59:51,319 --> 06:00:00,459
and we got 12. And then we found out that\n
2898
06:00:00,459 --> 06:00:06,129
So we remove four and then at least thing,\n
2899
06:00:06,130 --> 06:00:14,530
zero. And once we reach zero, we know that\n
2900
06:00:14,529 --> 06:00:20,717
analogous to this, instead of removing, we're\n
2901
06:00:20,718 --> 06:00:29,170
as you'll see. So for instance, if we want\n
2902
06:00:29,169 --> 06:00:38,799
to find out all the cells which are responsible\n
2903
06:00:38,799 --> 06:00:45,540
for a range of responsibility. So if we start\n
2904
06:00:45,540 --> 06:00:54,628
and add it to nine, and we get 10. So 10,\n
2905
06:00:54,628 --> 06:01:03,501
two, then we add it to 10, then find the least\n
2906
06:01:03,501 --> 06:01:10,540
it's 16. And then we would do the same thing,\n
2907
06:01:10,540 --> 06:01:20,130
know to stop. So if I draw a line outwards\n
2908
06:01:20,130 --> 06:01:28,049
ones I have to update. So remember, those\n
2909
06:01:28,049 --> 06:01:36,399
So the lines that I hit are the ones that\n
2910
06:01:36,400 --> 06:01:45,058
Okay. So 14, add some constant x, at position\n
2911
06:01:45,058 --> 06:01:52,969
need to modify? So we start six, and we find\n
2912
06:01:52,969 --> 06:02:02,479
we get eight, find the least significant bit\n
2913
06:02:02,479 --> 06:02:10,450
out from sex, then indeed the cells that I\n
2914
06:02:10,450 --> 06:02:25,250
updates for our Fenwick tree are that we need\n
2915
06:02:25,250 --> 06:02:31,069
the algorithm is really, really simple. So\n
2916
06:02:31,069 --> 06:02:39,957
array of size n, then while I, so I supposition\n
2917
06:02:39,957 --> 06:02:47,229
less than n, we're going to add x to the tree\n
2918
06:02:47,229 --> 06:02:56,520
significant bit of AI. And that's it, where\n
2919
06:02:56,520 --> 06:03:01,850
AI. And there are built in functions to do\n
2920
06:03:01,849 --> 06:03:07,319
at some source code. Alright, so that was\n
2921
06:03:07,319 --> 06:03:13,169
construct the Fenwick tree. Let's talk about\n
2922
06:03:13,169 --> 06:03:19,039
how to do range queries and point updates\n
2923
06:03:19,040 --> 06:03:24,708
seen how to construct the Fenwick tree yet.\n
2924
06:03:24,707 --> 06:03:32,599
you can't understand the Fenwick tree construction\n
2925
06:03:32,599 --> 06:03:39,451
updates work. Alright, so let's dive right\n
2926
06:03:39,452 --> 06:03:47,610
of a Fenwick tree. So if we're given an array\n
2927
06:03:47,610 --> 06:03:54,930
into a Fenwick tree, what we could do is initialize\n
2928
06:03:54,930 --> 06:04:01,808
all zeros and add the values into the Fenwick\n
2929
06:04:01,808 --> 06:04:10,260
get a total time complexity of order n log\n
2930
06:04:10,259 --> 06:04:20,299
do this in linear time. So why bother with\n
2931
06:04:20,299 --> 06:04:24,790
we're going to be given an array of values\n
2932
06:04:24,790 --> 06:04:33,120
a legitimate Fenwick tree, not just the array\n
2933
06:04:33,119 --> 06:04:40,409
going to propagate the values throughout our\n
2934
06:04:40,409 --> 06:04:47,121
And we're going to do this by updating the\n
2935
06:04:47,121 --> 06:04:52,169
as we pass through the entire tree, everyone's\n
2936
06:04:52,169 --> 06:04:59,069
a fully functional Fenwick tree at the end\n
2937
06:04:59,069 --> 06:05:05,650
idea. So you You propagate some thing to the\n
2938
06:05:05,650 --> 06:05:11,690
that parent propagate its value to its parent\n
2939
06:05:11,689 --> 06:05:19,439
almost delegating the value. So let's see\n
2940
06:05:19,439 --> 06:05:28,930
that. So if the current position is position\n
2941
06:05:30,580 --> 06:05:39,150
our parent, let's say that is J, and j is\n
2942
06:05:39,150 --> 06:05:49,490
of AI. Alright, so if we start at one, well,\n
2943
06:05:49,490 --> 06:05:57,930
the parent is at position two. So notice that\n
2944
06:05:57,930 --> 06:06:03,569
going to add two for the value of i, which\n
2945
06:06:03,569 --> 06:06:13,319
seven. Now, we want a bit position two. So\n
2946
06:06:13,319 --> 06:06:20,400
two. So two, plus at least significant bit\n
2947
06:06:20,400 --> 06:06:28,819
for two are immediately responsible for two.\n
2948
06:06:28,819 --> 06:06:34,409
who is responsible for three? Well, that's\n
2949
06:06:34,409 --> 06:06:43,279
value at index three. Now, who's responsible\n
2950
06:06:43,279 --> 06:06:50,457
So go to position eight, and add the value\n
2951
06:06:50,457 --> 06:06:59,297
So in our five, and then you see how we keep\n
2952
06:06:59,297 --> 06:07:04,799
cell responsible for us, now seven is updating\neight.
2953
06:07:05,799 --> 06:07:12,329
now, nobody, oh, eight doesn't have a parent.\n
2954
06:07:12,330 --> 06:07:19,420
only has 12 cells, but the parent that would\n
2955
06:07:19,419 --> 06:07:27,911
out of bounds. So we just ignore it. It's\n
2956
06:07:27,911 --> 06:07:35,707
I is nine nines, least significant bit is\n
2957
06:07:35,707 --> 06:07:44,889
So keep propagating that value 10s, parent\n
2958
06:07:44,889 --> 06:07:50,849
the same sort of situation we had with eight\n
2959
06:07:50,849 --> 06:07:59,579
we ignore it. So the values that are there\n
2960
06:07:59,580 --> 06:08:07,728
And with these values, we can do range queries\n
2961
06:08:07,727 --> 06:08:14,520
that we had. So let's look at the construction\n
2962
06:08:14,520 --> 06:08:19,600
this, we will have a look at some source code\n
2963
06:08:19,599 --> 06:08:26,189
language that I'm not using, this can be helpful.\n
2964
06:08:26,189 --> 06:08:32,230
into a Fenwick tree. Let's get it's about\n
2965
06:08:32,230 --> 06:08:41,069
actually clone or make a deep copy of the\n
2966
06:08:41,069 --> 06:08:46,878
manipulate the values array while you're constructing\n
2967
06:08:46,878 --> 06:08:53,170
because we're doing all this stuff in place.\n
2968
06:08:53,169 --> 06:09:02,379
I at one and go up to n and then compute j\n
2969
06:09:02,380 --> 06:09:12,560
significant bit of I do an if statement to\n
2970
06:09:12,560 --> 06:09:16,521
be less than or equal to and actually not\n
2971
06:09:16,521 --> 06:09:22,610
based and in a Fenwick tree. Yeah, I'm pretty\n
2972
06:09:22,610 --> 06:09:28,340
let's have a look at some Fenwick tree source\n
2973
06:09:28,340 --> 06:09:34,659
can find it at this link which I'll put in\n
2974
06:09:34,659 --> 06:09:43,840
data dash structures, and the Fenwick trees\n
2975
06:09:43,840 --> 06:09:53,457
tree folder. So let's dive right in. I have\n
2976
06:09:53,457 --> 06:10:00,387
Alright, so this source code is provided to\n
2977
06:10:00,387 --> 06:10:07,360
layer two, any language you're working. So\n
2978
06:10:07,360 --> 06:10:15,110
constructors. One, they'll create an empty\n
2979
06:10:15,110 --> 06:10:21,718
populate yourself. And another one, which\n
2980
06:10:21,718 --> 06:10:29,319
in the last video, and constructs the Fenwick\n
2981
06:10:29,319 --> 06:10:33,047
the constructor you want to use, and not the\n
2982
06:10:33,047 --> 06:10:41,699
either or. So one thing that you guys should\n
2983
06:10:41,700 --> 06:10:50,100
thing needs to be one based. In the last video,\n
2984
06:10:50,099 --> 06:10:57,009
go less than or less than or equal to the\n
2985
06:10:57,009 --> 06:11:02,957
on whether your array is one based or zero\n
2986
06:11:02,957 --> 06:11:11,877
one based in which case, it would be less\n
2987
06:11:11,878 --> 06:11:17,887
right. But other than that, so this is just\n
2988
06:11:17,887 --> 06:11:24,869
propagate the value to the parent. So that's\n
2989
06:11:24,869 --> 06:11:30,539
So pretty simple stuff. So this is probably\n
2990
06:11:30,540 --> 06:11:36,639
least significant bit method. And it's going\n
2991
06:11:36,639 --> 06:11:45,957
bit for some integer i. So this bit magic\n
2992
06:11:45,957 --> 06:11:54,389
the least significant bit value. Something\n
2993
06:11:54,389 --> 06:11:58,930
here, which uses Java's built in method find\n
2994
06:11:58,930 --> 06:12:06,968
However, using a Rob manipulation, like this\n
2995
06:12:06,968 --> 06:12:14,170
faster. Okay, so the prefix sums, this is\n
2996
06:12:14,169 --> 06:12:20,469
allows you to compute the prefix sum from\n
2997
06:12:20,470 --> 06:12:29,960
done one based. So this would do the cascading\n
2998
06:12:29,959 --> 06:12:35,887
a sum equal to zero, and add the values of\n
2999
06:12:35,887 --> 06:12:46,579
cascading down. And this line line 55 is equivalent\n
3000
06:12:46,580 --> 06:12:52,888
of I, which is a lot more readable than this\n
3001
06:12:52,887 --> 06:12:58,649
clears that bit. But you want to use as much\n
3002
06:12:58,650 --> 06:13:04,000
tree fast, even though it's already really,\n
3003
06:13:04,000 --> 06:13:12,680
you use, the less operation or machine level\n
3004
06:13:12,680 --> 06:13:21,029
program is going to be so much faster. Okay,\n
3005
06:13:21,029 --> 06:13:31,069
then we can call the prefix some methods right\n
3006
06:13:31,069 --> 06:13:38,900
So that's easy. So adding, so this is from\n
3007
06:13:38,900 --> 06:13:43,740
So we k can be positive or negative, that\n
3008
06:13:43,740 --> 06:13:52,290
to do as for AI, you're going to update everyone\n
3009
06:13:52,290 --> 06:13:56,850
that are responsible for you. And for each\n
3010
06:13:56,849 --> 06:14:05,669
K. And then you're going to propagate up to\n
3011
06:14:05,669 --> 06:14:10,579
least significant bit, and you're going to\n
3012
06:14:10,580 --> 06:14:19,208
at some valent index. And this additional\n
3013
06:14:19,207 --> 06:14:29,180
the index is equal to k, this might sometimes\n
3014
06:14:29,180 --> 06:14:38,650
and then call the Add method. So pretty simple\n
3015
06:14:38,650 --> 06:14:48,240
and half of it is comments. So this is a really\n
3016
06:14:48,240 --> 06:14:54,290
And interesting topic I want to talk about\n
3017
06:14:54,290 --> 06:14:59,218
powerful data structure to have in your toolbox\n
3018
06:14:59,218 --> 06:15:04,569
Stuff like arrays are a relatively new data\n
3019
06:15:04,569 --> 06:15:11,239
due to the heavy memory consumption needs\n
3020
06:15:11,240 --> 06:15:19,468
and talk about just what a suffix is. For\n
3021
06:15:19,468 --> 06:15:27,810
at the end of a string. For example, if we\n
3022
06:15:27,810 --> 06:15:36,370
of the string horse are, we are able to come\n
3023
06:15:36,369 --> 06:15:49,369
e, s, e, r, s, e, and so on. Now we can answer\n
3024
06:15:49,369 --> 06:15:57,539
is a suffix array is the array containing\n
3025
06:15:57,540 --> 06:16:06,090
see an example of this. Suppose, you want\n
3026
06:16:06,090 --> 06:16:14,090
On the left, I constructed a table with all\n
3027
06:16:14,090 --> 06:16:22,957
that particular suffix started in a string\n
3028
06:16:22,957 --> 06:16:29,279
all the suffixes in lexicographic order in\na table.
3029
06:16:29,279 --> 06:16:37,137
The actual suffix array is the array of sorted\n
3030
06:16:37,137 --> 06:16:42,378
to actually store the suffixes themselves\n
3031
06:16:42,378 --> 06:16:49,270
original string. This is an ingenious idea,\n
3032
06:16:49,270 --> 06:16:56,590
of the sort of suffixes without actually needing\n
3033
06:16:56,590 --> 06:17:04,869
All we need is the original string and the\n
3034
06:17:04,869 --> 06:17:12,159
suffix array is an array of indices which\n
3035
06:17:12,159 --> 06:17:17,329
for a bit of history on the suffix array,\n
3036
06:17:17,330 --> 06:17:24,900
to be a space efficient alternative to a suffix\n
3037
06:17:24,900 --> 06:17:32,069
a compressed version of another data structure\n
3038
06:17:32,069 --> 06:17:38,119
the suffix array is a very different from\n
3039
06:17:38,119 --> 06:17:45,529
about virtually anything, the suffix tree\n
3040
06:17:45,529 --> 06:17:53,600
such as a longest common prefix array, which\n
3041
06:17:53,600 --> 06:17:58,659
video, we're going to talk about perhaps the\n
3042
06:17:58,659 --> 06:18:06,040
with the suffix array. And that is the longest\n
3043
06:18:06,040 --> 06:18:14,208
array. The LCP array is an array where each\n
3044
06:18:14,207 --> 06:18:21,759
suffixes have in common with each other. Let's\n
3045
06:18:21,759 --> 06:18:29,477
show what the LCP array is, is to do an example.\n
3046
06:18:29,477 --> 06:18:37,139
the LCP array is, for the string, A, B, A,\n
3047
06:18:37,139 --> 06:18:45,349
do is construct the suffix array for our string\n
3048
06:18:45,349 --> 06:18:52,019
Notice that the very first entry that we placed\n
3049
06:18:52,020 --> 06:19:00,369
is zero. This is because this index is undefined,\n
3050
06:19:00,369 --> 06:19:06,457
our LCP array, let's begin by looking at the\n
3051
06:19:06,457 --> 06:19:14,680
they have in common with each other. We noticed\n
3052
06:19:14,680 --> 06:19:23,189
index of our LCP array. Now we move on to\n
3053
06:19:23,189 --> 06:19:32,869
have an LCP value of two and the next two\n
3054
06:19:32,869 --> 06:19:40,829
zero and the next to only have one character\n
3055
06:19:40,830 --> 06:19:51,080
and lastly, only one character in common.\n
3056
06:19:51,080 --> 06:20:00,208
highlighted in purple. In summary, the LCP\n
3057
06:20:00,207 --> 06:20:07,579
To sort of suffixes have in common with each\n
3058
06:20:07,580 --> 06:20:16,040
how much information can be derived from such\n
3059
06:20:16,040 --> 06:20:24,870
noting is that the very first index in our\n
3060
06:20:24,869 --> 06:20:32,169
LCP array as an integer array, by convention,\n
3061
06:20:32,169 --> 06:20:38,439
So that doesn't interfere with any operations\n
3062
06:20:38,439 --> 06:20:48,419
And this is fine for most purposes. Lastly,\n
3063
06:20:48,419 --> 06:20:54,639
to construct it very efficiently. There are\n
3064
06:20:54,639 --> 06:21:01,819
how to construct the LCP array, which run\n
3065
06:21:01,819 --> 06:21:11,029
in n log n time, and even in linear time.\n
3066
06:21:11,029 --> 06:21:19,610
of suffix arrays and LCP arrays, and that\n
3067
06:21:19,610 --> 06:21:24,860
There are a variety of interesting problems\n
3068
06:21:24,860 --> 06:21:30,727
that require you to find all the unique substrings\n
3069
06:21:30,727 --> 06:21:38,590
time complexity of n squared, which requires\n
3070
06:21:38,590 --> 06:21:45,860
the substrings of the string and dump them\n
3071
06:21:45,860 --> 06:21:52,529
the information stored inside the LCP array.\n
3072
06:21:52,529 --> 06:21:59,119
space efficient solution. I'm not saying that\n
3073
06:21:59,119 --> 06:22:05,779
substrings because there exists Other notable\n
3074
06:22:05,779 --> 06:22:14,079
with Bloom filters. Let's now look at an example\n
3075
06:22:14,080 --> 06:22:21,320
look at an example of how to find all the\n
3076
06:22:21,319 --> 06:22:27,869
A for every string there are exactly n times\n
3077
06:22:27,869 --> 06:22:34,509
of this I will leave as an exercise so listener,\n
3078
06:22:34,509 --> 06:22:42,259
notice that all the substrings here, there\n
3079
06:22:42,259 --> 06:22:50,079
the repeated substrings there are exactly\n
3080
06:22:50,080 --> 06:22:55,370
use the information inside the LCP array to\n
3081
06:22:55,369 --> 06:23:03,039
were the duplicate once in the table on the\n
3082
06:23:03,040 --> 06:23:11,780
AZ az a. Remember what the LCP array represents,\n
3083
06:23:11,779 --> 06:23:18,590
the original string share a certain amount\n
3084
06:23:18,590 --> 06:23:25,240
value at a certain index is say five, and\n
3085
06:23:25,240 --> 06:23:32,558
those two suffixes, in other words, there\n
3086
06:23:32,558 --> 06:23:39,270
two suffixes, since they come from the same\n
3087
06:23:39,270 --> 06:23:47,830
LCP position at index one, we see that has\n
3088
06:23:47,830 --> 06:23:55,590
string is the first character in a so we know\n
3089
06:23:55,590 --> 06:24:05,279
values three, so there are three repeated\n
3090
06:24:08,457 --> 06:24:17,779
ACA, the next interesting LCP values to for\n
3091
06:24:17,779 --> 06:24:26,977
substrings. Here we can eliminate z, and z\n
3092
06:24:26,977 --> 06:24:32,680
way of counting all unique substrings. We\n
3093
06:24:32,680 --> 06:24:40,409
is n times n plus one over two. And we also\n
3094
06:24:40,409 --> 06:24:46,387
is the sum of all the LCP values. If this\n
3095
06:24:46,387 --> 06:24:51,159
examples and play around with it. If we go\n
3096
06:24:51,159 --> 06:24:58,779
az, az a and we set n equal to five, which\n
3097
06:24:58,779 --> 06:25:08,719
the correct answer. have nine by punching\n
3098
06:25:08,720 --> 06:25:15,871
substring values summed up in the LCP array.\n
3099
06:25:15,871 --> 06:25:20,369
challenge concerning counting substrings,\n
3100
06:25:20,369 --> 06:25:27,569
to the Caris online judge for some practice.\n
3101
06:25:27,569 --> 06:25:37,308
array and LCP array available on GitHub github.com\n
3102
06:25:37,308 --> 06:25:44,280
There is a really neat problem called longest\n
3103
06:25:44,279 --> 06:25:51,939
the K common substring problem, which is really\n
3104
06:25:51,939 --> 06:25:58,590
state the problem and then discuss multiple\n
3105
06:25:58,590 --> 06:26:04,630
How do we find the longest common substring\n
3106
06:26:04,630 --> 06:26:13,208
being anywhere from two to n, the number of\n
3107
06:26:13,207 --> 06:26:23,169
strings, s one, s two, and s three, with the\n
3108
06:26:23,169 --> 06:26:28,789
a minimum of two strings from our pool of\n
3109
06:26:28,790 --> 06:26:36,620
substring between them. In this situation,\n
3110
06:26:36,619 --> 06:26:41,541
know that the longest common substring is\n
3111
06:26:41,542 --> 06:26:48,170
be multiple. The traditional approach to solving\n
3112
06:26:48,169 --> 06:26:54,919
dynamic programming, which can solve the problem\n
3113
06:26:54,919 --> 06:27:03,707
of the string lengths. Obviously, this method\n
3114
06:27:03,707 --> 06:27:11,199
avoid using it whenever possible, a far superior\n
3115
06:27:11,200 --> 06:27:16,977
is to use a suffix array, which can find the\n
3116
06:27:16,977 --> 06:27:24,457
sum of the string lines. So how do we do this?\n
3117
06:27:24,457 --> 06:27:32,409
longest common substring problem? Let's consider\n
3118
06:27:32,409 --> 06:27:39,619
two and s three. What we will first want to\n
3119
06:27:39,619 --> 06:27:47,137
larger string, which I will call t, which\n
3120
06:27:47,137 --> 06:27:53,878
we must be careful and place unique Sentinel\n
3121
06:27:53,878 --> 06:27:59,510
this for multiple reasons. But the main one\n
3122
06:27:59,509 --> 06:28:06,707
of suffixes when we construct the suffix array.\n
3123
06:28:06,707 --> 06:28:13,409
settle values need to be lexicographically\n
3124
06:28:13,409 --> 06:28:20,450
in any of our strings. So in the ASCII table,\n
3125
06:28:20,450 --> 06:28:27,637
sign are all less than any alphabetic character\n
3126
06:28:27,637 --> 06:28:35,307
so we're good in doing the concatenation.\n
3127
06:28:35,308 --> 06:28:41,270
construct the suffix array for tea. This procedure\n
3128
06:28:41,270 --> 06:28:49,240
linear suffix array construction algorithm.\n
3129
06:28:49,240 --> 06:28:57,510
both the LCP array values on the leftmost\n
3130
06:28:57,509 --> 06:29:03,689
would appear in the suffix array on the right\n
3131
06:29:03,689 --> 06:29:09,967
color to match it with the original string\nit belongs to.
3132
06:29:09,968 --> 06:29:14,398
In this slide, you can see that the suffixes\n
3133
06:29:14,398 --> 06:29:19,319
the top because they were lexicographically\n
3134
06:29:19,319 --> 06:29:26,470
And this is to be expected. For our purposes,\n
3135
06:29:26,470 --> 06:29:34,430
them into the string t ourselves. So back\n
3136
06:29:34,430 --> 06:29:41,420
longest common substring of K strings? Given\n
3137
06:29:41,419 --> 06:29:50,189
LCP array constructed? The answer is, we're\n
3138
06:29:50,189 --> 06:29:58,467
colors whom share the largest LCP value. Let's\n
3139
06:29:58,468 --> 06:30:08,190
equals three We have three strings, this means\n
3140
06:30:08,189 --> 06:30:15,099
can achieve a maximum of two, if we sell the\n
3141
06:30:15,099 --> 06:30:21,457
that each of them is a different color, meaning\n
3142
06:30:21,457 --> 06:30:28,297
And that the minimum LCP value in the selected\n
3143
06:30:28,297 --> 06:30:35,727
first entry in the window. This means that\n
3144
06:30:35,727 --> 06:30:46,659
string ca of length two, which is shared amongst\n
3145
06:30:46,659 --> 06:30:54,400
do another example. But this time, let's change\n
3146
06:30:54,400 --> 06:31:01,208
equals two, we want to have two suffixes of\n
3147
06:31:01,207 --> 06:31:10,029
prefix value between them. In this case, there\n
3148
06:31:10,029 --> 06:31:21,049
B CA, with a length of three shared between\n
3149
06:31:21,049 --> 06:31:28,110
we've covered only some trivial cases, things\n
3150
06:31:28,110 --> 06:31:35,819
colors you need are exactly adjacent with\n
3151
06:31:35,819 --> 06:31:42,628
this will be to use a sliding window technique\n
3152
06:31:42,628 --> 06:31:48,458
Here's what we'll do at each step, we'll adjust\n
3153
06:31:48,457 --> 06:31:55,739
window contains exactly k suffixes of different\n
3154
06:31:55,740 --> 06:32:03,560
correct amount of colors, we'll want to query\n
3155
06:32:03,560 --> 06:32:08,899
In the picture below, you can see that this\n
3156
06:32:08,899 --> 06:32:17,250
the LCP array for that window is two, again,\n
3157
06:32:17,250 --> 06:32:23,360
the minimum value for that window is two,\n
3158
06:32:23,360 --> 06:32:32,128
the prefix a G, which has length of two, as\n
3159
06:32:32,128 --> 06:32:39,069
on how we are actually going to perform the\n
3160
06:32:39,069 --> 06:32:46,378
current window we are considering. Turns out\n
3161
06:32:46,378 --> 06:32:51,580
it can be solved in a variety of ways. Since\n
3162
06:32:51,580 --> 06:32:58,290
just any arbitrary range query, we can use\n
3163
06:32:58,290 --> 06:33:05,542
range query problem to obtain the value we\n
3164
06:33:05,542 --> 06:33:11,940
a minimum range query data structure such\n
3165
06:33:11,939 --> 06:33:19,227
queries on the LCP array. This is theoretically\n
3166
06:33:19,227 --> 06:33:26,430
implement in my opinion. So to implement the\n
3167
06:33:26,430 --> 06:33:27,939
data structure to keep track of
3168
06:33:27,939 --> 06:33:34,409
the colors in our window, I recommend using\n
3169
06:33:34,409 --> 06:33:40,378
left, I drew a table to indicate how much\n
3170
06:33:40,378 --> 06:33:48,069
window, a valid window will require at least\n
3171
06:33:48,069 --> 06:33:56,819
than zero. In the example that we'll follow,\n
3172
06:33:56,819 --> 06:34:04,520
all three colors will need to be present in\n
3173
06:34:04,520 --> 06:34:13,279
query I mean, querying the LCP array for the\n
3174
06:34:13,279 --> 06:34:19,659
possibly updating the best longest common\n
3175
06:34:19,659 --> 06:34:25,930
can see that our window is missing some blue.\n
3176
06:34:25,930 --> 06:34:32,590
is to expand the window down. And when we\n
3177
06:34:32,590 --> 06:34:41,279
down. So let's expand our window downwards\n
3178
06:34:41,279 --> 06:34:48,009
and enough of each color to do a valid query.\n
3179
06:34:48,009 --> 06:34:55,369
the longest common substring for the window\n
3180
06:34:55,369 --> 06:35:05,718
the string a G. Now since we have enough features\n
3181
06:35:05,718 --> 06:35:12,708
one green suffix, and we still have at least\n
3182
06:35:12,707 --> 06:35:21,779
a query to find out that we get the same result\n
3183
06:35:21,779 --> 06:35:29,489
more. And now we don't have enough green.\n
3184
06:35:29,490 --> 06:35:36,520
we need to do is expand the window downwards\n
3185
06:35:36,520 --> 06:35:42,909
was a green suffix right there. And we can\n
3186
06:35:42,909 --> 06:35:48,360
longest common substring value found so far.\n
3187
06:35:48,360 --> 06:35:54,700
is only of length one. So the longest common\n
3188
06:35:54,700 --> 06:36:02,069
which is not as long as the longest common\n
3189
06:36:02,069 --> 06:36:09,090
length three, so we can ignore this one and\n
3190
06:36:09,090 --> 06:36:15,670
and we're short one color, and that is red.\n
3191
06:36:15,669 --> 06:36:21,679
suffix. So we expand and find a blue suffix.\n
3192
06:36:21,680 --> 06:36:28,218
searching the expand Finally, green suffix,\n
3193
06:36:28,218 --> 06:36:33,790
reached. This video is getting a little long,\n
3194
06:36:33,790 --> 06:36:40,548
next video, we'll look at a full example of\n
3195
06:36:40,547 --> 06:36:45,419
meantime, if you're looking for a challenge\n
3196
06:36:45,419 --> 06:36:49,949
make sure to check out the life forms problem\n
3197
06:36:49,950 --> 06:36:55,229
for an implementation of longest common substring\n
3198
06:36:55,229 --> 06:37:03,500
to check out my algorithms repository@github.com\n
3199
06:37:03,500 --> 06:37:10,369
going to finish where we left off in the last\n
3200
06:37:10,369 --> 06:37:16,919
example solving the longest common substring\n
3201
06:37:16,919 --> 06:37:25,019
For this example, we're going to have four\n
3202
06:37:25,020 --> 06:37:32,490
I have also selected the value of K to be\n
3203
06:37:32,490 --> 06:37:38,000
of two strings of our pool of four to share\n
3204
06:37:38,000 --> 06:37:44,639
them. I have also provided you with a concatenated\n
3205
06:37:44,639 --> 06:37:50,308
solution at the bottom of the screen. In case\n
3206
06:37:50,308 --> 06:37:57,620
out for yourself. The first step in finding\n
3207
06:37:57,619 --> 06:38:05,387
of our four strings is to build the suffix\n
3208
06:38:05,387 --> 06:38:12,869
on the right side and the left side, respectively.\n
3209
06:38:12,869 --> 06:38:19,610
substring algorithm, notice the variables\n
3210
06:38:19,610 --> 06:38:27,029
and a window LCS values will track the longest\n
3211
06:38:27,029 --> 06:38:33,967
values for the current window. And the LCS\n
3212
06:38:33,968 --> 06:38:41,260
values so far. So let's get started. Initially,\n
3213
06:38:41,259 --> 06:38:46,959
the window to contain two different colors.\n
3214
06:38:46,959 --> 06:38:54,069
do not meet this criteria. As I expand down,\n
3215
06:38:54,069 --> 06:39:03,340
one color, I expand on again and still one\n
3216
06:39:03,340 --> 06:39:10,420
and now we arrive at a blue suffix. And here\n
3217
06:39:10,419 --> 06:39:16,429
window. However, our query isn't fruitful\n
3218
06:39:16,430 --> 06:39:24,628
is zero. So there's no longest common substring\n
3219
06:39:24,628 --> 06:39:32,878
like we do now, we decrease the window size\n
3220
06:39:32,878 --> 06:39:40,770
nothing interesting. Decrease the window size\n
3221
06:39:40,770 --> 06:39:46,600
The current window contains a longest common\n
3222
06:39:46,599 --> 06:39:53,840
common substring BC and added to our solution\n
3223
06:39:53,840 --> 06:40:01,650
because we meet the color requirement. Our\n
3224
06:40:01,650 --> 06:40:11,280
is two, so we need two different color strings.\n
3225
06:40:11,279 --> 06:40:18,659
has happened because we find an LCP value\n
3226
06:40:18,659 --> 06:40:25,319
best value. So we update the solution set\n
3227
06:40:25,319 --> 06:40:32,119
BC, which is one character longer. Now we\n
3228
06:40:32,119 --> 06:40:42,500
was now too small, so we expand the LCP value\n
3229
06:40:42,500 --> 06:40:51,270
the window size. Now expand to meet the color\n
3230
06:40:51,270 --> 06:40:58,950
that doesn't beat our current best, which\n
3231
06:40:58,950 --> 06:41:06,458
to meet the core requirements, so expand,\n
3232
06:41:06,457 --> 06:41:15,627
and LCP value of one for this window range.\n
3233
06:41:15,628 --> 06:41:22,819
an LCP value of two we're getting closer to\n
3234
06:41:22,819 --> 06:41:29,599
have to shrink and like let go. Now expand.\n
3235
06:41:29,599 --> 06:41:38,180
We have a window RCP value of three which\n
3236
06:41:38,180 --> 06:41:47,799
saying that, the CDE our newfound, longest\n
3237
06:41:47,799 --> 06:41:56,887
of the same length, we keep both in the solution\n
3238
06:41:56,887 --> 06:42:03,579
interval because we meet the color requirement.\n
3239
06:42:03,580 --> 06:42:15,910
expand again, our LCP window value is zero.\n
3240
06:42:15,909 --> 06:42:25,009
value of one here, that's not good enough.\n
3241
06:42:25,009 --> 06:42:31,340
Okay, we might be getting closer, but we meet\n
3242
06:42:31,340 --> 06:42:39,849
to meet the color requirement. These two strings\n
3243
06:42:39,849 --> 06:42:45,829
now shrink. Now we've reached the end and\n
3244
06:42:45,830 --> 06:42:54,440
problem with a four strings and a k value\n
3245
06:42:54,439 --> 06:43:03,539
and shrinking, I want you to notice that each\n
3246
06:43:03,540 --> 06:43:12,069
I only ever moved one of the endpoints downwards.\n
3247
06:43:12,069 --> 06:43:19,637
know that the number of Windows has to be\n
3248
06:43:19,637 --> 06:43:27,297
that we have. And the number of suffixes that\n
3249
06:43:27,297 --> 06:43:35,069
come to the conclusion that there must be\n
3250
06:43:35,069 --> 06:43:42,227
which is really good. Because we want our\n
3251
06:43:42,227 --> 06:43:49,110
going to be on an efficient way to solve the\n
3252
06:43:49,110 --> 06:43:55,490
The longest repeated substring problem is\n
3253
06:43:55,490 --> 06:44:00,450
in computer science, lots of problems can\n
3254
06:44:00,450 --> 06:44:06,409
important that we have an efficient way of\n
3255
06:44:06,409 --> 06:44:12,740
squared time and lots of space. What we want\n
3256
06:44:12,740 --> 06:44:21,558
inside the longest common prefix array to\n
3257
06:44:21,558 --> 06:44:26,878
Let's do an example what is the longest repeated\n
3258
06:44:26,878 --> 06:44:36,898
free to pause the video and figure it out\n
3259
06:44:36,898 --> 06:44:43,478
which is the longest substring that appears\n
3260
06:44:43,477 --> 06:44:51,840
the longest repeated substring. Here you can\n
3261
06:44:51,840 --> 06:44:58,409
And now you can see the second repeated instance\n
3262
06:44:58,409 --> 06:45:05,689
are disjoint and do not overlap. In general,\n
3263
06:45:05,689 --> 06:45:11,559
substring. Now let's solve the problem using\n
3264
06:45:11,560 --> 06:45:17,540
generated on the right hand side. I'll give\n
3265
06:45:17,540 --> 06:45:22,850
the LCP array in case you notice anything\n
3266
06:45:22,849 --> 06:45:30,627
substring. Now that you already know what\n
3267
06:45:30,628 --> 06:45:37,470
for in the LCP array is the maximum value.\n
3268
06:45:37,470 --> 06:45:43,819
know that the suffixes are already sorted.\n
3269
06:45:43,819 --> 06:45:49,520
longest common prefix value, then they share\n
3270
06:45:49,520 --> 06:45:55,930
We also know that if the LCP value at a certain\n
3271
06:45:55,930 --> 06:46:02,069
shared between the two adjacent suffixes is\n
3272
06:46:02,069 --> 06:46:09,340
between two suffixes, each of which start\n
3273
06:46:09,340 --> 06:46:16,650
again, is Abracadabra. Since our LCP value\n
3274
06:46:16,650 --> 06:46:24,970
from the suffix abracadabra forms one part\n
3275
06:46:24,970 --> 06:46:31,478
the suffix above it which shares the LCP value\n
3276
06:46:31,477 --> 06:46:39,599
in that longest repeated substring. Now, I\n
3277
06:46:39,599 --> 06:46:49,779
can you find the longest repeated substring\n
3278
06:46:49,779 --> 06:46:55,957
you did, you will find out that you not only\n
3279
06:46:55,957 --> 06:47:02,477
longest repeated substrings. Since there can\n
3280
06:47:02,477 --> 06:47:10,628
for a single largest value, but all largest\n
3281
06:47:10,628 --> 06:47:17,920
you can see that we'll want the first three\n
3282
06:47:17,919 --> 06:47:23,839
ba, and for the second maximum, we'll want\n
3283
06:47:23,840 --> 06:47:33,409
ba, ba. Visually, for this first, longest\n
3284
06:47:33,409 --> 06:47:41,720
at the start, and then a BA again, closer\n
3285
06:47:41,720 --> 06:47:48,960
found using the second largest common prefix\n
3286
06:47:48,959 --> 06:47:57,409
and the second one just next to it. That is\n
3287
06:47:57,409 --> 06:48:04,378
the suffix array and the LCP array have already\n
3288
06:48:04,378 --> 06:48:10,029
on campus if you want to tackle a longest\n
3289
06:48:10,029 --> 06:48:15,110
you to have an efficient solution. Today,\n
3290
06:48:15,110 --> 06:48:22,470
most important types of trees in computer\n
3291
06:48:22,470 --> 06:48:29,430
trees. Balanced Binary search trees are very\n
3292
06:48:29,430 --> 06:48:35,740
tree because they not only conform to the\n
3293
06:48:35,740 --> 06:48:42,080
also balanced. What I mean by balanced is\n
3294
06:48:42,080 --> 06:48:47,980
logarithmic height in proportion to the number\n
3295
06:48:47,979 --> 06:48:55,389
because it keeps operations such as insertion\n
3296
06:48:55,389 --> 06:49:04,000
is much more squashed. In terms of complexity,\n
3297
06:49:04,000 --> 06:49:10,240
operations, which is quite good. However,\n
3298
06:49:10,240 --> 06:49:18,330
the tree could degrade into a chain for some\n
3299
06:49:18,330 --> 06:49:24,398
numbers. To avoid this linear complexity,\n
3300
06:49:24,398 --> 06:49:33,040
in which the worst case is logarithmic for\n
3301
06:49:33,040 --> 06:49:37,888
Central to how nearly all Balanced Binary\n
3302
06:49:37,887 --> 06:49:44,189
balanced is the concept of tree rotations,\n
3303
06:49:44,189 --> 06:49:49,659
video. Later, we'll actually look at some\n
3304
06:49:49,659 --> 06:49:56,430
to see how these rotations come into play.\n
3305
06:49:56,430 --> 06:50:03,398
binary search tree implementations is the\n
3306
06:50:03,398 --> 06:50:12,100
tree invariant and tree rotations. A tree\n
3307
06:50:12,099 --> 06:50:19,079
you impose on your tree, such that it must\n
3308
06:50:19,080 --> 06:50:24,680
that the invariant is always satisfied a series\n
3309
06:50:24,680 --> 06:50:29,398
get back to this concept and fitting variants\n
3310
06:50:29,398 --> 06:50:36,468
so much for now. Right now we're going to\n
3311
06:50:36,468 --> 06:50:43,430
invariant is not satisfied. And to fix it,\n
3312
06:50:43,430 --> 06:50:50,878
A. Assuming node A has a left child B, we\n
3313
06:50:50,878 --> 06:50:59,650
node A was and push node A down to become\n
3314
06:50:59,650 --> 06:51:06,990
this, in my undergrad, I was Mind blown. Literally,\n
3315
06:51:06,990 --> 06:51:13,398
of being illegal that we should be allowed\n
3316
06:51:13,398 --> 06:51:20,520
tree. But what I've realized since is that\n
3317
06:51:20,520 --> 06:51:25,340
Since we're not breaking the binary search\n
3318
06:51:25,340 --> 06:51:31,889
the left tree, you'll discover that in terms\n
3319
06:51:31,889 --> 06:51:37,648
than node B is less than E is less than a\n
3320
06:51:37,648 --> 06:51:45,580
tree and remark that well, this is also true.\n
3321
06:51:45,580 --> 06:51:52,760
is a valid operation. First, you have to remember\n
3322
06:51:52,759 --> 06:51:58,709
binary search trees, meaning that the binary\n
3323
06:51:58,709 --> 06:52:04,869
node and the values in the left subtree are\n
3324
06:52:04,869 --> 06:52:11,727
the right subtree are all greater than the\n
3325
06:52:11,727 --> 06:52:16,939
what the tree structure itself looks like.\n
3326
06:52:16,939 --> 06:52:24,109
binary search tree invariant holds. This means\n
3327
06:52:24,110 --> 06:52:29,290
rotate the values and nodes in our tree as\n
3328
06:52:29,290 --> 06:52:33,479
environment remains satisfied.
3329
06:52:33,479 --> 06:52:40,439
Now let's look at how these rotations are\n
3330
06:52:40,439 --> 06:52:45,727
the rotations are symmetric, or will only\n
3331
06:52:45,727 --> 06:52:51,270
to figure out the left rotation on your own.\n
3332
06:52:51,270 --> 06:52:59,308
Notice that there are directed edges pointing\n
3333
06:52:59,308 --> 06:53:06,909
may or may not exist. This is why there is\n
3334
06:53:06,909 --> 06:53:12,669
have a parent node p, then it is important\n
3335
06:53:12,669 --> 06:53:19,369
rotation. In either case, we start with a\n
3336
06:53:19,369 --> 06:53:29,509
arrow, then we'll want a pointer to node B.\n
3337
06:53:29,509 --> 06:53:38,819
to B's right child, then change B's right\n
3338
06:53:38,819 --> 06:53:46,259
done a right rotation. If we rearrange the\n
3339
06:53:46,259 --> 06:53:54,739
However, notice that there's a slight problem.\n
3340
06:53:54,740 --> 06:54:01,370
parents left or right pointer would still\n
3341
06:54:01,369 --> 06:54:09,250
B is A's successor after the rotation. So\n
3342
06:54:09,250 --> 06:54:15,750
step is usually done on the recursive call\n
3343
06:54:15,750 --> 06:54:22,457
function. We just finished looking at the\n
3344
06:54:22,457 --> 06:54:29,477
and the right child nodes. But in some Balanced\n
3345
06:54:29,477 --> 06:54:35,919
convenient for nodes to also have a reference\n
3346
06:54:35,919 --> 06:54:42,959
rotations because now instead of updating\n
3347
06:54:42,959 --> 06:54:49,968
Let's have a look. In this case, where we\n
3348
06:54:49,968 --> 06:54:56,569
a sense, doubly linked. We start off with\n
3349
06:54:56,569 --> 06:55:04,119
we'll want to do is also reference node B\n
3350
06:55:04,119 --> 06:55:12,159
around pointers. Next we'll adjust the left\n
3351
06:55:12,159 --> 06:55:20,669
B's right subtree. Of course, throughout this\n
3352
06:55:20,669 --> 06:55:26,390
you can add an extra if statement to check\n
3353
06:55:26,390 --> 06:55:33,770
mistake to assume B is right subtree is not\n
3354
06:55:33,770 --> 06:55:43,958
before setting B's right child's parent to\n
3355
06:55:43,958 --> 06:55:53,967
of me. So set meas right pointer to reference\n
3356
06:55:53,968 --> 06:56:00,010
be the last thing we need to do is adjust\n
3357
06:56:00,009 --> 06:56:10,717
these parent pointer a reference P. And now\n
3358
06:56:10,718 --> 06:56:19,700
left or right pointer reference the successor\n
3359
06:56:19,700 --> 06:56:27,569
is not now because it might not exist. This\n
3360
06:56:27,569 --> 06:56:34,659
readjust the tree, you will see that we correctly\n
3361
06:56:34,659 --> 06:56:40,457
in doing this right rotation. And it's a very\n
3362
06:56:40,457 --> 06:56:47,659
to do it in such detail. Today we're going\n
3363
06:56:47,659 --> 06:56:54,137
tree in great detail. We'll be making use\n
3364
06:56:54,137 --> 06:57:00,718
in the last video. So if you didn't watch\n
3365
06:57:00,718 --> 06:57:06,159
before we get too far, I shouldn't mention\n
3366
06:57:06,159 --> 06:57:12,549
of many types of balanced binary search trees,\n
3367
06:57:12,549 --> 06:57:19,218
and search operations. something really special\n
3368
06:57:19,218 --> 06:57:23,500
type of Balanced Binary Search Tree to be\ndiscovered.
3369
06:57:23,500 --> 06:57:29,898
Then, soon after a whole bunch of other types\n
3370
06:57:29,898 --> 06:57:37,370
emerge, including the two three tree, the\n
3371
06:57:37,369 --> 06:57:43,680
main rival the red black tree, what you need\n
3372
06:57:43,680 --> 06:57:50,270
the avielle tree balanced. And this is the\n
3373
06:57:50,270 --> 06:57:57,159
of a node is the difference between the height\n
3374
06:57:57,159 --> 06:58:03,308
I'm pretty sure the bounce factor can also\n
3375
06:58:03,308 --> 06:58:08,218
the right subtree height. But don't quote\n
3376
06:58:08,218 --> 06:58:15,450
what I'm about to say, and may also be the\n
3377
06:58:15,450 --> 06:58:22,830
about what way to do tree rotations on various\n
3378
06:58:22,830 --> 06:58:28,670
let's keep the bounce factor right subtree\n
3379
06:58:28,669 --> 06:58:35,637
because people get this wrong or define it\n
3380
06:58:35,637 --> 06:58:42,700
as the number of edges between x and the furthest\n
3381
06:58:42,700 --> 06:58:50,080
tree has height zero, not height one because\n
3382
06:58:50,080 --> 06:58:58,190
avielle tree that keeps it balanced is forcing\n
3383
06:58:58,189 --> 06:59:04,770
minus one zero or plus one. If the balance\n
3384
06:59:04,770 --> 06:59:12,378
to resolve that with tree rotations. In terms\n
3385
06:59:12,378 --> 06:59:18,940
to actually make the avielle tree work. What\n
3386
06:59:18,939 --> 06:59:24,057
stores. This value must be comparable. So\n
3387
06:59:24,058 --> 06:59:30,280
it goes to the tree. Then we'll also need\n
3388
06:59:30,279 --> 06:59:37,297
as well as the left and the right child pointers.\n
3389
06:59:37,297 --> 06:59:44,849
these values. So keep that in mind. So a slider\n
3390
06:59:44,849 --> 06:59:52,397
node must always be minus zero or plus one.\n
3391
06:59:52,398 --> 06:59:58,670
the case where this is not true? The answer\n
3392
06:59:58,669 --> 07:00:05,829
factor must be either be plus two or minus\n
3393
07:00:05,830 --> 07:00:11,410
rotations. The rotations we need to perform\n
3394
07:00:11,409 --> 07:00:19,529
be broken down into four distinct cases. The\n
3395
07:00:19,529 --> 07:00:27,169
call left heavy, and there are two left child\n
3396
07:00:27,169 --> 07:00:35,839
all we need to do is perform a right rotation\n
3397
07:00:35,840 --> 07:00:43,477
case is the left right case where you have\n
3398
07:00:43,477 --> 07:00:53,159
child. To fix this, you do a left rotation\n
3399
07:00:53,159 --> 07:00:59,227
the left most image, what happens then is\n
3400
07:00:59,227 --> 07:01:08,128
we just saw, which we can resolve with a right\n
3401
07:01:08,128 --> 07:01:14,218
right case, which is symmetric to the left\n
3402
07:01:14,218 --> 07:01:23,208
we do a left rotation about the green note.\n
3403
07:01:23,207 --> 07:01:30,957
which is symmetric to the left right case.\n
3404
07:01:30,957 --> 07:01:37,090
about the yellow node on the left most image\n
3405
07:01:37,090 --> 07:01:43,080
case, and then do a left rotation about the\n
3406
07:01:43,080 --> 07:01:50,058
Next, I want to show you some pseudocode for\n
3407
07:01:50,058 --> 07:01:56,770
not all that obvious or easy. This first method\n
3408
07:01:56,770 --> 07:02:03,569
method, which returns true or false depending\n
3409
07:02:03,569 --> 07:02:11,180
or not. For simplicity, we're going to ban\n
3410
07:02:11,180 --> 07:02:18,060
value already exists, or the value is no,\n
3411
07:02:18,060 --> 07:02:23,968
does not know, and it doesn't already exist\n
3412
07:02:23,968 --> 07:02:30,119
insert method, where we pass in a pointer\n
3413
07:02:30,119 --> 07:02:36,399
insert. The private recursive method is also\n
3414
07:02:36,400 --> 07:02:41,760
we simply return a new instance of the node\n
3415
07:02:41,759 --> 07:02:46,519
we get to compare to a value with the value\n
3416
07:02:46,520 --> 07:02:52,387
node to determine if we should go on the left\n
3417
07:02:52,387 --> 07:02:58,207
call back, we call the update method which\n
3418
07:02:58,207 --> 07:03:04,819
for this note, and lastly, we rebounds the\n
3419
07:03:04,819 --> 07:03:11,898
a look at the update enhance method. And what\n
3420
07:03:11,898 --> 07:03:17,610
updates the bounce factor and height values\n
3421
07:03:17,610 --> 07:03:25,148
the node, we get the maximum height of the\n
3422
07:03:25,148 --> 07:03:30,400
Notice that I initialize the left and the\n
3423
07:03:30,400 --> 07:03:36,520
this is because it will cancel out with a\n
3424
07:03:36,520 --> 07:03:43,860
the node has no sub trees giving the correct\n
3425
07:03:43,860 --> 07:03:49,558
the balance factor for this node. By finding\n
3426
07:03:52,659 --> 07:03:57,727
balanced method is slightly more involved\n
3427
07:03:57,727 --> 07:04:05,840
factor has an illegal value of minus two or\n
3428
07:04:05,840 --> 07:04:12,700
two, then we know that the node is left heavy,\n
3429
07:04:12,700 --> 07:04:18,869
To determine if we're dealing with a left\n
3430
07:04:18,869 --> 07:04:24,450
thing if the balance factor is plus two, except\n
3431
07:04:24,450 --> 07:04:30,218
right left case. If the bounce factor is not\n
3432
07:04:30,218 --> 07:04:37,610
bounce factor is going to be either plus one,\n
3433
07:04:37,610 --> 07:04:44,830
cases, we don't need to do anything inside\n
3434
07:04:44,830 --> 07:04:51,570
all we do here are calls to the left rotation\n
3435
07:04:51,569 --> 07:04:58,898
the last video. Also notice that the left\n
3436
07:04:58,898 --> 07:05:06,920
left and right Right right case methods respectively,\n
3437
07:05:06,919 --> 07:05:14,137
rotation. In the last video, we looked at\n
3438
07:05:14,137 --> 07:05:20,939
dealing with any vl tree In this video, we\n
3439
07:05:20,939 --> 07:05:28,057
the height and bounce rate values for the\n
3440
07:05:28,058 --> 07:05:33,898
this is a subtle detail you must not forget,\n
3441
07:05:33,898 --> 07:05:39,680
will be inconsistent with the left rotation\n
3442
07:05:39,680 --> 07:05:44,450
able to figure it out pretty easily. In this\n
3443
07:05:44,450 --> 07:05:50,637
elements from an avielle tree. And what you'll\n
3444
07:05:50,637 --> 07:05:58,770
avielle tree is almost identical to removing\n
3445
07:05:58,770 --> 07:06:02,939
So for the majority of this video, we're going\n
3446
07:06:02,939 --> 07:06:10,887
from a binary search tree in the very end\n
3447
07:06:10,887 --> 07:06:17,430
So let's get started. So just for review,\n
3448
07:06:17,430 --> 07:06:25,648
binary search tree in detail. So we can generally\n
3449
07:06:25,648 --> 07:06:31,240
In the find phase, you find the element you\n
3450
07:06:31,240 --> 07:06:38,490
and then replace it with a successor node,\n
3451
07:06:38,490 --> 07:06:45,128
maintain the binary search tree invariant.\n
3452
07:06:45,128 --> 07:06:50,148
we're searching for the element in the tree\n
3453
07:06:50,148 --> 07:06:55,250
in happen. First is we hit a null node, in\n
3454
07:06:55,250 --> 07:07:02,919
for doesn't exist. Our comparateur returns\n
3455
07:07:02,919 --> 07:07:08,319
we want to remove, the comparative value is\n
3456
07:07:08,319 --> 07:07:14,579
for, if it exists, is going to be found in\n
3457
07:07:14,580 --> 07:07:20,420
is greater than zero, in which case the value\n
3458
07:07:20,419 --> 07:07:24,859
let's do an example of finding nodes in a\n
3459
07:07:24,860 --> 07:07:31,319
for a 14, well, we should have a reference\n
3460
07:07:31,319 --> 07:07:37,489
So we compare 20 and 14, and we know 14 is\n
3461
07:07:37,490 --> 07:07:43,320
We know 14 is greater than 10. So we go on\n
3462
07:07:43,319 --> 07:07:50,340
15. So you're on the left subtree. 14 is greater\n
3463
07:07:50,340 --> 07:07:56,659
there we found it, the node we were looking\n
3464
07:07:56,659 --> 07:08:03,579
that doesn't exist. So let's try and find\n
3465
07:08:03,580 --> 07:08:09,860
we go to the right subtree, because 26 is\n
3466
07:08:09,860 --> 07:08:18,659
because 26 is less than 31. And once we're\n
3467
07:08:18,659 --> 07:08:25,990
then discovered that 26 does not exist in\n
3468
07:08:25,990 --> 07:08:33,308
it exists, we need to replace that node with\n
3469
07:08:33,308 --> 07:08:40,040
that if we just remove the node without finding\n
3470
07:08:40,040 --> 07:08:44,128
tree. And when we're looking for a successor\n
3471
07:08:44,128 --> 07:08:51,020
will happen. Either were a leaf node, in which\n
3472
07:08:51,020 --> 07:08:56,790
has no left subtree the node to remove has\n
3473
07:08:56,790 --> 07:09:02,700
both the left subtree and right subtree. We'll\n
3474
07:09:02,700 --> 07:09:08,920
first case where the node to remove is a leaf\n
3475
07:09:08,919 --> 07:09:15,349
side effects. The successor node in this case\n
3476
07:09:15,349 --> 07:09:21,739
a remove node eight from this tree, the first\n
3477
07:09:21,740 --> 07:09:28,600
the tree. So we'd go down the tree and then\n
3478
07:09:28,599 --> 07:09:37,109
Oh, it's a leaf node, so we can just remove\n
3479
07:09:37,110 --> 07:09:45,047
where there's only a left or a right subtree.\n
3480
07:09:45,047 --> 07:09:51,397
immediate child of that left or right subtree.\n
3481
07:09:51,398 --> 07:09:58,850
node down from the node we're removing is\n
3482
07:09:58,849 --> 07:10:06,529
than it the case right subtree or less than\n
3483
07:10:06,529 --> 07:10:13,057
do an example, suppose we want to remove node\n
3484
07:10:13,058 --> 07:10:20,120
nine is in the tree. So start the route and\n
3485
07:10:20,119 --> 07:10:25,669
want to remove, which is nine. And then we\n
3486
07:10:25,669 --> 07:10:33,309
a left subtree. So the successor node is its\n
3487
07:10:33,310 --> 07:10:36,850
so seven. So what we do is we get a reference\nto seven
3488
07:10:36,849 --> 07:10:43,789
and then get ready to remove nine. And then\n
3489
07:10:43,790 --> 07:10:51,720
by linking it back up to five. And if we rebalance\n
3490
07:10:51,720 --> 07:10:58,600
nine has been removed. So the last case, is\n
3491
07:10:58,599 --> 07:11:06,069
a left subtree and a right subtree. So the\n
3492
07:11:06,069 --> 07:11:13,810
will we find the successor of the node we're\n
3493
07:11:13,810 --> 07:11:19,690
surprisingly, the answer is both. The successor\n
3494
07:11:19,689 --> 07:11:27,539
subtree, or the smallest value in the right\n
3495
07:11:27,540 --> 07:11:33,770
found in either left or right subtree, we\n
3496
07:11:33,770 --> 07:11:39,387
a value in the successor node. However, the\n
3497
07:11:39,387 --> 07:11:45,569
to remove the duplicate value of the successor\n
3498
07:11:45,569 --> 07:11:51,009
strategy to resolve this is to recursively\n
3499
07:11:51,009 --> 07:11:57,639
to remove as the value in the successor node.\n
3500
07:11:57,639 --> 07:12:04,220
trivial. Let's remove node seven from this\n
3501
07:12:04,220 --> 07:12:09,930
start at the root node and discover that in\n
3502
07:12:09,930 --> 07:12:16,387
notice that it has two non empty sub trees,\n
3503
07:12:16,387 --> 07:12:22,739
successor we either pick the smallest value\n
3504
07:12:22,740 --> 07:12:28,440
in the left subtree. Let's find the smallest\n
3505
07:12:28,439 --> 07:12:36,887
you will go to the right ones, and then dig\n
3506
07:12:36,887 --> 07:12:43,759
the successor node 11, we will copy its value\n
3507
07:12:43,759 --> 07:12:51,207
is the root node seven. Notice that now there\n
3508
07:12:51,207 --> 07:12:58,689
want unique values in our tree. So to remove\n
3509
07:12:58,689 --> 07:13:06,029
recursively call our remove method, but on\n
3510
07:13:06,029 --> 07:13:13,759
always result in a case one two or three removal.\n
3511
07:13:13,759 --> 07:13:23,099
a right subtree. So its successor is its immediate\n
3512
07:13:23,099 --> 07:13:32,189
get ready to remove 11. So we remove 11 and\n
3513
07:13:32,189 --> 07:13:38,279
the tree, then we can see that the duplicate\n
3514
07:13:38,279 --> 07:13:43,797
we've been waiting for how do we augment the\n
3515
07:13:43,797 --> 07:13:49,159
trees. The solution is simple, you only need\n
3516
07:13:49,159 --> 07:13:54,990
tree remains balanced and that the bounce\n
3517
07:13:54,990 --> 07:13:59,940
On the recursive callback, you invoke the\n
3518
07:13:59,939 --> 07:14:07,449
insert video, which ensure that when the node\n
3519
07:14:07,450 --> 07:14:13,250
tree remains balanced. It's as easy as that.\n
3520
07:14:13,250 --> 07:14:19,200
of the source code for the avielle tree. The\n
3521
07:14:19,200 --> 07:14:26,340
in this video can be found on GitHub github.com\n
3522
07:14:26,340 --> 07:14:34,542
Make sure you have watched the last three\n
3523
07:14:34,542 --> 07:14:39,920
insertions and removals in avielle trees before\n
3524
07:14:39,919 --> 07:14:47,647
code I'm presenting. I don't normally do this,\n
3525
07:14:47,648 --> 07:14:53,380
avielle tree in action. So I'm here in my\n
3526
07:14:53,380 --> 07:15:01,040
of the Java code with avielle tree and then\n
3527
07:15:01,040 --> 07:15:10,860
a random tree with some values and notice\n
3528
07:15:10,860 --> 07:15:20,137
for the number of nodes that are in it. So\n
3529
07:15:20,137 --> 07:15:29,878
expect the tree to be a bit more sloppy if\n
3530
07:15:29,878 --> 07:15:35,319
But the avielle tree really keeps the tree\nquite rigid.
3531
07:15:35,319 --> 07:15:41,878
So I think we're ready to dive into the source\n
3532
07:15:41,878 --> 07:15:49,458
of a recursive avielle tree implementation\n
3533
07:15:49,457 --> 07:15:56,770
get started. If you look at the class definition\n
3534
07:15:56,770 --> 07:16:08,290
this class takes a generic type argument,\n
3535
07:16:08,290 --> 07:16:16,620
And this generic type I'm defining, basically\n
3536
07:16:16,619 --> 07:16:21,549
to be inserting inside the tree need to be\n
3537
07:16:21,549 --> 07:16:29,558
to be able to insert them and know how to\n
3538
07:16:29,558 --> 07:16:36,450
node, which I've created, you can see that\n
3539
07:16:36,450 --> 07:16:44,619
node is of type T. So it's a comparable. And\n
3540
07:16:44,619 --> 07:16:51,039
variables I'm storing inside the, the node\n
3541
07:16:51,040 --> 07:16:59,159
the bounce factor as an integer, the height\n
3542
07:16:59,159 --> 07:17:05,808
query the height of a node in constant time,\n
3543
07:17:05,808 --> 07:17:12,090
of course, there's going to be the left and\n
3544
07:17:12,090 --> 07:17:20,200
notice that this, this node class implements\n
3545
07:17:20,200 --> 07:17:27,058
And that's just an interface I have somewhere\n
3546
07:17:27,058 --> 07:17:32,520
tree I did on the terminal. So this isn't\n
3547
07:17:32,520 --> 07:17:39,310
an ACL tree. And nor are these overrides,\n
3548
07:17:39,310 --> 07:17:46,840
tree in the terminal, which is really handy\n
3549
07:17:46,840 --> 07:17:55,189
variables at the avielle tree class level,\n
3550
07:17:55,189 --> 07:18:03,989
be private, although I'm using it for testing,\n
3551
07:18:03,990 --> 07:18:12,530
keeping track of the number of nodes inside\n
3552
07:18:12,529 --> 07:18:17,279
Something else I'm also using for testing\n
3553
07:18:17,279 --> 07:18:23,289
good to just saying you check yourself to\n
3554
07:18:23,290 --> 07:18:30,430
relatively low. Then there are these methods\n
3555
07:18:30,430 --> 07:18:38,292
explanatory. This the display method I call\n
3556
07:18:38,292 --> 07:18:45,280
The set contains method to check if a certain\n
3557
07:18:45,279 --> 07:18:52,329
is the public facing method, which calls the\n
3558
07:18:52,330 --> 07:18:59,610
the initial node to start off with. So to\n
3559
07:18:59,610 --> 07:19:07,319
hit the base case, then we know that that\n
3560
07:19:07,319 --> 07:19:12,207
the current value to the value inside the\n
3561
07:19:12,207 --> 07:19:19,659
of either a value less than zero, which means\n
3562
07:19:19,659 --> 07:19:26,029
in the left subtree or comparateur is greater\n
3563
07:19:26,029 --> 07:19:31,860
is in the right subtree. Otherwise, the means\n
3564
07:19:31,860 --> 07:19:40,360
And that means we found the node inside the\n
3565
07:19:40,360 --> 07:19:48,530
to insert this value variable, if it's no,\n
3566
07:19:48,530 --> 07:19:54,779
just return false. And we return a Boolean\n
3567
07:19:54,779 --> 07:20:01,817
an insertion was successful or not. So if\n
3568
07:20:01,817 --> 07:20:13,090
tree, then we're going to call the private\n
3569
07:20:13,090 --> 07:20:20,398
the root. And also increment the node count,\n
3570
07:20:20,398 --> 07:20:24,468
insert something if we're inside this block,\n
3571
07:20:25,599 --> 07:20:34,049
Alright, let's have a look at the private\n
3572
07:20:34,049 --> 07:20:40,770
meaning we traverse all the way down our tree,\n
3573
07:20:40,770 --> 07:20:46,540
this is the position where we need to insert\n
3574
07:20:46,540 --> 07:20:54,840
it the value. Otherwise, we're searching for\n
3575
07:20:54,840 --> 07:21:03,378
comparator function, then do value compared\n
3576
07:21:03,378 --> 07:21:10,250
from the comparable interface at the class\n
3577
07:21:10,250 --> 07:21:18,371
dot compare to on a generic type here and\n
3578
07:21:18,371 --> 07:21:25,829
is less than zero, then insert in left subtree.\n
3579
07:21:25,830 --> 07:21:33,080
here is the extra two lines, you need to add\n
3580
07:21:33,080 --> 07:21:40,638
update method to update the bounce factor\n
3581
07:21:40,637 --> 07:21:51,520
rebalance the tree if necessary. Well rebalance\n
3582
07:21:51,520 --> 07:21:59,279
look at the update method. So this is to update\n
3583
07:21:59,279 --> 07:22:06,430
here are two variables that grab the height\n
3584
07:22:06,430 --> 07:22:16,628
Then I update the height for this note. So\n
3585
07:22:16,628 --> 07:22:25,218
the left subtree or the right subtree height.\n
3586
07:22:25,218 --> 07:22:31,080
left and right, we can of course update the\n
3587
07:22:31,080 --> 07:22:38,340
the right and the left subtree heights. Okay,\n
3588
07:22:38,340 --> 07:22:46,420
the balance method. And inside the balance\n
3589
07:22:46,419 --> 07:22:53,979
balance factors are either minus two or plus\n
3590
07:22:53,979 --> 07:23:02,898
that our tree is left heavy. And inside the\n
3591
07:23:02,898 --> 07:23:10,580
left case or the left right case. And inside\n
3592
07:23:10,580 --> 07:23:17,270
case and the right left case. And to identify\n
3593
07:23:17,270 --> 07:23:28,610
factor, but on either no dot left or no dot\n
3594
07:23:28,610 --> 07:23:33,409
factor of the node is not minus two or plus\n
3595
07:23:33,409 --> 07:23:40,189
is either zero plus one or minus one, which\n
3596
07:23:40,189 --> 07:23:45,279
We don't have to rebalance the tree if either\n
3597
07:23:45,279 --> 07:23:53,919
the note. So now we can look at the individual\n
3598
07:23:53,919 --> 07:24:04,199
we'll just perform a right rotation. The left\n
3599
07:24:04,200 --> 07:24:12,477
and then call the left left case. So we're\n
3600
07:24:12,477 --> 07:24:19,477
this case degrades to that case, after one\n
3601
07:24:19,477 --> 07:24:24,739
the right right case and the right left case,\n
3602
07:24:24,740 --> 07:24:33,350
one right rotation over here. then here are\n
3603
07:24:33,349 --> 07:24:42,259
methods. For the avielle tree. It's very important\n
3604
07:24:42,259 --> 07:24:52,079
the height and balance factor values after\n
3605
07:24:52,080 --> 07:24:58,870
those values will undoubtedly change. And\n
3606
07:24:58,869 --> 07:25:04,250
you do them and you can't Do them in inverse\n
3607
07:25:04,250 --> 07:25:11,159
node first to get the correct balance factor\n
3608
07:25:13,139 --> 07:25:23,378
now we're at the Remove method. So in this\n
3609
07:25:23,378 --> 07:25:28,898
or I should have called it value rather. But\n
3610
07:25:28,898 --> 07:25:35,058
it doesn't exist in the tree. So just return\n
3611
07:25:35,058 --> 07:25:45,590
in the tree, and then remove it by calling\n
3612
07:25:45,590 --> 07:25:55,000
node count and simply return true on the Remove\n
3613
07:25:55,000 --> 07:25:59,009
So let's look at this private remove method
3614
07:25:59,009 --> 07:26:06,057
where all the action is happening. So if we\n
3615
07:26:06,058 --> 07:26:13,200
then we get our comparative value. And we\n
3616
07:26:13,200 --> 07:26:18,727
is less than zero. And if so we dig into the\n
3617
07:26:18,727 --> 07:26:26,369
we're looking for is smaller than the current\n
3618
07:26:26,369 --> 07:26:35,020
method. But passing down no dot left, so we're\n
3619
07:26:35,020 --> 07:26:43,629
passing the element as well. Otherwise do\n
3620
07:26:43,629 --> 07:26:52,270
down the right subtree. Otherwise, if this\n
3621
07:26:52,270 --> 07:26:59,020
know that the comparative value is equal to\n
3622
07:26:59,020 --> 07:27:07,939
to remove. And we know from the slides that\n
3623
07:27:07,939 --> 07:27:20,009
dot left is equal to null, then we know that\n
3624
07:27:20,009 --> 07:27:27,989
right is null, meaning the right subtree is\n
3625
07:27:27,990 --> 07:27:38,860
the final tricky case, where we're trying\n
3626
07:27:38,860 --> 07:27:42,637
so the left subtree is there and the right\n
3627
07:27:42,637 --> 07:27:50,789
that node. And we can either pick the smallest\n
3628
07:27:50,790 --> 07:27:56,280
smallest value in the right subtree of the\n
3629
07:27:56,279 --> 07:28:03,840
using a heuristic right here to actually determine\n
3630
07:28:03,840 --> 07:28:10,750
it's a good heuristic. And here's what it\n
3631
07:28:10,750 --> 07:28:19,727
I want to remove nodes from that subtree.\n
3632
07:28:19,727 --> 07:28:29,297
larger height, than I want to remove nodes\n
3633
07:28:29,297 --> 07:28:37,709
I'm using to choose which subtree I remove\n
3634
07:28:37,709 --> 07:28:45,180
I think in general, it'll work pretty well.\n
3635
07:28:45,180 --> 07:28:52,308
the left subtree has larger height, than what\n
3636
07:28:52,308 --> 07:29:01,540
value by going down the left subtree once\n
3637
07:29:01,540 --> 07:29:08,940
to find the max value, and then swapping that\n
3638
07:29:08,939 --> 07:29:19,329
is the recursive part, we recurse. on removing\n
3639
07:29:19,330 --> 07:29:27,870
we passed in, or rather the element we passed\n
3640
07:29:27,869 --> 07:29:32,227
the element, but now we also have to remove\n
3641
07:29:32,227 --> 07:29:38,707
duplicate values in our tree, then we basically\n
3642
07:29:38,707 --> 07:29:47,359
If the condition goes the other way. Here's\n
3643
07:29:47,360 --> 07:29:54,090
you want to call the update and rebalance\n
3644
07:29:54,090 --> 07:30:00,297
so that the tree remains balanced even though\n
3645
07:30:00,297 --> 07:30:08,159
men and find max method I was calling right\n
3646
07:30:08,159 --> 07:30:13,509
depending on the case. And here's just an\n
3647
07:30:13,509 --> 07:30:18,500
I need to go over this. And here's another\n
3648
07:30:18,500 --> 07:30:24,909
search tree invariant, also not particularly\n
3649
07:30:24,909 --> 07:30:32,360
in the main function that will just randomly\n
3650
07:30:32,360 --> 07:30:37,159
it. So when I invoked this file on the terminal
3651
07:30:37,159 --> 07:30:46,968
this is what executed. So that is an ACL tree.\n
3652
07:30:46,968 --> 07:30:53,740
enjoyed writing it up. Today's data structure\n
3653
07:30:53,740 --> 07:30:59,230
to prove to be a very useful data structure\n
3654
07:30:59,229 --> 07:31:05,387
time ago. So just before we get started, this\n
3655
07:31:05,387 --> 07:31:10,779
priority queue videos, which simply go over\n
3656
07:31:10,779 --> 07:31:15,919
get by without watching all those videos,\n
3657
07:31:15,919 --> 07:31:20,369
those of you who want to know, priority queues\n
3658
07:31:20,369 --> 07:31:24,409
for links to those. So what exactly is an\n
3659
07:31:24,409 --> 07:31:32,707
priority queue variant, which on top of having\n
3660
07:31:32,707 --> 07:31:38,029
also supports quick updates and deletions\n
3661
07:31:38,029 --> 07:31:43,509
that the index party queue solves is being\n
3662
07:31:43,509 --> 07:32:08,039
the values in your priority queue on the fly,\n
3663
07:32:08,040 --> 07:32:37,727
an example. Suppose a hospital has a waiting\n
3664
07:32:37,727 --> 07:33:05,450
of attention. Each person in the waiting room\n
3665
07:33:05,450 --> 07:33:35,450
with. For instance, Mary is in labor. So she\n
3666
07:33:35,450 --> 07:33:55,659
cut, he has a priority of one. James has an\n
3667
07:33:55,659 --> 07:34:20,069
Naija stomach hurts, she gets priority of\n
3668
07:34:20,069 --> 07:34:33,779
priority is five. And lastly, Leah also has\n
3669
07:34:33,779 --> 07:34:47,469
patients by highest priority first. This means\n
3670
07:34:47,470 --> 07:34:57,970
by James. However, then something happens\n
3671
07:34:57,970 --> 07:35:11,970
she starts vomiting. Her priority needs to\n
3672
07:35:11,970 --> 07:35:24,220
served next once they're finished with James.\n
3673
07:35:24,220 --> 07:35:40,569
leaves he goes to another clinic down the\n
3674
07:35:40,569 --> 07:36:04,128
for. Further suppose that a car wash goes\n
3675
07:36:04,128 --> 07:36:27,430
and as a result cracks his head and needs\n
3676
07:36:27,430 --> 07:36:50,500
to 10. Once the EDA is dealt with a karsch\n
3677
07:36:50,500 --> 07:37:12,750
by layer. As we saw in the hospital example,\n
3678
07:37:12,750 --> 07:37:28,930
update the priority of certain people. The\n
3679
07:37:28,930 --> 07:37:46,529
lets us do this efficiently. The first step\n
3680
07:37:46,529 --> 07:37:59,809
index values to all the keys thus forming\n
3681
07:37:59,810 --> 07:38:16,420
persecute to track who should get served next\n
3682
07:38:16,419 --> 07:38:30,029
a unique key index value between zero and\n
3683
07:38:30,029 --> 07:39:01,489
intended to be bi directional. So I would\n
3684
07:39:01,490 --> 07:39:18,200
be able to flip back and forth between the\n
3685
07:39:18,200 --> 07:39:28,990
on the index party q will require the associated\n
3686
07:39:28,990 --> 07:39:36,080
wondering why I'm saying that we need to map\n
3687
07:39:36,080 --> 07:39:41,010
inclusive. The reason for this is that typically\n
3688
07:39:41,009 --> 07:39:48,789
under the hood are actually arrays. So we\n
3689
07:39:48,790 --> 07:39:54,780
those arrays this will become apparent shortly.\n
3690
07:39:54,779 --> 07:40:01,869
often, the keys themselves are already integers\n
3691
07:40:01,869 --> 07:40:07,067
to actually construct this bi directional\n
3692
07:40:07,067 --> 07:40:15,637
it is handy to be able to support any type\n
3693
07:40:15,637 --> 07:40:23,058
can think of an index party queue As an abstract\n
3694
07:40:23,059 --> 07:40:31,830
it to support here are about a dozen or so\n
3695
07:40:31,830 --> 07:40:36,910
These are deleting keys, getting the value\n
3696
07:40:36,909 --> 07:40:42,750
in the priority queue, getting the key index\n
3697
07:40:42,750 --> 07:40:46,707
value in the index Burcu, being able to insert\n
3698
07:40:46,707 --> 07:40:49,149
specialized update operations increase in\n
3699
07:40:49,150 --> 07:40:51,958
end. For all these operations, you need the\n
3700
07:40:51,957 --> 07:40:57,039
that you're dealing with. Throughout these\n
3701
07:40:57,040 --> 07:41:03,229
as the variable KPI to distinguish it from\n
3702
07:41:03,229 --> 07:41:09,759
that. And index party queue can be implemented\n
3703
07:41:09,759 --> 07:41:14,797
time complexity is using specialized heap\n
3704
07:41:14,797 --> 07:41:22,409
the binary heap implementation for simplicity,\n
3705
07:41:22,409 --> 07:41:27,520
all these operations are either constant or\n
3706
07:41:27,520 --> 07:41:33,950
party queue. The remove and update operations\n
3707
07:41:33,950 --> 07:41:43,010
a mapping to the position of where our values\n
3708
07:41:43,009 --> 07:41:52,590
the index party queue per se, I want to spend\n
3709
07:41:52,591 --> 07:42:02,547
priority queue data structure which only supports\n
3710
07:42:02,547 --> 07:42:07,628
barbecue. Still, both data structures are\n
3711
07:42:07,628 --> 07:42:17,580
them the same. Although there are key differences\n
3712
07:42:17,580 --> 07:42:23,370
to represent a binary heap is within array\n
3713
07:42:23,369 --> 07:42:34,349
If we were to represent the following binary\n
3714
07:42:34,349 --> 07:42:46,430
of values. If we know the index of node i,\n
3715
07:42:46,430 --> 07:42:53,637
child nodes are by using simple formulas,\n
3716
07:42:53,637 --> 07:43:03,079
the right child is two times i plus two, assuming\n
3717
07:43:03,080 --> 07:43:09,708
the children of the node at index four? Well,\n
3718
07:43:09,707 --> 07:43:16,199
I just gave you to obtain the indices, nine\n
3719
07:43:16,200 --> 07:43:23,378
math backwards and figure out with a parent\n
3720
07:43:23,378 --> 07:43:30,740
useful if you're either walking up or down\n
3721
07:43:30,740 --> 07:43:35,830
value into the priority queue, you insert\n
3722
07:43:35,830 --> 07:43:44,250
the bottom right of the binary tree. Suppose\n
3723
07:43:44,250 --> 07:43:49,849
violate the heap invariant, so we need to\n
3724
07:43:49,849 --> 07:43:54,977
met. So swap nodes five and 12. The heap invariant\n
3725
07:43:54,977 --> 07:44:00,659
nodes two and five, and now the tree is balanced.\n
3726
07:44:00,659 --> 07:44:10,409
a traditional priority queue. To remove items\n
3727
07:44:10,409 --> 07:44:13,898
and then swap it with the last node, perform\n
3728
07:44:13,898 --> 07:44:22,208
the swapped value. For this example, suppose\n
3729
07:44:22,207 --> 07:44:31,180
five, we don't know where the node value five\n
3730
07:44:31,180 --> 07:44:36,240
search for it. This is one of the major differences\n
3731
07:44:36,240 --> 07:44:42,780
queue. So start at node zero and process each\n
3732
07:44:44,849 --> 07:44:49,449
So we found a node with a value five to actually\n
3733
07:44:49,450 --> 07:44:56,889
right most bottom node. Once this is done,\n
3734
07:44:56,889 --> 07:45:02,227
node we swapped into five spoon position may\n
3735
07:45:02,227 --> 07:45:08,159
to either move it up or down the tree. In\n
3736
07:45:08,159 --> 07:45:15,939
of one, which is smaller than its children,\n
3737
07:45:15,939 --> 07:45:26,369
to move the node down. That was a quick recap\n
3738
07:45:26,369 --> 07:45:33,459
about a traditional party queue. Now let's\n
3739
07:45:33,459 --> 07:45:43,317
queue with a binary heap. For the following\n
3740
07:45:43,317 --> 07:45:49,040
with different priorities that we need to\n
3741
07:45:49,040 --> 07:45:55,440
a queue at a hospital, a waiting line at a\n
3742
07:45:55,439 --> 07:45:59,797
is, we'll assume that the values can dynamically\n
3743
07:45:59,797 --> 07:46:15,779
person with the lowest priority to figure\n
3744
07:46:15,779 --> 07:46:20,520
priority queue to sort by lowest value first.\n
3745
07:46:20,520 --> 07:46:28,628
assign each person a unique index value. between\n
3746
07:46:28,628 --> 07:46:34,760
index values in the second column beside each\n
3747
07:46:34,759 --> 07:46:39,599
person an initial value to place inside the\n
3748
07:46:39,599 --> 07:46:44,779
by the index priority queue once inserted,\n
3749
07:46:44,779 --> 07:46:51,779
value, you want not only integers as shown\n
3750
07:46:51,779 --> 07:46:57,819
or whatever type of data we want. If I was\n
3751
07:46:57,819 --> 07:47:04,020
of the key value pairs I have in the last\n
3752
07:47:04,020 --> 07:47:08,930
that unlike the previous example, we're sorting\n
3753
07:47:08,930 --> 07:47:16,110
with a min heap. If I want to access the value\n
3754
07:47:16,110 --> 07:47:21,430
out what its key indexes. And then you will\n
3755
07:47:21,430 --> 07:47:26,270
the index priority queue. Here's a good question,\n
3756
07:47:26,270 --> 07:47:30,218
index priority queue? Well, first, find Bella's\n
3757
07:47:30,218 --> 07:47:33,930
into the values array at position one to find\n
3758
07:47:33,930 --> 07:47:38,939
index partner queue. So Bella has a value\n
3759
07:47:38,939 --> 07:47:44,250
value for a particular key in the index priority\n
3760
07:47:44,250 --> 07:47:50,990
node for a particular key? To do that, we'll\n
3761
07:47:50,990 --> 07:47:59,388
namely a position map we can use to tell us\n
3762
07:47:59,387 --> 07:48:06,237
key index. For convenience, I will store the\n
3763
07:48:06,238 --> 07:48:13,319
the priority queue. As an example, let's find\n
3764
07:48:13,319 --> 07:48:20,619
find the key index for Dylan, which happens\n
3765
07:48:20,619 --> 07:48:25,599
position map to tell you where Dylan is in\n
3766
07:48:25,599 --> 07:48:28,399
and the index seven highlighted in orange.\n
3767
07:48:28,400 --> 07:48:33,790
George in the heap? I'll give you a quick\n
3768
07:48:34,790 --> 07:48:40,659
All right, so with just about every operation,\n
3769
07:48:40,659 --> 07:48:48,797
the key we care about, which is George, then\n
3770
07:48:48,797 --> 07:49:00,770
position map and find out the node for George,\n
3771
07:49:00,770 --> 07:49:12,680
know how to look up the node for a given key.\n
3772
07:49:12,680 --> 07:49:18,047
This inverse lookup will prove to be a very\n
3773
07:49:18,047 --> 07:49:22,180
key is associated with the root node at index\n
3774
07:49:22,180 --> 07:49:27,229
lookup to figure that out. To do the inverse\n
3775
07:49:27,229 --> 07:49:33,637
need to maintain an inverse lookup table.\n
3776
07:49:33,637 --> 07:49:38,340
for inverse map. Let's see if we can figure\n
3777
07:49:38,340 --> 07:49:45,020
at index two. To do that, simply do a lookup\n
3778
07:49:45,020 --> 07:49:50,930
us information about which key index is associated\n
3779
07:49:50,930 --> 07:49:56,689
us to retrieve the actual key by doing a lookup\n
3780
07:49:56,689 --> 07:50:07,387
case, the node at position two represents\n
3781
07:50:07,387 --> 07:50:11,669
make sure you're still paying attention. Which\n
3782
07:50:11,669 --> 07:50:18,459
position three. Same as before, find the key\n
3783
07:50:18,459 --> 07:50:25,079
key next, figure out the actual key from the\n
3784
07:50:25,080 --> 07:50:30,350
that the node at position three represents\n
3785
07:50:30,349 --> 07:50:37,547
an index party Q is structured internally\n
3786
07:50:37,547 --> 07:50:41,207
want to actually do some useful operations\n
3787
07:50:41,207 --> 07:50:48,609
new key value pairs, removing key value pairs\n
3788
07:50:48,610 --> 07:50:52,898
associated with a key. These are all possible\n
3789
07:50:52,898 --> 07:50:58,978
do it with a regular priority queue insertion\n
3790
07:50:58,977 --> 07:51:10,567
update the position map pm and the inverse\n
3791
07:51:10,567 --> 07:51:18,411
pairs. Suppose we want to insert the key marry\n
3792
07:51:18,411 --> 07:51:21,560
What we first have to do is assign marry a\n
3793
07:51:21,560 --> 07:51:24,690
insert the new key value pair at the insertion\n
3794
07:51:24,689 --> 07:51:29,180
our arrays at index 12. To reflect that the\n
3795
07:51:29,180 --> 07:51:34,878
the heap invariant is not satisfied since\n
3796
07:51:34,878 --> 07:51:40,420
than the one at node five. To resolve this,\n
3797
07:51:40,419 --> 07:51:45,259
upwards until the heap invariant is satisfied\n
3798
07:51:45,259 --> 07:51:51,090
map and the inverse map. by swapping the values\n
3799
07:51:51,090 --> 07:51:55,599
array does not need to be touched since it\n
3800
07:51:55,599 --> 07:52:00,359
get from the map and not the node index per\n
3801
07:52:00,360 --> 07:52:04,440
still not satisfied, so we need to keep swapping\nupwards.
3802
07:52:04,439 --> 07:52:12,189
Let's have a look at some pseudocode for insertions\n
3803
07:52:12,189 --> 07:52:17,539
needs to provide a valid key index k II, as\n
3804
07:52:17,540 --> 07:52:24,308
The first thing I do is store the value associated\n
3805
07:52:24,308 --> 07:52:31,530
update the position map and the inverse map\n
3806
07:52:31,529 --> 07:52:39,529
has been inserted into the priority queue.\n
3807
07:52:39,529 --> 07:52:42,579
the heap invariant is satisfied. Let's take\n
3808
07:52:42,580 --> 07:52:47,120
that happens. Here we are looking at the swim\n
3809
07:52:47,119 --> 07:52:52,520
functions swap, and let's swap is simply exchanges\n
3810
07:52:52,520 --> 07:53:00,159
if no AI has a value less than node j. The\n
3811
07:53:00,159 --> 07:53:08,169
It begins by finding the index of the parent\n
3812
07:53:08,169 --> 07:53:16,739
we walk up one layer in the tree. If the index\n
3813
07:53:16,740 --> 07:53:22,048
the value of the current node is less than\n
3814
07:53:22,047 --> 07:53:29,619
with a min heap, and want small values to\n
3815
07:53:29,619 --> 07:53:35,020
issue a node exchange simply call the swap\n
3816
07:53:35,020 --> 07:53:40,810
node and the parent node, and then update\n
3817
07:53:40,810 --> 07:53:50,530
values. I also want to talk about swapping\n
3818
07:53:50,529 --> 07:53:56,369
slightly different than a traditional party\n
3819
07:53:56,369 --> 07:54:03,289
moving around the values in the array, we're\n
3820
07:54:03,290 --> 07:54:10,790
that the values array is indexed by the key\n
3821
07:54:10,790 --> 07:54:18,780
the values array can remain constant while\n
3822
07:54:18,779 --> 07:54:25,849
map. First, we update the positions of where\n
3823
07:54:25,849 --> 07:54:35,449
queue. Remember what the position map is,\n
3824
07:54:35,450 --> 07:54:42,520
key index is found at. So we can do a straightforward\n
3825
07:54:42,520 --> 07:54:46,479
the key index values and swap indices i and\n
3826
07:54:46,479 --> 07:54:54,979
index values associated with nodes iMj and\n
3827
07:54:54,979 --> 07:55:01,292
a simple straightforward exchange. Next up,\n
3828
07:55:01,292 --> 07:55:06,760
elements from an indexed priority queue. Polling\n
3829
07:55:06,759 --> 07:55:13,371
removing is improved from a linear time complexity\n
3830
07:55:13,371 --> 07:55:18,450
position lookups are now constant time but\n
3831
07:55:18,450 --> 07:55:25,869
is why the heap invariant is still logarithmic.\n
3832
07:55:25,869 --> 07:55:32,909
want to pull the root node This is something\n
3833
07:55:32,909 --> 07:55:40,509
key value pair with the lowest value in the\n
3834
07:55:40,509 --> 07:55:50,647
node with the bottom right node. As we do\n
3835
07:55:50,648 --> 07:55:55,620
we can remove the read node from the tree.\n
3836
07:55:55,619 --> 07:56:03,137
store the key value pair we're removing so\n
3837
07:56:03,137 --> 07:56:13,539
Then clean up the Remove node. Finally, restore\n
3838
07:56:13,540 --> 07:56:19,690
node down. Since the left and the right child\n
3839
07:56:22,779 --> 07:56:29,270
Let's do a slightly more involved removal\n
3840
07:56:29,270 --> 07:56:34,898
party Q. And this example, let's remove the\n
3841
07:56:34,898 --> 07:56:43,610
get the key index for Laura, which is the\n
3842
07:56:43,610 --> 07:56:49,840
key index is equal to 11. Once we know the\n
3843
07:56:49,840 --> 07:56:55,797
locate the node within the heap by looking\n
3844
07:56:55,797 --> 07:57:02,579
we want to remove that is the node which contains\n
3845
07:57:02,580 --> 07:57:08,000
it with the bottom rightmost node. store the\n
3846
07:57:08,000 --> 07:57:16,299
clean up the Remove node. And finally restore\n
3847
07:57:16,299 --> 07:57:24,819
node up or down, we're going to make the purple\n
3848
07:57:24,819 --> 07:57:31,779
Alright, let's look at some pseudocode. for\n
3849
07:57:31,779 --> 07:57:39,897
very short only five lines of implementation\n
3850
07:57:39,898 --> 07:57:50,740
to do is exchange the position of the node\n
3851
07:57:50,740 --> 07:57:58,320
node, which is always at index position as\n
3852
07:57:58,319 --> 07:58:01,878
exchanged the nodes, the rightmost node is\n
3853
07:58:01,878 --> 07:58:08,637
to remove was. So we need to move it either\n
3854
07:58:08,637 --> 07:58:14,229
we don't know which it will be so we try to\n
3855
07:58:14,229 --> 07:58:20,457
swim. Lastly, I just clean up the values associated\n
3856
07:58:20,457 --> 07:58:26,637
return the key value pair being removed. But\n
3857
07:58:26,637 --> 07:58:34,509
look at the sync method. So we understand\n
3858
07:58:34,509 --> 07:58:40,419
select the child with the smallest value and\n
3859
07:58:40,419 --> 07:58:46,797
tie. In this next block, I tried to update\n
3860
07:58:46,797 --> 07:58:51,430
But first I need to check if the right child\n
3861
07:58:51,430 --> 07:58:57,000
and its value is actually less than the one\n
3862
07:58:57,000 --> 07:59:03,090
if we're outside the size of the heap, or\n
3863
07:59:03,090 --> 07:59:09,067
Lastly, we want to make sure we swap the current\n
3864
07:59:09,067 --> 07:59:18,829
value. Lastly, we want to make sure we swap\n
3865
07:59:18,830 --> 07:59:25,750
with the smallest value. The last core operation\n
3866
07:59:25,750 --> 07:59:33,430
key value pair updates similar to remove those\n
3867
07:59:33,430 --> 07:59:40,099
logarithmic time due to the constant time\n
3868
07:59:40,099 --> 07:59:45,159
logarithmic time to adjust where the key value\n
3869
07:59:45,159 --> 07:59:53,689
want to update the value of the key Carly\n
3870
07:59:53,689 --> 08:00:03,049
find the key index of the key we want to work\n
3871
08:00:03,049 --> 08:00:07,860
two, then we can use that key index value\n
3872
08:00:07,860 --> 08:00:13,819
Of course, the heap invariant may not be satisfied,\n
3873
08:00:17,330 --> 08:00:26,250
the pseudocode for updating the value of a\n
3874
08:00:26,250 --> 08:00:32,520
in the values array and move the node either\n
3875
08:00:32,520 --> 08:00:33,520
so there was nothing too special about updates,\n
3876
08:00:33,520 --> 08:00:35,727
increase and decrease key. In many applications\n
3877
08:00:35,727 --> 08:00:37,259
often useful to be able to update a given\n
3878
08:00:37,259 --> 08:00:41,567
or always larger. In the event that a worse\n
3879
08:00:41,567 --> 08:00:45,808
should not be updated. In such situations,\n
3880
08:00:45,808 --> 08:00:46,808
form of update operation called increased\n
3881
08:00:46,808 --> 08:00:49,708
on whether you want to increase the value\n
3882
08:00:49,707 --> 08:00:50,707
the value associated with the key. Both of\n
3883
08:00:50,707 --> 08:00:51,707
consist of doing an if statement before performing\n
3884
08:00:51,707 --> 08:00:52,707
or decreases the value associated with the\n
3885
08:00:52,707 --> 08:00:53,707
just convenience methods that wrap a get operation\n
3886
08:00:53,707 --> 08:00:54,707
we're going to look at some source code for\n
3887
08:00:54,707 --> 08:00:55,707
we get started, make sure you watch my video\n
3888
08:00:55,707 --> 08:00:56,770
the implementation details and why an index\n
3889
08:00:56,770 --> 08:00:58,049
All the source code for this video can be\n
3890
08:00:58,049 --> 08:00:59,049
slash wm fuzzer slash data structures, the\n
3891
08:00:59,049 --> 08:01:02,529
Here we are in the source code for a min indexed\n
3892
08:01:02,529 --> 08:01:04,128
in the Java programming language. To get started,\n
3893
08:01:04,128 --> 08:01:07,000
that we pass in a type of object which is\n
3894
08:01:07,000 --> 08:01:08,909
value pairs within the heap. You'll also notice\n
3895
08:01:08,909 --> 08:01:11,919
This is just to be more generic. And all I\n
3896
08:01:11,919 --> 08:01:12,957
this heap to have at most two children for\n
3897
08:01:12,957 --> 08:01:14,069
general and teach children. So let's look\n
3898
08:01:14,069 --> 08:01:16,128
the fun stuff is happening. So let's go over\n
3899
08:01:16,128 --> 08:01:18,909
is just the number of elements in the heap,\n
3900
08:01:18,909 --> 08:01:23,270
number of elements in the heap, D is the degree\n
3901
08:01:23,270 --> 08:01:28,930
number is two. The two arrays, child and parent\n
3902
08:01:28,930 --> 08:01:34,207
node, so we don't have to compute them dynamically.\n
3903
08:01:34,207 --> 08:01:37,779
maps, which we're going to use to track the\n
3904
08:01:37,779 --> 08:01:42,467
values array, which is the array that contains\n
3905
08:01:42,468 --> 08:01:49,137
Note that it's very important to notice that\n
3906
08:01:49,137 --> 08:01:51,468
not by the nodes, indices, per se. So in the\n
3907
08:01:51,468 --> 08:01:56,280
maximum size for our heap. Then we just initialize\n
3908
08:01:56,279 --> 08:01:57,547
size of the heap, then I initialize all our\n
3909
08:01:57,547 --> 08:01:58,547
parent indices should be. And I also initialize\n
3910
08:01:58,547 --> 08:01:59,547
negative one values, then we have a few straightforward\n
3911
08:01:59,547 --> 08:02:02,759
So you'll notice that for contains, we don't\n
3912
08:02:02,759 --> 08:02:03,759
we pass in the key index. And we're going\n
3913
08:02:03,759 --> 08:02:04,759
have a few convenience bounds checking methods\n
3914
08:02:04,759 --> 08:02:05,949
such as key has to be in the bounds or throw.\n
3915
08:02:05,950 --> 08:02:09,680
is valid. And after this check is done, we\n
3916
08:02:09,680 --> 08:02:13,040
by looking inside the position map and checking\n
3917
08:02:13,040 --> 08:02:16,138
Then we have things like peak min key index,\n
3918
08:02:16,137 --> 08:02:17,137
of the heap. And similarly, Paul min key index
3919
08:02:17,137 --> 08:02:18,137
and also peak min value and pull min value.\n
3920
08:02:18,137 --> 08:02:19,137
we want to just look at the value at the top\n
3921
08:02:19,137 --> 08:02:20,137
the pole version will actually remove it.\n
3922
08:02:20,137 --> 08:02:21,137
value pair, we need to make sure that that\n
3923
08:02:21,137 --> 08:02:22,137
heap. Otherwise, we're going to throw an exception\n
3924
08:02:22,137 --> 08:02:23,137
null. And if it is we throw an exception.\n
3925
08:02:23,137 --> 08:02:24,137
the position map and inverse map. Then we\n
3926
08:02:24,137 --> 08:02:25,137
we passed in. And then we swim up that node\n
3927
08:02:25,137 --> 08:02:26,137
we also increment the size variable so that\n
3928
08:02:26,137 --> 08:02:27,137
of pretty straightforward, just do a look\n
3929
08:02:27,137 --> 08:02:28,137
slightly more interesting. We make sure the\n
3930
08:02:28,137 --> 08:02:29,137
that key exists within the heap. we swap the\n
3931
08:02:29,137 --> 08:02:30,137
heap. Then we reposition the new node we swapped\n
3932
08:02:30,137 --> 08:02:31,137
or down the heap, we capture the value in\n
3933
08:02:31,137 --> 08:02:32,137
can return it later, we clean up the node\n
3934
08:02:32,137 --> 08:02:33,137
value. update is also pretty easy. Just make\n
3935
08:02:33,137 --> 08:02:34,137
not no, then we get the index for the node,\n
3936
08:02:34,137 --> 08:02:35,137
new value, then move it within the heap. And\n
3937
08:02:35,137 --> 08:02:36,137
and decrease are just short for decrease key\n
3938
08:02:36,137 --> 08:02:37,137
dr heap here. So make sure you take that into\n
3939
08:02:37,137 --> 08:02:38,137
sure the key exists and it's not know then\n
3940
08:02:38,137 --> 08:02:39,137
in the heap, the values array at the key index,\n
3941
08:02:39,137 --> 08:02:40,137
that I didn't call the update method here,\n
3942
08:02:40,137 --> 08:02:41,137
the update method, we sink and we swim. So\n
3943
08:02:41,137 --> 08:02:42,137
way the node will go whether it's going to\n
3944
08:02:42,137 --> 08:02:43,137
key method, we do. Same thing for the increased\n
3945
08:02:43,137 --> 08:02:44,137
the less competitor so that the values array\n
3946
08:02:44,137 --> 08:02:45,137
passed in is on the right. These are just\n
3947
08:02:45,137 --> 08:02:46,137
the slides, we can go over quickly. So to\n
3948
08:02:46,137 --> 08:02:47,137
i since we're working with a D reheat, we\n
3949
08:02:47,137 --> 08:02:48,137
the one with the least value, this is going\n
3950
08:02:48,137 --> 08:02:49,137
sure that i is equal to the value of j and\n
3951
08:02:49,137 --> 08:02:50,137
repeat this until we can't sink the node anymore.\n
3952
08:02:50,137 --> 08:02:51,137
find the parent of note I which we can just\n
3953
08:02:51,137 --> 08:02:52,137
I swap with the parent and keep doing this\n
3954
08:02:52,137 --> 08:02:53,137
just looks at all the children of Note II\n
3955
08:02:53,137 --> 08:02:54,137
returns its index. Also pretty straightforward.\n
3956
08:02:54,137 --> 08:02:55,137
swapped the indices and the position map and\n
3957
08:02:55,137 --> 08:02:56,137
have the last function which simply compares\n
3958
08:02:56,137 --> 08:02:57,137
convenience methods just because I didn't\n
3959
08:02:57,137 --> 08:02:58,137
exceptions everywhere. Just kind of wrap them\n
3960
08:02:58,137 --> 08:02:59,137
to make sure that our heat is in these a min\n
3961
08:02:59,137 --> 08:03:03,637
the air heap. I hope there wasn't too much\n
3962
08:03:03,637 --> 08:03:10,700
give this video a thumbs up if you learn something\n
326221
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.