subtitlecat.com

All language subtitles for Data-Structures-Easy-to-Advanced-Course-Full-Tutorial-from-a-Google-Engineer_en

Afrikaans

Albanian

Amharic

Arabic Download

Armenian

Azerbaijani

Basque

Belarusian

Bengali

Bosnian

Bulgarian

Catalan

Cebuano

Chichewa

Chinese (Simplified)

Chinese (Traditional)

Corsican

Croatian

Czech

Danish

Dutch

English

Esperanto

Estonian

Filipino

Finnish

French

Frisian

Galician

Georgian

German

Greek

Gujarati

Haitian Creole

Hausa

Hawaiian

Hebrew

Hindi

Hmong

Hungarian

Icelandic

Igbo

Indonesian

Irish

Italian

Japanese

Javanese

Kannada

Kazakh

Khmer

Korean

Kurdish (Kurmanji)

Kyrgyz

Lao

Latin

Latvian

Lithuanian

Luxembourgish

Macedonian

Malagasy

Malay

Malayalam

Maltese

Maori

Marathi

Mongolian

Myanmar (Burmese)

Nepali

Norwegian

Pashto

Persian

Polish

Portuguese

Punjabi

Romanian

Russian

Samoan

Scots Gaelic

Serbian

Sesotho

Shona

Sindhi

Sinhala

Slovak

Slovenian

Somali

Spanish

Sundanese

Swahili

Swedish

Tajik

Tamil

Telugu

Thai

Turkish

Ukrainian

Urdu

Uzbek

Vietnamese

Welsh

Xhosa

Yiddish

Yoruba

Zulu

Odia (Oriya)

Kinyarwanda

Turkmen

Tatar

Uyghur

Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:00,420 --> 00:00:06,500 In these first few videos, I want to lay the\n 2 00:00:06,500 --> 00:00:13,809 need throughout these video tutorials. Let's\n 3 00:00:13,808 --> 00:00:21,160 data structure? one definition that I really\n 4 00:00:21,160 --> 00:00:27,460 data so that it can be used efficiently. And\n 5 00:00:27,460 --> 00:00:33,980 a way of organizing data in some fashion so\n 6 00:00:33,979 --> 00:00:45,158 or perhaps even updated quickly and easily.\n 7 00:00:45,158 --> 00:00:53,588 Well, they are essential ingredients in creating\n 8 00:00:53,588 --> 00:01:01,100 reason might be that they help us manage and\n 9 00:01:01,100 --> 00:01:09,390 this last point, is more of my own making.\n 10 00:01:09,390 --> 00:01:14,939 to understand. As a side note, one of the\n 11 00:01:14,938 --> 00:01:21,419 bad mediocre to excellent programmers is that\n 12 00:01:21,420 --> 00:01:27,700 fundamentally understand how and when to use\n 13 00:01:27,700 --> 00:01:32,170 they're trying to finish. data structures\n 14 00:01:32,170 --> 00:01:38,890 okay product and an outstanding one. It's\n 15 00:01:38,890 --> 00:01:46,950 student is required to take a course in data\n 16 00:01:46,950 --> 00:01:53,100 begin talking about data structures that we\n 17 00:01:53,099 --> 00:02:01,438 structures. What I'm talking about is the\n 18 00:02:01,438 --> 00:02:06,728 an abstracted type and how does it differ\n 19 00:02:06,728 --> 00:02:12,759 that an abstract data type is an abstraction\n 20 00:02:12,759 --> 00:02:19,759 interface to which that data structure must\n 21 00:02:22,750 --> 00:02:29,479 a specific data structure should be implemented,\n 22 00:02:29,479 --> 00:02:38,339 example I like to give is to suppose that\n 23 00:02:38,340 --> 00:02:45,590 to get from point A to point B. Well, as we\n 24 00:02:45,590 --> 00:02:51,719 to get from one place to another. So which\n 25 00:02:51,719 --> 00:02:57,908 transportation might be walking or biking,\n 26 00:02:57,908 --> 00:03:03,959 specific modes of transportation would be\n 27 00:03:03,959 --> 00:03:10,370 We want to get from one place to another through\n 28 00:03:10,370 --> 00:03:17,759 abstract data type. How did we do that? Exactly?\n 29 00:03:17,759 --> 00:03:25,968 some examples of abstract data types on the\n 30 00:03:25,968 --> 00:03:32,968 on the right hand side. As you can see, a\n 31 00:03:32,968 --> 00:03:39,489 have a dynamic array or a linked list. They\n 32 00:03:39,489 --> 00:03:47,530 indexing elements in the list. Next, we have\n 33 00:03:47,530 --> 00:03:54,019 themselves can be implemented in a variety\n 34 00:03:54,019 --> 00:04:02,079 for a queue, I put a stack based queue because\n 35 00:04:02,079 --> 00:04:09,659 using only stacks. This may not be the most\n 36 00:04:09,658 --> 00:04:16,800 it does work and it is possible. The point\n 37 00:04:16,800 --> 00:04:22,860 how a data structure should behave, and what\n 38 00:04:22,860 --> 00:04:31,069 surrounding how those methods are implemented.\n 39 00:04:31,069 --> 00:04:37,819 data types, we need to have a quick look at\n 40 00:04:37,819 --> 00:04:45,829 to understand the performance that our data\n 41 00:04:45,829 --> 00:04:53,389 we often find ourselves asking the same two\n 42 00:04:53,389 --> 00:05:01,279 is, how much time does this algorithm need\n 43 00:05:01,279 --> 00:05:08,529 algorithm need for my computation. So, if\n 44 00:05:08,529 --> 00:05:15,819 to finish, then it's no good. Similarly, if\n 45 00:05:15,819 --> 00:05:21,639 a space equal to the sum of all the bytes\n 46 00:05:21,639 --> 00:05:30,300 your algorithm is also useless. So just standardize\n 47 00:05:30,300 --> 00:05:35,759 much space is required for an algorithm to\n 48 00:05:35,759 --> 00:05:43,949 invented big O notation, amongst other things\n 49 00:05:43,949 --> 00:05:50,909 we're interested in Big O because it tells\n 50 00:05:50,910 --> 00:05:57,450 cares about the worst case. So if your algorithm\n 51 00:05:57,449 --> 00:06:05,579 possible arrangement of numbers for your particular\n 52 00:06:05,579 --> 00:06:10,649 suppose you have an unordered list of unique\n 53 00:06:10,649 --> 00:06:16,669 seven, or the position where seven occurs\n 54 00:06:16,670 --> 00:06:21,629 seven, that is at the beginning. Or in the\n 55 00:06:21,629 --> 00:06:29,339 very last element of the list. for that particular\n 56 00:06:29,339 --> 00:06:35,119 with respect to the number of elements in\n 57 00:06:35,120 --> 00:06:43,550 every single element until you find seven.\n 58 00:06:43,550 --> 00:06:48,740 just consider the worst possible amount of\n 59 00:06:48,740 --> 00:06:55,990 that particular input. There's also the fact\n 60 00:06:55,990 --> 00:07:01,410 when your input becomes arbitrarily large.\n 61 00:07:01,410 --> 00:07:07,170 the input is small. For this reason, you'll\n 62 00:07:07,170 --> 00:07:18,240 multiplicative factors. So in our big O notation,\n 63 00:07:21,639 --> 00:07:27,819 when I say n, n is usually always want to\n 64 00:07:27,819 --> 00:07:35,459 there's always going to be some limitation\n 65 00:07:35,459 --> 00:07:42,911 a one wrapped around a big O. If your algorithm\n 66 00:07:42,911 --> 00:07:50,180 we say that's big O of a log event. If it's\n 67 00:07:50,180 --> 00:07:57,470 quadratic time or cubic time, then we say\n 68 00:07:57,470 --> 00:08:03,030 Usually, this is going to be something like\n 69 00:08:03,029 --> 00:08:10,019 than one to the n. And then we also have n\n 70 00:08:10,019 --> 00:08:17,509 these like square root of n log log of n,\n 71 00:08:17,509 --> 00:08:24,379 any mathematical expression containing n can\n 72 00:08:24,379 --> 00:08:33,448 valid. Now, we want to talk about some properties\n 73 00:08:33,448 --> 00:08:38,929 in the last two slides, Big O only really\n 74 00:08:38,929 --> 00:08:45,128 really big. So we're not interested when n\nis small, only 75 00:08:45,129 --> 00:08:54,149 what happens when n goes to infinity. So this\n 76 00:08:54,149 --> 00:09:00,568 The first that we can simply remove constant\n 77 00:09:00,568 --> 00:09:06,389 if you're adding a constant to infinity, well,\n 78 00:09:06,389 --> 00:09:14,220 a constant by infinity, yeah, that's still\n 79 00:09:14,220 --> 00:09:20,810 of course, this is all theoretical. In the\n 80 00:09:20,809 --> 00:09:28,000 billion, probably, that's going to have a\n 81 00:09:29,620 --> 00:09:38,578 However, let us look at a function f, which\n 82 00:09:38,578 --> 00:09:49,500 f of n is seven log of n cubed plus 15 n squared\n 83 00:09:49,500 --> 00:10:01,169 of n is just n cubed, because n cubed is the\n 84 00:10:02,990 --> 00:10:12,579 look at some concrete examples of how big\n 85 00:10:12,578 --> 00:10:19,748 constant time with respect to the input size,\n 86 00:10:21,070 --> 00:10:27,350 So on the left, when we're just adding or\n 87 00:10:27,350 --> 00:10:33,060 constant time. And on the right, okay, we're\n 88 00:10:33,059 --> 00:10:39,669 depend on n. So it runs also in a constant\n 89 00:10:39,669 --> 00:10:46,958 arbitrarily large, well, that loop is still\n 90 00:10:46,958 --> 00:10:51,239 Now let's look at a linear example. So both\n 91 00:10:51,240 --> 00:10:57,680 linear time with respect to the input size,\n 92 00:11:00,620 --> 00:11:07,740 So on the left, we're incrementing, the counter\n 93 00:11:07,740 --> 00:11:14,180 clearly, when we wrap this in a big go get\n 94 00:11:14,179 --> 00:11:21,349 complicated, we're not incrementing by one,\n 95 00:11:21,350 --> 00:11:28,329 to finish that loop three times faster. So\n 96 00:11:28,328 --> 00:11:36,599 is two algorithms that run in quadratic time.\n 97 00:11:36,600 --> 00:11:45,319 at times. So n times as big O of n squared.\n 98 00:11:45,318 --> 00:11:52,938 zero with an eye. So pause the video and try\n 99 00:11:52,938 --> 00:12:02,088 squared. Okay, let's go over the solution.\n 100 00:12:02,089 --> 00:12:10,660 first loop isn't as important. So since I\n 101 00:12:10,659 --> 00:12:16,039 is going to be directly related to what AI\n 102 00:12:16,039 --> 00:12:24,429 loop. So if we fix AI to be zero, we do n\n 103 00:12:24,429 --> 00:12:32,938 If we fix it, too, we do n minus two work\n 104 00:12:32,938 --> 00:12:39,659 what is n plus n minus one plus n minus two\n 105 00:12:39,659 --> 00:12:47,958 a well known identity, and turns out to be\n 106 00:12:47,958 --> 00:12:56,599 wrap this in a big O, we split our equation,\n 107 00:12:56,600 --> 00:13:03,870 Now let's look at perhaps a more complicated\n 108 00:13:03,870 --> 00:13:10,339 how do we ever get these log Eurythmics or\n 109 00:13:10,339 --> 00:13:18,410 a classic algorithm of doing a binary search,\n 110 00:13:18,409 --> 00:13:26,179 So what this algorithm does is it starts by\n 111 00:13:26,179 --> 00:13:31,628 in one at the very end of the array, then\n 112 00:13:31,629 --> 00:13:38,610 if the value we're looking for was found at\n 113 00:13:38,610 --> 00:13:46,269 or not, if it has found it, it stops. Otherwise,\n 114 00:13:46,269 --> 00:13:52,649 either the high or the low pointer. remark\n 115 00:13:52,649 --> 00:14:00,078 half of the array, each iteration. So very,\n 116 00:14:00,078 --> 00:14:06,458 range check. So if you do the math, the worst\n 117 00:14:06,458 --> 00:14:14,528 N iterations, meaning that the binary search\n 118 00:14:14,528 --> 00:14:21,860 a very powerful algorithm. Here's a slightly\n 119 00:14:21,860 --> 00:14:28,789 notice that there is an outer loop with a\n 120 00:14:28,789 --> 00:14:35,528 that there are two inner loops, one that does\n 121 00:14:35,528 --> 00:14:40,499 work. So the rule we use to determine the\n 122 00:14:40,499 --> 00:14:49,439 loops on different levels and add those that\n 123 00:14:49,438 --> 00:14:56,889 using the rule above, we can see that it takes\n 124 00:14:56,889 --> 00:15:09,149 n plus two n for both inner loop Which gives\n 125 00:15:09,149 --> 00:15:15,919 All right, so this next one looks very similar,\n 126 00:15:15,919 --> 00:15:24,969 outer loop with AI, we have AI going from\n 127 00:15:24,970 --> 00:15:29,910 on the outside. But we have to multiply that\n 128 00:15:31,198 --> 00:15:39,808 j goes from 10, to 50. So that does 40 loops\n 129 00:15:39,808 --> 00:15:48,278 the amount of work. Plus however, the second\n 130 00:15:48,278 --> 00:15:54,999 J equals j plus two, so it's accelerated a\n 131 00:15:54,999 --> 00:16:04,060 a little faster. So we're gonna get on the\n 132 00:16:04,059 --> 00:16:11,318 we have to multiply that by three n. So we\n 133 00:16:11,318 --> 00:16:18,039 is going to give us big of enter the force,\n 134 00:16:18,039 --> 00:16:31,078 in our function, f of n. For some other classic\n 135 00:16:31,078 --> 00:16:36,479 the subsets of a set, that takes an exponential\n 136 00:16:36,480 --> 00:16:42,870 N subsets, finding all permutations of a string\n 137 00:16:42,870 --> 00:16:51,948 one is merge sort. So we have n times log\n 138 00:16:51,948 --> 00:16:59,688 iterate over all the cells of an array of\n 139 00:16:59,688 --> 00:17:07,519 right, let's talk about arrays, probably the\n 140 00:17:07,519 --> 00:17:13,910 of two in the array videos. The reason the\n 141 00:17:13,910 --> 00:17:21,029 a fundamental building block for all other\n 142 00:17:21,029 --> 00:17:27,579 with arrays and pointers alone, I'm pretty\n 143 00:17:27,579 --> 00:17:35,409 structure. So an outline for today's video.\n 144 00:17:35,410 --> 00:17:44,250 about arrays and answer some fundamental questions\n 145 00:17:44,250 --> 00:17:50,619 Next, I will explain the basic structure of\n 146 00:17:50,619 --> 00:17:58,549 able to perform on them. Lastly, we will go\n 147 00:17:58,549 --> 00:18:06,319 look at some source code on how to construct\n 148 00:18:06,319 --> 00:18:14,669 discussion and examples. So what is a static\n 149 00:18:14,670 --> 00:18:22,259 containing elements, which are indexable,\n 150 00:18:22,259 --> 00:18:31,940 n minus one also inclusive. So a follow up\n 151 00:18:31,940 --> 00:18:39,500 So answer this is this means that each slot\n 152 00:18:39,500 --> 00:18:47,519 a number. Furthermore, I would like to add\n 153 00:18:47,519 --> 00:18:55,250 chunks of memory. Meaning that your chunk\n 154 00:18:55,250 --> 00:19:04,230 cheese with a bunch of holes and gaps. It's\n 155 00:19:04,230 --> 00:19:14,130 in your static array. Okay, so when and where\n 156 00:19:14,130 --> 00:19:21,410 everywhere, absolutely everywhere. It's hard\n 157 00:19:21,410 --> 00:19:29,509 fact, here are a few places you may or may\n 158 00:19:29,509 --> 00:19:36,359 the first simple example is to temporarily\n 159 00:19:36,359 --> 00:19:42,429 you're probably familiar with. Next is that\n 160 00:19:42,430 --> 00:19:48,259 from an input or an output stream. Suppose\n 161 00:19:48,259 --> 00:19:55,180 that you need to process but that file is\n 162 00:19:55,180 --> 00:20:02,390 we use a buffer to read small chunks of the\n 163 00:20:02,390 --> 00:20:08,150 a time. And so eventually we're able to read\n 164 00:20:08,150 --> 00:20:15,160 as lookup tables because of their indexing\n 165 00:20:15,160 --> 00:20:20,960 retrieve data from the lookup table if you\n 166 00:20:20,960 --> 00:20:27,361 and at what offset. Next, we also use arrays\n 167 00:20:27,361 --> 00:20:30,410 that only allows one return value. 168 00:20:30,410 --> 00:20:38,279 So the hack we use then is to return a pointer\n 169 00:20:38,279 --> 00:20:45,190 all the return values that we want. This last\n 170 00:20:45,190 --> 00:20:51,400 are heavily used the programming technique\n 171 00:20:51,400 --> 00:20:57,880 to cache already computed subproblems. So\n 172 00:20:57,880 --> 00:21:03,820 problem, or the coin change problem. All right,\n 173 00:21:03,819 --> 00:21:11,819 access time for static array and a dynamic\n 174 00:21:11,819 --> 00:21:17,980 arrays are indexable. Searching, however,\n 175 00:21:17,980 --> 00:21:24,589 we potentially have to traverse all the elements\n 176 00:21:24,589 --> 00:21:31,929 the element you're looking for does not exist.\n 177 00:21:31,930 --> 00:21:38,730 array doesn't really make sense. The static\n 178 00:21:38,730 --> 00:21:46,490 grow larger or smaller. When inserting with\n 179 00:21:46,490 --> 00:21:51,849 linear time, because you potentially have\n 180 00:21:51,849 --> 00:21:57,559 recopy all the elements into the new static\n 181 00:21:57,559 --> 00:22:05,149 a dynamic array using static arrays. However,\n 182 00:22:05,150 --> 00:22:13,920 seem a little strange? Well, when we append\n 183 00:22:13,920 --> 00:22:21,310 the internal static array containing all those\n 184 00:22:21,309 --> 00:22:28,779 appending becomes constant time. deletions\n 185 00:22:28,779 --> 00:22:36,700 are linear, you have to shift all of the elements\n 186 00:22:36,700 --> 00:22:45,370 into your static array. Okay, so we have a\n 187 00:22:45,369 --> 00:22:56,739 a contains the values 4412 minus 517 6039\n 188 00:22:56,740 --> 00:23:06,970 but this is not a requirement of an array.\n 189 00:23:06,970 --> 00:23:15,940 has index position zero in the array, not\n 190 00:23:15,940 --> 00:23:23,880 science students, you have no idea. The confusing\n 191 00:23:23,880 --> 00:23:36,880 is one base to work computer science is one.\n 192 00:23:36,880 --> 00:23:49,850 the values 4412 minus 517 6039, and 100. Currently,\n 193 00:23:49,849 --> 00:23:58,199 is not at all a requirement of the array.\n 194 00:23:58,200 --> 00:24:09,200 is indexed, or positioned at index of zero\n 195 00:24:09,200 --> 00:24:17,039 a lot of intro computer science students.\n 196 00:24:17,039 --> 00:24:24,599 mathematics is one based while computer science\n 197 00:24:24,599 --> 00:24:30,719 But Worst of all, is quantum computing. I\n 198 00:24:30,720 --> 00:24:38,710 during my undergrad, and the field is a mess.\n 199 00:24:38,710 --> 00:24:45,740 scientists and physicists all at the same\n 200 00:24:45,740 --> 00:24:53,420 Anyways, back to arrays. I should also note\n 201 00:24:53,420 --> 00:24:59,350 for each loop something that's offered in\n 202 00:24:59,349 --> 00:25:06,409 you to explicitly reference the indices of\n 203 00:25:06,410 --> 00:25:14,850 internally, behind the scenes, the notation\n 204 00:25:14,849 --> 00:25:23,449 array A square bracket zero, close square\n 205 00:25:24,839 --> 00:25:35,319 is equal to the value 44. Similarly, a position\n 206 00:25:35,319 --> 00:25:43,289 But a index nine is out of bounds. And our\n 207 00:25:43,289 --> 00:25:51,819 in C, it doesn't always throw an exception,\n 208 00:25:51,819 --> 00:26:02,649 zero to B minus one that happens if we assign\n 209 00:26:02,650 --> 00:26:11,250 be 25, let's look at operations on dynamic\n 210 00:26:11,250 --> 00:26:18,259 in size as needed. So the dynamic array can\n 211 00:26:18,259 --> 00:26:25,400 can do but unlike the static array, it grows\n 212 00:26:25,400 --> 00:26:33,180 have a containing 34, and four, there, if\n 213 00:26:33,180 --> 00:26:41,070 end. If we add 34, again, then it will add\n 214 00:26:41,069 --> 00:26:49,109 So you see here, our dynamic array shrink\n 215 00:26:49,109 --> 00:26:54,279 talked about this a little bit. But how do\n 216 00:26:54,279 --> 00:26:59,589 the answer is typically this is done with\n 217 00:26:59,589 --> 00:27:07,109 of course. So first, we create a stack if\n 218 00:27:07,109 --> 00:27:14,259 nine zero. So as we add elements, we add elements\n 219 00:27:14,259 --> 00:27:21,250 of the number of elements added. Once, we\n 220 00:27:21,250 --> 00:27:26,569 of our internal static array, what we can\n 221 00:27:26,569 --> 00:27:31,500 elements into this new static array, and add\n 222 00:27:31,500 --> 00:27:39,509 an example. So suppose we create a dynamic\n 223 00:27:39,509 --> 00:27:46,730 we begin adding elements to it. So the little\n 224 00:27:46,730 --> 00:27:52,089 for an empty position. Okay, so we add seven,\n 225 00:27:52,089 --> 00:27:56,839 fine. But once we add three, it doesn't fit\n 226 00:27:56,839 --> 00:28:03,059 the size of the array, copy all the elements\n 227 00:28:03,059 --> 00:28:09,539 12, everything's still okay, we're doing good.\n 228 00:28:09,539 --> 00:28:16,339 resize again. So double the size of the container,\n 229 00:28:16,339 --> 00:28:22,849 array, and then finish off by adding five.\n 230 00:28:22,849 --> 00:28:31,609 issues. All right, time to look at the dynamic\n 231 00:28:31,609 --> 00:28:41,639 in the array series. So the source code for\n 232 00:28:41,640 --> 00:28:49,110 github.com, slash my username slash data dash\n 233 00:28:49,109 --> 00:28:55,719 video so you know what's going on with this\n 234 00:28:55,720 --> 00:29:04,680 in the array class. So I've designed an array\n 235 00:29:04,680 --> 00:29:13,160 type of data we want put in this array, that's\n 236 00:29:13,160 --> 00:29:18,050 that we care about our, which is our internal\nstatic array. 237 00:29:19,140 --> 00:29:26,080 which is the length the user thinks the array\n 238 00:29:26,079 --> 00:29:34,210 array, because sometimes our array might have\n 239 00:29:34,210 --> 00:29:42,340 the user that there's extra free slots that\n 240 00:29:42,339 --> 00:29:50,089 So there's two constructors. The first one\n 241 00:29:50,089 --> 00:29:55,859 The other one you give it a capacity, the\n 242 00:29:55,859 --> 00:30:06,299 or equal to zero. And then we finish Why's\n 243 00:30:06,299 --> 00:30:17,730 I put this suppress warnings and unchecked\n 244 00:30:17,730 --> 00:30:22,940 So here are two simple methods size, get the\n 245 00:30:22,940 --> 00:30:28,850 empty, pretty self explanatory. Similarly\n 246 00:30:28,849 --> 00:30:34,971 if we want the value or set it if we want\n 247 00:30:34,971 --> 00:30:45,380 be doing a bounds check for both of these,\n 248 00:30:45,380 --> 00:30:54,240 we just remove all the data in our array and\n 249 00:30:54,240 --> 00:31:00,529 the Add method where things actually get a\n 250 00:31:02,369 --> 00:31:10,689 plus one is greater than or equal to the capacity,\n 251 00:31:10,690 --> 00:31:16,610 I'm resizing is I'm doubling the size of the\n 252 00:31:16,609 --> 00:31:25,299 size, but I've decided that doubling the size\n 253 00:31:25,299 --> 00:31:34,079 I have to create a new array with the new\n 254 00:31:34,079 --> 00:31:41,559 this line or these lines are doing, it's copying\n 255 00:31:41,559 --> 00:31:52,319 then it sets the old array to be the new array.\n 256 00:31:52,319 --> 00:32:04,319 into our right. So this remove AZ method will\n 257 00:32:04,319 --> 00:32:11,519 First, we check if the index is valid. If\n 258 00:32:11,519 --> 00:32:20,720 otherwise, grab the data at the Remove index.\n 259 00:32:20,720 --> 00:32:28,950 one. Now copy everything into the new array 260 00:32:30,250 --> 00:32:36,849 for when it's at that remove index. Now I'm\n 261 00:32:36,849 --> 00:32:45,819 decided to do the following maintain two indices\n 262 00:32:45,819 --> 00:32:54,399 when i is equal to the Remove index, then\n 263 00:32:54,400 --> 00:33:07,540 temporarily and using j to lag behind, if\n 264 00:33:07,539 --> 00:33:13,210 So I guess pretty clever overall. And then\n 265 00:33:13,210 --> 00:33:23,509 generated. Reset the capacity and return the\n 266 00:33:23,509 --> 00:33:31,049 remove, we scan through the array. If we find\n 267 00:33:31,049 --> 00:33:38,759 the index and return true otherwise return\n 268 00:33:38,759 --> 00:33:48,650 it return I otherwise return minus one. Contains\n 269 00:33:48,650 --> 00:34:00,430 one. All right, this next one, I return an\n 270 00:34:00,430 --> 00:34:08,730 us to iterate over the array providing an\n 271 00:34:08,730 --> 00:34:15,440 two methods. And this is has next. So there\n 272 00:34:15,440 --> 00:34:21,500 is less than the length of the array. I should\n 273 00:34:21,500 --> 00:34:27,480 here, just in case someone decides to change\n 274 00:34:27,480 --> 00:34:34,539 it might add that later might not. 275 00:34:37,028 --> 00:34:45,869 there's the next method, which just returns\n 276 00:34:45,869 --> 00:34:51,250 the iterator. Okay, and lastly is the to string\n 277 00:34:51,250 --> 00:34:56,829 Nothing too complicated. Alright, so this\n 278 00:34:56,829 --> 00:35:02,700 a dynamic array. If you look at Java's ArrayList\n 279 00:35:02,699 --> 00:35:08,189 we're going to talk about singly and doubly\n 280 00:35:08,190 --> 00:35:12,960 out there. This is part one of two and the\n 281 00:35:12,960 --> 00:35:21,150 code on how to implement a doubly linked list.\n 282 00:35:21,150 --> 00:35:26,010 we're going to answer some basic questions\n 283 00:35:26,010 --> 00:35:31,970 namely, what are they and where are they used.\n 284 00:35:31,969 --> 00:35:37,139 linked lists, so everyone knows what I mean\n 285 00:35:37,139 --> 00:35:43,039 the tail of the weight class. Then last in\n 286 00:35:43,039 --> 00:35:49,759 and cons of using singly and doubly linked\n 287 00:35:49,760 --> 00:35:57,830 from both singly and doubly linked lists as\n 288 00:35:57,829 --> 00:36:04,018 discussion. So what is the link list linked\n 289 00:36:04,018 --> 00:36:11,078 data which point to other nodes also containing\n 290 00:36:11,079 --> 00:36:17,670 list containing some arbitrary data. Notice\n 291 00:36:17,670 --> 00:36:25,269 node. Also notice that the last node points\n 292 00:36:29,170 --> 00:36:37,260 always has a null reference to the next note,\n 293 00:36:37,260 --> 00:36:45,000 slides. Okay, so where are linked lists use.\n 294 00:36:45,000 --> 00:36:52,239 actually in the abstract data type implementation\n 295 00:36:52,239 --> 00:36:58,649 great time complexity for adding and removing\n 296 00:36:58,650 --> 00:37:05,889 like circular lists, making the pointer of\n 297 00:37:05,889 --> 00:37:13,029 linked lists are used to model repeating events\n 298 00:37:13,030 --> 00:37:18,190 on a bunch of elements or representing corners\n 299 00:37:18,190 --> 00:37:24,380 linked lists can also be used to model real\n 300 00:37:24,380 --> 00:37:31,019 that could be useful. And moving on some more\n 301 00:37:31,018 --> 00:37:38,239 lists and hash table separate chaining, and\n 302 00:37:38,239 --> 00:37:45,568 to those in a later video. Okay, a bit of\n 303 00:37:45,568 --> 00:37:50,369 thing you need to know when creating a linked\n 304 00:37:50,369 --> 00:37:56,068 to the head of the link lists. This is because\n 305 00:37:56,068 --> 00:38:03,019 our list. We give a name to the last element\n 306 00:38:03,019 --> 00:38:09,719 tail of the list. Then there are also the\n 307 00:38:09,719 --> 00:38:15,858 pointers are also sometimes called references.\n 308 00:38:15,858 --> 00:38:22,259 You should also know that the nodes themselves\n 309 00:38:22,259 --> 00:38:28,809 when actually implemented. We'll get to this\n 310 00:38:28,809 --> 00:38:34,750 versus doubly linked lists, sort of concerning\n 311 00:38:34,750 --> 00:38:42,349 are two types, singly linked and doubly length.\n 312 00:38:42,349 --> 00:38:48,740 to the next node, while doubly linked lists\n 313 00:38:48,739 --> 00:38:54,819 but also to the previous node, which makes\n 314 00:38:54,820 --> 00:38:59,769 say we cannot have triple or quadruple the\n 315 00:38:59,768 --> 00:39:06,858 place additional pointers, pros and cons of\n 316 00:39:06,858 --> 00:39:12,400 Between picking a singly and a doubly linked\n 317 00:39:12,400 --> 00:39:18,900 linked lists we observed that uses less memory.\n 318 00:39:18,900 --> 00:39:25,230 up a lot of memory. If you're running on a\n 319 00:39:25,230 --> 00:39:31,298 on a 32 bit machine four bytes each. So having\n 320 00:39:31,298 --> 00:39:38,759 pointer for each node, hence, twice as much\n 321 00:39:38,759 --> 00:39:44,681 you cannot access previous elements because\n 322 00:39:44,681 --> 00:39:49,239 to traverse from the head of a linked lists\n 323 00:39:49,239 --> 00:39:55,219 it. Now concerning doubly linked lists with\n 324 00:39:55,219 --> 00:39:59,649 we can easily traverse the list backwards,\n 325 00:39:59,650 --> 00:40:05,338 list. Also having a reference to know Do you\n 326 00:40:05,338 --> 00:40:11,949 time and patch the hole you just created.\n 327 00:40:11,949 --> 00:40:17,460 previous and an ex notes, this is something\n 328 00:40:17,460 --> 00:40:25,181 would leave the list severed into a downside\n 329 00:40:25,181 --> 00:40:31,489 use twice as much memory. Okay, let's go into\n 330 00:40:31,489 --> 00:40:37,818 create linked lists and remove elements from\n 331 00:40:37,818 --> 00:40:45,429 list. So here is a singly linked list. I've\n 332 00:40:45,429 --> 00:40:52,108 we want to insert 11. At the third position\n 333 00:40:52,108 --> 00:40:59,038 an example. So the first thing we do is we\n 334 00:40:59,039 --> 00:41:05,049 This is almost always the first step in all\n 335 00:41:05,048 --> 00:41:12,230 to do is seek up to but not including the\n 336 00:41:12,230 --> 00:41:17,289 and we advanced our traverser pointers, setting\n 337 00:41:17,289 --> 00:41:25,588 23. And now we're actually ready already where\n 338 00:41:25,588 --> 00:41:30,068 we create the next node, that's the green\nnode 11. 339 00:41:30,068 --> 00:41:38,460 And we make 11 elevens. Next pointer, point\n 340 00:41:38,460 --> 00:41:45,289 is next pointer to be 11. Remember, we have\n 341 00:41:45,289 --> 00:41:52,950 a reference to it with the traverser. Okay,\n 342 00:41:52,949 --> 00:42:01,460 that we've correctly inserted 11 at the right\n 343 00:42:01,460 --> 00:42:08,119 time to insert with a doubly linked list.\n 344 00:42:08,119 --> 00:42:15,891 flying around. But it's the exact same concept.\n 345 00:42:15,891 --> 00:42:22,220 only has pointers to the next node, but also\n 346 00:42:22,219 --> 00:42:30,649 those in the insertion phase. Okay, create\n 347 00:42:30,650 --> 00:42:38,639 the head is, and advance it until you are\n 348 00:42:38,639 --> 00:42:43,618 advanced the traversal by one and now we're\n 349 00:42:43,619 --> 00:42:53,240 let's create the new node which is node 11.\n 350 00:42:53,239 --> 00:43:02,969 Also point leptons previous pointer to be\n 351 00:43:02,969 --> 00:43:13,248 traverser. How we make sevens previous pointer\n 352 00:43:13,248 --> 00:43:23,778 to 11. And the last step, make 20 threes next\n 353 00:43:23,778 --> 00:43:33,141 go forwards from 23 to 11. So in total remarked\n 354 00:43:33,141 --> 00:43:40,590 we've flattened out the list, you can see\n 355 00:43:40,590 --> 00:43:48,200 All right now how to remove elements from\n 356 00:43:48,199 --> 00:43:55,558 remove the node with a value nine. How do\n 357 00:43:55,559 --> 00:44:02,950 use is not to use one pointer but two, you\n 358 00:44:02,949 --> 00:44:11,909 to show you how it is done by using two. So\n 359 00:44:11,909 --> 00:44:19,828 for traverser one in Traverse e two respectively.\n 360 00:44:19,829 --> 00:44:30,150 or two points to the heads. next node. Now\n 361 00:44:30,150 --> 00:44:41,490 we find the node we want to remove while also\n 362 00:44:41,489 --> 00:44:48,838 nine. So this is the stopping point. I'm going\n 363 00:44:48,838 --> 00:44:57,268 to remove so we can deallocate its memory\n 364 00:44:57,268 --> 00:45:04,818 traffic to to the next node Note nine has\n 365 00:45:04,818 --> 00:45:12,880 this will that at this point, node nine is\n 366 00:45:12,880 --> 00:45:23,180 for the visual effect. Okay, so now set trav\n 367 00:45:23,179 --> 00:45:30,659 And now is an appropriate time to remove the\n 368 00:45:30,659 --> 00:45:37,980 it and their temp has been deallocated. Make\n 369 00:45:37,980 --> 00:45:44,588 memory leaks. This is especially important\n 370 00:45:44,588 --> 00:45:50,690 where you manage your memory. Now you can\n 371 00:45:50,690 --> 00:45:58,259 list is shorter. Okay, now for the last bit\n 372 00:45:58,259 --> 00:46:04,130 nodes from a doubly linked list, which is\n 373 00:46:04,130 --> 00:46:11,430 from singly linked lists. The idea is the\n 374 00:46:11,429 --> 00:46:18,239 But this time, we only need one pointer. I\n 375 00:46:18,239 --> 00:46:24,318 node is singly linked list has a reference\n 376 00:46:24,318 --> 00:46:28,768 maintain it like we did with the singly linked\nlist. 377 00:46:28,768 --> 00:46:39,318 So let's start travel at the very beginning\n 378 00:46:39,318 --> 00:46:47,699 reached nine. And we want to remove it from\n 379 00:46:47,699 --> 00:46:56,528 be equal to 15 with access to four and 15\n 380 00:46:56,528 --> 00:47:06,498 pointer respectively. Similarly set 15 previous\n 381 00:47:06,498 --> 00:47:15,189 now read, meaning it is ready to be removed.\n 382 00:47:15,190 --> 00:47:23,019 out the doubly linked lists, we see that it\n 383 00:47:23,018 --> 00:47:31,769 complexity analysis on linked lists how good\n 384 00:47:31,769 --> 00:47:38,429 we have singly linked lists. And on the right\n 385 00:47:38,429 --> 00:47:44,328 in a linked list is linear in the worst case,\n 386 00:47:44,329 --> 00:47:52,420 not there, we have to traverse all of the\n 387 00:47:52,420 --> 00:47:58,940 is constant time, because we always maintain\n 388 00:47:58,940 --> 00:48:09,858 hence we can add it in constant time. Similarly\n 389 00:48:09,858 --> 00:48:14,818 linked lists, and a doubly linked list is\n 390 00:48:14,818 --> 00:48:22,358 a reference to it, so we can just move it\n 391 00:48:22,358 --> 00:48:27,768 removing from the tail is another story. It\n 392 00:48:27,768 --> 00:48:35,498 a singly linked list. Can you think of Why?\n 393 00:48:35,498 --> 00:48:43,549 tail in a singly linked lists, we can remove\n 394 00:48:43,550 --> 00:48:50,568 value of what the tail is. So we had to seek\n 395 00:48:50,568 --> 00:48:57,869 new tail is equal to. W linked list however,\n 396 00:48:57,869 --> 00:49:04,329 a pointer to the previous node. So we can\n 397 00:49:04,329 --> 00:49:10,450 finally, removing somewhere in the middle\n 398 00:49:10,449 --> 00:49:15,189 we would need to seek through n minus one\n 399 00:49:16,400 --> 00:49:23,088 to look at some double e linked list source\n 400 00:49:23,088 --> 00:49:30,998 list series. So the link for the source code\n 401 00:49:30,998 --> 00:49:38,329 Williams he's a slash data dash structures.\n 402 00:49:38,329 --> 00:49:43,828 source code helpful so that others may also\n 403 00:49:43,829 --> 00:49:49,880 first part of the linkless series before continuing.\n 404 00:49:49,880 --> 00:49:59,818 at the implementation of a doubly linked list\n 405 00:49:59,818 --> 00:50:06,380 a Few instance variables. So we are keeping\n 406 00:50:06,380 --> 00:50:13,160 as what the head and the tail currently are.\n 407 00:50:13,159 --> 00:50:20,489 meaning link list is empty. Furthermore, we\n 408 00:50:20,489 --> 00:50:27,259 excessively, because it contains the data\n 409 00:50:27,259 --> 00:50:33,880 and next pointers for each node since this\n 410 00:50:33,880 --> 00:50:39,809 for the node, namely the data and the previous\n 411 00:50:39,809 --> 00:50:47,150 both otherwise, we can't do much. So his first\n 412 00:50:47,150 --> 00:50:54,079 list, it does so in linear time by going through\n 413 00:50:54,079 --> 00:51:02,269 time deallocates them by setting them equal\n 414 00:51:02,268 --> 00:51:08,838 head, we loop while the traverser is likely\n 415 00:51:08,838 --> 00:51:13,599 and then we do our deallocation business.\n 416 00:51:13,599 --> 00:51:20,760 and reset the head and tail. Perfect. These\n 417 00:51:20,760 --> 00:51:30,720 get the size and check if the size of our\n 418 00:51:30,719 --> 00:51:36,419 a public method to add an element by default,\n 419 00:51:36,420 --> 00:51:44,639 list or at the tail. But I also support adding\n 420 00:51:44,639 --> 00:51:51,808 do we do this, if this is the first element,\n 421 00:51:51,809 --> 00:52:02,739 and the tail to be equal to the new node,\n 422 00:52:02,739 --> 00:52:14,130 and next pointers set to No. Otherwise, if\n 423 00:52:14,130 --> 00:52:22,729 previous pointer is equal to this new node.\n 424 00:52:22,728 --> 00:52:29,728 to be whatever hands previous is. So we backup\n 425 00:52:29,728 --> 00:52:36,998 forget to increment the size. A very similar\n 426 00:52:36,998 --> 00:52:42,608 length list, except we're moving the tail\npointer around. 427 00:52:44,009 --> 00:52:50,960 move to peak. So peaking is just looking at\n 428 00:52:50,960 --> 00:52:56,329 linked list or at the end of the linked list.\n 429 00:52:56,329 --> 00:53:04,650 is empty, because doesn't make sense to peek\n 430 00:53:04,650 --> 00:53:11,200 more complex method, which is remove first.\n 431 00:53:11,199 --> 00:53:19,808 the linked list. So we can't do much if the\n 432 00:53:19,809 --> 00:53:24,680 we extract the data at the head, and then\n 433 00:53:24,679 --> 00:53:33,230 the size by one. So if the list is empty,\n 434 00:53:33,230 --> 00:53:42,009 the head and the tail are now No. Otherwise,\n 435 00:53:42,009 --> 00:53:49,719 that we just removed. This is especially important\n 436 00:53:49,719 --> 00:53:58,848 delete pointers, then at the end, we return\n 437 00:53:58,849 --> 00:54:05,470 except we're using the tail this time to remove\n 438 00:54:05,469 --> 00:54:15,929 head. Okay, and here's a generic method to\n 439 00:54:15,929 --> 00:54:23,058 this to private because the node class itself\n 440 00:54:23,059 --> 00:54:28,700 to the node. That's just something we're using\n 441 00:54:28,699 --> 00:54:36,528 to manage the list. So if the node that we're\n 442 00:54:36,528 --> 00:54:45,838 detect that and call our methods either remove\n 443 00:54:45,838 --> 00:54:52,108 somewhere in the middle of linked list. And\n 444 00:54:52,108 --> 00:54:57,960 the to our current node equal to each other.\n 445 00:54:57,960 --> 00:55:05,000 node and And of course, don't forget to clean\n 446 00:55:05,000 --> 00:55:10,088 have to temporarily store the data. Of course,\n 447 00:55:10,088 --> 00:55:18,068 deleted the node and the data is already gone.\n 448 00:55:18,068 --> 00:55:25,920 particular index and our linked list. Yes,\n 449 00:55:25,920 --> 00:55:32,829 not explicitly indexed, we can pretend that\n 450 00:55:32,829 --> 00:55:40,068 valid, otherwise throw an illegal argument\n 451 00:55:40,068 --> 00:55:45,469 bit smarter than just naively going through\n 452 00:55:45,469 --> 00:55:51,429 from the front of the linked list to find\n 453 00:55:51,429 --> 00:56:00,169 the index is closer to the front or to the\n 454 00:56:00,170 --> 00:56:04,568 So for the Remove method, we want to be able\n 455 00:56:04,568 --> 00:56:13,690 list, which is object. So we're going to also\n 456 00:56:13,690 --> 00:56:20,690 someone decided that the value of the node\n 457 00:56:20,690 --> 00:56:27,229 special case. Otherwise, we traverse through\n 458 00:56:27,228 --> 00:56:34,338 and then remove that node and return true\n 459 00:56:34,338 --> 00:56:40,929 we want to remove. Otherwise, we return false\n 460 00:56:40,929 --> 00:56:48,199 for the element we want to remove, we use\n 461 00:56:48,199 --> 00:56:53,179 the element. If so, remove that node and return\ntrue. 462 00:56:54,710 --> 00:57:01,940 here we have a related method which is index\n 463 00:57:01,940 --> 00:57:08,950 remove value, but get whatever index this\n 464 00:57:08,949 --> 00:57:14,598 null. So even if our values No, we'll just\n 465 00:57:14,599 --> 00:57:23,579 So again, first link list. Otherwise, search\n 466 00:57:23,579 --> 00:57:35,099 the index as we go. We can use the index of\n 467 00:57:35,099 --> 00:57:40,329 is contained within a linked list because\n 468 00:57:40,329 --> 00:57:47,250 found. Something that's useful sometimes is\n 469 00:57:47,250 --> 00:57:56,880 is also trivial to implement, just start a\n 470 00:57:56,880 --> 00:58:03,940 until you reach the end. Notice I'm not checking\n 471 00:58:03,940 --> 00:58:15,880 you want to, it's pretty easy to do that.\n 472 00:58:15,880 --> 00:58:23,568 the to string method to print a string or\n 473 00:58:23,568 --> 00:58:32,978 linked list. May I begin by saying that the\n 474 00:58:32,978 --> 00:58:40,710 data structure, one of my favorites. In fact,\n 475 00:58:40,710 --> 00:58:46,318 Part Two will consist of looking at a stack\n 476 00:58:46,318 --> 00:58:54,639 source code for how a stack is implemented\n 477 00:58:54,639 --> 00:59:00,818 that we'll be covering in this video as well\n 478 00:59:00,818 --> 00:59:08,548 about what is a stack and where is it used?\n 479 00:59:08,548 --> 00:59:16,338 of how to solve problems using stacks. Afterwards,\n 480 00:59:16,338 --> 00:59:23,190 internally and the time complexity associated\n 481 00:59:23,190 --> 00:59:31,519 some source code. Moving on to the discussion\n 482 00:59:31,518 --> 00:59:37,899 is a one ended linear data structure which\n 483 00:59:37,900 --> 00:59:46,219 primary operations, namely, push and pop.\n 484 00:59:46,219 --> 00:59:52,550 constructed. There is one data member again\n 485 00:59:52,550 --> 00:59:59,249 data member getting added to the stack. Notice\n 486 00:59:59,248 --> 01:00:06,558 block at the top The stack. This is because\n 487 01:00:06,559 --> 01:00:16,309 added to the top of the pile. This behavior\n 488 01:00:16,309 --> 01:00:23,548 out. Let's look at a more detailed example\n 489 01:00:23,548 --> 01:00:32,099 and removed from a stack. So let's walk through\n 490 01:00:32,099 --> 01:00:40,360 on what we need to add and remove to the stack.\n 491 01:00:40,360 --> 01:00:49,068 the top element from the stack, which is Apple.\n 492 01:00:49,068 --> 01:00:56,298 onto the stack. So we add onion to the top\n 493 01:00:56,298 --> 01:01:05,268 celery onto the stack. Next is watermelon,\n 494 01:01:05,268 --> 01:01:10,118 says to pop so we remove the element at the\n 495 01:01:10,119 --> 01:01:17,219 just added. The next operation also a pop.\n 496 01:01:17,219 --> 01:01:24,849 is celery. And last operation push lettuce\n 497 01:01:24,849 --> 01:01:30,200 top of the stack. So as you can see, everything\n 498 01:01:30,199 --> 01:01:35,598 have access to anything else but the top of\n 499 01:01:36,778 --> 01:01:46,329 a stack works. So when in Where is a stack\n 500 01:01:46,329 --> 01:01:52,859 everywhere. They're using text editors to\n 501 01:01:52,858 --> 01:01:58,170 backwards or forwards. They use some compilers\n 502 01:01:58,170 --> 01:02:05,739 matching braces, and in the right order. stacks\n 503 01:02:05,739 --> 01:02:10,949 books, plates, and even games like the Tower\n 504 01:02:13,329 --> 01:02:17,859 stacks are also used behind the scenes to\n 505 01:02:17,858 --> 01:02:23,630 function calls. When a function returns it\n 506 01:02:23,630 --> 01:02:32,539 and rewinds to the next function that is on\n 507 01:02:32,539 --> 01:02:38,359 stacks all the time in programming and never\n 508 01:02:38,358 --> 01:02:43,900 stacks for us to perform a depth first search\n 509 01:02:43,900 --> 01:02:50,130 manually by maintaining your own stack, or\n 510 01:02:50,130 --> 01:02:58,440 stacks as we have just discussed complexity\n 511 01:02:58,440 --> 01:03:04,579 assumes that you implemented a stack using\n 512 01:03:04,579 --> 01:03:10,339 because we have a reference at the top of\n 513 01:03:10,338 --> 01:03:19,039 goes for popping and peeking. Searching however,\n 514 01:03:19,039 --> 01:03:23,489 searching for isn't necessarily at the top\n 515 01:03:23,489 --> 01:03:31,268 elements in the stack, hence require a linear\n 516 01:03:31,268 --> 01:03:39,449 of a problem using stacks problem. So given\n 517 01:03:39,449 --> 01:03:45,458 round brackets square brackets curly brackets\n 518 01:03:45,458 --> 01:03:52,288 So analyzing examples below to understand\n 519 01:03:52,289 --> 01:03:59,079 ones are invalid. So before I show you the\n 520 01:04:04,460 --> 01:04:10,499 in this first example, consider the following\n 521 01:04:10,498 --> 01:04:16,738 string from left to right, I will be displaying\n 522 01:04:16,739 --> 01:04:27,150 bracket. So let's begin. For every left bracket\n 523 01:04:27,150 --> 01:04:33,450 So this is the left square bracket that I\n 524 01:04:33,449 --> 01:04:42,798 this one on the stack. Same goes for the next\n 525 01:04:46,079 --> 01:04:55,280 this is a right square bracket. So we encountered\n 526 01:04:55,280 --> 01:05:01,630 checks. First we check if the stack is empty.\n 527 01:05:01,630 --> 01:05:08,420 if there are still things in the stack that\n 528 01:05:08,420 --> 01:05:15,740 is equal to the reversed current bracket.\n 529 01:05:15,739 --> 01:05:25,088 reversed bracket. So we are good. Next is\n 530 01:05:25,088 --> 01:05:32,338 empty. No, it isn't. So we're good. Is the\n 531 01:05:32,338 --> 01:05:40,190 bracket? Yes, it is. So let's keep going around\n 532 01:05:40,190 --> 01:05:48,349 A right bracket. Is the stack empty? No. Okay,\n 533 01:05:48,349 --> 01:05:57,999 equal to the reverse bracket? Yes. Okay. So\n 534 01:05:57,998 --> 01:06:04,471 stack empty? No. Okay, good. And there's the\n 535 01:06:04,471 --> 01:06:10,748 bracket. Yes. Okay, good. And now we're done\n 536 01:06:10,748 --> 01:06:17,018 the stack is empty. Now. Why is that? Well,\n 537 01:06:17,018 --> 01:06:23,838 sequence were left brackets, they would still\n 538 01:06:23,838 --> 01:06:31,989 So we can conclude that this bracket sequence\n 539 01:06:31,989 --> 01:06:39,568 example with another bracket sequence. So\n 540 01:06:39,568 --> 01:06:45,619 bracket is a left bracket, so we push onto\n 541 01:06:45,619 --> 01:06:51,880 bracket. So we push onto the stack. This next\n 542 01:06:51,880 --> 01:06:58,720 if the stack is empty. No, it's good. And\n 543 01:06:58,719 --> 01:07:05,449 reverse bracket? Yes, it is. This next bracket\n 544 01:07:05,449 --> 01:07:11,568 So we're good. And is the reverse bracket\n 545 01:07:11,568 --> 01:07:21,679 No, it isn't. So this bracket sequence is\n 546 01:07:21,679 --> 01:07:29,149 we just ran through. So if we let us be a\n 547 01:07:29,150 --> 01:07:36,410 string, we can get the reverse bracket for\n 548 01:07:36,409 --> 01:07:44,210 is a left bracket, push it on to the stack.\n 549 01:07:44,210 --> 01:07:50,179 And if the element at the top of the stack\n 550 01:07:50,179 --> 01:07:56,338 those conditions are true, then we return\n 551 01:07:56,338 --> 01:08:03,048 is empty or not. And if it is empty, then\n 552 01:08:03,048 --> 01:08:09,440 we do not. I want to take a moment and look\n 553 01:08:09,440 --> 01:08:14,920 amongst mathematicians and computer scientists.\n 554 01:08:14,920 --> 01:08:21,529 is played as follows. You start with a pile\n 555 01:08:21,529 --> 01:08:26,020 the objective of the game is to move all the\n 556 01:08:26,020 --> 01:08:33,480 this pile, and each move, you can move the\n 557 01:08:33,479 --> 01:08:37,658 a restriction that no disk be placed on top\nof 558 01:08:37,658 --> 01:08:44,848 a smaller desk. So we can think of each peg\n 559 01:08:44,849 --> 01:08:53,760 top element in a peg and placing it on another\n 560 01:08:53,760 --> 01:09:18,469 run. And you will see how each peg acts like\n 561 01:09:18,469 --> 01:09:25,380 So you just saw how transferring elements\n 562 01:09:25,380 --> 01:09:33,270 as popping a disk from one stack and pushing\n 563 01:09:33,270 --> 01:09:41,680 the disk you're placing on top is smaller.\n 564 01:09:41,680 --> 01:09:48,670 of three in the stack series. This is going\n 565 01:09:48,670 --> 01:09:57,449 a stack. So those stacks are often implemented\n 566 01:09:57,448 --> 01:10:04,609 sometimes double linked lists here We'll cover\n 567 01:10:04,609 --> 01:10:10,269 linked list. Later on, and we will look at\n 568 01:10:10,270 --> 01:10:18,751 using a doubly linked list. Okay, to begin\n 569 01:10:18,751 --> 01:10:26,980 our link place, so we're going to point the\n 570 01:10:26,979 --> 01:10:35,679 is initially empty. Then the trick to creating\n 571 01:10:35,680 --> 01:10:43,270 the new elements before the head and not at\n 572 01:10:43,270 --> 01:10:50,480 pointing in the correct direction when we\n 573 01:10:50,479 --> 01:10:56,299 we will soon see, the next element however,\n 574 01:10:56,300 --> 01:11:02,310 let's do that. To create a new node, adjust\n 575 01:11:02,310 --> 01:11:09,719 then hook on the nodes next pointer to where\n 576 01:11:09,719 --> 01:11:21,179 for five and also 13. Now let's have a look\n 577 01:11:21,179 --> 01:11:28,800 just move the head pointer to the next node\n 578 01:11:28,800 --> 01:11:35,639 the first node off the stack and set the nodes\n 579 01:11:35,639 --> 01:11:41,460 up by the garbage collector if you're coding\n 580 01:11:41,460 --> 01:11:46,929 references pointing to it. If you're in another\n 581 01:11:46,929 --> 01:11:53,849 explicitly deallocate free memory yourself\n 582 01:11:53,849 --> 01:11:59,840 Or you will get memory leaks. Getting a memory\n 583 01:11:59,840 --> 01:12:06,329 kinds of memory leaks, especially if it's\n 584 01:12:06,329 --> 01:12:13,019 reusing. So keep watching out for that not\n 585 01:12:13,020 --> 01:12:19,900 that we will be covering. If you see in an\n 586 01:12:19,899 --> 01:12:27,618 up my memory, please please point out to me,\n 587 01:12:27,618 --> 01:12:34,769 so we can patch that. Okay, so we keep proceeding\n 588 01:12:34,770 --> 01:12:43,980 pointer down to the next node. Pop again,\n 589 01:12:43,979 --> 01:12:52,199 popping we've reached last note and the stack\n 590 01:12:52,199 --> 01:12:59,050 in the stack series videos. Today we'll be\n 591 01:12:59,050 --> 01:13:06,260 stack. So the source code can be found on\n 592 01:13:06,260 --> 01:13:13,550 structures. Make sure you understood part\n 593 01:13:13,550 --> 01:13:19,840 So you actually know how we implement a stack\n 594 01:13:19,840 --> 01:13:25,400 video series, and the implementation of the\n 595 01:13:25,399 --> 01:13:31,859 to you then please start this repository on\n 596 01:13:31,859 --> 01:13:40,049 finding it as well. Here we are in the stack\n 597 01:13:40,050 --> 01:13:48,239 Java programming language. So the first thing\n 598 01:13:48,238 --> 01:13:56,109 variable of a length list. This is the linked\n 599 01:13:56,109 --> 01:14:02,000 linked list provided by Java. This is a little\n 600 01:14:02,000 --> 01:14:09,649 Java dot util that I will be using today,\n 601 01:14:10,770 --> 01:14:17,090 videos, this is just for portability, in case\n 602 01:14:17,090 --> 01:14:22,170 So we have two constructors, we can create\n 603 01:14:22,170 --> 01:14:29,800 one initial element. This is occasionally\n 604 01:14:29,800 --> 01:14:36,940 the stack. So to get to do that, we return\n 605 01:14:36,939 --> 01:14:42,698 the elements of our stack easy. We also check\n 606 01:14:42,698 --> 01:14:51,460 is zero. So this next one is just push so\n 607 01:14:51,460 --> 01:14:59,501 append that element as the last element in\n 608 01:14:59,501 --> 01:15:07,780 also pull element of the stack. So to do this,\n 609 01:15:07,779 --> 01:15:14,189 then we throw an empty stack exception because\n 610 01:15:14,189 --> 01:15:24,819 That doesn't make sense. Similarly, the same\n 611 01:15:24,819 --> 01:15:32,738 top element of the stack is, if the stack\n 612 01:15:32,738 --> 01:15:40,099 the last element of our list. And lastly,\n 613 01:15:40,100 --> 01:15:48,570 to iterate through our stack. This iterator\n 614 01:15:48,569 --> 01:15:54,359 supports concurrent modification errors. So\n 615 01:15:54,359 --> 01:16:00,880 that was static. It's only like 50 lines of\n 616 01:16:00,880 --> 01:16:06,010 one of the most useful data structures in\n 617 01:16:06,010 --> 01:16:13,130 one of three in the Q series. So the outline\n 618 01:16:13,130 --> 01:16:18,090 going to begin by talking about queues and\n 619 01:16:18,090 --> 01:16:24,110 some complexity analysis concerning queues.\n 620 01:16:24,109 --> 01:16:29,529 of n queuing and D queuing elements from a\n 621 01:16:29,529 --> 01:16:37,800 very end in the last video. So a discussion\n 622 01:16:37,800 --> 01:16:43,110 So below you can see an image of a queue.\n 623 01:16:43,109 --> 01:16:50,089 that models a real world queue. Having two\n 624 01:16:50,090 --> 01:16:59,739 and D queuing. So ever queue has a front and\n 625 01:16:59,738 --> 01:17:05,299 back and remove through the front. Adding\n 626 01:17:05,300 --> 01:17:12,989 n queuing. and removing elements from the\n 627 01:17:12,988 --> 01:17:21,888 there's a bit of terminology surrounding queues\n 628 01:17:21,889 --> 01:17:28,560 or when we refer to as queuing D queuing,\n 629 01:17:28,560 --> 01:17:37,119 So and queuing is also called adding but also\n 630 01:17:37,118 --> 01:17:42,389 we're talking about D queuing. So this is\n 631 01:17:42,390 --> 01:17:48,780 queue. This is also called polling elements.\n 632 01:17:48,779 --> 01:17:55,109 as removing an element from the queue. But\n 633 01:17:55,109 --> 01:18:02,880 some ambiguity, did they mean removing from\n 634 01:18:02,880 --> 01:18:08,969 the entire queue? Make note that if I say\n 635 01:18:08,969 --> 01:18:16,989 from the front of the queue unless I say otherwise.\n 636 01:18:16,989 --> 01:18:23,399 in detail. However, first, notice I have labeled\n 637 01:18:23,399 --> 01:18:29,359 where I'm going to be in queueing and D queuing\n 638 01:18:29,359 --> 01:18:36,449 instruction says in queue 12, so we add 12\n 639 01:18:36,449 --> 01:18:44,720 the first element from the front of the queue,\n 640 01:18:44,720 --> 01:18:52,320 we removed minus one from the front of the\n 641 01:18:55,359 --> 01:19:05,009 dq, so remove the front element being 33.\n 642 01:19:08,039 --> 01:19:13,380 So now that we know where a queue is, where\n 643 01:19:13,380 --> 01:19:19,050 Well, a classic example of where cuneus gets\n 644 01:19:19,050 --> 01:19:26,989 waiting in line at a movie theater or in the\n 645 01:19:26,988 --> 01:19:33,919 ever been to say McDonald's, where all the\n 646 01:19:33,920 --> 01:19:40,340 fried, the next person in line gets to order\n 647 01:19:40,340 --> 01:19:47,170 also be really useful if you have a sequence\n 648 01:19:47,170 --> 01:19:54,800 of say, the x most recent elements, while\n 649 01:19:54,800 --> 01:20:03,650 your queue gets larger than x elements, just\n 650 01:20:03,649 --> 01:20:11,039 in server management. So, suppose for a moment\n 651 01:20:11,039 --> 01:20:18,198 for requests from people to use your website,\n 652 01:20:18,198 --> 01:20:24,839 serve up to five people. But no more. If 12\n 653 01:20:24,840 --> 01:20:30,949 you're not going to be able to process all\n 654 01:20:30,948 --> 01:20:36,118 is you process the five that you're able to,\n 655 01:20:36,118 --> 01:20:43,408 queue waiting to be served. And whenever you\n 656 01:20:43,408 --> 01:20:49,179 next request, and then you start processing\n 657 01:20:49,179 --> 01:20:53,920 While you're doing this, more requests come\n 658 01:20:53,920 --> 01:21:01,539 add them to the end of the cube. queues are\n 659 01:21:01,539 --> 01:21:05,380 first search traversal on a graph, which is\n 660 01:21:05,380 --> 01:21:14,630 this example in the next video. All right\n 661 01:21:14,630 --> 01:21:19,510 So as we're seeing, it's pretty obvious that\n 662 01:21:19,510 --> 01:21:25,010 time. There's also another operation on a\n 663 01:21:25,010 --> 01:21:30,760 peaking. peaking means that we're looking\n 664 01:21:30,760 --> 01:21:37,489 removing it, the source or cost and time.\n 665 01:21:37,488 --> 01:21:44,019 within the queue, is linear time since we\n 666 01:21:44,020 --> 01:21:50,139 the elements. There's also element removal\n 667 01:21:50,139 --> 01:21:56,539 or polling, but in actually removing an element\n 668 01:21:56,539 --> 01:22:03,389 linear time, since we would have to scan through\n 669 01:22:03,389 --> 01:22:08,590 this video, we're going to have a look at\n 670 01:22:08,590 --> 01:22:14,369 search. And then we're going to look at the\n 671 01:22:14,369 --> 01:22:22,948 and D queuing elements works. Okay, onto the\n 672 01:22:22,948 --> 01:22:29,678 search is an operation we can do on the graph\n 673 01:22:29,679 --> 01:22:35,679 what I mean, when I say graph, I mean a network\n 674 01:22:35,679 --> 01:22:41,170 like that. But first, I should explain the\n 675 01:22:41,170 --> 01:22:47,279 search. The objective is to start a node and\n 676 01:22:47,279 --> 01:22:54,090 all the neighbors of the starting node, and\n 677 01:22:54,090 --> 01:22:59,670 node you visited and then all the neighbors\n 678 01:22:59,670 --> 01:23:07,559 so forth, expanding through all the neighbors\n 679 01:23:07,559 --> 01:23:14,929 of the breadth first search as expanding the\n 680 01:23:14,929 --> 01:23:23,260 as you go on. So let's begin our breadth first\n 681 01:23:23,260 --> 01:23:32,119 node zero as yellow and put it in the frontier\n 682 01:23:32,118 --> 01:23:38,380 visit all the neighbors of zero being one\n 683 01:23:38,380 --> 01:23:45,631 then we resolve the neighbors of one and nine\n 684 01:23:45,631 --> 01:23:51,410 seven, and visit all the neighbors of seven.\n 685 01:23:51,409 --> 01:23:59,689 here, and now visit all the neighbors of the\nyellow nodes. 686 01:23:59,689 --> 01:24:06,149 And now we're done our breadth first search\n 687 01:24:06,149 --> 01:24:14,198 frontier. Notice that there's 12 that is the\n 688 01:24:14,198 --> 01:24:20,169 island all by itself. So we are not able to\n 689 01:24:20,170 --> 01:24:26,719 is fine. Suppose you want to actually code\n 690 01:24:26,719 --> 01:24:36,529 done? Well, the idea is to use a cube. So\n 691 01:24:36,529 --> 01:24:43,639 And then we mark the starting node as visited.\n 692 01:24:43,639 --> 01:24:52,819 an element from our queue or D queuing. And\n 693 01:24:52,819 --> 01:24:58,799 D queued if the neighbor has not been visited\n 694 01:24:58,800 --> 01:25:07,429 to the queue. So now we have a way of processing\n 695 01:25:07,429 --> 01:25:15,658 search order. Really, really useful, very,\n 696 01:25:15,658 --> 01:25:23,219 now let's look at implementation of queues.\n 697 01:25:23,219 --> 01:25:29,439 out that you can implement the queue abstract\n 698 01:25:29,439 --> 01:25:36,219 the most popular methods are to either use\n 699 01:25:36,219 --> 01:25:42,800 lists. If you're using an array, you have\n 700 01:25:44,319 --> 01:25:48,960 array, if it's a dynamic array, then you'll\n 701 01:25:50,710 --> 01:25:54,730 a singly linked list and the source code,\n 702 01:25:54,729 --> 01:26:01,439 tuned for that. In a singly linked list, we're\n 703 01:26:01,439 --> 01:26:11,089 So initially, they're both No. But as we n\n 704 01:26:11,090 --> 01:26:16,630 so nothing really interesting is going on\n 705 01:26:16,630 --> 01:26:23,109 can see that we're pushing the tail pointer\n 706 01:26:23,109 --> 01:26:32,170 the tail pointer point to the next node. Now\n 707 01:26:32,170 --> 01:26:38,279 the tail forward, we're going to be pushing\n 708 01:26:38,279 --> 01:26:44,050 one, and then the element that was left over\n 709 01:26:44,050 --> 01:26:50,159 user. So why don't we push the head pointer\n 710 01:26:50,158 --> 01:26:55,399 so that it can be picked up by the garbage\n 711 01:26:55,399 --> 01:27:00,799 in another programming language, which requires\n 712 01:27:00,800 --> 01:27:08,210 yourself like C or c++, now's the time to\n 713 01:27:08,210 --> 01:27:16,448 we're just pushing the head forward and forward\n 714 01:27:16,448 --> 01:27:22,219 of elements, then remove them all then the\n 715 01:27:22,219 --> 01:27:31,789 is where we started. All right, now it's time\n 716 01:27:31,789 --> 01:27:37,948 So I implemented a queue and you can find\n 717 01:27:37,948 --> 01:27:47,808 slash my user name slash data dash structures.\n 718 01:27:47,809 --> 01:27:55,320 parts one and two from the Q series before\n 719 01:27:55,319 --> 01:28:03,380 some source code for a queue. So this source\n 720 01:28:03,380 --> 01:28:09,828 although you can probably translate it into\n 721 01:28:09,828 --> 01:28:17,009 the first thing to remark is I have an instance\n 722 01:28:17,010 --> 01:28:24,760 a Java's implementation of a doubly linked\n 723 01:28:24,760 --> 01:28:33,930 as you'll see the queue and the stack implementations\n 724 01:28:33,930 --> 01:28:40,920 constructors, one create just an empty queue.\n 725 01:28:40,920 --> 01:28:47,670 a queue but with a first element. In fact,\n 726 01:28:47,670 --> 01:28:56,210 we we might want to allow no elements. So\n 727 01:28:56,210 --> 01:29:00,149 method is the size, it just gets the size\n 728 01:29:00,149 --> 01:29:08,210 if the length list is empty. Those are both\n 729 01:29:08,210 --> 01:29:14,618 method is the peak method, the peak method\n 730 01:29:14,618 --> 01:29:23,000 the queue, but it will throw an error if your\n 731 01:29:23,000 --> 01:29:31,090 when your queue is empty. Similarly, for poll,\n 732 01:29:31,090 --> 01:29:41,510 the queue, but unlike peak will actually remove\n 733 01:29:41,510 --> 01:29:50,170 down a little bit, I have offer which adds\n 734 01:29:50,170 --> 01:29:55,949 I am allowing for no elements. So if you don't\n 735 01:29:55,948 --> 01:30:04,848 throw an error or something. So the poll removed\n 736 01:30:04,849 --> 01:30:11,840 to the back. So remove first and add last.\n 737 01:30:11,840 --> 01:30:18,889 in case you want to be able to iterate through\n 738 01:30:18,889 --> 01:30:26,579 and very simple implementation just under\n 739 01:30:26,579 --> 01:30:30,260 Although there are faster ways of creating\n 740 01:30:30,260 --> 01:30:39,780 The idea with arrays, especially static arrays,\n 741 01:30:39,779 --> 01:30:47,139 that will be in your queue at any given time,\n 742 01:30:47,139 --> 01:30:54,599 size and have pointers to the front and the\n 743 01:30:54,599 --> 01:31:03,659 remove elements based on the relative position\n 744 01:31:03,658 --> 01:31:10,939 where you're running off the edge of your\n 745 01:31:10,939 --> 01:31:16,559 of the array and keep processing elements\n 746 01:31:16,560 --> 01:31:24,969 maintain references to the next node, such\n 747 01:31:24,969 --> 01:31:33,059 your homework is to create a static array\n 748 01:31:33,059 --> 01:31:39,949 everything to do with priority queues from\n 749 01:31:39,948 --> 01:31:45,529 And towards the end, we'll also have a look\n 750 01:31:45,529 --> 01:31:52,090 queue stuff. We're also going to talk about\n 751 01:31:52,090 --> 01:32:00,159 although not the same. So the outline for\n 752 01:32:00,158 --> 01:32:06,170 start with the basics talking about what are\n 753 01:32:06,170 --> 01:32:10,859 then we'll move on to some common operations\n 754 01:32:10,859 --> 01:32:16,699 how we can turn min priority queues into max\n 755 01:32:16,699 --> 01:32:23,019 analysis. And we'll talk about common ways\n 756 01:32:23,020 --> 01:32:27,820 people think heaps are the only way we can\n 757 01:32:27,819 --> 01:32:34,658 queues somehow are heaps, I want to dispel\n 758 01:32:34,658 --> 01:32:40,348 some great detail about how to implement the\n 759 01:32:40,349 --> 01:32:48,310 we'll look at methods of sinking and swimming\n 760 01:32:48,310 --> 01:32:55,310 used to get and shuffle around elements in\n 761 01:32:55,310 --> 01:33:04,699 explanation, I also go over how to pull and\n 762 01:33:04,698 --> 01:33:12,638 let's get started. discussion and examples.\n 763 01:33:12,639 --> 01:33:19,630 priority queue series. So what is a priority\n 764 01:33:19,630 --> 01:33:25,809 type that operates similar to a normal queue\n 765 01:33:25,809 --> 01:33:34,360 a certain priority. So elements with a higher\n 766 01:33:34,359 --> 01:33:42,649 As a side note, I'd like to remark that priority\n 767 01:33:42,649 --> 01:33:48,379 meaning that the data we insert into the priority\n 768 01:33:48,380 --> 01:33:54,270 from least to greatest or raised lease. This\n 769 01:33:54,270 --> 01:34:01,380 elements. Okay, let's go into an example.\n 770 01:34:01,380 --> 01:34:06,170 inserted into a priority queue on the right,\n 771 01:34:06,170 --> 01:34:12,840 such that we want to order them from least\n 772 01:34:12,840 --> 01:34:21,630 higher priority than the bigger ones. So they\n 773 01:34:21,630 --> 01:34:27,868 Suppose we have now a list of instructions.\n 774 01:34:27,868 --> 01:34:35,328 the element that has the highest priority\n 775 01:34:35,328 --> 01:34:42,380 works. So if I say Paul, then I remove the\n 776 01:34:42,380 --> 01:34:51,059 to be one. Now I say add two, so we add two\n 777 01:34:51,059 --> 01:34:56,510 of smallest elements in our priority queue,\n 778 01:34:56,510 --> 01:35:06,619 Next, we add for all this smallest, this is\n 779 01:35:06,618 --> 01:35:12,479 pull the rest. So as I pull the rest, I'm\n 780 01:35:14,000 --> 01:35:21,109 priority queue. So it turns out that as we\n 781 01:35:21,109 --> 01:35:26,738 sequence. This is a coincidence. Actually,\n 782 01:35:26,738 --> 01:35:32,238 queue, we do not necessarily end up getting\n 783 01:35:32,238 --> 01:35:38,049 that the next number that is removed from\n 784 01:35:38,050 --> 01:35:45,260 that was currently in the priority queue.\n 785 01:35:45,260 --> 01:35:51,489 is the next smallest number to remove? As\n 786 01:35:51,488 --> 01:35:57,828 inside the priority queue and look and know\n 787 01:35:57,828 --> 01:36:04,210 was going to return. But fundamentally, how\n 788 01:36:04,210 --> 01:36:09,429 all the elements inside a priority queue before\n 789 01:36:09,429 --> 01:36:19,980 be highly ineffective. Instead, it uses what\n 790 01:36:19,979 --> 01:36:26,589 then is what is a heap? Usually I make up\n 791 01:36:26,590 --> 01:36:34,210 one from wiki. A heap is a tree based data\n 792 01:36:34,210 --> 01:36:41,800 also called the heap property. If a is a parent\n 793 01:36:41,800 --> 01:36:49,800 to B for all nodes A and B in the heap. What\n 794 01:36:49,800 --> 01:36:55,360 is always greater than or equal to the value\n 795 01:36:55,359 --> 01:37:01,269 way around that the value of the parent node\n 796 01:37:01,270 --> 01:37:08,810 child node for all nodes. This means we end\n 797 01:37:08,810 --> 01:37:15,620 and min heaps. So max heaps are the one with\n 798 01:37:15,619 --> 01:37:22,328 its children. And the min heap is the opposite.\n 799 01:37:22,328 --> 01:37:28,090 heaps binary because every node has exactly\n 800 01:37:28,090 --> 01:37:37,369 or no values I have not drawn in. So why are\n 801 01:37:37,368 --> 01:37:44,738 underlying data structure for priority queues?\n 802 01:37:44,738 --> 01:37:49,829 called heaps, although this isn't technically\n 803 01:37:49,829 --> 01:37:57,010 is an abstract data type, meaning it can be\n 804 01:37:57,010 --> 01:38:03,429 Okay, we're going to play a little game, I'm\n 805 01:38:03,429 --> 01:38:08,949 need to tell me whether it is a heap or not.\n 806 01:38:08,948 --> 01:38:13,069 to determine whether it's a heap or not, you\n 807 01:38:13,069 --> 01:38:21,799 just going to give you a short moment here.\n 808 01:38:21,800 --> 01:38:38,110 and this tree. So it's not a heap. Is this\n 809 01:38:38,109 --> 01:38:44,569 it satisfies a heap invariant, and it is a\n 810 01:38:44,569 --> 01:38:50,000 why we're called binomial heaps. Note that\n 811 01:38:50,000 --> 01:39:01,710 can have any number of branches. On to our\n 812 01:39:01,710 --> 01:39:07,969 a valid heap. Because even though this one\n 813 01:39:07,969 --> 01:39:15,340 free to move around the visual representation\n 814 01:39:15,340 --> 01:39:27,110 valid heap. How about this one? No, this structure\n 815 01:39:27,109 --> 01:39:40,849 the cycles. All heaps must be trees. What\n 816 01:39:40,850 --> 01:39:51,079 this one? Also heap because it satisfies the\n 817 01:39:51,078 --> 01:39:59,698 than or equal to or greater than or equal\n 818 01:39:59,698 --> 01:40:05,118 not onto the heap and because it does not\n 819 01:40:05,118 --> 01:40:14,679 do change the root to be 10, then we can satisfy\n 820 01:40:14,680 --> 01:40:27,761 Or rather sorry, a max heap. So when and where\n 821 01:40:27,761 --> 01:40:34,020 of the most popular places we see priority\n 822 01:40:34,020 --> 01:40:42,250 to fetch the next nodes we explore. priority\n 823 01:40:42,250 --> 01:40:49,810 a behavior in which you need to dynamically\n 824 01:40:49,810 --> 01:40:59,520 They're also used in Huffman encoding, which\n 825 01:40:59,520 --> 01:41:04,880 Many best first search algorithms use priority\n 826 01:41:04,880 --> 01:41:11,140 gain grab the next most promising node in\n 827 01:41:11,140 --> 01:41:18,340 we also see priority queues in prims minimum\n 828 01:41:18,340 --> 01:41:24,119 So it seems priority queues are really important,\n 829 01:41:24,118 --> 01:41:32,469 is where we see them often. Okay, on to some\n 830 01:41:32,469 --> 01:41:40,630 as a binary heap. To begin with, there exists\n 831 01:41:40,630 --> 01:41:46,250 unordered array in linear time, we're not\n 832 01:41:46,250 --> 01:41:54,559 cool. And it forms the basis for the sorting\n 833 01:41:54,559 --> 01:42:00,850 rather removing pulling or removing an element\n 834 01:42:00,850 --> 01:42:06,480 time, because you need to restore the heap\n 835 01:42:06,479 --> 01:42:13,698 time. So peaking or seeing the value at the\n 836 01:42:13,698 --> 01:42:20,269 is really nice. Adding an element to our heap\n 837 01:42:20,270 --> 01:42:26,950 we possibly have to reshuffled heap by bubbling\n 838 01:42:26,949 --> 01:42:36,109 Then there are a few more operations we can\n 839 01:42:36,109 --> 01:42:43,189 which is not the root element. So the naive\n 840 01:42:43,189 --> 01:42:50,349 do a linear scan to find the items position\n 841 01:42:50,350 --> 01:42:57,260 is it can be extremely slow in some situations,\n 842 01:42:57,260 --> 01:43:02,389 don't do this. And it's not a problem, which\n 843 01:43:02,389 --> 01:43:10,029 lazy and do the linear scan solution. However,\n 844 01:43:10,029 --> 01:43:17,130 time complexity, which I will go over later\n 845 01:43:17,130 --> 01:43:23,020 series. So stay tuned for that this method\n 846 01:43:23,020 --> 01:43:27,789 complexity to be logarithmic, which is super\n 847 01:43:27,788 --> 01:43:37,630 much as you are adding. Now the naive method\n 848 01:43:37,630 --> 01:43:43,590 heap is linear. Again, you just scan through\n 849 01:43:43,590 --> 01:43:51,340 table, we can reduce this to be a constant\n 850 01:43:51,340 --> 01:43:58,510 use the hash table implementation for the\n 851 01:43:58,510 --> 01:44:04,289 The downside however, to using hash table\n 852 01:44:04,288 --> 01:44:11,329 extra linear space factor. And it does add\n 853 01:44:11,329 --> 01:44:18,569 your table a lot during swaps. Today, we're\n 854 01:44:18,569 --> 01:44:24,939 into max priority queues. This is part two,\n 855 01:44:24,939 --> 01:44:29,799 may already be asking yourself, why is it\n 856 01:44:29,800 --> 01:44:36,400 priority queue into a max priority queue?\n 857 01:44:36,399 --> 01:44:42,519 library, most programming languages, they\n 858 01:44:42,520 --> 01:44:48,070 queue or a min priority queue. Usually it's\n 859 01:44:48,069 --> 01:44:53,639 by the smallest element first, but sometimes\n 860 01:44:53,639 --> 01:44:59,788 what we're programming. So how do we do this?\n 861 01:44:59,788 --> 01:45:07,750 queue To another type. Well, a hack we can\n 862 01:45:07,750 --> 01:45:10,050 in a priority queue must implement some sort 863 01:45:10,050 --> 01:45:17,220 of comparable interface, which we can simply\n 864 01:45:17,220 --> 01:45:22,110 heap. Let's look at some examples. Suppose\n 865 01:45:22,109 --> 01:45:28,349 consisting of elements that are on the right\n 866 01:45:28,350 --> 01:45:35,719 that min priority queue. So if x and y are\n 867 01:45:35,719 --> 01:45:41,989 than or equal to y, then x will come out of\n 868 01:45:41,988 --> 01:45:50,510 of this is x is greater than or equal to y.\n 869 01:45:50,511 --> 01:45:56,510 all these elements are still in the priority\n 870 01:45:56,510 --> 01:46:05,199 of x is less than or equal to y, just x greater\n 871 01:46:05,198 --> 01:46:12,859 to y? Well, not for competitors. You see if\n 872 01:46:12,859 --> 01:46:20,799 is negated, x should still equal y. So now\n 873 01:46:20,800 --> 01:46:30,061 elements out of priority queue with our negated\n 874 01:46:30,060 --> 01:46:41,179 it's the greatest. Next comes 11 753. And\n 875 01:46:41,180 --> 01:46:48,060 to negate the number before you insert it\n 876 01:46:48,060 --> 01:46:55,360 is a hack specific to numbers, but it's pretty\n 877 01:46:55,359 --> 01:47:01,019 negate all the numbers inside a priority queue.\n 878 01:47:01,020 --> 01:47:09,489 is the smallest so should come out first.\n 879 01:47:09,488 --> 01:47:17,948 now we have to remake the data and we 13.\n 880 01:47:17,948 --> 01:47:26,779 So really positive 11. And so on my seven,\n 881 01:47:26,779 --> 01:47:34,488 then arena, get the value to get it out of\n 882 01:47:34,488 --> 01:47:42,729 Okay, now let's look at my other examples\n 883 01:47:42,729 --> 01:47:49,419 for strings, which sorts strings in lexicographic.\n 884 01:47:49,420 --> 01:47:59,679 languages, then, let's call n Lex be the negation\n 885 01:47:59,679 --> 01:48:10,800 two to be some non null strings. So below,\n 886 01:48:10,800 --> 01:48:18,650 one if s one is less than s two Lexa graphically\n 887 01:48:18,649 --> 01:48:28,609 one if s one is greater than s two lexicographically.\n 888 01:48:28,609 --> 01:48:35,939 So just to break it down, ALEKS sorts strings,\n 889 01:48:35,939 --> 01:48:41,979 in gaining legs so that longer strings appear\n 890 01:48:41,979 --> 01:48:49,319 strings with letters at the end of the alphabet\n 891 01:48:49,319 --> 01:48:55,840 the beginning of the alphabet, I think I said\n 892 01:48:55,840 --> 01:49:04,069 he into a maxi beep. Let's look at a concrete\n 893 01:49:04,069 --> 01:49:09,859 right to a prayer queue with the lexicographic\n 894 01:49:09,859 --> 01:49:18,710 expect. First we get a because it's the shortest\n 895 01:49:18,710 --> 01:49:27,179 closest to the start of the alphabet, then\n 896 01:49:27,179 --> 01:49:38,270 and x x. So now let's do the same thing with\n 897 01:49:38,270 --> 01:49:43,069 opposite sequence in reverse order. 898 01:49:43,069 --> 01:49:55,018 And then we get x x x r x f, Zed mi N A. So\n 899 01:49:55,019 --> 01:50:00,090 we're going to talk about adding elements\n 900 01:50:00,090 --> 01:50:07,840 The priority queue series, we'll get to adding\n 901 01:50:07,840 --> 01:50:13,810 there are some important terminology and concepts\n 902 01:50:13,810 --> 01:50:24,889 prior to add elements effectively to our priority\n 903 01:50:24,889 --> 01:50:31,060 a priority queue is to use some kind of heap.\n 904 01:50:31,060 --> 01:50:37,579 which give us the best possible time complexity\n 905 01:50:37,578 --> 01:50:44,009 a priority queue. However, I want to make\n 906 01:50:44,010 --> 01:50:49,480 A priority queue is an abstract data type\n 907 01:50:49,479 --> 01:50:56,359 should have. The heap just lets us actually\n 908 01:50:56,359 --> 01:51:02,559 could use an unordered list to achieve the\n 909 01:51:02,560 --> 01:51:09,610 But this would not give us the best possible\n 910 01:51:09,609 --> 01:51:15,639 are many different types of heaps including\n 911 01:51:15,639 --> 01:51:26,550 pairing heaps, and so on, so on. But for simplicity,\n 912 01:51:26,550 --> 01:51:33,969 heap is a binary tree that supports the heap\n 913 01:51:33,969 --> 01:51:40,880 exactly two children. So the following structure\n 914 01:51:40,880 --> 01:51:47,510 property that every parent's value is greater\n 915 01:51:47,510 --> 01:51:54,489 has exactly two children. Well, no, you may\n 916 01:51:54,489 --> 01:52:00,429 leafs don't have children. Well, actually,\n 917 01:52:00,429 --> 01:52:09,649 children in gray. But for simplicity, I won't\n 918 01:52:09,649 --> 01:52:17,848 Okay. The next important bit of terminology,\n 919 01:52:17,849 --> 01:52:26,179 tree property. The complete binary tree property\n 920 01:52:26,179 --> 01:52:35,480 the last is completely filled, and that all\n 921 01:52:35,479 --> 01:52:44,029 binary tree. As you will see, when we insert\n 922 01:52:44,029 --> 01:52:52,509 row. As far left to meet this complete binary\n 923 01:52:52,510 --> 01:52:59,670 tree property is very, very important, because\n 924 01:52:59,670 --> 01:53:07,000 what the heap looks like, or what values are\n 925 01:53:07,000 --> 01:53:14,578 the hollow circle that and the next one will\n 926 01:53:14,578 --> 01:53:20,029 we fill up the row, at which point we need\n 927 01:53:20,029 --> 01:53:29,238 is a very important. One last thing before\n 928 01:53:29,238 --> 01:53:36,589 a binary heap, is we need to understand how\n 929 01:53:36,590 --> 01:53:44,989 there is a canonical way of doing this, which\n 930 01:53:44,988 --> 01:53:54,189 is a very convenient actually, because when\n 931 01:53:54,189 --> 01:54:02,939 the insertion position is just the last position\n 932 01:54:02,939 --> 01:54:07,889 way we can represent the heap, we can also\n 933 01:54:07,890 --> 01:54:16,219 and recursively add and remove nodes as needed.\n 934 01:54:16,219 --> 01:54:24,019 and also very, very fast. So on the left is\n 935 01:54:24,019 --> 01:54:30,489 the position of each node in the array. And\n 936 01:54:30,488 --> 01:54:36,198 as you read elements in the array from left\n 937 01:54:36,198 --> 01:54:39,678 the heap, one layer at a time. 938 01:54:39,679 --> 01:54:46,880 So if we're at no nine, which is index zero,\n 939 01:54:46,880 --> 01:54:54,190 position one. And as I keep moving along,\n 940 01:54:54,189 --> 01:55:02,118 the array going from left to right. So it's\n 941 01:55:02,118 --> 01:55:08,308 interesting property of story, I'm binary\n 942 01:55:08,309 --> 01:55:17,560 all the children and parent nodes. So suppose\n 943 01:55:17,560 --> 01:55:25,389 left child is going to be at index two times\n 944 01:55:25,389 --> 01:55:31,469 is going to be at two i plus two, this is\n 945 01:55:31,469 --> 01:55:40,059 just subtract one. So suppose we have a node\n 946 01:55:40,059 --> 01:55:46,820 its index is two. So by our formula, the left\n 947 01:55:46,819 --> 01:55:58,038 two, plus one, or, or five. If we look at\n 948 01:55:58,038 --> 01:56:03,658 look at the right child, we should expect\n 949 01:56:03,658 --> 01:56:10,488 look in our array, this gives us the value\n 950 01:56:10,488 --> 01:56:16,308 we need to manipulate the knowns now array\n 951 01:56:16,309 --> 01:56:22,529 Part Five for the series, we will see that\n 952 01:56:22,529 --> 01:56:29,590 All right. So now we want to know, how do\n 953 01:56:29,590 --> 01:56:34,779 the heap invariant, because if we I noticed\n 954 01:56:34,779 --> 01:56:41,420 heap property? Well, the binary heap is useless.\n 955 01:56:41,420 --> 01:56:47,029 some instructions, which tell us what values\n 956 01:56:47,029 --> 01:56:52,279 value is a one, which we can see, which should\n 957 01:56:52,279 --> 01:56:58,690 with a min heap. But instead of inserting\n 958 01:56:58,690 --> 01:57:04,809 put one at the bottom left of the tree in\n 959 01:57:04,809 --> 01:57:11,489 and performance call bubbling up as my undergrad\n 960 01:57:11,488 --> 01:57:20,000 swimming or even sifting up all really cool\n 961 01:57:20,000 --> 01:57:26,210 one and the insertion position. But now we're\n 962 01:57:26,210 --> 01:57:32,800 is less than seven, but one is found below\n 963 01:57:32,800 --> 01:57:38,779 swap one and seven, like so. But now we're\n 964 01:57:38,779 --> 01:57:47,979 one is a child of six, but one is less than\n 965 01:57:47,979 --> 01:57:53,319 again, violation the property. So we swap,\n 966 01:57:53,319 --> 01:57:59,269 be. And now the heap invariant to satisfy\n 967 01:57:59,270 --> 01:58:05,210 whatever you want to call it. So the next\n 968 01:58:05,210 --> 01:58:13,520 in the insertion position. And now, we need\n 969 01:58:13,520 --> 01:58:19,530 13. Notice that we're no longer in violation\n 970 01:58:19,529 --> 01:58:25,679 13, and 13 is less than 12. So 13 is actually\n 971 01:58:25,679 --> 01:58:32,619 have to bubble up our elements that much.\n 972 01:58:32,619 --> 01:58:39,130 zero and 10. Try seeing where these end up,\n 973 01:58:39,130 --> 01:58:47,659 exercise. But I will keep going for now. So\n 974 01:58:47,658 --> 01:58:54,460 the nodes, it's there, and we bubble it up\n 975 01:58:54,460 --> 01:59:01,420 the property is satisfied. Next zero, my favorite\n 976 01:59:01,420 --> 01:59:07,538 be at the top of the tree as you will see\n 977 01:59:07,538 --> 01:59:18,689 So let us bubble up and like magic zeros at\n 978 01:59:18,689 --> 01:59:24,198 numbers 10. So we put out an insertion position.\n 979 01:59:24,198 --> 01:59:26,269 invariant, so we do nothing. 980 01:59:26,270 --> 01:59:31,761 Today we're going to look at how to remove\n 981 01:59:31,761 --> 01:59:38,380 Four or five in the priority queue series.\n 982 01:59:38,380 --> 01:59:42,480 the underlying structure of the binary heap. 983 01:59:45,420 --> 01:59:52,480 In general, with heaps we always want to remove\n 984 01:59:52,479 --> 02:00:00,129 It's the one of the highest priority is the\n 985 02:00:00,130 --> 02:00:06,190 Route, we call it polling, a special thing\n 986 02:00:06,189 --> 02:00:14,269 to search for its index. Because in an array\n 987 02:00:14,270 --> 02:00:25,670 zero. So when I say pull in red, we have the\n 988 02:00:25,670 --> 02:00:32,840 one we're going to swap it with. So the note\n 989 02:00:32,840 --> 02:00:42,730 the end of our array, which we also have its\n 990 02:00:42,729 --> 02:00:52,109 one. And now, since 10 is at the top, well,\n 991 02:00:52,109 --> 02:00:57,429 we need to make sure that the heap invariant\n 992 02:00:57,429 --> 02:01:03,429 bubbling down now instead of bubbling up.\n 993 02:01:03,429 --> 02:01:11,679 five and one, and we select the smallest,\n 994 02:01:11,679 --> 02:01:20,449 go to one. So make sure you default, selecting\n 995 02:01:20,448 --> 02:01:29,098 as you can see, 1010s children are two and\n 996 02:01:29,099 --> 02:01:39,909 select the left node to break tight. And now\n 997 02:01:39,908 --> 02:01:50,868 invariant is satisfied. Now we want to remove\n 998 02:01:50,868 --> 02:01:57,659 element at the root. But 12 is not at the\n 999 02:01:57,659 --> 02:02:07,550 remove 12. So what we do is we have to search\n 1000 02:02:07,550 --> 02:02:13,980 its position yet. So we start at one, and\n 1001 02:02:13,979 --> 02:02:25,348 until we find 12. So five is not 12, two is\n 1002 02:02:25,349 --> 02:02:32,801 found 12. And now we know where its position\n 1003 02:02:32,801 --> 02:02:41,920 to remove, and also swap it with the purple\n 1004 02:02:41,920 --> 02:02:50,250 them remove the 12. And now we're in violation\n 1005 02:02:50,250 --> 02:03:00,059 up three, until the heap invariant is satisfied.\n 1006 02:03:00,059 --> 02:03:06,000 so we can start. Now we want to remove three,\n 1007 02:03:06,000 --> 02:03:17,988 the tree. Three wasn't far it was just two\n 1008 02:03:17,988 --> 02:03:28,919 swap it with the last node in the tree. Drop\n 1009 02:03:28,920 --> 02:03:37,369 up or bubble down the value because you don't\n 1010 02:03:37,369 --> 02:03:43,599 last position is when you're swapping it in.\n 1011 02:03:43,599 --> 02:03:53,750 we already satisfy that heap invariant from\n 1012 02:03:53,750 --> 02:03:59,050 five was smaller, so we swapped it with five,\n 1013 02:03:59,050 --> 02:04:08,400 eight. And again, the heap invariants are\n 1014 02:04:08,399 --> 02:04:19,340 root node, red swap it, remove the one. And\n 1015 02:04:19,341 --> 02:04:25,860 is satisfied. Now we want to remove six. So\n 1016 02:04:25,859 --> 02:04:36,479 Okay, we have found six and do the swap. Remove\n 1017 02:04:36,479 --> 02:04:43,158 answer is neither the heap invariant is already\n 1018 02:04:43,158 --> 02:04:49,109 We got lucky. So from all this polling and\n 1019 02:04:49,109 --> 02:04:53,920 polling takes logarithmic time since we're\n 1020 02:04:53,920 --> 02:05:00,329 to find it. And also that removing a random\n 1021 02:05:00,329 --> 02:05:06,908 to actually find the index of that node we\n 1022 02:05:06,908 --> 02:05:12,488 if you're as dissatisfied with this linear\n 1023 02:05:12,488 --> 02:05:19,039 has to be a better way. And indeed there is.\n 1024 02:05:19,039 --> 02:05:28,090 this complexity to be logarithmic in the general\n 1025 02:05:28,090 --> 02:05:34,039 at how to remove nodes on my heap with the\n 1026 02:05:34,039 --> 02:05:40,738 need to make use of a hash table, a data structure\n 1027 02:05:40,738 --> 02:05:47,198 are about to get a little wild. I promise\n 1028 02:05:47,198 --> 02:05:53,299 later video. But right now, it's going to\n 1029 02:05:53,300 --> 02:05:58,320 we have a bunch of nodes scatter across our\n 1030 02:05:58,319 --> 02:06:03,058 a linear scan to find out where the node we\n 1031 02:06:03,059 --> 02:06:09,690 to do a lookup and figure that out. The way\n 1032 02:06:09,689 --> 02:06:17,229 is going to be mapped to the indexes found\n 1033 02:06:17,229 --> 02:06:24,968 node, just look up its index and started doing\n 1034 02:06:24,969 --> 02:06:32,899 sounds great, except for one caveat. What\n 1035 02:06:32,899 --> 02:06:42,029 the same value? What problems will that cause?\n 1036 02:06:42,029 --> 02:06:48,259 To begin with, let's talk about how we can\n 1037 02:06:48,260 --> 02:06:56,210 of mapping one value to one position, we will\n 1038 02:06:56,210 --> 02:07:04,420 can do this by maintaining a set or tree set\n 1039 02:07:04,420 --> 02:07:14,899 or key if you want maps to. So can I example.\n 1040 02:07:14,899 --> 02:07:25,349 has repeated values. Namely, we can see that\n 1041 02:07:25,350 --> 02:07:33,130 twice, 11 and 13. Once the low I have drawn\n 1042 02:07:33,130 --> 02:07:42,819 determine the index position of a node in\n 1043 02:07:42,819 --> 02:07:52,460 at index five, and the first to index zero.\n 1044 02:07:52,460 --> 02:07:59,269 value pairs. Notice that two is found in three\n 1045 02:07:59,269 --> 02:08:04,280 two positions, one and four, and so on. So\n 1046 02:08:04,279 --> 02:08:10,288 positions of the values in the tree. If notes\n 1047 02:08:10,288 --> 02:08:16,618 keep track of that. For example, if a bubble\n 1048 02:08:16,618 --> 02:08:23,609 movements, and where the swabs go to so we\n 1049 02:08:23,609 --> 02:08:32,618 we swap 13. And the last seven, for example,\n 1050 02:08:32,618 --> 02:08:42,328 where seven and 13 are in our table. And then\n 1051 02:08:42,328 --> 02:08:49,518 red for the seven and yellow for the 13. And\n 1052 02:08:49,519 --> 02:08:58,429 do a swap in the tree but also in the table.\n 1053 02:08:58,429 --> 02:09:05,199 We keep track of repeated values by maintaining\n 1054 02:09:05,198 --> 02:09:12,649 value was found out. But now let's ask a further\n 1055 02:09:12,649 --> 02:09:18,118 node in our heap, which node do we remove\n 1056 02:09:18,118 --> 02:09:24,219 our heap right here, there's three possible\n 1057 02:09:27,099 --> 02:09:35,110 The answer is no, it does not matter. As long\n 1058 02:09:35,109 --> 02:09:41,670 and that's the most important thing. So let's\n 1059 02:09:41,670 --> 02:09:48,440 but also of adding and pulling elements with\n 1060 02:09:48,439 --> 02:09:56,288 that hard, trust me. So first one, we want\n 1061 02:09:56,288 --> 02:10:03,788 bottom of the heap in the insertion position.\n 1062 02:10:03,788 --> 02:10:11,139 so we add three to our table long with its\n 1063 02:10:11,139 --> 02:10:18,029 Look at the index, tree and grade confirm\n 1064 02:10:18,029 --> 02:10:24,170 we need to make sure the heap invariant satisfied,\n 1065 02:10:24,170 --> 02:10:31,179 up three, the parent of three is 11, which\n 1066 02:10:31,179 --> 02:10:38,868 those two notes, I have highlighted the seven\n 1067 02:10:38,868 --> 02:10:46,899 in the heap and three in the index three,\n 1068 02:10:46,899 --> 02:10:55,029 those both in the tree and in the table. Awesome.\n 1069 02:10:55,029 --> 02:11:00,960 So do a similar thing. For the note above,\n 1070 02:11:00,960 --> 02:11:10,960 on the table. And now the bat invariants are\n 1071 02:11:10,960 --> 02:11:17,649 The next instruction is to remove to from\n 1072 02:11:17,649 --> 02:11:24,379 Well, as I said, it doesn't matter as long\n 1073 02:11:24,380 --> 02:11:30,480 If we remove the last two, we can immediately\n 1074 02:11:30,479 --> 02:11:35,468 But for learning purposes, I will simply remove\n 1075 02:11:35,469 --> 02:11:47,448 to be located at index zero. So we want to\n 1076 02:11:47,448 --> 02:11:52,678 we remove a note again, so we did a look up.\n 1077 02:11:52,679 --> 02:12:00,868 which was nice. And now we swap it with the\n 1078 02:12:00,868 --> 02:12:08,509 we remove the last node. Now we need to satisfy\n 1079 02:12:08,510 --> 02:12:16,699 11. So we look at 11 children, which happens\n 1080 02:12:16,698 --> 02:12:26,569 the one we're going to swap with. So swap\n 1081 02:12:26,569 --> 02:12:33,130 are still not in satisfaction of the human\n 1082 02:12:33,130 --> 02:12:42,190 into two smaller, so swap it with two. And\n 1083 02:12:42,189 --> 02:12:50,339 says the poll, so we get the value of the\n 1084 02:12:50,340 --> 02:12:57,501 of the two and bubble down 11. So as you can\n 1085 02:12:57,501 --> 02:13:04,079 but still doing the same operations. This\n 1086 02:13:04,078 --> 02:13:10,448 series. And we're going to have a look at\n 1087 02:13:10,448 --> 02:13:16,069 So if you want the source code, here's the\n 1088 02:13:16,069 --> 02:13:20,889 in the series, the priority queue is one of\n 1089 02:13:20,889 --> 02:13:27,609 124, so you can actually understand what's\n 1090 02:13:27,609 --> 02:13:37,198 Alright, here we are inside the source code.\n 1091 02:13:37,198 --> 02:13:43,019 types of elements I'm allowing inside my priority\n 1092 02:13:43,020 --> 02:13:47,769 talked about. So if they implement the comparable\n 1093 02:13:47,769 --> 02:13:53,300 inside our queue. So this is anything like\n 1094 02:13:53,300 --> 02:13:58,690 interface. So let's have a look at some of\n 1095 02:13:58,689 --> 02:14:05,210 size. So this is the number of elements currently\n 1096 02:14:05,210 --> 02:14:12,550 instance variable, which is the heat capacity.\n 1097 02:14:12,550 --> 02:14:22,369 That we have four elements which may be larger\n 1098 02:14:22,368 --> 02:14:28,408 heap. And we're going to be maintaining it\n 1099 02:14:30,599 --> 02:14:39,019 Next, for our logging of and removals, I'm\n 1100 02:14:39,019 --> 02:14:45,639 want to map an element to a tree set of integers.\n 1101 02:14:45,639 --> 02:14:55,859 our heap, which we can find this element T.\n 1102 02:14:55,859 --> 02:15:02,198 constructors for our priority queue. We can\n 1103 02:15:02,198 --> 02:15:11,939 I'm creating and initially empty priority\n 1104 02:15:11,939 --> 02:15:19,469 you to create a priority queue with a defined\n 1105 02:15:19,469 --> 02:15:24,969 because then we don't have to keep expanding\n 1106 02:15:24,969 --> 02:15:32,050 this. But also, even better, is if you know\n 1107 02:15:32,050 --> 02:15:40,599 your heap, you can actually construct the\n 1108 02:15:40,599 --> 02:15:47,761 called heapify, which I didn't talk about\n 1109 02:15:47,761 --> 02:15:55,420 useful. So So this just has all the usual\n 1110 02:15:55,420 --> 02:16:03,219 to the math and but also to the heap. And\n 1111 02:16:03,219 --> 02:16:09,349 at halfway through the heap size, and then\n 1112 02:16:09,349 --> 02:16:16,550 the elements. And you're like, Wait a second,\n 1113 02:16:16,550 --> 02:16:22,409 Well, yes, it is in the general case, but\n 1114 02:16:22,408 --> 02:16:31,018 a link to this paper appear just because the\n 1115 02:16:31,019 --> 02:16:40,148 it has this linear complexity. And if you\n 1116 02:16:40,148 --> 02:16:46,439 end up seeing that the complexity boils down\n 1117 02:16:46,439 --> 02:16:53,489 constantly say it's linear time. But in general,\n 1118 02:16:53,489 --> 02:16:58,760 If you're given a collection of elements,\n 1119 02:16:58,760 --> 02:17:04,068 would just use our add method to add the elements\n 1120 02:17:04,068 --> 02:17:12,769 log N bound. But definitely use the heapify\n 1121 02:17:12,769 --> 02:17:21,159 methods we have is empty, just returns true\n 1122 02:17:21,159 --> 02:17:26,500 clear. So when we clear the heap, we remove\n 1123 02:17:26,500 --> 02:17:35,478 also inside our map. So that's why he called\n 1124 02:17:35,478 --> 02:17:46,000 simple. peek, the first really useful method\n 1125 02:17:46,000 --> 02:17:53,189 queue. And if it's empty returns No. Otherwise,\n 1126 02:17:53,189 --> 02:17:58,359 our heap and return it because it's the root\nnode 1127 02:17:59,359 --> 02:18:09,170 Similar to pique, except that we're going\n 1128 02:18:09,170 --> 02:18:16,808 And we're also going to return it because\n 1129 02:18:16,808 --> 02:18:23,929 because we have a map with our elements, we\n 1130 02:18:23,929 --> 02:18:31,500 inside heap. And this reduces our complexity\n 1131 02:18:31,500 --> 02:18:37,409 through a linear scan through all elements\n 1132 02:18:37,409 --> 02:18:44,619 is remarkable. But you have a job in case\n 1133 02:18:44,620 --> 02:18:50,559 just wanted to do it. Just to show you guys\n 1134 02:18:50,558 --> 02:18:59,590 a lot of constant overhead and you may or\n 1135 02:18:59,590 --> 02:19:08,279 is quite a lot. And I usually don't really\n 1136 02:19:08,279 --> 02:19:14,159 it might not be entirely worth it. But it's\n 1137 02:19:14,159 --> 02:19:20,079 as you are removals then definitely worth\n 1138 02:19:20,079 --> 02:19:29,138 Add method. So, so this element, sorry, this\n 1139 02:19:29,138 --> 02:19:39,849 And that element cannot be no. So what we\n 1140 02:19:39,850 --> 02:19:47,620 less than capacity. Otherwise we have to expand\n 1141 02:19:47,620 --> 02:19:54,829 we add it to the map. So we keep track of\n 1142 02:19:54,829 --> 02:20:01,690 we have to swim. I know Rob because we add\n 1143 02:20:01,690 --> 02:20:09,600 had to, like adjust where it goes inside our\n 1144 02:20:09,600 --> 02:20:17,250 called less is is a helper method, which helps\n 1145 02:20:17,250 --> 02:20:26,879 node j. And this uses the fact that both elements\n 1146 02:20:26,879 --> 02:20:34,269 can invoke the Compare to method. If we go\n 1147 02:20:34,270 --> 02:20:44,210 comparable interface, which we needed. So,\n 1148 02:20:44,209 --> 02:20:54,159 if i is less than or equal to J. Awesome.\n 1149 02:20:54,159 --> 02:21:02,409 are going to try to swim node k. So first,\n 1150 02:21:02,409 --> 02:21:12,939 that by solving for the parent. So remember\n 1151 02:21:12,939 --> 02:21:19,309 some people like to start the heaps index,\n 1152 02:21:19,309 --> 02:21:25,689 So I get the parent, which is that this position\n 1153 02:21:25,689 --> 02:21:35,510 going upwards. And while K is still greater\n 1154 02:21:35,510 --> 02:21:41,159 less than our parent, then we want to swim\n 1155 02:21:41,159 --> 02:21:47,738 parent and K. And then K is when we can become\n 1156 02:21:47,738 --> 02:21:54,129 parent of K once more. And then we'll keep\n 1157 02:21:54,129 --> 02:22:01,489 notes. So that's how you do the swim. So the\n 1158 02:22:01,489 --> 02:22:12,681 top down node sink. And here, we want to sync\n 1159 02:22:12,681 --> 02:22:18,840 node, but I also grab the right node. Remember,\n 1160 02:22:18,840 --> 02:22:29,110 two instead of a plus zero plus one. And then\n 1161 02:22:29,110 --> 02:22:34,360 is going to be the left one or the right one.\n 1162 02:22:34,360 --> 02:22:39,790 one is going to be smaller than the right\n 1163 02:22:39,790 --> 02:22:47,360 case it was false. So I checked that the right\n 1164 02:22:47,360 --> 02:22:54,550 the right node is less than the left node,\n 1165 02:22:55,760 --> 02:23:03,818 And our stopping condition is that we are\n 1166 02:23:03,818 --> 02:23:12,219 sink any more. And we can do a similar thing,\n 1167 02:23:12,219 --> 02:23:21,539 like like we did in the last method also.\n 1168 02:23:21,540 --> 02:23:30,590 to swap because I also have to swap things\n 1169 02:23:30,590 --> 02:23:36,139 And this is really what adds a lot of overhead\n 1170 02:23:36,139 --> 02:23:41,809 call the swap method, we also have to swap\n 1171 02:23:41,809 --> 02:23:48,139 a lot of overhead really, it technically maps\n 1172 02:23:48,139 --> 02:23:57,180 you're doing all this internal hashing and\n 1173 02:23:57,180 --> 02:24:04,729 So remove. So if the element is now returned\n 1174 02:24:04,728 --> 02:24:11,849 inside our heap. So this is how you would\n 1175 02:24:11,850 --> 02:24:17,329 out in case you want to revert back and remove\n 1176 02:24:17,329 --> 02:24:23,100 all the elements. And once you find the element\n 1177 02:24:23,100 --> 02:24:31,559 index and return true. Otherwise, we're going\n 1178 02:24:31,559 --> 02:24:39,989 the element one of the elements are. And if\n 1179 02:24:39,989 --> 02:24:49,629 now let's have a look at the Remove add method.\n 1180 02:24:49,629 --> 02:24:58,469 So if our heap is empty, well, can't really\n 1181 02:24:58,469 --> 02:25:09,898 swap the index Why remove with the last element,\n 1182 02:25:09,898 --> 02:25:19,139 we're going to kill off that node and also\n 1183 02:25:19,139 --> 02:25:27,340 that I was equal to the heap size, meaning\n 1184 02:25:27,340 --> 02:25:34,170 heap, just remove, return the removed data.\n 1185 02:25:34,170 --> 02:25:41,770 either sink that node up or down. And I'm\n 1186 02:25:41,770 --> 02:25:48,950 sink or swim. So I just tried both. So first\n 1187 02:25:48,950 --> 02:25:59,800 then I try swimming downwards. And in either\n 1188 02:25:59,799 --> 02:26:06,379 this just readjusts where, where the swap\n 1189 02:26:06,379 --> 02:26:17,379 down. This method is just a method I use in\n 1190 02:26:17,379 --> 02:26:24,889 is good. So it checks essentially the integrity\n 1191 02:26:24,889 --> 02:26:29,349 this method with K equals zero, and that starts\n 1192 02:26:29,350 --> 02:26:37,800 down the tree and check are we maintaining\n 1193 02:26:37,799 --> 02:26:44,250 So our basis, you want to be that k is outside\n 1194 02:26:44,250 --> 02:26:52,370 to return true. Otherwise, get our children\n 1195 02:26:52,370 --> 02:27:03,140 sure that k is less than both our children.\n 1196 02:27:03,139 --> 02:27:08,439 And if we ever returned false, because we\n 1197 02:27:12,209 --> 02:27:18,389 that gets propagated throughout the recursion,\n 1198 02:27:18,389 --> 02:27:24,680 if everything returns true and hits the base\n 1199 02:27:24,680 --> 02:27:34,479 heap. Okay, these last few methods are just\n 1200 02:27:34,478 --> 02:27:42,148 into the map, things, how to remove elements\n 1201 02:27:42,148 --> 02:27:48,799 here, as I'm using a tree set to add and remove\n 1202 02:27:48,799 --> 02:27:56,059 Java has a Balanced Binary Search Tree. So\n 1203 02:27:56,059 --> 02:28:03,799 which is really nice. So you guys can have\n 1204 02:28:03,799 --> 02:28:14,429 values removes values. And lastly, do a map\n 1205 02:28:14,430 --> 02:28:21,760 or in the map rather. So yes, have a look\n 1206 02:28:21,760 --> 02:28:29,219 covered everything about the priority queue.\n 1207 02:28:29,219 --> 02:28:36,829 also sometimes called the disjoint set, this\n 1208 02:28:36,829 --> 02:28:42,271 started. So an outline of things we'll be\n 1209 02:28:42,271 --> 02:28:49,689 be going over a motivating example, magnets.\n 1210 02:28:49,689 --> 02:28:56,818 be. Then we'll go over a classic example of\n 1211 02:28:56,818 --> 02:29:04,309 crucibles a minimum spanning tree algorithm,\n 1212 02:29:04,309 --> 02:29:11,549 it needs the union find to get the complexity\n 1213 02:29:11,549 --> 02:29:17,769 concerning the find in the Union operations,\n 1214 02:29:17,770 --> 02:29:24,540 And finally, we'll have a look at path compression.\n 1215 02:29:24,540 --> 02:29:33,479 time the unifying provides? Ok, let's dive\n 1216 02:29:33,478 --> 02:29:40,238 union find. So what what is the union fine.\n 1217 02:29:40,238 --> 02:29:46,478 elements which are split into one or more\n 1218 02:29:46,478 --> 02:29:55,228 primary operations. Find an union. A word\n 1219 02:29:55,228 --> 02:30:06,358 will tell you what group that element belongs\n 1220 02:30:06,359 --> 02:30:14,470 So if we have this example with magnets, suppose\n 1221 02:30:14,469 --> 02:30:19,889 are magnets. And also suppose that the magnets\n 1222 02:30:19,889 --> 02:30:27,079 meaning they want to merge together to form\n 1223 02:30:27,079 --> 02:30:32,379 all the magnets and give them numbers, and\n 1224 02:30:32,379 --> 02:30:40,319 attraction, first, we're going to merge six\n 1225 02:30:40,319 --> 02:30:49,270 our union find, we would say union six, and\n 1226 02:30:49,270 --> 02:30:56,861 out which groups six and eight belong to,\n 1227 02:30:56,861 --> 02:31:02,399 two, three, and three and four are highly\n 1228 02:31:02,398 --> 02:31:10,559 a group. So they would form the yellow group.\n 1229 02:31:10,559 --> 02:31:21,129 And this keeps on going, and we unify magnets\n 1230 02:31:21,129 --> 02:31:29,959 onto already existing groups. So we unify\n 1231 02:31:29,959 --> 02:31:38,898 own group, and add to an already existing\n 1232 02:31:38,898 --> 02:31:45,648 which are different colors, and we assign\n 1233 02:31:45,648 --> 02:31:51,478 everything in the yellow group went into the\n 1234 02:31:51,478 --> 02:31:58,688 in our union find to determine which group\n 1235 02:31:58,689 --> 02:32:06,300 in the blue group, and the union fine, does\n 1236 02:32:06,299 --> 02:32:12,099 manner, which is why it's so handy to have\naround. 1237 02:32:12,100 --> 02:32:15,772 Now explaining currently how that works. We'll\n 1238 02:32:15,772 --> 02:32:25,300 just a motivating example. So where are other\n 1239 02:32:25,299 --> 02:32:34,228 we will. Well, we see the unifying again in\n 1240 02:32:34,228 --> 02:32:40,000 In another problem called grid percolation,\n 1241 02:32:40,000 --> 02:32:45,540 we're trying to see if there's a path from\n 1242 02:32:45,540 --> 02:32:52,069 or vice versa, then the union find lets us\n 1243 02:32:52,068 --> 02:32:59,539 Also, similar kind of problem in network activity\n 1244 02:32:59,540 --> 02:33:04,771 each other through a series of edges. And\n 1245 02:33:04,771 --> 02:33:11,850 like the least common ancestor in a tree,\n 1246 02:33:11,850 --> 02:33:20,149 complexity can we attribute to the union fight?\n 1247 02:33:20,148 --> 02:33:28,439 is linear time, which isn't actually bad at\n 1248 02:33:28,439 --> 02:33:35,010 check if connected operations all happened\n 1249 02:33:35,010 --> 02:33:44,460 So almost constant time, although not quite\n 1250 02:33:44,459 --> 02:33:49,188 where we can determine how many components\n 1251 02:33:49,189 --> 02:33:54,829 groups of nine that's we have. And we can\n 1252 02:33:54,829 --> 02:34:03,250 really great. Okay, let's talk about a really\n 1253 02:34:03,250 --> 02:34:09,750 is crew skills, minimum spanning tree algorithm.\n 1254 02:34:09,750 --> 02:34:17,020 minimum spanning tree? So if we're given some\n 1255 02:34:17,020 --> 02:34:22,979 minimum spanning tree is a subset of the edges\n 1256 02:34:22,978 --> 02:34:34,590 so at a minimal cost. So, if this is our graph,\n 1257 02:34:34,590 --> 02:34:42,500 minimum spanning tree is the following and\n 1258 02:34:42,500 --> 02:34:47,770 Note that the minimum spanning tree is not\n 1259 02:34:47,770 --> 02:34:56,850 minimum spanning tree, it will also have a\n 1260 02:34:56,850 --> 02:35:02,689 we can break it up into three steps essentially.\n 1261 02:35:02,689 --> 02:35:09,470 sort them by ascending edge edge weight. Next\n 1262 02:35:09,469 --> 02:35:16,670 the sorted edges and compare the two nodes\n 1263 02:35:16,670 --> 02:35:22,850 already belong to the same group, then we\n 1264 02:35:22,850 --> 02:35:27,270 cycle in our minimum spanning tree, which\n 1265 02:35:27,270 --> 02:35:36,720 the, the two groups those nodes belong to.\n 1266 02:35:36,719 --> 02:35:43,699 until either we run out of edges, or all the\n 1267 02:35:43,700 --> 02:35:49,550 And you'll soon see what I mean by a group,\n 1268 02:35:49,549 --> 02:35:57,789 is going to come into play. So if this is\n 1269 02:35:57,790 --> 02:36:04,330 on it, first, let's scale the edges and sort\n 1270 02:36:04,329 --> 02:36:12,689 all the edges and their edge weights sort\n 1271 02:36:12,689 --> 02:36:20,850 processing the edges one at a time, started\n 1272 02:36:20,850 --> 02:36:28,000 highlighted the edge, it Jane, orange. And\n 1273 02:36:28,000 --> 02:36:37,829 i and j currently don't belong to any group.\n 1274 02:36:37,829 --> 02:36:45,950 orange. Next is edge eight, he, so he don't\n 1275 02:36:45,950 --> 02:36:57,329 them together into group purple. Next is CGI.\n 1276 02:36:57,329 --> 02:37:04,889 have a group yet. So see can go into group\n 1277 02:37:04,889 --> 02:37:07,969 to a group. So F can go to group purple. 1278 02:37:07,969 --> 02:37:16,608 Next, H and G, knee, neither age nor g belong\n 1279 02:37:16,609 --> 02:37:26,450 to the red group. And next we have the Ruby.\n 1280 02:37:26,450 --> 02:37:34,190 them their own group, let's say group green.\n 1281 02:37:34,190 --> 02:37:41,220 get interesting. Now, we're trying to connect\n 1282 02:37:41,219 --> 02:37:47,438 belong to group orange. So we don't want to\n 1283 02:37:47,439 --> 02:37:54,389 a cycle, so ignore it. And to check that they\n 1284 02:37:54,389 --> 02:38:00,889 find operation in our union fine to check\n 1285 02:38:00,889 --> 02:38:12,170 unifying really comes into play. Next is edge\n 1286 02:38:12,170 --> 02:38:17,180 and D belongs to group green. So now we want\n 1287 02:38:17,180 --> 02:38:21,680 belong to the same group. So either the purple\n 1288 02:38:21,680 --> 02:38:26,190 green groups and the purple group. And it\n 1289 02:38:26,190 --> 02:38:31,850 them. And this is when the union operation\n 1290 02:38:31,850 --> 02:38:39,930 us to merge groups of colors together very\n 1291 02:38:39,930 --> 02:38:51,139 Next edge would be d, h, h belongs to group\n 1292 02:38:51,139 --> 02:39:00,099 let's say they both become group purple. Next\n 1293 02:39:00,100 --> 02:39:04,530 belong to the same group. So that would create\n 1294 02:39:04,530 --> 02:39:15,530 edge. So skip. next rounds include edge BTC\n 1295 02:39:15,530 --> 02:39:22,029 two groups into one larger group. So we have\n 1296 02:39:22,029 --> 02:39:27,979 minimum spanning tree algorithm. Pretty neat,\n 1297 02:39:27,978 --> 02:39:33,938 allows us to do this is the union find it\n 1298 02:39:33,939 --> 02:39:40,729 efficiently, but also to find out which groups\n 1299 02:39:40,729 --> 02:39:49,100 cycle. So so that's crystals algorithm. It's\n 1300 02:39:49,100 --> 02:39:55,279 union find works. So I'm going to go into\n 1301 02:39:55,279 --> 02:40:02,729 the Find and the union operations work internally,\n 1302 02:40:02,728 --> 02:40:09,289 useful way. Okay, so now we're going to talk\n 1303 02:40:09,290 --> 02:40:17,270 do on the union find, or the disjoint. Set.\n 1304 02:40:17,270 --> 02:40:24,680 actually work internally. So to create our\n 1305 02:40:24,680 --> 02:40:32,420 do is we're going to construct a by ejection,\n 1306 02:40:32,420 --> 02:40:40,850 the integers in the range zero inclusive to\n 1307 02:40:40,850 --> 02:40:47,079 So this step in general is actually not necessary.\n 1308 02:40:47,079 --> 02:40:55,969 based unit find, which is very efficient,\n 1309 02:40:55,969 --> 02:41:03,809 have some random objects, and we want to assign\n 1310 02:41:03,809 --> 02:41:15,389 as long as each element maps to exactly one\n 1311 02:41:15,389 --> 02:41:20,170 And we want to store these mappings perhaps\n 1312 02:41:20,170 --> 02:41:30,109 them and determine what everything is mapped\n 1313 02:41:30,109 --> 02:41:39,449 And each index is going to have an associated\n 1314 02:41:39,449 --> 02:41:49,350 So for instance, in the last slide, a was\n 1315 02:41:55,228 --> 02:42:01,920 So what you see in this picture is at the\n 1316 02:42:01,920 --> 02:42:10,949 our mapping. And in the center is just a visual\n 1317 02:42:10,949 --> 02:42:19,120 in the array for each position is currently\n 1318 02:42:19,120 --> 02:42:30,140 originally, every node is a root node, meaning\n 1319 02:42:30,139 --> 02:42:38,978 on the left of unifying groups together, or\n 1320 02:42:38,978 --> 02:42:49,148 to find that we're going to change the values\n 1321 02:42:49,148 --> 02:42:57,059 specifically, the way we're going to do it\n 1322 02:42:57,059 --> 02:43:08,269 eyes parent is going to be whatever index\n 1323 02:43:08,270 --> 02:43:19,601 to unify C and K, we look at C and K. And\n 1324 02:43:19,601 --> 02:43:28,620 and K as of nine. So either C's won't come\n 1325 02:43:28,620 --> 02:43:37,220 And I chose that case parent is going to be\n 1326 02:43:37,219 --> 02:43:44,219 position, I'm going to put a four, because\n 1327 02:43:44,219 --> 02:43:50,510 are going to do a similar type of thing. And\n 1328 02:43:50,510 --> 02:43:59,670 is going to be E. So at F position, which\n 1329 02:43:59,670 --> 02:44:07,050 is zero. Similar thing for a and J. But here's\n 1330 02:44:07,050 --> 02:44:20,519 now we want to unify A and B. So if I look\n 1331 02:44:20,520 --> 02:44:30,630 I know that A's root node for group greens\n 1332 02:44:30,629 --> 02:44:40,170 loop. And in general, I'm going to merge smaller\n 1333 02:44:40,170 --> 02:44:52,850 are point to J, because the green groups root\n 1334 02:44:52,850 --> 02:44:58,750 find the root node of D which is the and find\n 1335 02:44:58,750 --> 02:45:04,670 to merge the smartcompany into the into the\n 1336 02:45:04,670 --> 02:45:11,439 Now these want to be part of the orange group.\n 1337 02:45:11,439 --> 02:45:23,930 happens. And I now points that C. Now, I want\n 1338 02:45:23,930 --> 02:45:33,939 going to merge L and B into the red group.\n 1339 02:45:33,939 --> 02:45:40,738 interesting example. So I find a C's root\n 1340 02:45:40,738 --> 02:45:50,260 node which is J. Now, component, orange has\n 1341 02:45:50,260 --> 02:45:58,909 three. So I'm going to merge the green component\n 1342 02:45:58,909 --> 02:46:09,439 to point to C. So I want to unify A and B.\n 1343 02:46:09,439 --> 02:46:18,600 nodes until I reach a root node, as parents\n 1344 02:46:18,600 --> 02:46:23,760 belongs to the orange group. And if I do a\n 1345 02:46:23,760 --> 02:46:29,158 B's parent is also C, which is the orange\n 1346 02:46:29,158 --> 02:46:40,260 already unified together. So H and J, G, they\n 1347 02:46:40,260 --> 02:46:47,728 to arbitrarily merge them into a new group.\n 1348 02:46:47,728 --> 02:46:53,679 if I look, h is parent ID, G, and s parent\nis he 1349 02:46:53,680 --> 02:47:03,139 the right component is larger, so I'm going\n 1350 02:47:03,139 --> 02:47:08,849 g was the root node, I make it point to E,\n 1351 02:47:08,850 --> 02:47:19,930 merge H and B. So H is root node is E, if\n 1352 02:47:19,930 --> 02:47:26,779 B's root node is C, because we go from B to\n 1353 02:47:26,779 --> 02:47:31,760 is larger than the red component, we're going\n 1354 02:47:31,760 --> 02:47:38,100 the root of the orange component. So he now\n 1355 02:47:38,100 --> 02:47:44,520 example, I'm not using a technique called\n 1356 02:47:44,520 --> 02:47:52,909 going to look at in the next video, which\n 1357 02:47:52,909 --> 02:47:59,600 if we want to find out which component a particular\n 1358 02:47:59,600 --> 02:48:06,988 the root of that component by following all\n 1359 02:48:06,988 --> 02:48:13,969 or a node whose parent is itself and that\n 1360 02:48:13,969 --> 02:48:22,969 element belongs to. And to unify two components\n 1361 02:48:22,969 --> 02:48:29,889 of each component. And then if the root nodes\n 1362 02:48:29,889 --> 02:48:35,278 then they belong to the same component already.\n 1363 02:48:35,279 --> 02:48:44,699 nodes point to the become the parent of the\n 1364 02:48:44,699 --> 02:48:50,899 union find data structure. So in general,\n 1365 02:48:50,898 --> 02:48:54,939 because this would be inefficient as we'd\n 1366 02:48:54,939 --> 02:49:01,430 to that note, we don't have access to those.\n 1367 02:49:01,430 --> 02:49:09,090 of that. I just don't see any application\n 1368 02:49:09,090 --> 02:49:15,189 that the number of components in our union\n 1369 02:49:15,189 --> 02:49:21,790 root nodes remaining. Because each root node\n 1370 02:49:21,790 --> 02:49:26,770 that the number of root nodes never increases,\n 1371 02:49:26,770 --> 02:49:33,109 unify components, so components only get bigger\n 1372 02:49:33,109 --> 02:49:39,109 about the complexity of the Union find. So\n 1373 02:49:39,109 --> 02:49:46,649 has an amortized time complexity. However,\n 1374 02:49:46,648 --> 02:49:52,189 have an amortized time complexity. Not yet\n 1375 02:49:52,189 --> 02:49:57,920 something we're going to look at in the next\n 1376 02:49:57,920 --> 02:50:05,920 an absolute beast of it. structure, you must\n 1377 02:50:05,920 --> 02:50:12,790 if we need to check, if H and B belong to\n 1378 02:50:12,790 --> 02:50:20,260 going to take five hops in the worst case,\n 1379 02:50:20,260 --> 02:50:26,260 find the root node, which is C, and then we\n 1380 02:50:26,260 --> 02:50:35,100 also C. So this takes quite a few hops. Let's\n 1381 02:50:35,100 --> 02:50:41,930 is really what makes the union find one of\n 1382 02:50:41,930 --> 02:50:51,790 it's how the union find gets to boast in its\n 1383 02:50:51,790 --> 02:50:57,010 we get started, it's critical that you watch\n 1384 02:50:57,010 --> 02:51:04,350 the find in the Union operation work. Otherwise,\n 1385 02:51:04,350 --> 02:51:10,149 what's up with path compression, and how we're\n 1386 02:51:11,779 --> 02:51:19,100 Alright, suppose we have this hypothetical\n 1387 02:51:19,100 --> 02:51:24,590 path compression, I'm almost certain it's\n 1388 02:51:24,590 --> 02:51:31,340 a structure that looks like this. Nonetheless,\n 1389 02:51:31,340 --> 02:51:42,449 to unify nodes, E and L. Or just unify groups,\n 1390 02:51:42,449 --> 02:51:48,500 and L. And that's what we're calling the unify\n 1391 02:51:48,500 --> 02:51:54,949 that start on E and L. And where we would\n 1392 02:51:54,949 --> 02:52:02,390 find the root node of L, and then get one\n 1393 02:52:02,389 --> 02:52:09,129 compression, we do that, but we're also going\n 1394 02:52:09,129 --> 02:52:18,849 the parent note of E. So E's parent is D,\n 1395 02:52:18,850 --> 02:52:24,988 F. So we found the root node of E. But with\n 1396 02:52:24,988 --> 02:52:30,689 to do. Now that we have a reference to the\n 1397 02:52:30,690 --> 02:52:38,899 the root node. And similarly DS are important\n 1398 02:52:38,898 --> 02:52:47,039 everything along the path got compressed,\n 1399 02:52:47,040 --> 02:52:55,500 so, at every time we do a lookup, on either\n 1400 02:52:55,500 --> 02:53:04,209 be able to find out what the parent or the\n 1401 02:53:04,209 --> 02:53:10,669 immediately point to it, we don't have to\n 1402 02:53:10,670 --> 02:53:17,040 can do this because in a union find, we're\n 1403 02:53:17,040 --> 02:53:22,979 them more and more compressed. We're never\n 1404 02:53:22,978 --> 02:53:27,898 we do the same thing for L, we find LS parent.\n 1405 02:53:27,898 --> 02:53:36,170 find the root. And then we compress the path.\n 1406 02:53:36,170 --> 02:53:45,020 to G. And so we compress that path. But we\n 1407 02:53:45,020 --> 02:53:52,430 point to the other. And we've unified both\n 1408 02:53:52,430 --> 02:53:58,360 E, and once with l have now been merged into\n 1409 02:53:58,360 --> 02:54:06,630 is we've compressed along the way as we've\n 1410 02:54:06,629 --> 02:54:13,769 Now let's have a look at another example.\n 1411 02:54:13,770 --> 02:54:20,430 the regular union find operations where we're\n 1412 02:54:20,430 --> 02:54:29,479 version, we now know. So if I run all those\n 1413 02:54:29,478 --> 02:54:38,028 So it's the beginning all these pairs of components,\n 1414 02:54:38,029 --> 02:54:51,710 right. And this is the final state of our\n 1415 02:54:51,709 --> 02:55:00,278 determine what groups say a and j or n, then\n 1416 02:55:00,279 --> 02:55:09,600 nodes. So j goes, I use h h goes to eat. But\n 1417 02:55:09,600 --> 02:55:18,680 what happens. So I still have all those components.\n 1418 02:55:18,680 --> 02:55:23,710 the right hand side, this is what happens.\n 1419 02:55:23,709 --> 02:55:32,329 of path compression, that j merged into the\n 1420 02:55:32,329 --> 02:55:39,110 And then I keep executing more instructions.\n 1421 02:55:39,110 --> 02:55:50,940 dynamically. So so I'm getting more and more\n 1422 02:55:50,940 --> 02:55:57,069 So on the last example, we haven't even finish\n 1423 02:55:57,068 --> 02:56:01,359 the final state. But with path compression,\n 1424 02:56:01,359 --> 02:56:08,159 our path, we get to compress the path along\n 1425 02:56:08,159 --> 02:56:13,299 now, we only have one, root, B, E, and 1426 02:56:13,299 --> 02:56:21,000 almost everything in constant time, points\n 1427 02:56:21,000 --> 02:56:26,029 And we know that the route is easy. So we\n 1428 02:56:26,029 --> 02:56:32,800 becomes very stable eventually, because of\n 1429 02:56:32,799 --> 02:56:40,398 find with path compression is so efficient.\n 1430 02:56:40,398 --> 02:56:49,318 source code. So here's the link to the source\n 1431 02:56:49,318 --> 02:56:55,459 github.com slash William fees, slash data\n 1432 02:56:55,459 --> 02:57:00,858 structures from past videos. And before we\n 1433 02:57:00,859 --> 02:57:07,220 the other videos pertaining to the union find,\n 1434 02:57:07,219 --> 02:57:17,528 Okay, let's dig in. Here we are inside the\n 1435 02:57:17,529 --> 02:57:23,488 and see a few instance variables. So let's\n 1436 02:57:23,488 --> 02:57:30,478 many elements we have in our union find. There\n 1437 02:57:30,478 --> 02:57:38,159 one called size. So the interest, while the\n 1438 02:57:38,159 --> 02:57:47,859 array I talked about which at index i points\n 1439 02:57:47,859 --> 02:57:55,609 is equal to AI, then we know that AI is a\n 1440 02:57:55,609 --> 02:58:04,000 of all these like tree like structures right\n 1441 02:58:04,000 --> 02:58:10,790 because we create a by ejection between our\n 1442 02:58:10,790 --> 02:58:18,110 able to access them through this ID array.\n 1443 02:58:18,110 --> 02:58:24,470 of components, that's sometimes some useful\n 1444 02:58:24,469 --> 02:58:28,898 a union find, well, you need to know how many\n 1445 02:58:28,898 --> 02:58:37,930 find. And I make sure that we have a positive\n 1446 02:58:37,930 --> 02:58:48,220 Now go ahead and initialize some instance\n 1447 02:58:48,219 --> 02:59:00,959 So initially, everyone is a root node, and\n 1448 02:59:00,959 --> 02:59:10,259 is pretty simple. It's given a a node, it\n 1449 02:59:10,260 --> 02:59:17,591 does path compression along the way. So if\n 1450 02:59:17,591 --> 02:59:22,760 of P, what we're going to do is we're going\n 1451 02:59:22,760 --> 02:59:29,659 loop. So we initialize a new variable called\n 1452 02:59:29,659 --> 02:59:37,029 is not equal to ID at root. So aka This is\n 1453 02:59:37,029 --> 02:59:42,989 so we can stop and the root is stored in the\n 1454 02:59:42,989 --> 02:59:49,898 is we do the path compression. This is what\n 1455 02:59:49,898 --> 02:59:59,219 back at p, we assign everything from idmp\n 1456 02:59:59,219 --> 03:00:05,659 the path gives us that nice amortized time\n 1457 03:00:05,659 --> 03:00:12,939 But I don't like having the overhead and doing\n 1458 03:00:12,939 --> 03:00:25,950 Okay, so now, we have these simple methods,\n 1459 03:00:25,950 --> 03:00:35,409 same component, this will return true, because\n 1460 03:00:35,409 --> 03:00:43,010 this will return false. And just calling find\n 1461 03:00:43,010 --> 03:00:49,620 just checking if two components are connected,\n 1462 03:00:49,620 --> 03:00:56,560 the path, same thing here, if we decide to\n 1463 03:00:56,559 --> 03:01:05,818 index p, then when we index into the size\n 1464 03:01:05,818 --> 03:01:11,409 the root but at the same time, we'll also\n 1465 03:01:11,409 --> 03:01:20,238 And I would just like to note that the the\n 1466 03:01:20,238 --> 03:01:24,010 the size because they're the ones that are\n 1467 03:01:24,010 --> 03:01:31,309 at the end of the chain. Size just returns\n 1468 03:01:31,309 --> 03:01:37,840 disjoint, set components number components\n 1469 03:01:37,840 --> 03:01:45,069 method is the last interesting method. So\n 1470 03:01:45,069 --> 03:01:54,909 together. So so first of all we do is we find\n 1471 03:01:54,909 --> 03:02:01,159 node for Q is. And if the root nodes are equal,\n 1472 03:02:01,159 --> 03:02:08,549 we don't do anything. Otherwise, by convention,\n 1473 03:02:08,549 --> 03:02:19,259 group. Although I know some people like to\n 1474 03:02:19,260 --> 03:02:26,489 and then merge according to not and that may\n 1475 03:02:26,489 --> 03:02:33,318 work. So I just like to merge the smaller\n 1476 03:02:33,318 --> 03:02:40,409 the roots are different, and we're emerging,\n 1477 03:02:40,409 --> 03:02:47,000 must have decreased by one. So that's why\n 1478 03:02:47,000 --> 03:02:55,350 components, subtract that by one, because\n 1479 03:02:55,350 --> 03:03:03,859 So this whole time, inside this class, I've\n 1480 03:03:03,859 --> 03:03:12,130 as elements, like letters that I that we saw\n 1481 03:03:12,129 --> 03:03:18,709 by ejection, I would do a lookup to find out\n 1482 03:03:18,709 --> 03:03:25,919 give me an integer, and what maps to the element\n 1483 03:03:25,920 --> 03:03:31,100 union find data structure created, and turn\n 1484 03:03:31,100 --> 03:03:40,840 of dealing with objects and having all this\n 1485 03:03:40,840 --> 03:03:48,079 an array based union find. You could also\n 1486 03:03:48,079 --> 03:03:53,760 objects. But this is really nice, and it's\n 1487 03:03:53,760 --> 03:04:01,110 I want to start talking about a very exciting\n 1488 03:04:01,110 --> 03:04:08,750 start to realize that there are tons and tons\n 1489 03:04:08,750 --> 03:04:16,399 I want to focus on a very popular country\n 1490 03:04:16,399 --> 03:04:23,648 trees, we must talk about binary search trees\n 1491 03:04:23,648 --> 03:04:31,010 on binary trees and binary search trees where\n 1492 03:04:31,010 --> 03:04:36,540 tutorials, we're going to cover how to insert\n 1493 03:04:36,540 --> 03:04:44,740 and also do some of the more popular tree\n 1494 03:04:44,739 --> 03:04:53,549 trees. Also, not just binary trees. Okay.\n 1495 03:04:53,549 --> 03:05:00,090 course on trees before we get started. So\n 1496 03:05:00,090 --> 03:05:05,210 can satisfy either of the following definitions,\n 1497 03:05:05,209 --> 03:05:15,619 these are the most popular ones. So trees\n 1498 03:05:15,620 --> 03:05:24,590 a cyclic, a cyclic means there are no cycles.\n 1499 03:05:24,590 --> 03:05:33,909 we have n minus one edges. And lastly, for\n 1500 03:05:33,909 --> 03:05:41,549 those two vertices, you can have two different\n 1501 03:05:41,549 --> 03:05:49,059 a tree because there's another route to get\n 1502 03:05:49,059 --> 03:05:58,488 Okay, and context is trees, we can have something\n 1503 03:05:58,488 --> 03:06:04,359 node of our tree, you can think of it that\n 1504 03:06:04,359 --> 03:06:10,140 and you don't have a route yet, it doesn't\n 1505 03:06:10,139 --> 03:06:19,409 because any node you pick can become the root\n 1506 03:06:19,409 --> 03:06:27,139 that node. And suddenly, it's the new root.\n 1507 03:06:27,139 --> 03:06:33,920 child and parent nodes. So child node is a\n 1508 03:06:33,920 --> 03:06:39,978 of it as going down or it's an A parent node\n 1509 03:06:39,978 --> 03:06:47,789 towards the root. So we have an interesting\n 1510 03:06:47,790 --> 03:06:56,689 of the root node? The answer is that the root\n 1511 03:06:56,689 --> 03:07:04,439 may be useful to say that the parent of the\n 1512 03:07:04,439 --> 03:07:11,739 when programming, for instance, a file system,\n 1513 03:07:11,739 --> 03:07:19,199 command line, I'm in some directory, so I\n 1514 03:07:19,200 --> 03:07:27,750 somewhere in the file system tree. And if\n 1515 03:07:27,750 --> 03:07:36,238 dot dot slash, and now I'm up in another directory,\n 1516 03:07:36,238 --> 03:07:43,898 doing this and going up and up in the file\n 1517 03:07:43,898 --> 03:07:50,670 directly to the root node, which is slash,\n 1518 03:07:50,670 --> 03:07:57,989 very top at the root of the directory, and\n 1519 03:07:57,989 --> 03:08:06,119 I am, I'm again at the root. So in this context,\n 1520 03:08:06,120 --> 03:08:13,710 is the root. Pretty cool. So just as an example,\n 1521 03:08:13,709 --> 03:08:20,579 three and two and a parent four, we also have\n 1522 03:08:20,579 --> 03:08:26,271 this a node which has no children, and these\n 1523 03:08:26,271 --> 03:08:32,360 just at the very bottom of your tree. Think\n 1524 03:08:32,360 --> 03:08:38,329 which is sub tree, this is the tree entirely\n 1525 03:08:38,329 --> 03:08:45,329 use triangles to denote sub trees. It's possible\n 1526 03:08:45,329 --> 03:08:56,969 node, so that's fine. So if this so tree with\n 1527 03:08:56,969 --> 03:09:02,459 particular sub tree and look what's inside\n 1528 03:09:02,459 --> 03:09:08,698 nodes and more sub trees. Then we pick another\n 1529 03:09:08,699 --> 03:09:14,870 we get another tree. And eventually, we're\n 1530 03:09:14,870 --> 03:09:22,120 the question, what is a binary tree? And this\n 1531 03:09:22,120 --> 03:09:30,760 node has at most, two children. So both those\n 1532 03:09:30,760 --> 03:09:38,829 have at most two children. You can see that\n 1533 03:09:38,829 --> 03:09:45,569 and that's fine, because the criteria is at\n 1534 03:09:45,569 --> 03:09:50,879 I'm going to give you some various structures,\n 1535 03:09:50,879 --> 03:10:02,379 it is a binary tree or not. So is this a binary\n 1536 03:10:02,379 --> 03:10:10,929 two children. How about this one? No, you\n 1537 03:10:10,930 --> 03:10:17,809 it's not a binary tree. How about this one?\n 1538 03:10:17,809 --> 03:10:25,698 Yes, this is a binary tree. It may be a degenerate\n 1539 03:10:25,699 --> 03:10:30,790 let's move on to binary search trees. So what\n 1540 03:10:30,790 --> 03:10:36,489 it's a binary tree. But Furthermore, it also\n 1541 03:10:36,488 --> 03:10:43,989 tree invariant. And that is that the less\n 1542 03:10:43,989 --> 03:10:50,389 value of the current node. And the right subtree\n 1543 03:10:50,389 --> 03:10:56,639 node. So below are a few binary search trees.\n 1544 03:10:56,639 --> 03:11:01,099 and I'm going to give you some trees, and\n 1545 03:11:01,100 --> 03:11:11,000 binary search trees or not. What about this\n 1546 03:11:11,000 --> 03:11:16,469 on whether you want to allow duplicate values\n 1547 03:11:16,469 --> 03:11:21,770 search tree operations allow for duplicate\n 1548 03:11:21,771 --> 03:11:27,659 most of the time, we're only interested in\n 1549 03:11:27,658 --> 03:11:36,129 particular tree depends on what your definition\n 1550 03:11:36,129 --> 03:11:44,969 tree? Yes, this is a binary search tree. How\n 1551 03:11:44,969 --> 03:11:54,719 the elements within the tree. Yes, this is\n 1552 03:11:54,719 --> 03:12:00,409 to only having numbers within our binary search\n 1553 03:12:00,409 --> 03:12:13,398 is comparable and can be ordered. How about\n 1554 03:12:13,398 --> 03:12:21,219 tree. And the reason is the the node nine\n 1555 03:12:21,219 --> 03:12:30,408 inserting nine, we would have to place it\n 1556 03:12:30,408 --> 03:12:41,810 is larger than eight so belongs in its right\n 1557 03:12:41,810 --> 03:12:47,799 isn't even a tree actually, because it contains\n 1558 03:12:47,799 --> 03:12:55,679 search tree is that you must be a tree. And\n 1559 03:12:55,680 --> 03:13:02,790 more time to look at this one. Because it's\n 1560 03:13:02,790 --> 03:13:11,680 think of as a binary search tree. And the\n 1561 03:13:11,680 --> 03:13:19,170 the binary search tree invariant that every\n 1562 03:13:19,170 --> 03:13:25,579 will you'll see that that is true. And also\n 1563 03:13:25,579 --> 03:13:32,799 It doesn't look like a tree, but it satisfies\n 1564 03:13:32,799 --> 03:13:38,349 tree. Okay, so we've been talking about binary\n 1565 03:13:38,350 --> 03:13:45,870 used? Why are they useful? So in particular,\n 1566 03:13:45,870 --> 03:13:53,930 of abstract data types for sets, and maps\n 1567 03:13:53,930 --> 03:13:59,439 balanced binary search trees, which we'll\n 1568 03:13:59,439 --> 03:14:05,738 see binary search, or sorry, binary trees\n 1569 03:14:05,738 --> 03:14:12,770 priority queues when we're making a binary\n 1570 03:14:12,770 --> 03:14:20,319 like syntax trees, so you're parsing an arithmetic\n 1571 03:14:20,318 --> 03:14:28,539 syntax tree. And then you can simplify expression.\n 1572 03:14:28,540 --> 03:14:34,790 expressions. So wherever you punch in your\n 1573 03:14:34,790 --> 03:14:41,949 evaluated. And lastly, I just threw in a trip,\n 1574 03:14:41,949 --> 03:14:47,590 structure. So now let's look at the complexity\n 1575 03:14:47,590 --> 03:14:54,540 looked very interesting and also very useful.\n 1576 03:14:54,540 --> 03:14:58,149 when you're just given some random data 1577 03:14:58,148 --> 03:15:04,939 the time complexity is growing. Be logarithmic,\n 1578 03:15:04,939 --> 03:15:11,120 nodes, deleting nodes, removing nodes searching\n 1579 03:15:11,120 --> 03:15:17,000 complexity is going to be the logarithmic.\n 1580 03:15:17,000 --> 03:15:24,930 are very easy to implement. So this is really\n 1581 03:15:24,930 --> 03:15:34,590 tree and D generates to being a line, then\n 1582 03:15:34,590 --> 03:15:41,100 which is really bad. So there's some trade\n 1583 03:15:41,100 --> 03:15:47,100 search tree, that it's going to be easy to\n 1584 03:15:47,100 --> 03:15:52,210 this logarithmic behavior. But in the worst\n 1585 03:15:52,209 --> 03:16:00,778 stuff, which is not so good. Okay, let's have\n 1586 03:16:00,779 --> 03:16:10,180 a binary search tree. So let's dive right\n 1587 03:16:10,180 --> 03:16:15,500 search tree, we need to make sure that the\n 1588 03:16:15,500 --> 03:16:22,309 meaning that we can order them in some way\n 1589 03:16:22,309 --> 03:16:29,420 know whether we need to place the element\n 1590 03:16:29,420 --> 03:16:36,370 And we're going to encounter essentially four\n 1591 03:16:36,370 --> 03:16:43,630 to compare the value to the value of the current\n 1592 03:16:43,629 --> 03:16:51,000 things. Either, we're going to recurse down\n 1593 03:16:51,000 --> 03:16:55,000 than the current element, or we're going to\n 1594 03:16:55,000 --> 03:17:01,500 element is greater than the current element.\n 1595 03:17:01,500 --> 03:17:10,850 has the same value as the one we're considering.\n 1596 03:17:10,850 --> 03:17:18,829 if we're deciding to add duplicate values\n 1597 03:17:18,829 --> 03:17:22,850 we have the case that we've hit a null node,\n 1598 03:17:22,850 --> 03:17:29,470 and insert it in our tree. Let's look at some\n 1599 03:17:29,469 --> 03:17:35,969 of insert instructions. So we have all these\n 1600 03:17:35,969 --> 03:17:40,228 tree. And currently the search tree or the\n 1601 03:17:40,228 --> 03:17:50,059 want to insert seven. So seven becomes the\n 1602 03:17:50,059 --> 03:17:59,750 Next, we want to insert 20. So 20 is greater\n 1603 03:17:59,750 --> 03:18:04,840 we want insert five. So we always start at\n 1604 03:18:04,840 --> 03:18:09,630 an important point. So you start at the root,\n 1605 03:18:09,629 --> 03:18:14,778 figure out where you want to insert the node.\n 1606 03:18:14,779 --> 03:18:22,370 oh, five is less than seven. So we're going\n 1607 03:18:22,370 --> 03:18:30,390 go to the right, because 15 is greater than\n 1608 03:18:30,389 --> 03:18:44,879 20 at 10. Now four, so four is less than seven,\n 1609 03:18:44,879 --> 03:18:51,500 create the new node. Now we have four again.\n 1610 03:18:51,500 --> 03:18:57,620 to the left and moved to the left. Now we've\n 1611 03:18:57,620 --> 03:19:04,040 our tree. So as I said before, if your tree\n 1612 03:19:04,040 --> 03:19:10,370 to add another node. And you would either\n 1613 03:19:10,370 --> 03:19:16,779 or on the right. Otherwise, you'd do nothing.\n 1614 03:19:16,779 --> 03:19:24,010 33. So start at the root, go to the right,\n 1615 03:19:24,010 --> 03:19:31,950 Now insert two, so two smaller than everything\n 1616 03:19:31,950 --> 03:19:41,390 the left. Now try and see where 25 would go.\n 1617 03:19:41,389 --> 03:19:48,108 to go to the right again, because it's greater\n 1618 03:19:50,840 --> 03:19:59,949 And finally sex so once left, once right,\n 1619 03:19:59,949 --> 03:20:05,790 search tree. So on average, the insertion\n 1620 03:20:05,790 --> 03:20:12,449 worst case, this behavior could degrade to\n 1621 03:20:12,449 --> 03:20:21,229 if our instructions are the following insert\n 1622 03:20:21,228 --> 03:20:26,159 and insert two sets to the right. Okay, now\n 1623 03:20:26,159 --> 03:20:31,449 than everything. So I have to place the right\n 1624 03:20:31,449 --> 03:20:38,960 greater than everything. Oh, looks like we're\n 1625 03:20:38,959 --> 03:20:44,289 still greater than everything. So as you can\n 1626 03:20:44,290 --> 03:20:51,229 bad. And we don't want to create lines like\n 1627 03:20:51,228 --> 03:20:56,020 is in the tree, or if we want to remove five,\n 1628 03:20:56,021 --> 03:21:01,420 thing to find the node, that's going to take\n 1629 03:21:01,420 --> 03:21:07,329 one of the reasons why people haven't invented\n 1630 03:21:07,329 --> 03:21:13,680 or self balancing trees, which balance themselves\n 1631 03:21:13,680 --> 03:21:21,510 But that's it for insertion. It's really simple.\n 1632 03:21:21,510 --> 03:21:26,809 we know how to insert elements into a binary\n 1633 03:21:26,809 --> 03:21:32,250 elements from a binary search tree. And this\n 1634 03:21:32,250 --> 03:21:38,750 to make it very simple for you guys. So when\n 1635 03:21:38,750 --> 03:21:44,889 you can think of it as a two step process.\n 1636 03:21:44,889 --> 03:21:51,559 to remove within the binary search tree, if\n 1637 03:21:51,559 --> 03:21:58,859 we want to replace the node we're removing\n 1638 03:21:58,860 --> 03:22:04,810 to maintain the binary search tree invariant,\n 1639 03:22:04,809 --> 03:22:11,978 invariant is it's that the left subtree has\n 1640 03:22:11,978 --> 03:22:18,459 the right subtree has larger elements than\n 1641 03:22:18,459 --> 03:22:26,629 phase one the find phase. So if we're searching\n 1642 03:22:26,629 --> 03:22:32,879 one of four things is going to happen. The\n 1643 03:22:32,879 --> 03:22:38,579 we've went all the way down our binary search\n 1644 03:22:38,579 --> 03:22:44,920 So the value does not exist inside our binary\n 1645 03:22:44,920 --> 03:22:54,500 is the competitor value is equal to zero.\n 1646 03:22:54,500 --> 03:23:01,430 function that will return minus one if it's\n 1647 03:23:01,430 --> 03:23:07,139 it's greater than the current value. So it\n 1648 03:23:07,139 --> 03:23:12,579 to go down, or if we found value that we're\n 1649 03:23:12,579 --> 03:23:19,350 we found the value. If it's less than zero,\n 1650 03:23:19,350 --> 03:23:26,770 to be the left subtree if I compared to returns\n 1651 03:23:26,770 --> 03:23:33,819 exists, it's going to be the right subtree.\n 1652 03:23:33,818 --> 03:23:42,859 So suppose we have four or five queries, find\n 1653 03:23:42,859 --> 03:23:49,329 tree there on the right. So if we're trying\n 1654 03:23:49,329 --> 03:23:59,000 the root and 14 is less than so go left. 14\n 1655 03:23:59,000 --> 03:24:07,520 than 15. Go left. 14 is greater than 12. So\n 1656 03:24:07,520 --> 03:24:14,750 value that we are looking for. Alright, and\n 1657 03:24:14,750 --> 03:24:25,450 Wi Fi is less than 31. And now we found 25.\n 1658 03:24:27,090 --> 03:24:37,270 Okay, here's her go, go right, go right, your\n 1659 03:24:37,270 --> 03:24:43,940 look at 17. So 17 should be on the left. Now\n 1660 03:24:43,940 --> 03:24:53,630 again. And, oh, we've hit a point where we\n 1661 03:24:53,629 --> 03:25:00,379 that's another possibility the value simply\n 1662 03:25:00,379 --> 03:25:07,608 Left of 90, and the left of 19 is a null node.\n 1663 03:25:07,609 --> 03:25:13,739 value we're looking for. So now that we found\n 1664 03:25:13,739 --> 03:25:18,969 the Remove phase. And in the Remove phase,\n 1665 03:25:18,969 --> 03:25:24,608 first case is that the node we want to remove\n 1666 03:25:24,609 --> 03:25:32,790 two, and three, as I like to call them, is\n 1667 03:25:32,790 --> 03:25:38,930 right subtree, but no left subtree, or those\n 1668 03:25:38,930 --> 03:25:44,270 finally case for is that we have both a left\n 1669 03:25:44,270 --> 03:25:51,600 you guys how to handle each of these cases,\n 1670 03:25:51,600 --> 03:25:57,430 one, we have a leaf node. So if you have a\n 1671 03:25:57,430 --> 03:26:03,699 you can do so with a side effect, which is\n 1672 03:26:03,699 --> 03:26:11,979 search tree on the right, and we want to remove\n 1673 03:26:11,978 --> 03:26:20,760 eight is. Oh, and it is a case one, because\n 1674 03:26:20,760 --> 03:26:28,719 it without side effect. So we remove it. Perfect.\n 1675 03:26:28,719 --> 03:26:38,019 two and three. Meaning that either the left\n 1676 03:26:38,020 --> 03:26:45,449 the successor of the node we're trying to\n 1677 03:26:45,449 --> 03:26:54,250 left or right subtree. Let's look at an example.\n 1678 03:26:54,250 --> 03:27:04,870 let's find nine. Okay, we found nine. Now\n 1679 03:27:04,870 --> 03:27:12,050 is nine doesn't have a right subtree. So the\n 1680 03:27:12,049 --> 03:27:20,179 that left subtree. So seven. So now I can\n 1681 03:27:20,180 --> 03:27:29,180 of nine. Perfect. Now let's do another example\n 1682 03:27:29,180 --> 03:27:39,568 find four. So we find four. And this is our\n 1683 03:27:39,568 --> 03:27:46,609 But no, right. subtree. So where do we do,\n 1684 03:27:46,609 --> 03:27:52,800 node of that left. subtree. So three, so we\n 1685 03:27:52,799 --> 03:28:01,369 successor. Alright, that wasn't so bad, was\n 1686 03:28:01,370 --> 03:28:09,819 want to remove node which has both a left\n 1687 03:28:09,818 --> 03:28:16,260 is, in which subtree will the successor of\n 1688 03:28:16,260 --> 03:28:26,829 And the answer is both the successor can either\n 1689 03:28:26,829 --> 03:28:34,469 the smallest value in the right. subtree and\n 1690 03:28:34,469 --> 03:28:42,809 there can be two successors. So the largest\n 1691 03:28:42,809 --> 03:28:48,478 would satisfy the binary search tree invariant,\n 1692 03:28:48,478 --> 03:28:54,799 be larger than everything in the left subtree\n 1693 03:28:54,799 --> 03:29:00,369 left subtree and also it would be smaller\n 1694 03:29:00,370 --> 03:29:08,010 we had found it in the left. subtree similarly,\n 1695 03:29:08,010 --> 03:29:12,510 subtree It would also satisfy the biosurgery\n 1696 03:29:12,510 --> 03:29:19,000 be smaller than everything in the right subtree\n 1697 03:29:19,000 --> 03:29:24,680 than the right subtree and also larger than\n 1698 03:29:24,680 --> 03:29:31,540 was found in the right subtree and we know\n 1699 03:29:31,540 --> 03:29:38,439 than everything in the left subtree. So we\n 1700 03:29:38,439 --> 03:29:47,380 be two possible successors. So we can choose\n 1701 03:29:47,379 --> 03:29:55,389 and we want to remove seven, well seven is\n 1702 03:29:55,389 --> 03:30:05,269 and right subtree also seven, so find seven\n 1703 03:30:05,270 --> 03:30:15,770 the successor in our left subtree or our right\n 1704 03:30:15,770 --> 03:30:26,550 in the right subtree what we do is we go into\n 1705 03:30:26,549 --> 03:30:35,478 as possible, go left and go left again. And\n 1706 03:30:35,478 --> 03:30:44,829 stop. And this node is going to be the successor,\n 1707 03:30:44,829 --> 03:30:50,590 And you can see that quite clearly 11 is smaller\n 1708 03:30:50,590 --> 03:30:58,969 now what we want to do is, we want to copy\n 1709 03:30:58,969 --> 03:31:05,238 subtree 11 into the node we want to originally\n 1710 03:31:05,238 --> 03:31:12,689 seven with 11. Now we have a problem, which\n 1711 03:31:12,689 --> 03:31:21,199 want to now remove that element 11 which is\n 1712 03:31:21,199 --> 03:31:28,030 shouldn't no longer be inside the tree. And\n 1713 03:31:28,030 --> 03:31:33,399 to remove is always going to be either a case\n 1714 03:31:33,398 --> 03:31:40,478 it would be the case where there's a right\n 1715 03:31:40,478 --> 03:31:47,658 just do this recursively. So so just call\n 1716 03:31:47,658 --> 03:31:57,000 cases. Okay, so it's right subtree. So I want\n 1717 03:31:57,000 --> 03:32:06,648 right subtree and then remove it. So remove\n 1718 03:32:06,648 --> 03:32:14,219 to rebalance the tree like that. Alright,\n 1719 03:32:14,219 --> 03:32:22,158 this example, let's remove 14. So first, let's\n 1720 03:32:22,158 --> 03:32:30,279 go right. Alright, we found 14. Now we either\n 1721 03:32:30,279 --> 03:32:35,640 subtree like we did last time, or the largest\n 1722 03:32:35,639 --> 03:32:42,898 ladder this time and find the largest value\n 1723 03:32:42,898 --> 03:32:49,369 is we would go into the left subtree and dig\n 1724 03:32:49,370 --> 03:33:01,760 the left subtree in digges far right 913.\n 1725 03:33:01,760 --> 03:33:12,219 And now what we want to do as before is we\n 1726 03:33:12,219 --> 03:33:19,899 into the node we want to remove which is 14.\n 1727 03:33:19,899 --> 03:33:25,770 the remaining 13. So now just remove it and\n 1728 03:33:25,770 --> 03:33:33,060 I just want to go over some more examples.\n 1729 03:33:33,059 --> 03:33:39,260 removing is not quite so obvious. Alright,\n 1730 03:33:39,260 --> 03:33:46,370 see if we can remove 18 from this strange\n 1731 03:33:46,370 --> 03:33:54,880 root find a teen so dig all the way down.\n 1732 03:33:54,879 --> 03:34:03,059 be it's one of those case two or threes. It's\n 1733 03:34:03,059 --> 03:34:10,299 successor is just going to be the root node\n 1734 03:34:10,299 --> 03:34:17,579 replace it 17. So 17 is a new successor. Perfect.\n 1735 03:34:17,579 --> 03:34:24,709 want to remove minus two. So now first find\n 1736 03:34:24,709 --> 03:34:31,789 it found it. Now there's two subtrees pick\n 1737 03:34:31,790 --> 03:34:38,470 and we're going to go to the right. Alright,\n 1738 03:34:38,469 --> 03:34:48,559 one. And that's it. Okay, and that is removals\n 1739 03:34:48,559 --> 03:34:55,699 off binary trees and binary search trees with\n 1740 03:34:55,700 --> 03:35:04,069 in order post order and level order. You see\n 1741 03:35:04,068 --> 03:35:11,079 they're good to know. I want to focus on pre\n 1742 03:35:11,079 --> 03:35:20,579 because they're very similar. They're also\n 1743 03:35:20,579 --> 03:35:30,408 sort of get a feel for why they have their\n 1744 03:35:30,408 --> 03:35:38,609 before the two recursive calls, in order will\n 1745 03:35:38,609 --> 03:35:47,689 will print after the recursive calls. So if\n 1746 03:35:47,689 --> 03:35:54,340 the only thing that's different between them\n 1747 03:35:54,340 --> 03:36:01,049 move on to some detail on how preorder works.\n 1748 03:36:01,049 --> 03:36:07,479 stack of what gets called. So when we're recursing\n 1749 03:36:07,479 --> 03:36:14,909 node to go to. And what you need to know about\n 1750 03:36:14,909 --> 03:36:21,440 current node and then we traverse the left\n 1751 03:36:21,440 --> 03:36:30,101 for order where we're going to do is insert\n 1752 03:36:30,101 --> 03:36:41,800 we go down to D, go down to H. And now we\n 1753 03:36:41,799 --> 03:36:51,849 call stack and go to D and then we go to I\n 1754 03:36:51,850 --> 03:37:00,180 so we recurse back up, so we push AI off the\n 1755 03:37:01,180 --> 03:37:10,670 go back to B we also are a process B but now\n 1756 03:37:10,670 --> 03:37:20,360 and explore II. Now we've explored ease. So\n 1757 03:37:20,360 --> 03:37:30,540 off the stack. And now a now we need to explore\n 1758 03:37:30,540 --> 03:37:42,830 C then F then j and then right at the bottom,\n 1759 03:37:42,829 --> 03:37:51,049 are recursive push node k off the stack, push\n 1760 03:37:51,049 --> 03:38:03,329 C's right subtree. So g now l and now we're\n 1761 03:38:03,329 --> 03:38:11,659 would exit our function. And at the bottom,\n 1762 03:38:11,659 --> 03:38:20,020 Okay, now let's cover inorder traversal. So,\n 1763 03:38:20,020 --> 03:38:29,050 the left subtree that we print the value.\n 1764 03:38:29,049 --> 03:38:34,728 for this example, I'm going to be using a\n 1765 03:38:34,728 --> 03:38:43,670 binary tree. And you'll see something interesting\n 1766 03:38:43,670 --> 03:38:51,500 it's our route. Then we go left, then we go\n 1767 03:38:51,500 --> 03:38:58,779 I was going left, I would push those on to\n 1768 03:38:58,779 --> 03:39:06,949 because when I call in order, the very first\n 1769 03:39:06,949 --> 03:39:15,140 I only print once I've traversed the entire\n 1770 03:39:15,139 --> 03:39:22,420 now one is a leaf node, then I've already\n 1771 03:39:22,420 --> 03:39:30,579 I can print the current value. Then I recurse\n 1772 03:39:30,579 --> 03:39:39,978 threes left subtree now we go right now and\n 1773 03:39:39,978 --> 03:39:51,010 recurse. Now I can print six because I've\n 1774 03:39:51,010 --> 03:40:01,318 then recurse then print 11. Now we need to\n 1775 03:40:01,318 --> 03:40:12,840 left, go left may explore 12 recurs and we're\n 1776 03:40:12,840 --> 03:40:22,719 14. Also, because 14 has no sub trees up.\n 1777 03:40:22,719 --> 03:40:34,159 stack, print 15, because we will explode 15th\n 1778 03:40:34,159 --> 03:40:42,350 last thing we need to do is finish our function\n 1779 03:40:42,350 --> 03:40:49,250 So go back up. And now, did you notice something\n 1780 03:40:49,250 --> 03:41:00,119 traversal? Well, what happened was we printed\n 1781 03:41:00,119 --> 03:41:05,819 which is why it's called an inorder traversal.\n 1782 03:41:05,818 --> 03:41:11,818 the values in increasing order, which is really\n 1783 03:41:11,818 --> 03:41:19,709 inorder traversal. Now, let's look at the\n 1784 03:41:19,709 --> 03:41:24,459 says, okay, traverse the left subtree, then\n 1785 03:41:24,459 --> 03:41:31,599 done doing both of those, only then print\n 1786 03:41:31,600 --> 03:41:38,550 look at our tree right now, the last value\n 1787 03:41:38,550 --> 03:41:46,350 we need to process a lemon's entire left subtree\n 1788 03:41:46,350 --> 03:41:54,988 So let's start at 11 and explore its left\n 1789 03:41:54,988 --> 03:42:01,158 one, because we've explored both its left\n 1790 03:42:01,158 --> 03:42:06,238 we haven't explored its right subtree yet,\n 1791 03:42:06,238 --> 03:42:12,448 trees which don't exist. Now we can print\n 1792 03:42:12,449 --> 03:42:20,739 trees. And then similarly, they go down to\n 1793 03:42:20,738 --> 03:42:30,199 don't print 11, because we still need to do\n 1794 03:42:30,200 --> 03:42:38,590 Go up to 13, print 14, go back up to 13. And\n 1795 03:42:38,590 --> 03:42:47,510 we haven't explored all of its right subtree\n 1796 03:42:47,510 --> 03:42:54,840 the stack and print on the way back up. And\n 1797 03:42:54,840 --> 03:43:02,170 node we have visited. And that's pre order\n 1798 03:43:02,170 --> 03:43:07,920 to look at level order traversal, which is\n 1799 03:43:07,920 --> 03:43:15,699 other two, a level order traversal is we want\n 1800 03:43:15,699 --> 03:43:24,930 start with 11. They want to print six and\n 1801 03:43:24,930 --> 03:43:35,158 And you're like oh, how am I going to do that.\n 1802 03:43:35,158 --> 03:43:40,939 is by doing something called a breadth first\n 1803 03:43:40,939 --> 03:43:46,420 to the leaf node. So she knows what a breadth\n 1804 03:43:46,420 --> 03:43:53,658 the same thing, a tree is a type of graph,\n 1805 03:43:53,658 --> 03:43:58,579 do to do our breadth first search is we're\n 1806 03:43:58,579 --> 03:44:05,430 left to explore. And how it's going to work\n 1807 03:44:05,430 --> 03:44:13,639 only the root node. And we're going to keep\n 1808 03:44:13,639 --> 03:44:24,199 in our queue until our queue is over. A bit\n 1809 03:44:24,200 --> 03:44:32,210 queue. On the right, I've inserted node 11.\n 1810 03:44:32,209 --> 03:44:37,719 add elevens left child and right child to\n 1811 03:44:37,719 --> 03:44:44,988 the queue. And I've also removed 11. So so\n 1812 03:44:44,988 --> 03:44:55,368 by 15. And then I would keep adding children\n 1813 03:44:55,369 --> 03:45:04,040 So let's have a look. So I've pulled 11 from\n 1814 03:45:04,040 --> 03:45:12,279 Now the next thing on the top of the queue\n 1815 03:45:12,279 --> 03:45:22,359 three and eight the queue, then 15. Next up,\n 1816 03:45:22,359 --> 03:45:31,380 queue, and next up in the queues three. So\n 1817 03:45:31,379 --> 03:45:39,148 and then move on, explore eight, eight has\n 1818 03:45:39,148 --> 03:45:47,358 As you can see that as I'm exploring nodes,\n 1819 03:45:47,359 --> 03:45:55,350 most recent thing in the queue. And this gives\n 1820 03:45:55,350 --> 03:46:01,500 And this is how you do a breadth first search\n 1821 03:46:01,500 --> 03:46:09,430 we had to use that special trick of using\n 1822 03:46:09,430 --> 03:46:17,550 do level order traversals recursively, we\n 1823 03:46:17,549 --> 03:46:22,679 Okay, finally time to look at some source\n 1824 03:46:23,680 --> 03:46:29,568 code I'm about to show you can be found at\n 1825 03:46:29,568 --> 03:46:37,059 in the description at the bottom of this video,\n 1826 03:46:37,059 --> 03:46:45,859 can also find it more easily. And now let's\n 1827 03:46:45,859 --> 03:46:55,939 source code for the binary search tree. The\n 1828 03:46:55,939 --> 03:47:05,648 thing you will notice is I have a class representing\n 1829 03:47:05,648 --> 03:47:12,948 anything that is comparable, we need things\n 1830 03:47:12,949 --> 03:47:21,869 them accordingly within the binary search\n 1831 03:47:21,869 --> 03:47:28,300 variables, actually only two, in fact, one\n 1832 03:47:28,299 --> 03:47:35,219 binary search tree, and another one, which\n 1833 03:47:35,219 --> 03:47:48,129 this binary search tree is a rooted tree.\n 1834 03:47:48,129 --> 03:47:56,429 a left node and a right node, as well as some\n 1835 03:47:56,430 --> 03:48:07,380 here. So it's some comparable type T. Okay,\n 1836 03:48:07,379 --> 03:48:13,589 search tree is empty. It simply checks if\n 1837 03:48:13,590 --> 03:48:21,100 the node count, which gets either incremented\n 1838 03:48:21,100 --> 03:48:30,180 Okay, here's the public method to add elements\n 1839 03:48:30,180 --> 03:48:37,568 private method down here, as you will notice.\n 1840 03:48:37,568 --> 03:48:45,868 business and the public method to just check\n 1841 03:48:45,869 --> 03:48:54,869 the binary search tree. This insertion method\n 1842 03:48:54,869 --> 03:48:59,760 a new element into the binary search tree\n 1843 03:48:59,760 --> 03:49:05,949 something inside the binary search tree. So\n 1844 03:49:05,949 --> 03:49:14,050 search tree, okay, so supposing this branch\n 1845 03:49:14,049 --> 03:49:19,799 does not already exist in the tree, then we're\n 1846 03:49:19,799 --> 03:49:27,269 add this new element to the binary search\n 1847 03:49:27,270 --> 03:49:34,460 by one and return true because well, this\n 1848 03:49:34,459 --> 03:49:46,459 at the recursive method now. So our base case\n 1849 03:49:46,459 --> 03:49:56,769 insert our element at so we will create a\n 1850 03:49:56,770 --> 03:50:05,699 with the value of the element, we want insert\n 1851 03:50:05,699 --> 03:50:14,390 which sub tree we want to place our element\n 1852 03:50:14,389 --> 03:50:22,680 subtree of the first branch, or the right\n 1853 03:50:22,680 --> 03:50:30,648 look at removing. So here's the public method\n 1854 03:50:30,648 --> 03:50:38,059 the method if it exists within the tree. So\n 1855 03:50:38,059 --> 03:50:43,510 it is going to return false, meaning we have\n 1856 03:50:43,510 --> 03:50:51,228 string. And if it is contained, I'm also going\n 1857 03:50:51,228 --> 03:51:04,799 at this recursive method to remove the node.\n 1858 03:51:04,799 --> 03:51:14,090 base case, it's now returned now. And in the\n 1859 03:51:14,090 --> 03:51:20,329 to find it. And we know it exists. Because\n 1860 03:51:20,329 --> 03:51:28,738 within the tree so we can remove it. And that\n 1861 03:51:28,738 --> 03:51:36,489 Phase I was talking about in the later video.\n 1862 03:51:36,489 --> 03:51:44,430 if it's less than, so we're going in the left\n 1863 03:51:44,430 --> 03:51:51,908 It's going to be one of these two cases, otherwise,\n 1864 03:51:51,908 --> 03:52:01,219 finds the node. And here's where we do the\n 1865 03:52:01,219 --> 03:52:08,929 in my slides. But in fact, you can think of\n 1866 03:52:08,930 --> 03:52:16,420 because two the cases are very similar. So\n 1867 03:52:16,420 --> 03:52:24,978 node, it's really that that can also be thought\n 1868 03:52:24,978 --> 03:52:33,750 case is the case where the left subtree is\n 1869 03:52:33,750 --> 03:52:43,430 case, the right subtree is now but the left\n 1870 03:52:43,430 --> 03:52:52,488 I'll get to later, we have both subsidiaries.\n 1871 03:52:52,488 --> 03:53:01,639 have a right subtree, then I'm going to say\n 1872 03:53:01,639 --> 03:53:14,028 the root node of that right subtree. So node\n 1873 03:53:14,029 --> 03:53:23,810 destroy the data within this node and the\n 1874 03:53:23,809 --> 03:53:32,988 where I only have a left subtree. What I'm\n 1875 03:53:32,988 --> 03:53:40,520 left subtree and grab the root node. And I'm\n 1876 03:53:40,520 --> 03:53:46,630 to destroy this node because we know we don't\n 1877 03:53:46,629 --> 03:53:52,988 easy. Now let's look at the key. So we have\n 1878 03:53:52,988 --> 03:54:03,020 subtree. So as I mentioned in my slides, we\n 1879 03:54:03,020 --> 03:54:13,250 subtree, or the smallest node in the right\n 1880 03:54:13,250 --> 03:54:24,100 in the right subtree. So I go down to the\n 1881 03:54:24,100 --> 03:54:32,020 the node or the successor node if you will.\n 1882 03:54:32,020 --> 03:54:39,960 and call ourselves to remove the successor\n 1883 03:54:39,959 --> 03:54:46,019 in the left subtree then you can just uncomment\n 1884 03:54:46,020 --> 03:54:53,010 removing a nutshell and they also had these\n 1885 03:54:53,010 --> 03:55:00,579 dig right. Moving on, I also have this method\nthat checks 1886 03:55:00,579 --> 03:55:08,699 contains an element. So given an element return\n 1887 03:55:08,699 --> 03:55:15,510 is within this binary subtree. And this is\n 1888 03:55:15,510 --> 03:55:22,059 phase, if we reach a null node, we would definitely\n 1889 03:55:22,059 --> 03:55:27,760 get our comparative value, which is either\n 1890 03:55:27,760 --> 03:55:35,449 the left subtree, meaning this case, or greater\n 1891 03:55:35,449 --> 03:55:42,630 Or if we found the element, then that's zero\n 1892 03:55:42,629 --> 03:55:49,639 Just as a bonus, I also threw in a height\n 1893 03:55:49,639 --> 03:55:58,299 of the tree, it will do so in linear time,\n 1894 03:55:58,299 --> 03:56:09,159 method. And all this does is it's fairly simple.\n 1895 03:56:09,159 --> 03:56:16,818 return zero. Otherwise, we're going to return\n 1896 03:56:16,818 --> 03:56:22,549 the right subtree. Because one of our sub\n 1897 03:56:22,549 --> 03:56:28,129 that's going to be the one that we want the\n 1898 03:56:28,129 --> 03:56:36,609 add plus one. So this corresponds to a depth.\n 1899 03:56:36,610 --> 03:56:47,930 the height of the tree is, you want to go\n 1900 03:56:47,930 --> 03:56:59,109 created this method called traverse. And what\n 1901 03:56:59,109 --> 03:57:06,829 which is an enamel type, I'm going to show\n 1902 03:57:06,829 --> 03:57:13,398 then I pick whichever order you give me and\n 1903 03:57:13,398 --> 03:57:19,858 I want to traverse. So if you tell me I want\n 1904 03:57:19,859 --> 03:57:28,630 order fashion that I'm going to return you\n 1905 03:57:28,629 --> 03:57:36,420 want to traverse the tree in order for it\n 1906 03:57:36,420 --> 03:57:47,879 Let's have a look at what this tree traversal\n 1907 03:57:47,879 --> 03:57:56,279 So that is simply an e&m type you can see\n 1908 03:57:56,280 --> 03:58:06,390 things, it's pre order in order post order.\n 1909 03:58:06,389 --> 03:58:14,648 these traversals iteratively. So that you\n 1910 03:58:14,648 --> 03:58:20,340 be slightly slower, and perhaps less convenient,\n 1911 03:58:20,340 --> 03:58:26,648 want to dive into the code because it is fairly\n 1912 03:58:26,648 --> 03:58:33,109 traversal, then it does look pretty gross,\n 1913 03:58:33,110 --> 03:58:39,569 have to check for concurrent modification\n 1914 03:58:39,568 --> 03:58:45,949 great interview questions like how do i do\n 1915 03:58:45,950 --> 03:58:53,020 I do a post order traversal, iteratively,\n 1916 03:58:53,020 --> 03:58:59,390 great to practice. If you want to actually\n 1917 03:58:59,389 --> 03:59:09,469 go on the GitHub repository and have a look.\n 1918 03:59:09,469 --> 03:59:16,108 in the last slides. So you should be good\n 1919 03:59:16,109 --> 03:59:22,510 Anyways, I just want to be a bit more fancy\n 1920 03:59:22,510 --> 03:59:24,639 it for the binary search tree. 1921 03:59:24,639 --> 03:59:30,689 Today we're going to be talking about hash\n 1922 03:59:30,689 --> 03:59:38,550 of all times, if I had a subtitle, it would\n 1923 03:59:38,549 --> 03:59:43,519 let's get started. There's going to be a lot\n 1924 03:59:43,520 --> 03:59:48,729 We're gonna start off with with a hash table\n 1925 03:59:48,728 --> 03:59:55,760 why do we need one. Then we're going to talk\n 1926 03:59:55,760 --> 04:00:00,719 In particular, separate chaining, open addressing\n 1927 04:00:00,719 --> 04:00:06,448 there's a ton more, there, we're going to\n 1928 04:00:06,449 --> 04:00:12,220 is a really popular implementation. And there's\n 1929 04:00:12,219 --> 04:00:16,679 covered. So we're going to be talking about\n 1930 04:00:16,680 --> 04:00:21,300 that's done. I'm not even giving lots and\n 1931 04:00:21,299 --> 04:00:28,250 not super obvious how they work. And to finish\n 1932 04:00:28,250 --> 04:00:34,399 finally removing elements from the open addressing\n 1933 04:00:34,399 --> 04:00:39,148 All right. So to begin with, what is a hash\ntable. 1934 04:00:40,369 --> 04:00:47,270 a data structure that lets us construct a\n 1935 04:00:47,270 --> 04:00:56,590 a technique called hashing, which we'll talk\n 1936 04:00:56,590 --> 04:01:06,119 as long as it's unique, which maps to a value.\n 1937 04:01:06,119 --> 04:01:12,529 names, and the value could be someone's favorite\n 1938 04:01:12,529 --> 04:01:19,800 Mike is this purple, Katherine's is yellow,\n 1939 04:01:19,799 --> 04:01:25,438 so each key is associated with the value.\n 1940 04:01:25,439 --> 04:01:31,350 don't have to be unique. For example, Micah's\n 1941 04:01:31,350 --> 04:01:41,239 favorite color is purple. We often use hash\n 1942 04:01:41,239 --> 04:01:48,520 Below is a frequency table of the number of\n 1943 04:01:48,520 --> 04:01:56,850 Hamlet, which I Parson obtained this table.\n 1944 04:01:56,850 --> 04:02:04,109 word, Lord 223. But the word cabbage did not\n 1945 04:02:04,109 --> 04:02:13,090 track frequencies, which is really handy.\n 1946 04:02:13,090 --> 04:02:19,380 any key value pairs given that the keys are\n 1947 04:02:19,379 --> 04:02:25,519 about shortly. So to be able to understand\n 1948 04:02:25,520 --> 04:02:33,251 table, we need to understand what a hash function\n 1949 04:02:33,251 --> 04:02:41,639 will denote from now on as h of x, for some\n 1950 04:02:41,639 --> 04:02:47,158 number in some fixed range. So that's pretty\n 1951 04:02:47,158 --> 04:02:54,118 hash function. So if our hash function is\n 1952 04:02:54,119 --> 04:03:01,880 6x, plus nine modulo 10, well, all integer\n 1953 04:03:01,879 --> 04:03:09,778 inclusive. No matter what integer number I\n 1954 04:03:09,779 --> 04:03:17,790 certain values, or keys fuel into the hash\n 1955 04:03:17,790 --> 04:03:26,140 that the output is not unique. In that there\n 1956 04:03:26,139 --> 04:03:32,500 the hash function which yield the same value,\n 1957 04:03:32,500 --> 04:03:39,369 the same thing, and that's completely fine.\n 1958 04:03:39,369 --> 04:03:46,100 not just on integers, but on arbitrary objects,\n 1959 04:03:46,100 --> 04:03:52,870 suppose we have a string s, and we're gonna\n 1960 04:03:52,870 --> 04:04:05,590 defined below. So this hash function, all\n 1961 04:04:05,590 --> 04:04:13,460 characters within the string, and then at\n 1962 04:04:13,459 --> 04:04:22,039 any given string just output a number. And\n 1963 04:04:22,040 --> 04:04:30,890 simple inputs, so h of BB gets 66, or 66.\n 1964 04:04:30,889 --> 04:04:37,019 The empty string gets zero because we don't\n 1965 04:04:37,020 --> 04:04:43,489 effectively converted a string to a number.\n 1966 04:04:43,489 --> 04:04:52,189 you. Suppose we have a database of people\n 1967 04:04:52,189 --> 04:05:00,109 have an age, a name and a sex. I want you\n 1968 04:05:00,109 --> 04:05:07,479 going to be given a person object as an argument.\n 1969 04:05:07,478 --> 04:05:13,969 going to map a person to the set of values,\n 1970 04:05:13,969 --> 04:05:20,658 the video, just create any hash function that\n 1971 04:05:20,658 --> 04:05:28,250 so here's my attempt at creating such a hash\n 1972 04:05:28,250 --> 04:05:32,879 of possible hash functions we could define\n 1973 04:05:32,879 --> 04:05:39,170 And here's a simple one. So I'm gonna say,\n 1974 04:05:39,170 --> 04:05:44,939 whatever there is, that's the starting value\n 1975 04:05:44,939 --> 04:05:51,270 of the person's name, again, an arbitrary\n 1976 04:05:51,270 --> 04:05:58,180 males, and then mod by six. So as you can\n 1977 04:05:58,180 --> 04:06:02,630 defined. At this point, we're going to get\n 1978 04:06:02,629 --> 04:06:12,270 later. So this particular hash function yields\n 1979 04:06:12,271 --> 04:06:18,770 very, very important that we need to talk\n 1980 04:06:18,770 --> 04:06:27,310 if we have a hash function, and two objects,\n 1981 04:06:27,309 --> 04:06:37,618 then those two objects might be equal. We\n 1982 04:06:37,619 --> 04:06:45,300 check x against y. Yes, the hash function\n 1983 04:06:45,299 --> 04:06:52,978 there. But if the hash functions are not equal,\n 1984 04:06:52,978 --> 04:06:59,609 a good question. How can we use this to our\n 1985 04:06:59,610 --> 04:07:07,540 The answer is that instead of comparing X\n 1986 04:07:07,540 --> 04:07:16,260 hash values, and first let's compare the hash\n 1987 04:07:16,260 --> 04:07:22,658 and Y explicitly, and in this next example,\n 1988 04:07:22,658 --> 04:07:29,359 the problem of trying to determine if two\n 1989 04:07:29,359 --> 04:07:35,100 is something you want to do in an operating\n 1990 04:07:35,100 --> 04:07:42,790 the hash values for file one and file two,\n 1991 04:07:42,790 --> 04:07:50,211 to see if they match or don't match, because\n 1992 04:07:50,210 --> 04:07:58,778 super fast. So if possible, we don't want\n 1993 04:07:58,779 --> 04:08:05,060 X and Y directly or file one against file\n 1994 04:08:05,059 --> 04:08:13,409 per byte if their hash values are equal. So\n 1995 04:08:13,409 --> 04:08:20,510 more sophisticated than those that we use\n 1996 04:08:20,510 --> 04:08:30,000 hashing algorithms called cryptographic hash\n 1997 04:08:30,000 --> 04:08:37,260 property of hash functions is that they must\n 1998 04:08:37,260 --> 04:08:46,510 if h of x produces a y, then h of x must always,\n 1999 04:08:46,510 --> 04:08:54,880 value. This is super critical, because this\n 2000 04:08:54,879 --> 04:09:02,769 this happens, we do not want this. So an example\n 2001 04:09:02,770 --> 04:09:10,390 that introduces say, a global variable or\n 2002 04:09:10,389 --> 04:09:16,198 So in this particular hash function, the first\n 2003 04:09:16,199 --> 04:09:21,380 six. But then if I call it again, we've incremented\n 2004 04:09:21,379 --> 04:09:30,488 Oh, not good. Something else but hash functions\n 2005 04:09:30,488 --> 04:09:35,139 To minimize the number of hash collisions,\n 2006 04:09:35,139 --> 04:09:41,439 shortly. But a hash collision is when two\n 2007 04:09:41,439 --> 04:09:52,479 is h of x equals h of y. And the reason why\n 2008 04:09:52,478 --> 04:10:00,698 hash function are hit, so that we can fill\n 2009 04:10:00,699 --> 04:10:10,970 Table we generated earlier, William and key\n 2010 04:10:10,969 --> 04:10:16,129 able to answer a central question about the\n 2011 04:10:16,129 --> 04:10:24,969 our hash table. So what makes a key of type\n 2012 04:10:24,969 --> 04:10:31,049 going to implement our hash table using these\n 2013 04:10:33,280 --> 04:10:40,430 And to enforce that behavior, you'll see a\n 2014 04:10:40,430 --> 04:10:48,460 the keys you use, be immutable, meaning you\n 2015 04:10:48,459 --> 04:10:55,429 constants. So they're things like immutable\n 2016 04:10:55,430 --> 04:11:03,189 or lists or things you can add or remove things\n 2017 04:11:03,189 --> 04:11:10,069 condition, and we have a hash function that\n 2018 04:11:10,069 --> 04:11:19,199 say that, that that type is hashable. Okay.\n 2019 04:11:19,199 --> 04:11:27,609 details. How does the hash table work? Well,\n 2020 04:11:27,609 --> 04:11:35,578 really quick insertion look up, and the removal\n 2021 04:11:35,578 --> 04:11:44,029 that using the hash function as a way to index\n 2022 04:11:44,029 --> 04:11:49,649 a fancy word for an array. Think of it like\n 2023 04:11:49,648 --> 04:11:58,238 So again, we use the hash function as a way\n 2024 04:11:58,238 --> 04:12:05,250 function gives us an index to go look up inside\n 2025 04:12:05,250 --> 04:12:14,379 time. Given that we have a uniform hash function,\n 2026 04:12:14,379 --> 04:12:21,379 on the right as an indexable block of memory.\n 2027 04:12:21,379 --> 04:12:28,670 entries with the hash function, h of x. So\n 2028 04:12:28,670 --> 04:12:35,238 string key value pairs into the table that\n 2029 04:12:35,238 --> 04:12:41,420 their usernames, say in an online programming\n 2030 04:12:41,420 --> 04:12:49,090 function, I chose x squared plus three, Montana.\n 2031 04:12:49,090 --> 04:12:58,790 insert the key value pair, three by eater,\n 2032 04:12:58,790 --> 04:13:06,659 is by eater got a rank of three, and we want\n 2033 04:13:06,658 --> 04:13:13,670 we do is we hash the key, which is three,\n 2034 04:13:13,670 --> 04:13:21,350 plus three montagner. Two, so in our hash\n 2035 04:13:21,350 --> 04:13:29,779 put the key B three, and the value to be byte\n 2036 04:13:29,779 --> 04:13:36,680 say, which is usually what I call myself an\n 2037 04:13:36,680 --> 04:13:44,540 one. And if we hash one, then we get four.\nSo at index four 2038 04:13:44,540 --> 04:13:54,100 then we're going to put the key in the value\n 2039 04:13:54,100 --> 04:14:02,250 want to insert Lauren 425, we've got rank\n 2040 04:14:02,250 --> 04:14:11,238 table where she goes, then we can do the same\n 2041 04:14:11,238 --> 04:14:17,789 insert orange Knight, again began by hashing\n 2042 04:14:17,790 --> 04:14:22,100 keep doing this, you keep filling the table,\n 2043 04:14:22,100 --> 04:14:29,500 Now, in the event that we want to do a lookup,\n 2044 04:14:29,500 --> 04:14:37,180 to do is compute the hash value for R and\n 2045 04:14:37,180 --> 04:14:42,658 this user is. So say I want to find the user\nwith reign 10. 2046 04:14:42,658 --> 04:14:50,139 And I hash 10. Figure out its index is three\n 2047 04:14:53,510 --> 04:15:01,770 However, how do we handle hash collisions?\n 2048 04:15:01,770 --> 04:15:09,279 and eight have the same value. This is problematic.\n 2049 04:15:09,279 --> 04:15:17,290 them inside our table, they would index into\n 2050 04:15:17,290 --> 04:15:23,710 we use one of many hash collision resolution\n 2051 04:15:23,709 --> 04:15:29,469 but the two most popular ones are separate\n 2052 04:15:29,469 --> 04:15:35,908 separate chaining is that we deal with hash\n 2053 04:15:35,908 --> 04:15:42,109 usually a link lists to hold all the different\n 2054 04:15:42,110 --> 04:15:49,399 value. Open addressing is slightly different.\n 2055 04:15:49,398 --> 04:15:55,898 finding other places in the hash table offset\n 2056 04:15:55,898 --> 04:16:00,959 So an open addressing, everything is kept\n 2057 04:16:00,959 --> 04:16:07,099 you have multiple auxiliary data structures.\n 2058 04:16:07,100 --> 04:16:13,050 actually pretty remarkable. In fact, it's\n 2059 04:16:13,049 --> 04:16:20,429 time insertion, removal and search. But if\n 2060 04:16:20,430 --> 04:16:27,279 uniform, then you can get linear time, which\n 2061 04:16:27,279 --> 04:16:35,140 something super, super cool. And that is hash\n 2062 04:16:35,139 --> 04:16:43,590 dive right in what is separate chaining. Separate\n 2063 04:16:43,590 --> 04:16:51,408 resolution techniques. And how it works is\n 2064 04:16:51,408 --> 04:16:56,430 keys hash of the same value, we need to have\n 2065 04:16:56,430 --> 04:17:01,540 hash table. So it's still functional. Or what\n 2066 04:17:01,540 --> 04:17:09,319 auxilary data structure to essentially hold\n 2067 04:17:09,318 --> 04:17:17,209 and look up inside that bucket or that data\n 2068 04:17:17,209 --> 04:17:23,898 for. And usually, we use the length list for\n 2069 04:17:23,898 --> 04:17:29,978 lists, we can use arrays, binary trees, self\n 2070 04:17:29,978 --> 04:17:36,948 Okay, so suppose we have the following hash\n 2071 04:17:36,949 --> 04:17:49,460 of key value pairs of age and names. And we\n 2072 04:17:49,459 --> 04:17:55,509 been computed with some hash function. So\n 2073 04:17:55,510 --> 04:18:01,250 For now, we're just going to see how we can\n 2074 04:18:01,250 --> 04:18:09,850 Okay, so on the left is our hash table. So\n 2075 04:18:09,850 --> 04:18:17,649 of these key value pairs into this hash table\n 2076 04:18:17,648 --> 04:18:27,778 easy it is. Okay, so our first person is will\n 2077 04:18:27,779 --> 04:18:36,550 going to put in, in that slot three, Leah,\n 2078 04:18:36,549 --> 04:18:43,039 we're going to put her at index four. So the\n 2079 04:18:43,040 --> 04:18:52,260 rig age 61 hash two and put them there. And\n 2080 04:18:52,260 --> 04:18:58,350 get a little bit full in our hash table here.\n 2081 04:18:58,350 --> 04:19:06,600 Okay. Lera, age 34, hash to four, but we say\n 2082 04:19:06,600 --> 04:19:16,949 we do? Well, in separate chaining, we just\n 2083 04:19:16,949 --> 04:19:25,180 in the array is actually a linked list data\n 2084 04:19:25,180 --> 04:19:32,300 the link list and see if alera exists and\n 2085 04:19:32,299 --> 04:19:36,278 add lira at the very end of the chain. 2086 04:19:38,158 --> 04:19:45,469 so Ryan also hash to one but then we look\n 2087 04:19:45,469 --> 04:19:59,579 add a new entry at position one. alera age\n 2088 04:19:59,579 --> 04:20:08,068 exists. So in our hash table, so we're good.\n 2089 04:20:08,068 --> 04:20:17,769 to update it. So fin age 21, hash to three.\n 2090 04:20:17,770 --> 04:20:23,158 hash to three. So what we're going to do is\n 2091 04:20:23,158 --> 04:20:30,680 length list chain. So note that even though\n 2092 04:20:30,680 --> 04:20:36,790 value, that is index three, and they have\n 2093 04:20:36,790 --> 04:20:44,010 we store both the key and the value as an\n 2094 04:20:44,010 --> 04:20:49,488 how we're able to tell them apart. Okay, now\n 2095 04:20:49,488 --> 04:20:56,750 has to four. So scan through the linked list\n 2096 04:20:56,750 --> 04:21:01,728 we have to append mark at the very end. All\n 2097 04:21:01,728 --> 04:21:10,760 do lookups in this structure. So it's basically\n 2098 04:21:10,760 --> 04:21:18,800 a name, we want to find what the person's\n 2099 04:21:18,799 --> 04:21:26,068 when we hash him, we get one. So we suspect\n 2100 04:21:26,068 --> 04:21:33,850 say a bucket, I just mean, whatever data structure\n 2101 04:21:33,850 --> 04:21:38,579 link lists. So if you scan this linked list\n 2102 04:21:38,579 --> 04:21:44,959 comparing the key. So we're comparing the\n 2103 04:21:44,959 --> 04:21:50,868 a match. So keep going. Compare Ryan, Ryan,\n 2104 04:21:50,869 --> 04:21:58,210 inside in that entry, say, oh, his age is\n 2105 04:21:58,209 --> 04:22:05,209 the age of Mark hash mark. And since our hash\n 2106 04:22:05,209 --> 04:22:12,408 if there's a mark, then it's going to be found\n 2107 04:22:12,408 --> 04:22:21,439 bucket for scan through, oh, last one is a\n 2108 04:22:21,439 --> 04:22:28,488 that the value or the key looking forward\n 2109 04:22:28,488 --> 04:22:34,770 turn now. Okay, so here's a good question.\n 2110 04:22:34,770 --> 04:22:42,710 and lookup time complexity? If the hash table\n 2111 04:22:42,709 --> 04:22:48,809 list chains? Good question. And the answer\n 2112 04:22:48,809 --> 04:22:53,409 your hash table, you'll actually want to create\n 2113 04:22:53,409 --> 04:23:02,020 rehash all your items and re insert them into\n 2114 04:23:02,020 --> 04:23:10,828 fixed size. Okay, and another good question,\n 2115 04:23:10,828 --> 04:23:16,770 table with separate chaining? Well, the answer\n 2116 04:23:16,770 --> 04:23:23,260 your key. And instead of doing a lookup while\n 2117 04:23:23,260 --> 04:23:30,828 length list, that's another question. How\n 2118 04:23:30,828 --> 04:23:37,119 to model the bucket behavior? Yes, of course.\n 2119 04:23:37,119 --> 04:23:43,319 linked lists include arrays, binary tree,\n 2120 04:23:43,318 --> 04:23:49,760 approach and our hash map. So once they get\n 2121 04:23:49,760 --> 04:23:57,010 a binary tree, or maybe a cell balance spanning\n 2122 04:23:57,010 --> 04:24:02,590 methods are a bit more memory intensive and\n 2123 04:24:02,590 --> 04:24:09,728 be less popular, but they might be a lot faster\n 2124 04:24:09,728 --> 04:24:14,858 All right, it's time to have a look at some\n 2125 04:24:14,859 --> 04:24:21,920 table. So here I am on my GitHub repository\n 2126 04:24:21,920 --> 04:24:27,309 find the source code here under the hash table\n 2127 04:24:27,309 --> 04:24:33,039 the hash table. Today, we're going to be looking\n 2128 04:24:33,040 --> 04:24:38,970 in the later videos, probably one of these\n 2129 04:24:38,969 --> 04:24:46,719 So let's dive into the code. I have it here\n 2130 04:24:46,719 --> 04:24:55,238 So first things first, I have two classes,\n 2131 04:24:55,238 --> 04:25:02,618 chaining hash table. So let's have a look\n 2132 04:25:02,619 --> 04:25:10,130 individual items or key value pairs, you would\n 2133 04:25:10,129 --> 04:25:20,309 So in Java, we have generics, so a generic\n 2134 04:25:20,309 --> 04:25:27,689 So when I create an entry, I give it the key\n 2135 04:25:27,689 --> 04:25:34,040 So there's a built in method in Java to compute\n 2136 04:25:34,040 --> 04:25:38,660 you can override it to specify the hash code\n 2137 04:25:38,659 --> 04:25:44,289 convenient. So compute the hash code and cache\n 2138 04:25:44,290 --> 04:25:49,720 don't have to re compute this thing multiple\n 2139 04:25:49,719 --> 04:25:55,049 for something like a string, hash code can\n 2140 04:25:55,049 --> 04:26:01,250 good. So here, I have an equals method, which\n 2141 04:26:01,250 --> 04:26:06,189 because I don't want to have to do any casting.\n 2142 04:26:06,189 --> 04:26:11,970 If the hashes are not equal, we know from\n 2143 04:26:11,969 --> 04:26:17,608 equal, so we can return false. Otherwise,\n 2144 04:26:17,609 --> 04:26:23,630 about it for the entry class. Very simple,\n 2145 04:26:23,629 --> 04:26:32,289 thing. So the hash table itself. Okay, so\n 2146 04:26:32,290 --> 04:26:40,579 holds three, three items, and the load factor\n 2147 04:26:40,578 --> 04:26:49,068 or be 0.75. So that's the maximum capacity\n 2148 04:26:49,068 --> 04:26:52,819 important instance variables, we need to go\n 2149 04:26:52,819 --> 04:27:00,799 the load factor goes above this value, then\n 2150 04:27:00,799 --> 04:27:07,429 so the actual maximum number of items that\n 2151 04:27:07,430 --> 04:27:15,328 So the threshold. So this is computed to be\n 2152 04:27:15,328 --> 04:27:22,449 us, hey, you're above the threshold to resize\n 2153 04:27:22,450 --> 04:27:32,510 in the table. And this is the table itself.\n 2154 04:27:32,510 --> 04:27:38,340 have entries, pretty simple. So there's a\n 2155 04:27:38,340 --> 04:27:45,648 table, just using the default settings with\n 2156 04:27:45,648 --> 04:27:54,409 load factor. So this is a designated constructor,\n 2157 04:27:54,409 --> 04:28:01,969 factor is compute default capacity. And make\n 2158 04:28:01,969 --> 04:28:08,028 and capacity just so that I know you don't\n 2159 04:28:08,029 --> 04:28:14,790 weird things happening if the capacity is\n 2160 04:28:14,790 --> 04:28:21,409 threshold and then finally initialize the\n 2161 04:28:21,408 --> 04:28:26,219 a look at all these methods right here. So\n 2162 04:28:26,219 --> 04:28:33,350 table empty, is the hash table empty. So this\n 2163 04:28:33,351 --> 04:28:41,109 normalized index. And it's used when you want\n 2164 04:28:41,109 --> 04:28:44,729 it says in the comments here, essentially,\n 2165 04:28:44,728 --> 04:28:52,459 hash value in a domain, zero to capacity.\n 2166 04:28:52,459 --> 04:28:59,809 can be anywhere in the domain of an integer,\n 2167 04:28:59,809 --> 04:29:09,889 to positive to the 31. around that. So what\n 2168 04:29:09,889 --> 04:29:16,250 sign from the hash value, and then modified\n 2169 04:29:16,250 --> 04:29:23,200 so we can actually use it as a lookup index.\n 2170 04:29:23,200 --> 04:29:32,449 how the table that's straightforward. Contains\n 2171 04:29:32,449 --> 04:29:36,890 going to do is compute given a key. So we\n 2172 04:29:36,889 --> 04:29:45,078 the hash table. Right? So we're going to do\n 2173 04:29:45,078 --> 04:29:55,289 then now give us the bucket index. So which\n 2174 04:29:55,290 --> 04:30:01,100 in a hash table? I'm just going to seek to\n 2175 04:30:01,100 --> 04:30:07,340 if the entry is not equal to no exists, if\n 2176 04:30:07,340 --> 04:30:16,369 for add an insert are all common names for\n 2177 04:30:16,369 --> 04:30:21,899 or updating a value inside of the hash table\n 2178 04:30:21,898 --> 04:30:27,250 something we absolutely don't want. So just\n 2179 04:30:27,250 --> 04:30:33,790 going to create a new entry, find the bucket\n 2180 04:30:33,790 --> 04:30:42,609 method we'll get to. Okay, get. So given a\n 2181 04:30:42,609 --> 04:30:48,670 that key. Again, though, allow no keys. And\n 2182 04:30:48,670 --> 04:30:54,969 don't want to find which bucket this particular\n 2183 04:30:54,969 --> 04:31:02,578 find the entry. assuming it's not no, then\n 2184 04:31:02,578 --> 04:31:13,250 value. If it is no, well, the key doesn't\n 2185 04:31:13,250 --> 04:31:21,068 the key now from the hash table. So he's not\n 2186 04:31:21,068 --> 04:31:28,028 call his private remove entry method, which\n 2187 04:31:28,029 --> 04:31:35,140 so which, which bucket does this keyboard\n 2188 04:31:35,139 --> 04:31:44,049 to seek for the entry inside the link list\n 2189 04:31:44,049 --> 04:31:51,750 we're going to extract the actual link list\n 2190 04:31:51,750 --> 04:31:58,420 in Java. So this removed from that link this\n 2191 04:31:58,420 --> 04:32:07,020 the actual value, that's all we have to do.\n 2192 04:32:07,020 --> 04:32:14,680 So insert bucket insert entry is a given a\n 2193 04:32:14,680 --> 04:32:22,859 inside of it. Okay. So first, since we know\n 2194 04:32:22,859 --> 04:32:27,790 automatically get the linked list structure.\n 2195 04:32:27,790 --> 04:32:36,540 have to create a new linked list. So we're\n 2196 04:32:36,540 --> 04:32:41,989 list data structures, which is good, because\n 2197 04:32:41,988 --> 04:32:50,469 to. So next up, I find the entry that already\n 2198 04:32:50,469 --> 04:32:58,618 an update, for instance. So if the existence\n 2199 04:32:58,619 --> 04:33:04,439 a new entry to the end of the blank last.\n 2200 04:33:04,438 --> 04:33:12,340 the size, and check if we're above the threshold,\n 2201 04:33:12,340 --> 04:33:18,069 now to indicate that there was no previous\n 2202 04:33:18,069 --> 04:33:24,370 So then just update the value in the existing\n 2203 04:33:26,330 --> 04:33:32,770 So seek entry this method we've been using\n 2204 04:33:32,770 --> 04:33:38,909 particular entry at a given bucket index,\n 2205 04:33:38,909 --> 04:33:45,778 They probably know what's going on by now.\n 2206 04:33:45,778 --> 04:33:52,708 bucket index. Otherwise return now if it doesn't\n 2207 04:33:52,708 --> 04:34:00,569 the entries in the linked list and compare\n 2208 04:34:00,569 --> 04:34:08,020 was a match, return that entry otherwise return\n 2209 04:34:08,020 --> 04:34:14,569 method called resize table. So this resizes\n 2210 04:34:14,569 --> 04:34:22,420 the table. First we double the capacity. We\n 2211 04:34:22,420 --> 04:34:30,340 have a higher threshold because we have increasing\n 2212 04:34:30,340 --> 04:34:36,368 new capacity. So this new table is bigger\n 2213 04:34:36,368 --> 04:34:43,530 current table. Look for linkless data structures\n 2214 04:34:43,530 --> 04:34:51,458 loop through all these entries, calculate\n 2215 04:34:51,458 --> 04:35:02,409 insert it into this new table and after that\n 2216 04:35:02,409 --> 04:35:13,490 the old table at the end, set the table to\n 2217 04:35:13,490 --> 04:35:20,830 these last two methods, so just return all\n 2218 04:35:20,830 --> 04:35:28,270 fairly simple. And these last two methods\n 2219 04:35:28,270 --> 04:35:34,300 need to go over. So that's essentially, separate\n 2220 04:35:34,300 --> 04:35:42,169 with the link lists much more difficult to\n 2221 04:35:42,169 --> 04:35:47,618 or something like that. I'm pretty excited,\n 2222 04:35:47,618 --> 04:35:54,989 collision resolution technique for hash tables.\n 2223 04:35:54,990 --> 04:36:00,320 quick recap on hash tables so that everyone's\n 2224 04:36:00,319 --> 04:36:07,250 table is to construct a mapping from a set\n 2225 04:36:07,250 --> 04:36:14,708 to be hashable. Now, what we do is we define\n 2226 04:36:14,708 --> 04:36:22,278 into numbers, then we use the number obtained\n 2227 04:36:22,278 --> 04:36:29,147 into the array or the hash table. However,\n 2228 04:36:29,148 --> 04:36:38,260 time to time, we're going to have hash collisions,\n 2229 04:36:38,259 --> 04:36:44,458 So we need a way to resolve this and open\n 2230 04:36:44,458 --> 04:36:50,039 so what we're going to be using the open addressing\n 2231 04:36:50,039 --> 04:36:56,650 to keep in mind is the actual key value pairs\n 2232 04:36:56,650 --> 04:37:04,819 itself. So as opposed to say, an auxilary\n 2233 04:37:04,819 --> 04:37:12,359 method we saw in the last video. So this means\n 2234 04:37:12,359 --> 04:37:17,618 the hash tables, and how many elements are\n 2235 04:37:17,618 --> 04:37:22,329 there are too many elements inside of the\n 2236 04:37:22,330 --> 04:37:29,160 hard to find an open slot or a position to\n 2237 04:37:29,159 --> 04:37:34,169 of terminology, we say that the load factor\n 2238 04:37:34,169 --> 04:37:41,708 table and the size of the table. So this means\n 2239 04:37:41,708 --> 04:37:48,430 Here's a neat chart from Wikipedia. So on\n 2240 04:37:48,430 --> 04:37:53,900 methods. One of them is chaining, that is\n 2241 04:37:53,900 --> 04:38:01,459 open addressing technique. And we can see\n 2242 04:38:01,458 --> 04:38:09,298 it gets to a certain threshold, it gets exponentially\n 2243 04:38:09,298 --> 04:38:15,118 that say point eight mark. In fact, we're\n 2244 04:38:15,118 --> 04:38:20,868 usually. And what this says is, we always\n 2245 04:38:20,868 --> 04:38:27,649 by the Greek letter alpha, below a certain\n 2246 04:38:27,650 --> 04:38:36,730 our table once that threshold is met. Right,\n 2247 04:38:36,729 --> 04:38:42,791 into our hash table, here's what we do, we\n 2248 04:38:42,791 --> 04:38:47,789 on our key and we hash the value and this\n 2249 04:38:47,789 --> 04:38:54,969 table for where the key should go. But suppose\n 2250 04:38:54,969 --> 04:39:02,289 a key in that slot, well, we can't have two\n 2251 04:39:02,289 --> 04:39:12,719 work. So what we do is we use a probing sequence,\n 2252 04:39:12,719 --> 04:39:19,888 tell us where to go next. So we hashed to\n 2253 04:39:19,888 --> 04:39:27,548 And now we're going to probe along using this\n 2254 04:39:27,548 --> 04:39:36,298 going to eventually find an open spots along\n 2255 04:39:36,298 --> 04:39:43,189 an infinite amount of probing sequences to\n 2256 04:39:43,189 --> 04:39:53,479 have linear probing, which probes via a linear\n 2257 04:39:53,479 --> 04:40:02,029 when we're probing we start, usually x at\n 2258 04:40:02,029 --> 04:40:07,309 slot, then we just increment x by one. And\n 2259 04:40:07,310 --> 04:40:12,909 functions. for linear probing, we use a linear\n 2260 04:40:12,909 --> 04:40:19,387 function. And then there's double hashing,\n 2261 04:40:19,387 --> 04:40:26,968 is we define a secondary hash function on\n 2262 04:40:26,968 --> 04:40:32,878 inside the probing function. And the last\n 2263 04:40:32,878 --> 04:40:39,299 probing function that we can use. So given\n 2264 04:40:39,299 --> 04:40:46,610 seed it using the hash value of our key, which\n 2265 04:40:46,610 --> 04:40:53,750 to be the same thing. And then we can use\n 2266 04:40:53,750 --> 04:40:58,637 pretty neat, and increment by x each time\n 2267 04:40:58,637 --> 04:41:03,309 just getting the next number in the random\n 2268 04:41:03,310 --> 04:41:09,210 that. Alright, so here's a general insertion\n 2269 04:41:09,209 --> 04:41:18,797 a table of size n. And here's how the algorithm\n 2270 04:41:18,797 --> 04:41:25,989 x is a constant, or sorry, a variable that\n 2271 04:41:25,990 --> 04:41:34,420 going to increment x each time we fail to\n 2272 04:41:34,419 --> 04:41:39,217 just by hashing our key. And that is actually\n 2273 04:41:39,218 --> 04:41:46,952 look in a table first. So while the table\n 2274 04:41:46,952 --> 04:41:57,218 to No, we're going to say our new index is\n 2275 04:41:57,218 --> 04:42:06,387 hash to plus the probing function, mod n,\n 2276 04:42:06,387 --> 04:42:11,759 and then we're going to increment x, so that\n 2277 04:42:11,759 --> 04:42:19,949 at a different position. And then eventually,\n 2278 04:42:19,950 --> 04:42:25,680 set up our probing function in such a way\n 2279 04:42:25,680 --> 04:42:32,740 we will always find a free slot, because we\n 2280 04:42:35,759 --> 04:42:44,969 so here's the big issue with open addressing.\n 2281 04:42:44,970 --> 04:42:51,878 we choose modulo n, are going to end up producing\n 2282 04:42:51,878 --> 04:43:00,639 size itself. So imagine your probing sequence\n 2283 04:43:00,639 --> 04:43:07,479 cycles. And your table is of size 10. But\n 2284 04:43:07,479 --> 04:43:13,649 because it's stuck in a cycle. And all of\n 2285 04:43:13,650 --> 04:43:19,770 in an infinite loop. So this is very problematic,\n 2286 04:43:19,770 --> 04:43:27,869 to handle. Right, so let's have a look at\n 2287 04:43:27,869 --> 04:43:34,968 And using open addressing, it's got some key\n 2288 04:43:34,968 --> 04:43:40,490 the circle with a bar through it is the no\n 2289 04:43:40,490 --> 04:43:49,120 probing sequence p of x equals 4x. And suppose\n 2290 04:43:49,119 --> 04:43:57,307 and that the key hashes to eight. So that\n 2291 04:43:57,308 --> 04:44:03,049 at position eight, but Oh, it's already occupied,\n 2292 04:44:03,049 --> 04:44:13,489 there. So what we do well, we probe, so we\n 2293 04:44:13,490 --> 04:44:19,791 we get eight plus for my 12. Well, that's\n 2294 04:44:19,791 --> 04:44:26,760 Oh, that is also occupied, because the key\n 2295 04:44:26,759 --> 04:44:36,599 we compute P of two, and then they gives us\n 2296 04:44:36,599 --> 04:44:42,840 already occupied, and then we keep probing.\n 2297 04:44:42,840 --> 04:44:48,779 So we'll keep probing and probing and probing,\n 2298 04:44:48,779 --> 04:44:56,110 position. So although we have a proving function,\n 2299 04:44:56,110 --> 04:45:03,208 The probing function is flawed. So that's\n 2300 04:45:03,207 --> 04:45:08,779 functions are viable. They produce cycles\n 2301 04:45:08,779 --> 04:45:16,137 handle this? And in general, the consensus\n 2302 04:45:16,137 --> 04:45:23,729 we try to avoid it all together by restricting\n 2303 04:45:23,729 --> 04:45:30,399 to be those which produce a cycle of exactly\n 2304 04:45:30,400 --> 04:45:37,480 I have a little Asterix here and says there\n 2305 04:45:37,479 --> 04:45:43,169 are some probing functions we can use, which\n 2306 04:45:43,169 --> 04:45:49,707 And we're going to have a look at I think,\n 2307 04:45:49,707 --> 04:45:55,909 Alright, so just to recap, techniques such\n 2308 04:45:55,909 --> 04:46:02,740 and double hashing, they're all subject to\n 2309 04:46:02,740 --> 04:46:09,670 to do is redefine probing functions, which\n 2310 04:46:09,669 --> 04:46:16,000 length and to avoid not being able to insert\n 2311 04:46:16,000 --> 04:46:21,718 loop. So this is a bit of an issue with the\n 2312 04:46:21,718 --> 04:46:27,250 we can handle. Although notice that this isn't\n 2313 04:46:27,250 --> 04:46:32,630 in the separate chaining world, just because\n 2314 04:46:32,630 --> 04:46:38,670 just captures all our collisions. Okay, this\n 2315 04:46:38,669 --> 04:46:47,250 talking about hash tables, and the linear\n 2316 04:46:49,919 --> 04:46:55,969 So in general, if we have a table of size\n 2317 04:46:55,970 --> 04:46:59,840 what your probing function is. So we start\nour constant 2318 04:46:59,840 --> 04:47:07,150 or sorry, variable x at one, the key hash\n 2319 04:47:07,150 --> 04:47:13,510 gives us four key. And our first index we're\n 2320 04:47:13,509 --> 04:47:20,637 hash position. And while the table at the\n 2321 04:47:20,637 --> 04:47:30,489 is already occupied, then we're going to offset\n 2322 04:47:30,490 --> 04:47:37,048 probing function mode. And every time we do\n 2323 04:47:37,047 --> 04:47:44,729 our probing function pushes us along one extra\n 2324 04:47:44,729 --> 04:47:47,797 then we can insert the key value pair into\nthe table 2325 04:47:49,790 --> 04:47:55,990 Alright, so what is linear probing? So linear\n 2326 04:47:55,990 --> 04:48:04,548 according to some linear formula, specifically,\n 2327 04:48:04,547 --> 04:48:09,180 b. And we have to make sure that a is not\n 2328 04:48:09,180 --> 04:48:14,900 a constant which does nothing. Now have a\n 2329 04:48:14,900 --> 04:48:22,830 constant b is obsolete. And if you know why,\n 2330 04:48:22,830 --> 04:48:32,340 with the others. And as we saw in the last\n 2331 04:48:32,340 --> 04:48:39,229 currently, and it's that some linear functions\n 2332 04:48:39,229 --> 04:48:46,159 n. And we might end up getting stuck in a\n 2333 04:48:46,159 --> 04:48:56,958 our linear function to be p of x equals 3x,\n 2334 04:48:56,958 --> 04:49:04,387 some reason our table size was nine, then\n 2335 04:49:04,387 --> 04:49:13,250 Assuming that positions, four, seven and one\n 2336 04:49:13,250 --> 04:49:18,520 So the fact that we're only probing at those\n 2337 04:49:18,520 --> 04:49:24,529 all the other buckets, which is really bad.\n 2338 04:49:24,529 --> 04:49:30,207 loop, we cannot get stuck in this situation\n 2339 04:49:30,207 --> 04:49:38,967 question which values of the constant A and\n 2340 04:49:38,968 --> 04:49:48,549 M. It turns out that this happens when the\n 2341 04:49:48,549 --> 04:49:58,599 prime to each other. The two numbers are relatively\n 2342 04:49:58,599 --> 04:50:09,329 is equal to one So that is a an N have a GCD\n 2343 04:50:09,330 --> 04:50:15,580 function will always be able to generate a\n 2344 04:50:15,580 --> 04:50:24,400 to find an empty bucket. Awesome. Alright,\n 2345 04:50:24,400 --> 04:50:30,290 suppose we have an originally empty hash table,\n 2346 04:50:30,290 --> 04:50:37,378 And we selected our probing function to be\n 2347 04:50:37,378 --> 04:50:44,860 equals nine. And then we also selected a max\n 2348 04:50:44,860 --> 04:50:54,458 and the threshold will then be six. So we\n 2349 04:50:54,458 --> 04:51:01,189 based on the probing function we chose at\n 2350 04:51:01,189 --> 04:51:07,840 infinite loop while inserting? Based on what\n 2351 04:51:07,840 --> 04:51:14,040 answer is, yes, the greatest common denominator\n 2352 04:51:14,040 --> 04:51:23,218 not one. So let's go ahead and attempt to\n 2353 04:51:23,218 --> 04:51:31,360 may not hit any problems. Okay. So first,\n 2354 04:51:31,360 --> 04:51:38,790 some key value pairs, I want to insert, and\n 2355 04:51:38,790 --> 04:51:46,043 that the hash value for K one is equal to\n 2356 04:51:46,043 --> 04:51:51,980 of K one plus the probing sequence, add zero,\n 2357 04:51:51,979 --> 04:52:01,819 insert that key value pair at position two.\n 2358 04:52:01,819 --> 04:52:08,191 is equal to two again. So we're going to try\n 2359 04:52:08,191 --> 04:52:14,270 two. But oh snap, we have a hash collision.\n 2360 04:52:14,270 --> 04:52:21,260 to offset the probing function at one and\n 2361 04:52:21,259 --> 04:52:29,289 of inserting it now at two, we're going to\n 2362 04:52:29,290 --> 04:52:37,170 in because that slot was free. Now, let's\n 2363 04:52:37,169 --> 04:52:44,099 that hashes the three, then we can insert\n 2364 04:52:44,099 --> 04:52:51,349 Now, notice that we're trying to re insert\n 2365 04:52:51,349 --> 04:52:55,579 table. So instead of inserting it, we're actually\n 2366 04:52:55,580 --> 04:53:02,638 exists in the hash table. Alright, so from\n 2367 04:53:02,637 --> 04:53:10,718 two is two. So So then we look at position\n 2368 04:53:11,819 --> 04:53:19,889 So we increment x offset by P of one. And\n 2369 04:53:19,889 --> 04:53:28,430 update valuate there. Let's go to K five.\n 2370 04:53:28,430 --> 04:53:37,150 eight. So eight is taken. So we're going to\n 2371 04:53:37,150 --> 04:53:43,950 we're going to insert the key value pair there.\n 2372 04:53:43,950 --> 04:53:56,280 six hashes the five, then let's probe ones\n 2373 04:53:56,279 --> 04:54:05,649 a hash collision, let's keep probing. So now,\n 2374 04:54:05,650 --> 04:54:11,240 right, another hash collision, so we have\n 2375 04:54:11,240 --> 04:54:19,280 back to five. So we've hit a cycle. Alright,\n 2376 04:54:19,279 --> 04:54:26,439 expected this to happen because we knew that\n 2377 04:54:26,439 --> 04:54:33,877 to three and not one. So if we look at all\n 2378 04:54:33,878 --> 04:54:41,490 instead of six, we see that the ones that\n 2379 04:54:41,490 --> 04:54:51,808 with and the table size are 12457 and eight,\n 2380 04:54:51,808 --> 04:55:01,360 something else. So this comes to the realization\n 2381 04:55:01,360 --> 04:55:09,119 p of x to be one times x, then the greatest\n 2382 04:55:09,119 --> 04:55:16,029 going to be one no matter what our choice\n 2383 04:55:16,029 --> 04:55:23,729 one times x is a very popular probing function\n 2384 04:55:23,729 --> 04:55:29,819 hash table, and we wish in certain more key\n 2385 04:55:29,819 --> 04:55:36,121 going to pick a probing function that works,\n 2386 04:55:36,121 --> 04:55:42,229 I'm going to pick the table size to be 12\n 2387 04:55:42,229 --> 04:55:53,110 should occur. All right, so let's go with\n 2388 04:55:53,110 --> 04:56:00,869 one has a hash value of 10, then at index\n 2389 04:56:00,869 --> 04:56:08,229 a hash value of eight, then slot eight is\n 2390 04:56:08,229 --> 04:56:18,887 now, suppose k three is equal to 10, hash\n 2391 04:56:18,887 --> 04:56:26,450 to keep probing. Alright, so if we use our\n 2392 04:56:26,450 --> 04:56:35,271 So they'll give us three module and when we\n 2393 04:56:35,271 --> 04:56:44,720 k four. Now suppose the hash value for K four\n 2394 04:56:44,720 --> 04:56:55,958 we hit k three, which inserted last time.\n 2395 04:56:55,957 --> 04:57:04,229 able to pull out eventually when we hit the\n 2396 04:57:04,229 --> 04:57:11,950 we've actually reached the threshold of our\n 2397 04:57:11,950 --> 04:57:21,468 that I picked alpha to be 0.35. So n, which\n 2398 04:57:21,468 --> 04:57:31,069 And we just finished inserting the fourth\n 2399 04:57:31,069 --> 04:57:37,957 So how we usually resize the table is VSM,\n 2400 04:57:37,957 --> 04:57:49,297 or so on. But we need to double in such a\n 2401 04:57:49,297 --> 04:57:59,557 a doubling, and is equal to 24, and the GCD\n 2402 04:57:59,558 --> 04:58:05,298 so it's still 3.5. So our new threshold is\n 2403 04:58:05,297 --> 04:58:08,457 the programming function. Alright 2404 04:58:08,457 --> 04:58:14,789 so let's allocate a new chunk of memory for\n 2405 04:58:14,790 --> 04:58:26,990 the old elements in our old table into this\n 2406 04:58:26,990 --> 04:58:33,860 right. So we scan across all these elements,\n 2407 04:58:33,860 --> 04:58:39,350 along. So from before we knew that hash value\n 2408 04:58:39,349 --> 04:58:45,439 is going to go at position 10. Scan along\n 2409 04:58:45,439 --> 04:58:52,459 three was 10. So it should go in position\n 2410 04:58:52,459 --> 04:59:00,340 so we have to keep probing. So if we add our\n 2411 04:59:00,340 --> 04:59:06,790 we get 10 plus five, which is 15. So we're\n 2412 04:59:06,790 --> 04:59:11,878 table, keep probing nothing here and nothing\n 2413 04:59:11,878 --> 04:59:18,878 k two. So we know from before k two is equal\n 2414 04:59:18,878 --> 04:59:27,100 eight. Now we know k one is equal to 10. So\n 2415 04:59:27,099 --> 04:59:32,729 that's taken so to probe so the next position\n 2416 04:59:32,729 --> 04:59:36,957 us 20. So insert k one v one at 20. 2417 04:59:38,069 --> 04:59:43,409 so now we throw away the old table and we\n 2418 04:59:43,409 --> 04:59:52,939 table we're working with and we were at inserting\n 2419 04:59:52,939 --> 05:00:00,039 And that spot is free. So we are good. So\n 2420 05:00:00,040 --> 05:00:06,388 I know how insertion works. Now how do I remove\n 2421 05:00:06,387 --> 05:00:14,069 open addressing? And my answer this is that\n 2422 05:00:14,069 --> 05:00:19,389 And we're going to do it after we see all\n 2423 05:00:19,389 --> 05:00:25,968 it's actually non trivial. All right, let's\n 2424 05:00:25,968 --> 05:00:34,619 works. Let's dive right in. So let's recall\n 2425 05:00:34,619 --> 05:00:41,569 of size and using the open addressing collision\n 2426 05:00:41,569 --> 05:00:47,759 a variable called x to be one, which we're\n 2427 05:00:47,759 --> 05:00:57,319 to find a free slot, then we compute the key\n 2428 05:00:57,319 --> 05:01:04,279 going to check and we're going to the loop.\n 2429 05:01:04,279 --> 05:01:10,090 the table at that index is not equal to null,\n 2430 05:01:10,090 --> 05:01:16,880 happens, we're going to offset the key hash\n 2431 05:01:16,880 --> 05:01:23,650 in our case is going to be a quadratic function.\n 2432 05:01:23,650 --> 05:01:31,530 we will find an open slot to insert our key\n 2433 05:01:31,529 --> 05:01:39,029 probing? So quadratic probing is simply probing\n 2434 05:01:39,029 --> 05:01:45,638 when our probing function looks something\n 2435 05:01:45,638 --> 05:01:52,128 c, and a, b, and c are all constants. And\n 2436 05:01:52,128 --> 05:02:01,540 we degrade to linear probing. But as we saw\n 2437 05:02:01,540 --> 05:02:06,781 functions are viable because they don't produce\n 2438 05:02:06,781 --> 05:02:13,628 an infinite loop. So as it turns out, most\n 2439 05:02:13,628 --> 05:02:20,340 will end up producing a cycle. Here's an example.\n 2440 05:02:20,340 --> 05:02:29,150 to be p of x equals 2x squared plus two, the\n 2441 05:02:29,150 --> 05:02:38,170 the current table size was nine, then we would\n 2442 05:02:38,169 --> 05:02:45,869 we would probe at position zero, we would\n 2443 05:02:45,869 --> 05:02:54,259 suppose those two entries are full, and then\n 2444 05:02:54,259 --> 05:03:00,639 function is only ever able to hit the buckets,\n 2445 05:03:00,639 --> 05:03:06,739 to reach all the other buckets, 012356, and\n 2446 05:03:06,740 --> 05:03:13,100 when four and seven are already occupied.\n 2447 05:03:13,099 --> 05:03:20,769 then is, how do we pick a probing function,\n 2448 05:03:20,770 --> 05:03:27,450 numerous ways. But here are the three most\n 2449 05:03:27,450 --> 05:03:36,031 is to select the probing function to be p\n 2450 05:03:36,031 --> 05:03:43,878 size a prime number greater than three, and\n 2451 05:03:43,878 --> 05:03:50,378 one half or less than or equal to one half.\n 2452 05:03:50,378 --> 05:03:57,270 equals x squared plus x divided by two, and\n 2453 05:03:57,270 --> 05:04:07,128 And the last and final one says that p of\n 2454 05:04:07,128 --> 05:04:13,610 and keep the table size a prime number where\n 2455 05:04:13,610 --> 05:04:22,137 we can say that I were table size was 23,\n 2456 05:04:22,137 --> 05:04:29,669 to three mod four. So any of these will work.\n 2457 05:04:29,669 --> 05:04:35,119 how they work and whether table size should\n 2458 05:04:37,349 --> 05:04:42,840 So we're going to focus on the second one\n 2459 05:04:42,840 --> 05:04:51,637 by two and the table size is a power of two.\n 2460 05:04:51,637 --> 05:04:59,979 hash table, and we want to insert some key\n 2461 05:04:59,979 --> 05:05:07,229 probing function, p of x equals x squared\n 2462 05:05:07,229 --> 05:05:14,878 is a power of two, so it's eight. And that's\n 2463 05:05:14,878 --> 05:05:22,409 the table threshold is going to be three.\n 2464 05:05:22,409 --> 05:05:29,360 absolutely be a power of two, otherwise, this\n 2465 05:05:29,360 --> 05:05:41,319 guy. So suppose that k one hashes six, then\n 2466 05:05:41,319 --> 05:05:46,718 Right, next k two, suppose k two is equal\n 2467 05:05:46,718 --> 05:05:54,860 five, no collision there. Suppose k threes\n 2468 05:05:54,860 --> 05:06:00,569 need to handle that. So we're going to try\n 2469 05:06:00,569 --> 05:06:08,369 to six. So we probe again, and that brings\n 2470 05:06:08,369 --> 05:06:17,159 going cert, k three, and V three key value\n 2471 05:06:17,159 --> 05:06:21,919 before we can do that, we've reached the table\n 2472 05:06:21,919 --> 05:06:28,529 first. Okay, so let's allocate a new block\n 2473 05:06:28,529 --> 05:06:37,430 table to keep it a power of two. So our new\n 2474 05:06:37,430 --> 05:06:48,128 However, a new threshold is six, and the probing\n 2475 05:06:48,128 --> 05:06:54,630 the entries in the old hash table into the\n 2476 05:06:54,630 --> 05:07:02,208 k three hashed to five, so we're going to\n 2477 05:07:02,207 --> 05:07:08,717 there. And no element at position 123 or four,\n 2478 05:07:08,718 --> 05:07:16,479 right there. So we know from before that key\n 2479 05:07:16,479 --> 05:07:23,457 a preposition five, there's a hash collision,\n 2480 05:07:23,457 --> 05:07:33,207 one equals six, position six, or insert k\n 2481 05:07:33,207 --> 05:07:38,557 them before k one hash two, six, but we can't\n 2482 05:07:38,558 --> 05:07:46,590 So we're going to probe along. So we're going\n 2483 05:07:46,590 --> 05:07:51,619 that does it for resizing the table. So let's\n 2484 05:07:51,619 --> 05:08:03,539 to insert inside our table. So suppose that\n 2485 05:08:03,540 --> 05:08:11,680 buy 16 years, this position to, so we're going\n 2486 05:08:11,680 --> 05:08:17,920 we've already seen k three, and we know its\n 2487 05:08:17,919 --> 05:08:23,637 is already in our hash table, we're going\n 2488 05:08:23,637 --> 05:08:30,250 probing functions, zero gives us five. So\n 2489 05:08:30,250 --> 05:08:46,218 v3, which it was before. So suppose that the\n 2490 05:08:46,218 --> 05:08:55,440 to three mod 16. So that's why it's free.\n 2491 05:08:55,439 --> 05:09:03,180 seven suppose hashes to well, we have a collision\n 2492 05:09:03,180 --> 05:09:08,540 we probe, our probing function gives us an\n 2493 05:09:08,540 --> 05:09:20,069 can so now we are at position five, but that's\n 2494 05:09:20,069 --> 05:09:26,770 for a fourth time scan offset of six. So that\ngives us eight. 2495 05:09:26,770 --> 05:09:34,659 That slot is free. We're going to answer that\n 2496 05:09:34,659 --> 05:09:43,110 and the double hashing, open addressing collision\n 2497 05:09:43,110 --> 05:09:50,069 for those of you who don't know how we do\n 2498 05:09:50,069 --> 05:09:59,259 we start with a variable x initialized to\n 2499 05:09:59,259 --> 05:10:06,759 We set that to be the first index that we're\n 2500 05:10:06,759 --> 05:10:15,637 it's not now. So the goal is to find an empty\n 2501 05:10:15,637 --> 05:10:23,000 still hitting spots where there are already\n 2502 05:10:23,000 --> 05:10:31,878 to offset our key hash using a probing function.\n 2503 05:10:31,878 --> 05:10:38,781 probing function. And we're also going to\n 2504 05:10:38,781 --> 05:10:44,009 along further and further. Once you find a\n 2505 05:10:44,009 --> 05:10:51,329 pair into the hash table. Okay, so what's\n 2506 05:10:51,330 --> 05:10:57,290 is just a probing method like any other. But\n 2507 05:10:57,290 --> 05:11:03,590 to a constant multiple of another hash function.\n 2508 05:11:03,590 --> 05:11:10,009 something like this, we give it as input,\n 2509 05:11:10,009 --> 05:11:19,967 x, and we compute x times h sub two of k,\n 2510 05:11:19,968 --> 05:11:28,409 here's an important note, H 2k, must hash\n 2511 05:11:28,409 --> 05:11:38,941 your key is a string, well, h two of K must\n 2512 05:11:38,941 --> 05:11:46,129 the nature of K must also hash integers. So\n 2513 05:11:46,130 --> 05:11:53,048 that double hashing actually reduces to linear\n 2514 05:11:53,047 --> 05:12:01,529 until runtime, because we dynamically compute\n 2515 05:12:01,529 --> 05:12:07,020 reduces to linear probing at runtime, we may\n 2516 05:12:07,020 --> 05:12:15,670 probing, which is that we get stuck in an\n 2517 05:12:15,669 --> 05:12:23,199 our secondary hash function at runtime calculate\n 2518 05:12:23,200 --> 05:12:29,840 h1 of K was four, at the table size was nine,\n 2519 05:12:29,840 --> 05:12:38,540 occurring. So the cycle produces values of\n 2520 05:12:38,540 --> 05:12:47,720 to reach any of the buckets, 02356, and eight.\n 2521 05:12:47,720 --> 05:12:51,208 that means we're stuck in an infinite loop\n 2522 05:12:51,207 --> 05:12:56,770 to insert our key value pair because we're\n 2523 05:12:56,770 --> 05:13:04,171 issue we have to deal with. So to fix the\n 2524 05:13:04,170 --> 05:13:10,269 one strategy, we're going to pick our table\n 2525 05:13:10,270 --> 05:13:20,029 to compute a value called delta. So delta\n 2526 05:13:20,029 --> 05:13:25,599 occasionally Delta might be zero. And if that's\n 2527 05:13:25,599 --> 05:13:29,930 to be stuck in a cycle because we're not going\n 2528 05:13:29,930 --> 05:13:35,957 going to be multiplying by zero. So when this\n 2529 05:13:35,957 --> 05:13:43,319 All right. So here's the justification why\n 2530 05:13:43,319 --> 05:13:49,547 going to be between one inclusive and non\n 2531 05:13:49,547 --> 05:13:58,869 between delta and n is going to be one, since\n 2532 05:13:58,869 --> 05:14:05,950 conditions, we know that the probing sequence\n 2533 05:14:05,950 --> 05:14:12,790 to able to hit every single slot in our hash\n 2534 05:14:12,790 --> 05:14:17,620 free slot in the hash table, which there will\n 2535 05:14:17,619 --> 05:14:21,861 a certain threshold, that we're going to be\n 2536 05:14:21,862 --> 05:14:29,450 Okay, so here's a core question, how do we\n 2537 05:14:29,450 --> 05:14:37,240 the keys we're using have type T, and whenever\n 2538 05:14:37,240 --> 05:14:44,909 h 2k, two hash keys that are also of type\n 2539 05:14:44,909 --> 05:14:50,779 systematic way of generating these new hash\n 2540 05:14:50,779 --> 05:14:58,039 we might be dealing with multiple different\n 2541 05:14:58,040 --> 05:15:04,600 computer science, Almost every object we ever\n 2542 05:15:04,599 --> 05:15:10,489 building blocks, in particular integers, strings,\n 2543 05:15:10,490 --> 05:15:18,270 on. So we can use this to our advantage. Luckily,\n 2544 05:15:18,270 --> 05:15:24,887 these fundamental data types. And we can combine\n 2545 05:15:24,887 --> 05:15:33,409 function h two of K. Frequently, when we compose\n 2546 05:15:33,409 --> 05:15:38,630 functions called Universal hash functions,\n 2547 05:15:38,630 --> 05:15:44,468 data types, which is quite convenient. Alright,\n 2548 05:15:44,468 --> 05:15:51,218 hashing. So suppose we have an originally\n 2549 05:15:51,218 --> 05:15:57,792 probing function to be p of x equals x squared\n 2550 05:15:57,792 --> 05:16:04,260 be in our table size v n equals seven. Notice\n 2551 05:16:04,259 --> 05:16:10,769 max load factor to be alpha equals point seven,\n 2552 05:16:10,770 --> 05:16:17,700 So once we hit five elements, we need to grow\n 2553 05:16:17,700 --> 05:16:23,430 all these key value pairs on the left into\n 2554 05:16:23,430 --> 05:16:30,650 keyword and v one. Now suppose that the hash\n 2555 05:16:30,650 --> 05:16:39,468 is 67, and H 2k. One is 34. And first thing\n 2556 05:16:39,468 --> 05:16:48,110 two of K one modulo seven, which is the table\n 2557 05:16:48,110 --> 05:16:59,292 where this key value pair should go and should\n 2558 05:16:59,292 --> 05:17:14,308 h one of K two is two, and H, two of K two\n 2559 05:17:14,308 --> 05:17:21,229 just going to insert a position two, because\n 2560 05:17:21,229 --> 05:17:32,250 is two, and H 2k. Three is 10. These are just\n 2561 05:17:32,250 --> 05:17:40,560 So then delta would be three in this, in this\n 2562 05:17:40,560 --> 05:17:49,708 have a hash collision, because we're trying\n 2563 05:17:49,707 --> 05:17:56,251 two is already there. So what we need to do\n 2564 05:17:56,251 --> 05:18:04,729 the position to be our original hash function\n 2565 05:18:04,729 --> 05:18:10,790 plus one times our delta value, mod seven,\n 2566 05:18:10,790 --> 05:18:20,990 to position five right there. Now, right,\n 2567 05:18:20,990 --> 05:18:28,860 h one of K four is equal to two, and h two\n 2568 05:18:28,860 --> 05:18:35,430 delta. So h two of K four modulo seven is\n 2569 05:18:35,430 --> 05:18:41,939 is zero. So when this happens, we know that\n 2570 05:18:41,939 --> 05:18:50,779 we don't get stuck in an infinite loop. So\n 2571 05:18:50,779 --> 05:18:57,807 mod seven gives us two. So we have a hash\n 2572 05:18:57,808 --> 05:19:08,170 you keep probing. So now we probed by multiplying\n 2573 05:19:10,819 --> 05:19:16,669 So now we're going to try insert k three,\n 2574 05:19:16,669 --> 05:19:22,859 but with a new value. So we're going to be\n 2575 05:19:22,860 --> 05:19:30,990 h 1k three is equal to two. Actually, we should\n 2576 05:19:30,990 --> 05:19:41,020 is 10. So compute delta. So we have a collision.\n 2577 05:19:41,020 --> 05:19:50,468 and update its value. Now suppose the first\n 2578 05:19:50,468 --> 05:19:58,010 secondary hash function of k six is 23. Then\n 2579 05:19:58,009 --> 05:20:04,090 we try to insert it, it goes to position So\n 2580 05:20:04,090 --> 05:20:11,349 delta mod seven, that gives us five. There's\n 2581 05:20:11,349 --> 05:20:17,127 offset at two times delta minus seven, which\n 2582 05:20:17,128 --> 05:20:24,920 we're able to insert our key value pair there.\n 2583 05:20:24,919 --> 05:20:30,899 so it's time to resize and grow the table,\n 2584 05:20:30,900 --> 05:20:35,590 So one strategy when we're trying to resize\n 2585 05:20:35,590 --> 05:20:41,950 need to keep our table size to be a prime\n 2586 05:20:41,950 --> 05:20:47,270 find the next prime number above this value.\n 2587 05:20:47,270 --> 05:20:56,128 and the next prime number 14 is 17. So 17\n 2588 05:20:56,128 --> 05:21:05,690 a new table of size 17, and go through the\n 2589 05:21:05,689 --> 05:21:15,079 into the new table. So from before, a 12k,\n 2590 05:21:15,080 --> 05:21:22,150 compute delta, and we know we're going to\n 2591 05:21:22,150 --> 05:21:29,580 collision. Next up, and nothing in position\n 2592 05:21:29,580 --> 05:21:37,420 value for K to two and the secondary hash\n 2593 05:21:37,419 --> 05:21:46,079 to compute delta to be six. So we're going\n 2594 05:21:46,080 --> 05:21:55,240 Next 1k four. So we know h one of K four is\n 2595 05:21:55,240 --> 05:22:02,231 compute our delta value. And notice that our\n 2596 05:22:02,230 --> 05:22:13,957 before, but seven because our mod is a 17\n 2597 05:22:13,957 --> 05:22:21,250 But we need to keep probing. So compute the\n 2598 05:22:21,250 --> 05:22:32,119 nine months 17. Next one, insert k one, suppose\n 2599 05:22:32,119 --> 05:22:42,590 then compute delta. And that gives us zero.\n 2600 05:22:42,590 --> 05:22:49,811 h one of K one plus zero times delta gives\n 2601 05:22:49,811 --> 05:22:57,780 probing. now compute the offset at one times\n 2602 05:22:57,779 --> 05:23:04,649 the x value. So now two times delta t is four,\n 2603 05:23:04,650 --> 05:23:12,620 value pair there. And the last 1k three. So\n 2604 05:23:12,619 --> 05:23:22,218 K three is 10. Delta is then 10. And we have\n 2605 05:23:22,218 --> 05:23:30,940 us 12. And that slot is free. And we reached\n 2606 05:23:30,939 --> 05:23:36,669 old table and replace it with a new table.\n 2607 05:23:36,669 --> 05:23:45,489 pair from before which is k seven. Suppose\n 2608 05:23:45,490 --> 05:23:54,878 is three, then our delta value is three. And\n 2609 05:23:54,878 --> 05:24:02,718 right, I know a lot of you have been anticipating\n 2610 05:24:02,718 --> 05:24:08,909 from a hash table using the open addressing\nscheme. 2611 05:24:08,909 --> 05:24:16,520 So let's have a look first at what issues\n 2612 05:24:16,520 --> 05:24:22,308 it naively, I think this is valuable. Because\n 2613 05:24:22,308 --> 05:24:29,458 hash table, and we're using a linear probing\n 2614 05:24:29,457 --> 05:24:37,637 equal to x. And we want to perform the following\n 2615 05:24:37,637 --> 05:24:47,557 1k, two and K three, and then remove k two\n 2616 05:24:47,558 --> 05:24:53,900 for the sake of argument, let's assume that\n 2617 05:24:53,900 --> 05:25:03,100 and K three, all equal to one. This is a possible\n 2618 05:25:03,099 --> 05:25:10,377 hashes to one. So we're going to insert at\n 2619 05:25:10,378 --> 05:25:15,560 has a hash collision with K one which is already\n 2620 05:25:15,560 --> 05:25:23,830 so insert it in the next slot over, that's\n 2621 05:25:23,830 --> 05:25:29,410 let's probe Okay, another hash collision.\n 2622 05:25:34,520 --> 05:25:40,909 we are going to remove k two, and we're going\n 2623 05:25:40,909 --> 05:25:51,878 just going to clear the contents of the bucket\n 2624 05:25:51,878 --> 05:25:59,208 was equal to k two. So we haven't found the\n 2625 05:25:59,207 --> 05:26:06,569 have found k two. So we're just going to remove\n 2626 05:26:06,569 --> 05:26:13,729 table and obtain the value for K three. Let's\n 2627 05:26:13,729 --> 05:26:20,140 we get one and K one cycles here three. So\n 2628 05:26:20,140 --> 05:26:29,739 no element. So what does this mean? So since\n 2629 05:26:29,740 --> 05:26:34,940 we're forced to conclude that he three does\n 2630 05:26:34,939 --> 05:26:41,289 we would have encountered it before reaching\n 2631 05:26:41,290 --> 05:26:50,058 works inside a hash table. So this method\n 2632 05:26:50,058 --> 05:26:55,790 doesn't work. Because k three clearly exists\n 2633 05:26:55,790 --> 05:27:01,659 index three. So here's a solution to the removing. 2634 05:27:02,659 --> 05:27:08,090 an element, we're going to place a unique\n 2635 05:27:08,090 --> 05:27:14,650 element to indicate that a specific key value\n 2636 05:27:14,650 --> 05:27:20,000 we're doing a search, we're just going to\n 2637 05:27:20,000 --> 05:27:25,718 the deleted bucket with a tombstone like we\n 2638 05:27:25,718 --> 05:27:36,770 we now search for the key k three. Okay, so\n 2639 05:27:36,770 --> 05:27:42,100 equal to k three. So keep probing. Alright,\n 2640 05:27:42,100 --> 05:27:49,449 was deleted. So keep probing. All right, we\n 2641 05:27:49,450 --> 05:27:58,128 value v three as our answer for the search.\n 2642 05:27:58,128 --> 05:28:05,900 I have a lot of tombstones cluttering my hash\n 2643 05:28:05,900 --> 05:28:11,520 with tombstones is that we're actually going\n 2644 05:28:11,520 --> 05:28:17,727 table. So they're going to increase the load\n 2645 05:28:17,727 --> 05:28:25,047 resize the hash table. But there's also another\n 2646 05:28:25,047 --> 05:28:31,599 a new key value pair, then we can replace\n 2647 05:28:31,599 --> 05:28:38,557 key value pair. And I want to give you guys\n 2648 05:28:38,558 --> 05:28:44,510 Suppose this is our hash table of size eight,\n 2649 05:28:44,509 --> 05:28:53,849 p of x equals x squared plus x divided by\n 2650 05:28:53,849 --> 05:29:03,079 play doing this when we want to do a lookup.\n 2651 05:29:03,080 --> 05:29:10,530 inside the hash table and the hash value for\n 2652 05:29:10,529 --> 05:29:20,717 key k seven. So k seven, hash to five. So\n 2653 05:29:20,718 --> 05:29:27,968 four. So let's keep probing. So we probe quadratically.\n 2654 05:29:27,968 --> 05:29:36,790 and that's position six. Now, position six\n 2655 05:29:36,790 --> 05:29:42,500 we encounter which has a tombstone in it.\n 2656 05:29:42,500 --> 05:29:48,700 for later to perform an optimization. Okay,\n 2657 05:29:48,700 --> 05:29:54,968 seven yet. So when we probe at position two,\n 2658 05:29:54,968 --> 05:30:01,680 because we have a tombstone so we have keep\n 2659 05:30:01,680 --> 05:30:11,128 So let's keep probing, and Aha, we have found\n 2660 05:30:11,128 --> 05:30:20,058 v seven. Now, however, we can do an optimization,\n 2661 05:30:20,058 --> 05:30:28,298 four times to find k seven, because we just\n 2662 05:30:28,297 --> 05:30:35,887 an optimization we can do is to relocate the\n 2663 05:30:35,887 --> 05:30:42,128 position where there was a tombstone. So that\n 2664 05:30:42,128 --> 05:30:49,619 We call this lazy deletion or lazy relocation.\n 2665 05:30:49,619 --> 05:30:55,279 there with K seven v seven. And now we have\n 2666 05:30:55,279 --> 05:31:02,797 going to want to replace the old one with\n 2667 05:31:02,797 --> 05:31:07,957 going to be having a look at some source code\n 2668 05:31:07,957 --> 05:31:14,877 as a collision resolution scheme. And you\n 2669 05:31:14,878 --> 05:31:22,297 William fiza, slash data dash structures,\n 2670 05:31:22,297 --> 05:31:28,887 to find a whole bunch of hash table implementations.\n 2671 05:31:28,887 --> 05:31:34,399 here. In particular, we are quadratic probing,\n 2672 05:31:34,400 --> 05:31:40,458 all very similar to each other. So I will\n 2673 05:31:40,457 --> 05:31:47,557 curious, you can go on GitHub and check them\n 2674 05:31:47,558 --> 05:31:50,420 or slightly more different is the double hashing. 2675 05:31:50,419 --> 05:31:57,619 But other than that, they are really essentially\n 2676 05:31:57,619 --> 05:32:04,489 we're going to have a look at the quadratic\n 2677 05:32:04,490 --> 05:32:13,100 Alright, here we are inside the code for the\n 2678 05:32:13,099 --> 05:32:19,717 let's dive right in. So I have a class called\n 2679 05:32:19,718 --> 05:32:25,208 it takes in two generic types K and V. So\n 2680 05:32:25,207 --> 05:32:31,779 type. So you're gonna have to specify these\n 2681 05:32:31,779 --> 05:32:35,957 hash table for quadratic probing. So I have\n 2682 05:32:35,957 --> 05:32:42,467 to need. The first is the load factor, this\n 2683 05:32:42,468 --> 05:32:49,580 that we're willing to tolerate the current\n 2684 05:32:49,580 --> 05:32:56,298 we're willing to tolerate the modification\n 2685 05:32:58,860 --> 05:33:04,260 two instance variables that keep track of\n 2686 05:33:04,259 --> 05:33:08,489 the key count, which tracks the number of\n 2687 05:33:08,490 --> 05:33:15,370 table. Now, since we're doing open addressing,\n 2688 05:33:15,369 --> 05:33:22,450 inside an array. So instead of having one\n 2689 05:33:22,450 --> 05:33:28,119 decided just allocate two different arrays,\n 2690 05:33:28,119 --> 05:33:35,110 the code a lot easier. And shorter. Actually,\n 2691 05:33:35,110 --> 05:33:41,690 to be using, or rather setting when we call\n 2692 05:33:41,689 --> 05:33:45,989 was talking about in the last video. This\n 2693 05:33:45,990 --> 05:33:52,030 deletions. So every time we delete an entry,\n 2694 05:33:52,029 --> 05:33:58,509 we know this tombstone object is unique. Alright,\n 2695 05:33:58,509 --> 05:34:03,619 So whenever you want to initialize a hash\n 2696 05:34:03,619 --> 05:34:09,329 these constants. So this is a default load\n 2697 05:34:09,330 --> 05:34:17,430 it up as you like. So you can initialize it\n 2698 05:34:17,430 --> 05:34:26,000 constructor. So let's have a look. So the\n 2699 05:34:26,000 --> 05:34:31,900 check if the user pass in some sort of a weird\n 2700 05:34:31,900 --> 05:34:40,138 then set the max value for the load factor,\n 2701 05:34:40,137 --> 05:34:49,899 that the capacity is a power of two, I need\n 2702 05:34:49,900 --> 05:34:58,930 going to be with this method next to power.\n 2703 05:34:58,930 --> 05:35:05,750 That is just Above this current number, or\n 2704 05:35:05,750 --> 05:35:11,009 itself is a power of two, so we don't have\n 2705 05:35:11,009 --> 05:35:15,599 to be a power of two, then we compute the\n 2706 05:35:15,599 --> 05:35:25,877 the capacity and initialize our tables. Alright,\n 2707 05:35:25,878 --> 05:35:35,297 probing function I chose. So P of x, so we\n 2708 05:35:35,297 --> 05:35:42,949 x squared plus x divided by two. So this is\n 2709 05:35:42,950 --> 05:35:50,049 So given a hash value, it essentially strips\n 2710 05:35:50,049 --> 05:35:59,000 So it dumps our hash value inside the domain,\n 2711 05:35:59,000 --> 05:36:05,128 a clear method. And this is pretty self explanatory.\n 2712 05:36:05,128 --> 05:36:14,792 hash table and start fresh. Some helper methods\n 2713 05:36:14,792 --> 05:36:21,490 tables empty. And put add an insert or essentially\n 2714 05:36:21,490 --> 05:36:27,950 insert method. This inserts a key value pair\n 2715 05:36:27,950 --> 05:36:34,869 the key already exists. All right, we don't\n 2716 05:36:34,869 --> 05:36:40,180 an exception. If the number of buckets use\n 2717 05:36:40,180 --> 05:36:46,689 we're tolerating, we're going to resize the\n 2718 05:36:46,689 --> 05:36:52,590 we want to calculate the hash value from the\n 2719 05:36:52,590 --> 05:36:59,040 you can override this for your particular\n 2720 05:36:59,040 --> 05:37:04,100 Jane xR. So I is going to be the current index\n 2721 05:37:04,099 --> 05:37:11,899 to be bouncing around this I value is going\n 2722 05:37:11,900 --> 05:37:17,540 the first tombstone we encounter if we encounter\n 2723 05:37:17,540 --> 05:37:25,308 going to be using this for an optimization\n 2724 05:37:25,308 --> 05:37:32,450 to one, initially. Okay, so this is a do while\n 2725 05:37:34,529 --> 05:37:40,860 Alright, so first, we check in the key table,\n 2726 05:37:40,860 --> 05:37:46,819 is equal to minus one, that means we haven't\n 2727 05:37:46,819 --> 05:37:58,360 this tombstone. Okay, so this next check checks\n 2728 05:37:58,360 --> 05:38:03,797 meaning there's a key inside of it. So we\n 2729 05:38:03,797 --> 05:38:11,451 table. So that's what this does. It compares\n 2730 05:38:11,452 --> 05:38:22,409 trying to insert this key. And if j is equal\n 2731 05:38:22,409 --> 05:38:28,628 then just update the value. If we've hit a\n 2732 05:38:28,628 --> 05:38:36,708 the tombstone. And at the modification, count,\n 2733 05:38:36,707 --> 05:38:44,819 was there before just just in case why use\n 2734 05:38:44,819 --> 05:38:54,540 so we can do an insertion. So j is equal to\n 2735 05:38:54,540 --> 05:39:01,420 so far. So increment number of use buckets\n 2736 05:39:01,419 --> 05:39:13,679 pair. Otherwise, we have seen a tombstone\n 2737 05:39:13,680 --> 05:39:20,430 where the element is inserted where the deleted\n 2738 05:39:20,430 --> 05:39:27,297 of AI. So here we're inserting an AI, but\n 2739 05:39:27,297 --> 05:39:35,759 we're gonna return null because there was\n 2740 05:39:35,759 --> 05:39:41,807 a loop, so we get through all these if statements\n 2741 05:39:41,808 --> 05:39:49,620 need to keep probing we had a hash collision,\n 2742 05:39:49,619 --> 05:39:57,419 in it. So we need to probe so we need to offset\n 2743 05:39:57,419 --> 05:40:03,159 probing index or the probe. Addition and increment\n 2744 05:40:03,159 --> 05:40:08,290 to the next spot. And we'll do this while\n 2745 05:40:12,419 --> 05:40:20,679 so contains key and has key, just check if\n 2746 05:40:20,680 --> 05:40:27,430 do this, I'm being pretty lazy. And I'm just\n 2747 05:40:27,430 --> 05:40:33,797 instance variable in there called contains\n 2748 05:40:33,797 --> 05:40:40,201 key is inside our hash table or not. Because\n 2749 05:40:40,202 --> 05:40:46,978 have essentially the same code. So that's\n 2750 05:40:46,977 --> 05:40:53,739 at the get method since it's getting used\n 2751 05:40:53,740 --> 05:41:02,440 the original hash index is equal to the hash\n 2752 05:41:02,439 --> 05:41:10,127 do all the same stuff, or mostly except set\n 2753 05:41:10,128 --> 05:41:20,659 flag to be true, when we identify that the\n 2754 05:41:20,659 --> 05:41:26,968 our else condition is just shorter, we return\n 2755 05:41:26,968 --> 05:41:34,887 a new element, and set the contains slide\n 2756 05:41:34,887 --> 05:41:42,959 Remove method is actually quite a bit shorter.\n 2757 05:41:42,959 --> 05:41:49,319 is now find the hash set x to be equal to\n 2758 05:41:49,319 --> 05:41:57,529 too much. So we don't have a j position. So\n 2759 05:41:57,529 --> 05:42:07,878 probe until we find a spot. So for every loop,\n 2760 05:42:07,878 --> 05:42:15,279 position if this loop gets completed, so here's\n 2761 05:42:15,279 --> 05:42:22,610 skip over those. So if this happens, if the\n 2762 05:42:22,610 --> 05:42:29,690 return now. Otherwise, the key we want to\n 2763 05:42:29,689 --> 05:42:33,957 this check, because we check if it's null\n 2764 05:42:33,957 --> 05:42:45,569 So decrement, the key, count up the modification\n 2765 05:42:45,569 --> 05:42:52,619 here, and just wipe whatever value is in there.\n 2766 05:42:52,619 --> 05:42:59,180 the Remove method, and then just return the\n 2767 05:42:59,180 --> 05:43:04,330 okay, these two methods are pretty self explanatory,\n 2768 05:43:04,330 --> 05:43:09,718 that are contained within our hash table.\n 2769 05:43:09,718 --> 05:43:17,159 resize table method. So this is this gets\n 2770 05:43:17,159 --> 05:43:24,452 I mean, to grow the table size. And remember\n 2771 05:43:24,452 --> 05:43:30,308 implementation, we always need the capacity\n 2772 05:43:30,308 --> 05:43:35,540 the capacity is already a power of two, multiplying\n 2773 05:43:35,540 --> 05:43:46,920 fine. So we compute the new threshold allocates\n 2774 05:43:46,919 --> 05:43:56,169 table, but it's actually going to be the new\n 2775 05:43:56,169 --> 05:44:03,899 of interesting maneuver here, I swap the current\n 2776 05:44:03,900 --> 05:44:10,670 table, which I call an old table. In order\n 2777 05:44:10,669 --> 05:44:17,569 here, we'll get to that. So swap the key tables,\n 2778 05:44:17,569 --> 05:44:25,759 count and the bucket count. And the reason\n 2779 05:44:25,759 --> 05:44:37,069 the swap, well, the then the new table is\n 2780 05:44:37,069 --> 05:44:46,319 the old table. That might sound confusing,\n 2781 05:44:46,319 --> 05:44:53,270 insertions on or the pointer to it. Alright,\n 2782 05:44:53,270 --> 05:45:01,159 if we encounter a token or a pointer that's\n 2783 05:45:02,159 --> 05:45:08,000 So because we're avoiding reinserting tombstones,\n 2784 05:45:08,000 --> 05:45:12,707 even though our table might have been cluttered\n 2785 05:45:12,707 --> 05:45:20,090 all of them here. Alright, so that's that\n 2786 05:45:20,090 --> 05:45:26,659 at this yourself. It's just looping through\n 2787 05:45:26,659 --> 05:45:34,430 That's a pretty standard to string method.\n 2788 05:45:34,430 --> 05:45:39,457 open addressing. Today, I want to talk about\n 2789 05:45:39,457 --> 05:45:45,349 binary index tree, and you'll see why very\n 2790 05:45:45,349 --> 05:45:51,709 because it's such a powerful data structure.\n 2791 05:45:51,709 --> 05:45:58,529 really simple to code. So let's dive right\n 2792 05:45:58,529 --> 05:46:04,129 video series, and just some standard stuff.\n 2793 05:46:04,130 --> 05:46:11,250 structure exists, analyze the time complexity,\n 2794 05:46:11,250 --> 05:46:16,869 So in this video, we'll get to the range query,\n 2795 05:46:16,869 --> 05:46:24,128 and how to construct the Fenwick tree in linear\n 2796 05:46:24,128 --> 05:46:33,128 but I'm not going to be covering that in this\n 2797 05:46:33,128 --> 05:46:40,208 the motivation behind the Fenwick tree. So\n 2798 05:46:40,207 --> 05:46:49,637 and we want to query a range and find the\n 2799 05:46:49,637 --> 05:47:00,250 do would be to start at the position and scan\n 2800 05:47:00,250 --> 05:47:05,909 all the individual values between that range.\n 2801 05:47:05,909 --> 05:47:12,970 it'll soon get pretty slow, because we're\n 2802 05:47:12,970 --> 05:47:21,110 However, if we do something like compute all\n 2803 05:47:21,110 --> 05:47:30,292 do queries in constant time, which is really,\n 2804 05:47:30,292 --> 05:47:39,720 zero to be zero, and then we go in our array\n 2805 05:47:39,720 --> 05:47:46,900 sum, we get five and then five, plus or minus\n 2806 05:47:46,900 --> 05:47:54,470 is eight, and so on. So this is an elementary\n 2807 05:47:54,470 --> 05:48:02,600 out prefix sums. And then if we want to find\n 2808 05:48:02,599 --> 05:48:11,430 then we can get the difference between those\n 2809 05:48:11,430 --> 05:48:18,099 time thing to compute. The sum of the values\n 2810 05:48:18,099 --> 05:48:25,489 really great. However, there's a slight flaw\n 2811 05:48:25,490 --> 05:48:32,990 to update a value in our original array A,\n 2812 05:48:32,990 --> 05:48:40,659 Well, now we have to recompute all the prefix\n 2813 05:48:40,659 --> 05:48:46,169 recalculate all those prefix sums. And to\n 2814 05:48:46,169 --> 05:48:54,039 Chu was essentially created. So what is the\n 2815 05:48:54,040 --> 05:49:04,040 that supports range queries on arrays and\n 2816 05:49:04,040 --> 05:49:13,090 we won't be covering that in this video. So\n 2817 05:49:13,090 --> 05:49:20,229 dates are logarithmic. range, some queries\n 2818 05:49:20,229 --> 05:49:31,250 but you can't say add elements to the array\n 2819 05:49:34,457 --> 05:49:38,797 so let's look at how we can do range queries\n 2820 05:49:38,797 --> 05:49:46,319 is that unlike a regular array, a family tree,\n 2821 05:49:46,319 --> 05:49:56,759 rather for a range of other cells as well.\n 2822 05:49:56,759 --> 05:50:04,739 for other cells depending on what The value\n 2823 05:50:04,740 --> 05:50:13,090 representation. So on the left, I have a one\n 2824 05:50:13,090 --> 05:50:19,029 very important. And I, on the side of that,\n 2825 05:50:19,029 --> 05:50:26,057 of the numbers, you can clearly see what they\n 2826 05:50:26,058 --> 05:50:35,690 index 12, its binary representation is 1100.\n 2827 05:50:35,689 --> 05:50:44,329 most bit. So that is at position three, and\n 2828 05:50:44,330 --> 05:50:51,280 responsible for we're going to say two to\n 2829 05:50:51,279 --> 05:51:03,009 cells below itself. Similarly, 10, has a binary\n 2830 05:51:03,009 --> 05:51:09,279 least significant bit is that position two.\n 2831 05:51:09,279 --> 05:51:18,378 And 11 has this thing fit in there position\n 2832 05:51:18,378 --> 05:51:29,378 So here, I've outlined the lease Sydney, leasing\n 2833 05:51:29,378 --> 05:51:36,110 for all the odd numbers, which are just responsible\n 2834 05:51:36,110 --> 05:51:42,529 indicates, the blue bars don't represent value,\n 2835 05:51:42,529 --> 05:51:51,399 And that's really important for you to keep\n 2836 05:51:51,400 --> 05:51:59,100 arranged responsibility to now the cells have\n 2837 05:51:59,099 --> 05:52:08,920 all these ranges of responsibilities or powers\n 2838 05:52:08,920 --> 05:52:18,479 16 for 16 cells. So now, how do we do a range\n 2839 05:52:18,479 --> 05:52:26,360 in this standard array, but rather this weird\n 2840 05:52:26,360 --> 05:52:32,770 answer is, we're going to calculate the prefix\n 2841 05:52:32,770 --> 05:52:37,220 eventually going to allow us to do a range\n 2842 05:52:37,220 --> 05:52:44,878 sums, just like we did for a regular array,\n 2843 05:52:44,878 --> 05:52:52,860 going to start at some index and cascade downwards\n 2844 05:52:52,860 --> 05:52:59,930 I mean. So for example, let's find the prefix\n 2845 05:52:59,930 --> 05:53:06,240 the prefix sound from index, one to seven,\n 2846 05:53:06,240 --> 05:53:17,190 tree is inclusive. So if we look at where\n 2847 05:53:17,189 --> 05:53:25,419 the array at position seven. And then we want\n 2848 05:53:25,419 --> 05:53:34,709 us is six, and then four. Notice that we were\n 2849 05:53:34,709 --> 05:53:43,279 then we move down to six. And then from six,\n 2850 05:53:43,279 --> 05:53:48,759 responsibility to, and then we're at four\n 2851 05:53:48,759 --> 05:53:53,279 that brings us all the way down to zero. And\n 2852 05:53:53,279 --> 05:54:03,520 we are all the way down to zero. So the prefix\n 2853 05:54:03,520 --> 05:54:10,420 plus the array index six plus the array index\n 2854 05:54:10,419 --> 05:54:16,649 for index 11. So we always start at where\n 2855 05:54:16,650 --> 05:54:24,569 to cascade down. So the cell directly below\n 2856 05:54:24,569 --> 05:54:27,450 to so we're gonna go down to 2857 05:54:27,450 --> 05:54:34,139 so that's eight and an eight brings us all\n 2858 05:54:34,139 --> 05:54:41,509 And one last one, let's find the prefix son\n 2859 05:54:41,509 --> 05:54:47,659 four as arranger responsibility of exactly\n 2860 05:54:47,659 --> 05:54:54,869 so we can stop. Okay, let's pull this all\n 2861 05:54:54,869 --> 05:55:05,000 i and j. So let's calculate the interval sun\n 2862 05:55:05,000 --> 05:55:14,137 going to calculate the prefix sum of 15 and\n 2863 05:55:14,137 --> 05:55:19,369 of 15. And then we're going to calculate the\n 2864 05:55:19,369 --> 05:55:26,878 not going to calculate up to 11. inclusive,\n 2865 05:55:26,878 --> 05:55:36,308 a lot. Okay, so if we start at 15, then we\n 2866 05:55:36,308 --> 05:55:44,218 arranger responsibility of one, subtract one\n 2867 05:55:44,218 --> 05:55:53,159 responsibility if two, because the least significant\n 2868 05:55:53,159 --> 05:56:02,599 and then keep cascading down. So the prefix\n 2869 05:56:02,599 --> 05:56:11,457 plus 12 plus eight. All right, now the prefix\n 2870 05:56:11,457 --> 05:56:17,539 to start at 10. Now we want to cascade down\n 2871 05:56:17,540 --> 05:56:25,128 to subtract two from 10, we get to eight.\n 2872 05:56:25,128 --> 05:56:33,310 of eight, so cascade down, so eight minus\n 2873 05:56:33,310 --> 05:56:43,210 sum of all the indices of 15 minus those of\n 2874 05:56:43,209 --> 05:56:53,399 range sum. So notice that in the worst case,\n 2875 05:56:53,400 --> 05:57:00,069 which is all the ones and these are the numbers\n 2876 05:57:00,069 --> 05:57:08,590 of two, minus one. So a power of two has one\n 2877 05:57:08,590 --> 05:57:15,680 one, then your whole bunch of ones here. So\n 2878 05:57:15,680 --> 05:57:24,670 are ones. And those are the worst cases. So\n 2879 05:57:24,669 --> 05:57:34,429 say 15, and seven, both of which have a lot\n 2880 05:57:34,430 --> 05:57:40,670 base two of N operations. But in the average\n 2881 05:57:40,669 --> 05:57:46,269 going to implement this in such a way that\n 2882 05:57:46,270 --> 05:57:54,860 this is like super fast. So the range query\n 2883 05:57:54,860 --> 05:58:02,477 like literally no code, the range query from\n 2884 05:58:02,477 --> 05:58:09,439 so I'm going to define a function called prefix\n 2885 05:58:09,439 --> 05:58:17,619 down operation. So we started I and while\n 2886 05:58:17,619 --> 05:58:28,899 up the values in our final tree. And we're\n 2887 05:58:28,900 --> 05:58:34,990 bit. And we're all we're going to keep doing\n 2888 05:58:34,990 --> 05:58:42,740 can return the sum. So the range query manages\n 2889 05:58:42,740 --> 05:58:50,792 really neat little algorithm. I want to talk\n 2890 05:58:50,792 --> 05:58:56,650 let's dive right in. But before we get to\n 2891 05:58:56,650 --> 05:59:05,140 the Fenwick tree range query video. And I\n 2892 05:59:05,139 --> 05:59:14,291 the Fenwick tree is set up and how we're doing\n 2893 05:59:14,292 --> 05:59:22,990 your brain on how we actually did a prefix\n 2894 05:59:22,990 --> 05:59:30,770 what we did was, we started at a value at\n 2895 05:59:30,770 --> 05:59:38,740 continuously removed the least significant\n 2896 05:59:41,330 --> 05:59:51,320 so let's say we started at 13 or 13, at least\n 2897 05:59:51,319 --> 06:00:00,459 and we got 12. And then we found out that\n 2898 06:00:00,459 --> 06:00:06,129 So we remove four and then at least thing,\n 2899 06:00:06,130 --> 06:00:14,530 zero. And once we reach zero, we know that\n 2900 06:00:14,529 --> 06:00:20,717 analogous to this, instead of removing, we're\n 2901 06:00:20,718 --> 06:00:29,170 as you'll see. So for instance, if we want\n 2902 06:00:29,169 --> 06:00:38,799 to find out all the cells which are responsible\n 2903 06:00:38,799 --> 06:00:45,540 for a range of responsibility. So if we start\n 2904 06:00:45,540 --> 06:00:54,628 and add it to nine, and we get 10. So 10,\n 2905 06:00:54,628 --> 06:01:03,501 two, then we add it to 10, then find the least\n 2906 06:01:03,501 --> 06:01:10,540 it's 16. And then we would do the same thing,\n 2907 06:01:10,540 --> 06:01:20,130 know to stop. So if I draw a line outwards\n 2908 06:01:20,130 --> 06:01:28,049 ones I have to update. So remember, those\n 2909 06:01:28,049 --> 06:01:36,399 So the lines that I hit are the ones that\n 2910 06:01:36,400 --> 06:01:45,058 Okay. So 14, add some constant x, at position\n 2911 06:01:45,058 --> 06:01:52,969 need to modify? So we start six, and we find\n 2912 06:01:52,969 --> 06:02:02,479 we get eight, find the least significant bit\n 2913 06:02:02,479 --> 06:02:10,450 out from sex, then indeed the cells that I\n 2914 06:02:10,450 --> 06:02:25,250 updates for our Fenwick tree are that we need\n 2915 06:02:25,250 --> 06:02:31,069 the algorithm is really, really simple. So\n 2916 06:02:31,069 --> 06:02:39,957 array of size n, then while I, so I supposition\n 2917 06:02:39,957 --> 06:02:47,229 less than n, we're going to add x to the tree\n 2918 06:02:47,229 --> 06:02:56,520 significant bit of AI. And that's it, where\n 2919 06:02:56,520 --> 06:03:01,850 AI. And there are built in functions to do\n 2920 06:03:01,849 --> 06:03:07,319 at some source code. Alright, so that was\n 2921 06:03:07,319 --> 06:03:13,169 construct the Fenwick tree. Let's talk about\n 2922 06:03:13,169 --> 06:03:19,039 how to do range queries and point updates\n 2923 06:03:19,040 --> 06:03:24,708 seen how to construct the Fenwick tree yet.\n 2924 06:03:24,707 --> 06:03:32,599 you can't understand the Fenwick tree construction\n 2925 06:03:32,599 --> 06:03:39,451 updates work. Alright, so let's dive right\n 2926 06:03:39,452 --> 06:03:47,610 of a Fenwick tree. So if we're given an array\n 2927 06:03:47,610 --> 06:03:54,930 into a Fenwick tree, what we could do is initialize\n 2928 06:03:54,930 --> 06:04:01,808 all zeros and add the values into the Fenwick\n 2929 06:04:01,808 --> 06:04:10,260 get a total time complexity of order n log\n 2930 06:04:10,259 --> 06:04:20,299 do this in linear time. So why bother with\n 2931 06:04:20,299 --> 06:04:24,790 we're going to be given an array of values\n 2932 06:04:24,790 --> 06:04:33,120 a legitimate Fenwick tree, not just the array\n 2933 06:04:33,119 --> 06:04:40,409 going to propagate the values throughout our\n 2934 06:04:40,409 --> 06:04:47,121 And we're going to do this by updating the\n 2935 06:04:47,121 --> 06:04:52,169 as we pass through the entire tree, everyone's\n 2936 06:04:52,169 --> 06:04:59,069 a fully functional Fenwick tree at the end\n 2937 06:04:59,069 --> 06:05:05,650 idea. So you You propagate some thing to the\n 2938 06:05:05,650 --> 06:05:11,690 that parent propagate its value to its parent\n 2939 06:05:11,689 --> 06:05:19,439 almost delegating the value. So let's see\n 2940 06:05:19,439 --> 06:05:28,930 that. So if the current position is position\n 2941 06:05:30,580 --> 06:05:39,150 our parent, let's say that is J, and j is\n 2942 06:05:39,150 --> 06:05:49,490 of AI. Alright, so if we start at one, well,\n 2943 06:05:49,490 --> 06:05:57,930 the parent is at position two. So notice that\n 2944 06:05:57,930 --> 06:06:03,569 going to add two for the value of i, which\n 2945 06:06:03,569 --> 06:06:13,319 seven. Now, we want a bit position two. So\n 2946 06:06:13,319 --> 06:06:20,400 two. So two, plus at least significant bit\n 2947 06:06:20,400 --> 06:06:28,819 for two are immediately responsible for two.\n 2948 06:06:28,819 --> 06:06:34,409 who is responsible for three? Well, that's\n 2949 06:06:34,409 --> 06:06:43,279 value at index three. Now, who's responsible\n 2950 06:06:43,279 --> 06:06:50,457 So go to position eight, and add the value\n 2951 06:06:50,457 --> 06:06:59,297 So in our five, and then you see how we keep\n 2952 06:06:59,297 --> 06:07:04,799 cell responsible for us, now seven is updating\neight. 2953 06:07:05,799 --> 06:07:12,329 now, nobody, oh, eight doesn't have a parent.\n 2954 06:07:12,330 --> 06:07:19,420 only has 12 cells, but the parent that would\n 2955 06:07:19,419 --> 06:07:27,911 out of bounds. So we just ignore it. It's\n 2956 06:07:27,911 --> 06:07:35,707 I is nine nines, least significant bit is\n 2957 06:07:35,707 --> 06:07:44,889 So keep propagating that value 10s, parent\n 2958 06:07:44,889 --> 06:07:50,849 the same sort of situation we had with eight\n 2959 06:07:50,849 --> 06:07:59,579 we ignore it. So the values that are there\n 2960 06:07:59,580 --> 06:08:07,728 And with these values, we can do range queries\n 2961 06:08:07,727 --> 06:08:14,520 that we had. So let's look at the construction\n 2962 06:08:14,520 --> 06:08:19,600 this, we will have a look at some source code\n 2963 06:08:19,599 --> 06:08:26,189 language that I'm not using, this can be helpful.\n 2964 06:08:26,189 --> 06:08:32,230 into a Fenwick tree. Let's get it's about\n 2965 06:08:32,230 --> 06:08:41,069 actually clone or make a deep copy of the\n 2966 06:08:41,069 --> 06:08:46,878 manipulate the values array while you're constructing\n 2967 06:08:46,878 --> 06:08:53,170 because we're doing all this stuff in place.\n 2968 06:08:53,169 --> 06:09:02,379 I at one and go up to n and then compute j\n 2969 06:09:02,380 --> 06:09:12,560 significant bit of I do an if statement to\n 2970 06:09:12,560 --> 06:09:16,521 be less than or equal to and actually not\n 2971 06:09:16,521 --> 06:09:22,610 based and in a Fenwick tree. Yeah, I'm pretty\n 2972 06:09:22,610 --> 06:09:28,340 let's have a look at some Fenwick tree source\n 2973 06:09:28,340 --> 06:09:34,659 can find it at this link which I'll put in\n 2974 06:09:34,659 --> 06:09:43,840 data dash structures, and the Fenwick trees\n 2975 06:09:43,840 --> 06:09:53,457 tree folder. So let's dive right in. I have\n 2976 06:09:53,457 --> 06:10:00,387 Alright, so this source code is provided to\n 2977 06:10:00,387 --> 06:10:07,360 layer two, any language you're working. So\n 2978 06:10:07,360 --> 06:10:15,110 constructors. One, they'll create an empty\n 2979 06:10:15,110 --> 06:10:21,718 populate yourself. And another one, which\n 2980 06:10:21,718 --> 06:10:29,319 in the last video, and constructs the Fenwick\n 2981 06:10:29,319 --> 06:10:33,047 the constructor you want to use, and not the\n 2982 06:10:33,047 --> 06:10:41,699 either or. So one thing that you guys should\n 2983 06:10:41,700 --> 06:10:50,100 thing needs to be one based. In the last video,\n 2984 06:10:50,099 --> 06:10:57,009 go less than or less than or equal to the\n 2985 06:10:57,009 --> 06:11:02,957 on whether your array is one based or zero\n 2986 06:11:02,957 --> 06:11:11,877 one based in which case, it would be less\n 2987 06:11:11,878 --> 06:11:17,887 right. But other than that, so this is just\n 2988 06:11:17,887 --> 06:11:24,869 propagate the value to the parent. So that's\n 2989 06:11:24,869 --> 06:11:30,539 So pretty simple stuff. So this is probably\n 2990 06:11:30,540 --> 06:11:36,639 least significant bit method. And it's going\n 2991 06:11:36,639 --> 06:11:45,957 bit for some integer i. So this bit magic\n 2992 06:11:45,957 --> 06:11:54,389 the least significant bit value. Something\n 2993 06:11:54,389 --> 06:11:58,930 here, which uses Java's built in method find\n 2994 06:11:58,930 --> 06:12:06,968 However, using a Rob manipulation, like this\n 2995 06:12:06,968 --> 06:12:14,170 faster. Okay, so the prefix sums, this is\n 2996 06:12:14,169 --> 06:12:20,469 allows you to compute the prefix sum from\n 2997 06:12:20,470 --> 06:12:29,960 done one based. So this would do the cascading\n 2998 06:12:29,959 --> 06:12:35,887 a sum equal to zero, and add the values of\n 2999 06:12:35,887 --> 06:12:46,579 cascading down. And this line line 55 is equivalent\n 3000 06:12:46,580 --> 06:12:52,888 of I, which is a lot more readable than this\n 3001 06:12:52,887 --> 06:12:58,649 clears that bit. But you want to use as much\n 3002 06:12:58,650 --> 06:13:04,000 tree fast, even though it's already really,\n 3003 06:13:04,000 --> 06:13:12,680 you use, the less operation or machine level\n 3004 06:13:12,680 --> 06:13:21,029 program is going to be so much faster. Okay,\n 3005 06:13:21,029 --> 06:13:31,069 then we can call the prefix some methods right\n 3006 06:13:31,069 --> 06:13:38,900 So that's easy. So adding, so this is from\n 3007 06:13:38,900 --> 06:13:43,740 So we k can be positive or negative, that\n 3008 06:13:43,740 --> 06:13:52,290 to do as for AI, you're going to update everyone\n 3009 06:13:52,290 --> 06:13:56,850 that are responsible for you. And for each\n 3010 06:13:56,849 --> 06:14:05,669 K. And then you're going to propagate up to\n 3011 06:14:05,669 --> 06:14:10,579 least significant bit, and you're going to\n 3012 06:14:10,580 --> 06:14:19,208 at some valent index. And this additional\n 3013 06:14:19,207 --> 06:14:29,180 the index is equal to k, this might sometimes\n 3014 06:14:29,180 --> 06:14:38,650 and then call the Add method. So pretty simple\n 3015 06:14:38,650 --> 06:14:48,240 and half of it is comments. So this is a really\n 3016 06:14:48,240 --> 06:14:54,290 And interesting topic I want to talk about\n 3017 06:14:54,290 --> 06:14:59,218 powerful data structure to have in your toolbox\n 3018 06:14:59,218 --> 06:15:04,569 Stuff like arrays are a relatively new data\n 3019 06:15:04,569 --> 06:15:11,239 due to the heavy memory consumption needs\n 3020 06:15:11,240 --> 06:15:19,468 and talk about just what a suffix is. For\n 3021 06:15:19,468 --> 06:15:27,810 at the end of a string. For example, if we\n 3022 06:15:27,810 --> 06:15:36,370 of the string horse are, we are able to come\n 3023 06:15:36,369 --> 06:15:49,369 e, s, e, r, s, e, and so on. Now we can answer\n 3024 06:15:49,369 --> 06:15:57,539 is a suffix array is the array containing\n 3025 06:15:57,540 --> 06:16:06,090 see an example of this. Suppose, you want\n 3026 06:16:06,090 --> 06:16:14,090 On the left, I constructed a table with all\n 3027 06:16:14,090 --> 06:16:22,957 that particular suffix started in a string\n 3028 06:16:22,957 --> 06:16:29,279 all the suffixes in lexicographic order in\na table. 3029 06:16:29,279 --> 06:16:37,137 The actual suffix array is the array of sorted\n 3030 06:16:37,137 --> 06:16:42,378 to actually store the suffixes themselves\n 3031 06:16:42,378 --> 06:16:49,270 original string. This is an ingenious idea,\n 3032 06:16:49,270 --> 06:16:56,590 of the sort of suffixes without actually needing\n 3033 06:16:56,590 --> 06:17:04,869 All we need is the original string and the\n 3034 06:17:04,869 --> 06:17:12,159 suffix array is an array of indices which\n 3035 06:17:12,159 --> 06:17:17,329 for a bit of history on the suffix array,\n 3036 06:17:17,330 --> 06:17:24,900 to be a space efficient alternative to a suffix\n 3037 06:17:24,900 --> 06:17:32,069 a compressed version of another data structure\n 3038 06:17:32,069 --> 06:17:38,119 the suffix array is a very different from\n 3039 06:17:38,119 --> 06:17:45,529 about virtually anything, the suffix tree\n 3040 06:17:45,529 --> 06:17:53,600 such as a longest common prefix array, which\n 3041 06:17:53,600 --> 06:17:58,659 video, we're going to talk about perhaps the\n 3042 06:17:58,659 --> 06:18:06,040 with the suffix array. And that is the longest\n 3043 06:18:06,040 --> 06:18:14,208 array. The LCP array is an array where each\n 3044 06:18:14,207 --> 06:18:21,759 suffixes have in common with each other. Let's\n 3045 06:18:21,759 --> 06:18:29,477 show what the LCP array is, is to do an example.\n 3046 06:18:29,477 --> 06:18:37,139 the LCP array is, for the string, A, B, A,\n 3047 06:18:37,139 --> 06:18:45,349 do is construct the suffix array for our string\n 3048 06:18:45,349 --> 06:18:52,019 Notice that the very first entry that we placed\n 3049 06:18:52,020 --> 06:19:00,369 is zero. This is because this index is undefined,\n 3050 06:19:00,369 --> 06:19:06,457 our LCP array, let's begin by looking at the\n 3051 06:19:06,457 --> 06:19:14,680 they have in common with each other. We noticed\n 3052 06:19:14,680 --> 06:19:23,189 index of our LCP array. Now we move on to\n 3053 06:19:23,189 --> 06:19:32,869 have an LCP value of two and the next two\n 3054 06:19:32,869 --> 06:19:40,829 zero and the next to only have one character\n 3055 06:19:40,830 --> 06:19:51,080 and lastly, only one character in common.\n 3056 06:19:51,080 --> 06:20:00,208 highlighted in purple. In summary, the LCP\n 3057 06:20:00,207 --> 06:20:07,579 To sort of suffixes have in common with each\n 3058 06:20:07,580 --> 06:20:16,040 how much information can be derived from such\n 3059 06:20:16,040 --> 06:20:24,870 noting is that the very first index in our\n 3060 06:20:24,869 --> 06:20:32,169 LCP array as an integer array, by convention,\n 3061 06:20:32,169 --> 06:20:38,439 So that doesn't interfere with any operations\n 3062 06:20:38,439 --> 06:20:48,419 And this is fine for most purposes. Lastly,\n 3063 06:20:48,419 --> 06:20:54,639 to construct it very efficiently. There are\n 3064 06:20:54,639 --> 06:21:01,819 how to construct the LCP array, which run\n 3065 06:21:01,819 --> 06:21:11,029 in n log n time, and even in linear time.\n 3066 06:21:11,029 --> 06:21:19,610 of suffix arrays and LCP arrays, and that\n 3067 06:21:19,610 --> 06:21:24,860 There are a variety of interesting problems\n 3068 06:21:24,860 --> 06:21:30,727 that require you to find all the unique substrings\n 3069 06:21:30,727 --> 06:21:38,590 time complexity of n squared, which requires\n 3070 06:21:38,590 --> 06:21:45,860 the substrings of the string and dump them\n 3071 06:21:45,860 --> 06:21:52,529 the information stored inside the LCP array.\n 3072 06:21:52,529 --> 06:21:59,119 space efficient solution. I'm not saying that\n 3073 06:21:59,119 --> 06:22:05,779 substrings because there exists Other notable\n 3074 06:22:05,779 --> 06:22:14,079 with Bloom filters. Let's now look at an example\n 3075 06:22:14,080 --> 06:22:21,320 look at an example of how to find all the\n 3076 06:22:21,319 --> 06:22:27,869 A for every string there are exactly n times\n 3077 06:22:27,869 --> 06:22:34,509 of this I will leave as an exercise so listener,\n 3078 06:22:34,509 --> 06:22:42,259 notice that all the substrings here, there\n 3079 06:22:42,259 --> 06:22:50,079 the repeated substrings there are exactly\n 3080 06:22:50,080 --> 06:22:55,370 use the information inside the LCP array to\n 3081 06:22:55,369 --> 06:23:03,039 were the duplicate once in the table on the\n 3082 06:23:03,040 --> 06:23:11,780 AZ az a. Remember what the LCP array represents,\n 3083 06:23:11,779 --> 06:23:18,590 the original string share a certain amount\n 3084 06:23:18,590 --> 06:23:25,240 value at a certain index is say five, and\n 3085 06:23:25,240 --> 06:23:32,558 those two suffixes, in other words, there\n 3086 06:23:32,558 --> 06:23:39,270 two suffixes, since they come from the same\n 3087 06:23:39,270 --> 06:23:47,830 LCP position at index one, we see that has\n 3088 06:23:47,830 --> 06:23:55,590 string is the first character in a so we know\n 3089 06:23:55,590 --> 06:24:05,279 values three, so there are three repeated\n 3090 06:24:08,457 --> 06:24:17,779 ACA, the next interesting LCP values to for\n 3091 06:24:17,779 --> 06:24:26,977 substrings. Here we can eliminate z, and z\n 3092 06:24:26,977 --> 06:24:32,680 way of counting all unique substrings. We\n 3093 06:24:32,680 --> 06:24:40,409 is n times n plus one over two. And we also\n 3094 06:24:40,409 --> 06:24:46,387 is the sum of all the LCP values. If this\n 3095 06:24:46,387 --> 06:24:51,159 examples and play around with it. If we go\n 3096 06:24:51,159 --> 06:24:58,779 az, az a and we set n equal to five, which\n 3097 06:24:58,779 --> 06:25:08,719 the correct answer. have nine by punching\n 3098 06:25:08,720 --> 06:25:15,871 substring values summed up in the LCP array.\n 3099 06:25:15,871 --> 06:25:20,369 challenge concerning counting substrings,\n 3100 06:25:20,369 --> 06:25:27,569 to the Caris online judge for some practice.\n 3101 06:25:27,569 --> 06:25:37,308 array and LCP array available on GitHub github.com\n 3102 06:25:37,308 --> 06:25:44,280 There is a really neat problem called longest\n 3103 06:25:44,279 --> 06:25:51,939 the K common substring problem, which is really\n 3104 06:25:51,939 --> 06:25:58,590 state the problem and then discuss multiple\n 3105 06:25:58,590 --> 06:26:04,630 How do we find the longest common substring\n 3106 06:26:04,630 --> 06:26:13,208 being anywhere from two to n, the number of\n 3107 06:26:13,207 --> 06:26:23,169 strings, s one, s two, and s three, with the\n 3108 06:26:23,169 --> 06:26:28,789 a minimum of two strings from our pool of\n 3109 06:26:28,790 --> 06:26:36,620 substring between them. In this situation,\n 3110 06:26:36,619 --> 06:26:41,541 know that the longest common substring is\n 3111 06:26:41,542 --> 06:26:48,170 be multiple. The traditional approach to solving\n 3112 06:26:48,169 --> 06:26:54,919 dynamic programming, which can solve the problem\n 3113 06:26:54,919 --> 06:27:03,707 of the string lengths. Obviously, this method\n 3114 06:27:03,707 --> 06:27:11,199 avoid using it whenever possible, a far superior\n 3115 06:27:11,200 --> 06:27:16,977 is to use a suffix array, which can find the\n 3116 06:27:16,977 --> 06:27:24,457 sum of the string lines. So how do we do this?\n 3117 06:27:24,457 --> 06:27:32,409 longest common substring problem? Let's consider\n 3118 06:27:32,409 --> 06:27:39,619 two and s three. What we will first want to\n 3119 06:27:39,619 --> 06:27:47,137 larger string, which I will call t, which\n 3120 06:27:47,137 --> 06:27:53,878 we must be careful and place unique Sentinel\n 3121 06:27:53,878 --> 06:27:59,510 this for multiple reasons. But the main one\n 3122 06:27:59,509 --> 06:28:06,707 of suffixes when we construct the suffix array.\n 3123 06:28:06,707 --> 06:28:13,409 settle values need to be lexicographically\n 3124 06:28:13,409 --> 06:28:20,450 in any of our strings. So in the ASCII table,\n 3125 06:28:20,450 --> 06:28:27,637 sign are all less than any alphabetic character\n 3126 06:28:27,637 --> 06:28:35,307 so we're good in doing the concatenation.\n 3127 06:28:35,308 --> 06:28:41,270 construct the suffix array for tea. This procedure\n 3128 06:28:41,270 --> 06:28:49,240 linear suffix array construction algorithm.\n 3129 06:28:49,240 --> 06:28:57,510 both the LCP array values on the leftmost\n 3130 06:28:57,509 --> 06:29:03,689 would appear in the suffix array on the right\n 3131 06:29:03,689 --> 06:29:09,967 color to match it with the original string\nit belongs to. 3132 06:29:09,968 --> 06:29:14,398 In this slide, you can see that the suffixes\n 3133 06:29:14,398 --> 06:29:19,319 the top because they were lexicographically\n 3134 06:29:19,319 --> 06:29:26,470 And this is to be expected. For our purposes,\n 3135 06:29:26,470 --> 06:29:34,430 them into the string t ourselves. So back\n 3136 06:29:34,430 --> 06:29:41,420 longest common substring of K strings? Given\n 3137 06:29:41,419 --> 06:29:50,189 LCP array constructed? The answer is, we're\n 3138 06:29:50,189 --> 06:29:58,467 colors whom share the largest LCP value. Let's\n 3139 06:29:58,468 --> 06:30:08,190 equals three We have three strings, this means\n 3140 06:30:08,189 --> 06:30:15,099 can achieve a maximum of two, if we sell the\n 3141 06:30:15,099 --> 06:30:21,457 that each of them is a different color, meaning\n 3142 06:30:21,457 --> 06:30:28,297 And that the minimum LCP value in the selected\n 3143 06:30:28,297 --> 06:30:35,727 first entry in the window. This means that\n 3144 06:30:35,727 --> 06:30:46,659 string ca of length two, which is shared amongst\n 3145 06:30:46,659 --> 06:30:54,400 do another example. But this time, let's change\n 3146 06:30:54,400 --> 06:31:01,208 equals two, we want to have two suffixes of\n 3147 06:31:01,207 --> 06:31:10,029 prefix value between them. In this case, there\n 3148 06:31:10,029 --> 06:31:21,049 B CA, with a length of three shared between\n 3149 06:31:21,049 --> 06:31:28,110 we've covered only some trivial cases, things\n 3150 06:31:28,110 --> 06:31:35,819 colors you need are exactly adjacent with\n 3151 06:31:35,819 --> 06:31:42,628 this will be to use a sliding window technique\n 3152 06:31:42,628 --> 06:31:48,458 Here's what we'll do at each step, we'll adjust\n 3153 06:31:48,457 --> 06:31:55,739 window contains exactly k suffixes of different\n 3154 06:31:55,740 --> 06:32:03,560 correct amount of colors, we'll want to query\n 3155 06:32:03,560 --> 06:32:08,899 In the picture below, you can see that this\n 3156 06:32:08,899 --> 06:32:17,250 the LCP array for that window is two, again,\n 3157 06:32:17,250 --> 06:32:23,360 the minimum value for that window is two,\n 3158 06:32:23,360 --> 06:32:32,128 the prefix a G, which has length of two, as\n 3159 06:32:32,128 --> 06:32:39,069 on how we are actually going to perform the\n 3160 06:32:39,069 --> 06:32:46,378 current window we are considering. Turns out\n 3161 06:32:46,378 --> 06:32:51,580 it can be solved in a variety of ways. Since\n 3162 06:32:51,580 --> 06:32:58,290 just any arbitrary range query, we can use\n 3163 06:32:58,290 --> 06:33:05,542 range query problem to obtain the value we\n 3164 06:33:05,542 --> 06:33:11,940 a minimum range query data structure such\n 3165 06:33:11,939 --> 06:33:19,227 queries on the LCP array. This is theoretically\n 3166 06:33:19,227 --> 06:33:26,430 implement in my opinion. So to implement the\n 3167 06:33:26,430 --> 06:33:27,939 data structure to keep track of 3168 06:33:27,939 --> 06:33:34,409 the colors in our window, I recommend using\n 3169 06:33:34,409 --> 06:33:40,378 left, I drew a table to indicate how much\n 3170 06:33:40,378 --> 06:33:48,069 window, a valid window will require at least\n 3171 06:33:48,069 --> 06:33:56,819 than zero. In the example that we'll follow,\n 3172 06:33:56,819 --> 06:34:04,520 all three colors will need to be present in\n 3173 06:34:04,520 --> 06:34:13,279 query I mean, querying the LCP array for the\n 3174 06:34:13,279 --> 06:34:19,659 possibly updating the best longest common\n 3175 06:34:19,659 --> 06:34:25,930 can see that our window is missing some blue.\n 3176 06:34:25,930 --> 06:34:32,590 is to expand the window down. And when we\n 3177 06:34:32,590 --> 06:34:41,279 down. So let's expand our window downwards\n 3178 06:34:41,279 --> 06:34:48,009 and enough of each color to do a valid query.\n 3179 06:34:48,009 --> 06:34:55,369 the longest common substring for the window\n 3180 06:34:55,369 --> 06:35:05,718 the string a G. Now since we have enough features\n 3181 06:35:05,718 --> 06:35:12,708 one green suffix, and we still have at least\n 3182 06:35:12,707 --> 06:35:21,779 a query to find out that we get the same result\n 3183 06:35:21,779 --> 06:35:29,489 more. And now we don't have enough green.\n 3184 06:35:29,490 --> 06:35:36,520 we need to do is expand the window downwards\n 3185 06:35:36,520 --> 06:35:42,909 was a green suffix right there. And we can\n 3186 06:35:42,909 --> 06:35:48,360 longest common substring value found so far.\n 3187 06:35:48,360 --> 06:35:54,700 is only of length one. So the longest common\n 3188 06:35:54,700 --> 06:36:02,069 which is not as long as the longest common\n 3189 06:36:02,069 --> 06:36:09,090 length three, so we can ignore this one and\n 3190 06:36:09,090 --> 06:36:15,670 and we're short one color, and that is red.\n 3191 06:36:15,669 --> 06:36:21,679 suffix. So we expand and find a blue suffix.\n 3192 06:36:21,680 --> 06:36:28,218 searching the expand Finally, green suffix,\n 3193 06:36:28,218 --> 06:36:33,790 reached. This video is getting a little long,\n 3194 06:36:33,790 --> 06:36:40,548 next video, we'll look at a full example of\n 3195 06:36:40,547 --> 06:36:45,419 meantime, if you're looking for a challenge\n 3196 06:36:45,419 --> 06:36:49,949 make sure to check out the life forms problem\n 3197 06:36:49,950 --> 06:36:55,229 for an implementation of longest common substring\n 3198 06:36:55,229 --> 06:37:03,500 to check out my algorithms repository@github.com\n 3199 06:37:03,500 --> 06:37:10,369 going to finish where we left off in the last\n 3200 06:37:10,369 --> 06:37:16,919 example solving the longest common substring\n 3201 06:37:16,919 --> 06:37:25,019 For this example, we're going to have four\n 3202 06:37:25,020 --> 06:37:32,490 I have also selected the value of K to be\n 3203 06:37:32,490 --> 06:37:38,000 of two strings of our pool of four to share\n 3204 06:37:38,000 --> 06:37:44,639 them. I have also provided you with a concatenated\n 3205 06:37:44,639 --> 06:37:50,308 solution at the bottom of the screen. In case\n 3206 06:37:50,308 --> 06:37:57,620 out for yourself. The first step in finding\n 3207 06:37:57,619 --> 06:38:05,387 of our four strings is to build the suffix\n 3208 06:38:05,387 --> 06:38:12,869 on the right side and the left side, respectively.\n 3209 06:38:12,869 --> 06:38:19,610 substring algorithm, notice the variables\n 3210 06:38:19,610 --> 06:38:27,029 and a window LCS values will track the longest\n 3211 06:38:27,029 --> 06:38:33,967 values for the current window. And the LCS\n 3212 06:38:33,968 --> 06:38:41,260 values so far. So let's get started. Initially,\n 3213 06:38:41,259 --> 06:38:46,959 the window to contain two different colors.\n 3214 06:38:46,959 --> 06:38:54,069 do not meet this criteria. As I expand down,\n 3215 06:38:54,069 --> 06:39:03,340 one color, I expand on again and still one\n 3216 06:39:03,340 --> 06:39:10,420 and now we arrive at a blue suffix. And here\n 3217 06:39:10,419 --> 06:39:16,429 window. However, our query isn't fruitful\n 3218 06:39:16,430 --> 06:39:24,628 is zero. So there's no longest common substring\n 3219 06:39:24,628 --> 06:39:32,878 like we do now, we decrease the window size\n 3220 06:39:32,878 --> 06:39:40,770 nothing interesting. Decrease the window size\n 3221 06:39:40,770 --> 06:39:46,600 The current window contains a longest common\n 3222 06:39:46,599 --> 06:39:53,840 common substring BC and added to our solution\n 3223 06:39:53,840 --> 06:40:01,650 because we meet the color requirement. Our\n 3224 06:40:01,650 --> 06:40:11,280 is two, so we need two different color strings.\n 3225 06:40:11,279 --> 06:40:18,659 has happened because we find an LCP value\n 3226 06:40:18,659 --> 06:40:25,319 best value. So we update the solution set\n 3227 06:40:25,319 --> 06:40:32,119 BC, which is one character longer. Now we\n 3228 06:40:32,119 --> 06:40:42,500 was now too small, so we expand the LCP value\n 3229 06:40:42,500 --> 06:40:51,270 the window size. Now expand to meet the color\n 3230 06:40:51,270 --> 06:40:58,950 that doesn't beat our current best, which\n 3231 06:40:58,950 --> 06:41:06,458 to meet the core requirements, so expand,\n 3232 06:41:06,457 --> 06:41:15,627 and LCP value of one for this window range.\n 3233 06:41:15,628 --> 06:41:22,819 an LCP value of two we're getting closer to\n 3234 06:41:22,819 --> 06:41:29,599 have to shrink and like let go. Now expand.\n 3235 06:41:29,599 --> 06:41:38,180 We have a window RCP value of three which\n 3236 06:41:38,180 --> 06:41:47,799 saying that, the CDE our newfound, longest\n 3237 06:41:47,799 --> 06:41:56,887 of the same length, we keep both in the solution\n 3238 06:41:56,887 --> 06:42:03,579 interval because we meet the color requirement.\n 3239 06:42:03,580 --> 06:42:15,910 expand again, our LCP window value is zero.\n 3240 06:42:15,909 --> 06:42:25,009 value of one here, that's not good enough.\n 3241 06:42:25,009 --> 06:42:31,340 Okay, we might be getting closer, but we meet\n 3242 06:42:31,340 --> 06:42:39,849 to meet the color requirement. These two strings\n 3243 06:42:39,849 --> 06:42:45,829 now shrink. Now we've reached the end and\n 3244 06:42:45,830 --> 06:42:54,440 problem with a four strings and a k value\n 3245 06:42:54,439 --> 06:43:03,539 and shrinking, I want you to notice that each\n 3246 06:43:03,540 --> 06:43:12,069 I only ever moved one of the endpoints downwards.\n 3247 06:43:12,069 --> 06:43:19,637 know that the number of Windows has to be\n 3248 06:43:19,637 --> 06:43:27,297 that we have. And the number of suffixes that\n 3249 06:43:27,297 --> 06:43:35,069 come to the conclusion that there must be\n 3250 06:43:35,069 --> 06:43:42,227 which is really good. Because we want our\n 3251 06:43:42,227 --> 06:43:49,110 going to be on an efficient way to solve the\n 3252 06:43:49,110 --> 06:43:55,490 The longest repeated substring problem is\n 3253 06:43:55,490 --> 06:44:00,450 in computer science, lots of problems can\n 3254 06:44:00,450 --> 06:44:06,409 important that we have an efficient way of\n 3255 06:44:06,409 --> 06:44:12,740 squared time and lots of space. What we want\n 3256 06:44:12,740 --> 06:44:21,558 inside the longest common prefix array to\n 3257 06:44:21,558 --> 06:44:26,878 Let's do an example what is the longest repeated\n 3258 06:44:26,878 --> 06:44:36,898 free to pause the video and figure it out\n 3259 06:44:36,898 --> 06:44:43,478 which is the longest substring that appears\n 3260 06:44:43,477 --> 06:44:51,840 the longest repeated substring. Here you can\n 3261 06:44:51,840 --> 06:44:58,409 And now you can see the second repeated instance\n 3262 06:44:58,409 --> 06:45:05,689 are disjoint and do not overlap. In general,\n 3263 06:45:05,689 --> 06:45:11,559 substring. Now let's solve the problem using\n 3264 06:45:11,560 --> 06:45:17,540 generated on the right hand side. I'll give\n 3265 06:45:17,540 --> 06:45:22,850 the LCP array in case you notice anything\n 3266 06:45:22,849 --> 06:45:30,627 substring. Now that you already know what\n 3267 06:45:30,628 --> 06:45:37,470 for in the LCP array is the maximum value.\n 3268 06:45:37,470 --> 06:45:43,819 know that the suffixes are already sorted.\n 3269 06:45:43,819 --> 06:45:49,520 longest common prefix value, then they share\n 3270 06:45:49,520 --> 06:45:55,930 We also know that if the LCP value at a certain\n 3271 06:45:55,930 --> 06:46:02,069 shared between the two adjacent suffixes is\n 3272 06:46:02,069 --> 06:46:09,340 between two suffixes, each of which start\n 3273 06:46:09,340 --> 06:46:16,650 again, is Abracadabra. Since our LCP value\n 3274 06:46:16,650 --> 06:46:24,970 from the suffix abracadabra forms one part\n 3275 06:46:24,970 --> 06:46:31,478 the suffix above it which shares the LCP value\n 3276 06:46:31,477 --> 06:46:39,599 in that longest repeated substring. Now, I\n 3277 06:46:39,599 --> 06:46:49,779 can you find the longest repeated substring\n 3278 06:46:49,779 --> 06:46:55,957 you did, you will find out that you not only\n 3279 06:46:55,957 --> 06:47:02,477 longest repeated substrings. Since there can\n 3280 06:47:02,477 --> 06:47:10,628 for a single largest value, but all largest\n 3281 06:47:10,628 --> 06:47:17,920 you can see that we'll want the first three\n 3282 06:47:17,919 --> 06:47:23,839 ba, and for the second maximum, we'll want\n 3283 06:47:23,840 --> 06:47:33,409 ba, ba. Visually, for this first, longest\n 3284 06:47:33,409 --> 06:47:41,720 at the start, and then a BA again, closer\n 3285 06:47:41,720 --> 06:47:48,960 found using the second largest common prefix\n 3286 06:47:48,959 --> 06:47:57,409 and the second one just next to it. That is\n 3287 06:47:57,409 --> 06:48:04,378 the suffix array and the LCP array have already\n 3288 06:48:04,378 --> 06:48:10,029 on campus if you want to tackle a longest\n 3289 06:48:10,029 --> 06:48:15,110 you to have an efficient solution. Today,\n 3290 06:48:15,110 --> 06:48:22,470 most important types of trees in computer\n 3291 06:48:22,470 --> 06:48:29,430 trees. Balanced Binary search trees are very\n 3292 06:48:29,430 --> 06:48:35,740 tree because they not only conform to the\n 3293 06:48:35,740 --> 06:48:42,080 also balanced. What I mean by balanced is\n 3294 06:48:42,080 --> 06:48:47,980 logarithmic height in proportion to the number\n 3295 06:48:47,979 --> 06:48:55,389 because it keeps operations such as insertion\n 3296 06:48:55,389 --> 06:49:04,000 is much more squashed. In terms of complexity,\n 3297 06:49:04,000 --> 06:49:10,240 operations, which is quite good. However,\n 3298 06:49:10,240 --> 06:49:18,330 the tree could degrade into a chain for some\n 3299 06:49:18,330 --> 06:49:24,398 numbers. To avoid this linear complexity,\n 3300 06:49:24,398 --> 06:49:33,040 in which the worst case is logarithmic for\n 3301 06:49:33,040 --> 06:49:37,888 Central to how nearly all Balanced Binary\n 3302 06:49:37,887 --> 06:49:44,189 balanced is the concept of tree rotations,\n 3303 06:49:44,189 --> 06:49:49,659 video. Later, we'll actually look at some\n 3304 06:49:49,659 --> 06:49:56,430 to see how these rotations come into play.\n 3305 06:49:56,430 --> 06:50:03,398 binary search tree implementations is the\n 3306 06:50:03,398 --> 06:50:12,100 tree invariant and tree rotations. A tree\n 3307 06:50:12,099 --> 06:50:19,079 you impose on your tree, such that it must\n 3308 06:50:19,080 --> 06:50:24,680 that the invariant is always satisfied a series\n 3309 06:50:24,680 --> 06:50:29,398 get back to this concept and fitting variants\n 3310 06:50:29,398 --> 06:50:36,468 so much for now. Right now we're going to\n 3311 06:50:36,468 --> 06:50:43,430 invariant is not satisfied. And to fix it,\n 3312 06:50:43,430 --> 06:50:50,878 A. Assuming node A has a left child B, we\n 3313 06:50:50,878 --> 06:50:59,650 node A was and push node A down to become\n 3314 06:50:59,650 --> 06:51:06,990 this, in my undergrad, I was Mind blown. Literally,\n 3315 06:51:06,990 --> 06:51:13,398 of being illegal that we should be allowed\n 3316 06:51:13,398 --> 06:51:20,520 tree. But what I've realized since is that\n 3317 06:51:20,520 --> 06:51:25,340 Since we're not breaking the binary search\n 3318 06:51:25,340 --> 06:51:31,889 the left tree, you'll discover that in terms\n 3319 06:51:31,889 --> 06:51:37,648 than node B is less than E is less than a\n 3320 06:51:37,648 --> 06:51:45,580 tree and remark that well, this is also true.\n 3321 06:51:45,580 --> 06:51:52,760 is a valid operation. First, you have to remember\n 3322 06:51:52,759 --> 06:51:58,709 binary search trees, meaning that the binary\n 3323 06:51:58,709 --> 06:52:04,869 node and the values in the left subtree are\n 3324 06:52:04,869 --> 06:52:11,727 the right subtree are all greater than the\n 3325 06:52:11,727 --> 06:52:16,939 what the tree structure itself looks like.\n 3326 06:52:16,939 --> 06:52:24,109 binary search tree invariant holds. This means\n 3327 06:52:24,110 --> 06:52:29,290 rotate the values and nodes in our tree as\n 3328 06:52:29,290 --> 06:52:33,479 environment remains satisfied. 3329 06:52:33,479 --> 06:52:40,439 Now let's look at how these rotations are\n 3330 06:52:40,439 --> 06:52:45,727 the rotations are symmetric, or will only\n 3331 06:52:45,727 --> 06:52:51,270 to figure out the left rotation on your own.\n 3332 06:52:51,270 --> 06:52:59,308 Notice that there are directed edges pointing\n 3333 06:52:59,308 --> 06:53:06,909 may or may not exist. This is why there is\n 3334 06:53:06,909 --> 06:53:12,669 have a parent node p, then it is important\n 3335 06:53:12,669 --> 06:53:19,369 rotation. In either case, we start with a\n 3336 06:53:19,369 --> 06:53:29,509 arrow, then we'll want a pointer to node B.\n 3337 06:53:29,509 --> 06:53:38,819 to B's right child, then change B's right\n 3338 06:53:38,819 --> 06:53:46,259 done a right rotation. If we rearrange the\n 3339 06:53:46,259 --> 06:53:54,739 However, notice that there's a slight problem.\n 3340 06:53:54,740 --> 06:54:01,370 parents left or right pointer would still\n 3341 06:54:01,369 --> 06:54:09,250 B is A's successor after the rotation. So\n 3342 06:54:09,250 --> 06:54:15,750 step is usually done on the recursive call\n 3343 06:54:15,750 --> 06:54:22,457 function. We just finished looking at the\n 3344 06:54:22,457 --> 06:54:29,477 and the right child nodes. But in some Balanced\n 3345 06:54:29,477 --> 06:54:35,919 convenient for nodes to also have a reference\n 3346 06:54:35,919 --> 06:54:42,959 rotations because now instead of updating\n 3347 06:54:42,959 --> 06:54:49,968 Let's have a look. In this case, where we\n 3348 06:54:49,968 --> 06:54:56,569 a sense, doubly linked. We start off with\n 3349 06:54:56,569 --> 06:55:04,119 we'll want to do is also reference node B\n 3350 06:55:04,119 --> 06:55:12,159 around pointers. Next we'll adjust the left\n 3351 06:55:12,159 --> 06:55:20,669 B's right subtree. Of course, throughout this\n 3352 06:55:20,669 --> 06:55:26,390 you can add an extra if statement to check\n 3353 06:55:26,390 --> 06:55:33,770 mistake to assume B is right subtree is not\n 3354 06:55:33,770 --> 06:55:43,958 before setting B's right child's parent to\n 3355 06:55:43,958 --> 06:55:53,967 of me. So set meas right pointer to reference\n 3356 06:55:53,968 --> 06:56:00,010 be the last thing we need to do is adjust\n 3357 06:56:00,009 --> 06:56:10,717 these parent pointer a reference P. And now\n 3358 06:56:10,718 --> 06:56:19,700 left or right pointer reference the successor\n 3359 06:56:19,700 --> 06:56:27,569 is not now because it might not exist. This\n 3360 06:56:27,569 --> 06:56:34,659 readjust the tree, you will see that we correctly\n 3361 06:56:34,659 --> 06:56:40,457 in doing this right rotation. And it's a very\n 3362 06:56:40,457 --> 06:56:47,659 to do it in such detail. Today we're going\n 3363 06:56:47,659 --> 06:56:54,137 tree in great detail. We'll be making use\n 3364 06:56:54,137 --> 06:57:00,718 in the last video. So if you didn't watch\n 3365 06:57:00,718 --> 06:57:06,159 before we get too far, I shouldn't mention\n 3366 06:57:06,159 --> 06:57:12,549 of many types of balanced binary search trees,\n 3367 06:57:12,549 --> 06:57:19,218 and search operations. something really special\n 3368 06:57:19,218 --> 06:57:23,500 type of Balanced Binary Search Tree to be\ndiscovered. 3369 06:57:23,500 --> 06:57:29,898 Then, soon after a whole bunch of other types\n 3370 06:57:29,898 --> 06:57:37,370 emerge, including the two three tree, the\n 3371 06:57:37,369 --> 06:57:43,680 main rival the red black tree, what you need\n 3372 06:57:43,680 --> 06:57:50,270 the avielle tree balanced. And this is the\n 3373 06:57:50,270 --> 06:57:57,159 of a node is the difference between the height\n 3374 06:57:57,159 --> 06:58:03,308 I'm pretty sure the bounce factor can also\n 3375 06:58:03,308 --> 06:58:08,218 the right subtree height. But don't quote\n 3376 06:58:08,218 --> 06:58:15,450 what I'm about to say, and may also be the\n 3377 06:58:15,450 --> 06:58:22,830 about what way to do tree rotations on various\n 3378 06:58:22,830 --> 06:58:28,670 let's keep the bounce factor right subtree\n 3379 06:58:28,669 --> 06:58:35,637 because people get this wrong or define it\n 3380 06:58:35,637 --> 06:58:42,700 as the number of edges between x and the furthest\n 3381 06:58:42,700 --> 06:58:50,080 tree has height zero, not height one because\n 3382 06:58:50,080 --> 06:58:58,190 avielle tree that keeps it balanced is forcing\n 3383 06:58:58,189 --> 06:59:04,770 minus one zero or plus one. If the balance\n 3384 06:59:04,770 --> 06:59:12,378 to resolve that with tree rotations. In terms\n 3385 06:59:12,378 --> 06:59:18,940 to actually make the avielle tree work. What\n 3386 06:59:18,939 --> 06:59:24,057 stores. This value must be comparable. So\n 3387 06:59:24,058 --> 06:59:30,280 it goes to the tree. Then we'll also need\n 3388 06:59:30,279 --> 06:59:37,297 as well as the left and the right child pointers.\n 3389 06:59:37,297 --> 06:59:44,849 these values. So keep that in mind. So a slider\n 3390 06:59:44,849 --> 06:59:52,397 node must always be minus zero or plus one.\n 3391 06:59:52,398 --> 06:59:58,670 the case where this is not true? The answer\n 3392 06:59:58,669 --> 07:00:05,829 factor must be either be plus two or minus\n 3393 07:00:05,830 --> 07:00:11,410 rotations. The rotations we need to perform\n 3394 07:00:11,409 --> 07:00:19,529 be broken down into four distinct cases. The\n 3395 07:00:19,529 --> 07:00:27,169 call left heavy, and there are two left child\n 3396 07:00:27,169 --> 07:00:35,839 all we need to do is perform a right rotation\n 3397 07:00:35,840 --> 07:00:43,477 case is the left right case where you have\n 3398 07:00:43,477 --> 07:00:53,159 child. To fix this, you do a left rotation\n 3399 07:00:53,159 --> 07:00:59,227 the left most image, what happens then is\n 3400 07:00:59,227 --> 07:01:08,128 we just saw, which we can resolve with a right\n 3401 07:01:08,128 --> 07:01:14,218 right case, which is symmetric to the left\n 3402 07:01:14,218 --> 07:01:23,208 we do a left rotation about the green note.\n 3403 07:01:23,207 --> 07:01:30,957 which is symmetric to the left right case.\n 3404 07:01:30,957 --> 07:01:37,090 about the yellow node on the left most image\n 3405 07:01:37,090 --> 07:01:43,080 case, and then do a left rotation about the\n 3406 07:01:43,080 --> 07:01:50,058 Next, I want to show you some pseudocode for\n 3407 07:01:50,058 --> 07:01:56,770 not all that obvious or easy. This first method\n 3408 07:01:56,770 --> 07:02:03,569 method, which returns true or false depending\n 3409 07:02:03,569 --> 07:02:11,180 or not. For simplicity, we're going to ban\n 3410 07:02:11,180 --> 07:02:18,060 value already exists, or the value is no,\n 3411 07:02:18,060 --> 07:02:23,968 does not know, and it doesn't already exist\n 3412 07:02:23,968 --> 07:02:30,119 insert method, where we pass in a pointer\n 3413 07:02:30,119 --> 07:02:36,399 insert. The private recursive method is also\n 3414 07:02:36,400 --> 07:02:41,760 we simply return a new instance of the node\n 3415 07:02:41,759 --> 07:02:46,519 we get to compare to a value with the value\n 3416 07:02:46,520 --> 07:02:52,387 node to determine if we should go on the left\n 3417 07:02:52,387 --> 07:02:58,207 call back, we call the update method which\n 3418 07:02:58,207 --> 07:03:04,819 for this note, and lastly, we rebounds the\n 3419 07:03:04,819 --> 07:03:11,898 a look at the update enhance method. And what\n 3420 07:03:11,898 --> 07:03:17,610 updates the bounce factor and height values\n 3421 07:03:17,610 --> 07:03:25,148 the node, we get the maximum height of the\n 3422 07:03:25,148 --> 07:03:30,400 Notice that I initialize the left and the\n 3423 07:03:30,400 --> 07:03:36,520 this is because it will cancel out with a\n 3424 07:03:36,520 --> 07:03:43,860 the node has no sub trees giving the correct\n 3425 07:03:43,860 --> 07:03:49,558 the balance factor for this node. By finding\n 3426 07:03:52,659 --> 07:03:57,727 balanced method is slightly more involved\n 3427 07:03:57,727 --> 07:04:05,840 factor has an illegal value of minus two or\n 3428 07:04:05,840 --> 07:04:12,700 two, then we know that the node is left heavy,\n 3429 07:04:12,700 --> 07:04:18,869 To determine if we're dealing with a left\n 3430 07:04:18,869 --> 07:04:24,450 thing if the balance factor is plus two, except\n 3431 07:04:24,450 --> 07:04:30,218 right left case. If the bounce factor is not\n 3432 07:04:30,218 --> 07:04:37,610 bounce factor is going to be either plus one,\n 3433 07:04:37,610 --> 07:04:44,830 cases, we don't need to do anything inside\n 3434 07:04:44,830 --> 07:04:51,570 all we do here are calls to the left rotation\n 3435 07:04:51,569 --> 07:04:58,898 the last video. Also notice that the left\n 3436 07:04:58,898 --> 07:05:06,920 left and right Right right case methods respectively,\n 3437 07:05:06,919 --> 07:05:14,137 rotation. In the last video, we looked at\n 3438 07:05:14,137 --> 07:05:20,939 dealing with any vl tree In this video, we\n 3439 07:05:20,939 --> 07:05:28,057 the height and bounce rate values for the\n 3440 07:05:28,058 --> 07:05:33,898 this is a subtle detail you must not forget,\n 3441 07:05:33,898 --> 07:05:39,680 will be inconsistent with the left rotation\n 3442 07:05:39,680 --> 07:05:44,450 able to figure it out pretty easily. In this\n 3443 07:05:44,450 --> 07:05:50,637 elements from an avielle tree. And what you'll\n 3444 07:05:50,637 --> 07:05:58,770 avielle tree is almost identical to removing\n 3445 07:05:58,770 --> 07:06:02,939 So for the majority of this video, we're going\n 3446 07:06:02,939 --> 07:06:10,887 from a binary search tree in the very end\n 3447 07:06:10,887 --> 07:06:17,430 So let's get started. So just for review,\n 3448 07:06:17,430 --> 07:06:25,648 binary search tree in detail. So we can generally\n 3449 07:06:25,648 --> 07:06:31,240 In the find phase, you find the element you\n 3450 07:06:31,240 --> 07:06:38,490 and then replace it with a successor node,\n 3451 07:06:38,490 --> 07:06:45,128 maintain the binary search tree invariant.\n 3452 07:06:45,128 --> 07:06:50,148 we're searching for the element in the tree\n 3453 07:06:50,148 --> 07:06:55,250 in happen. First is we hit a null node, in\n 3454 07:06:55,250 --> 07:07:02,919 for doesn't exist. Our comparateur returns\n 3455 07:07:02,919 --> 07:07:08,319 we want to remove, the comparative value is\n 3456 07:07:08,319 --> 07:07:14,579 for, if it exists, is going to be found in\n 3457 07:07:14,580 --> 07:07:20,420 is greater than zero, in which case the value\n 3458 07:07:20,419 --> 07:07:24,859 let's do an example of finding nodes in a\n 3459 07:07:24,860 --> 07:07:31,319 for a 14, well, we should have a reference\n 3460 07:07:31,319 --> 07:07:37,489 So we compare 20 and 14, and we know 14 is\n 3461 07:07:37,490 --> 07:07:43,320 We know 14 is greater than 10. So we go on\n 3462 07:07:43,319 --> 07:07:50,340 15. So you're on the left subtree. 14 is greater\n 3463 07:07:50,340 --> 07:07:56,659 there we found it, the node we were looking\n 3464 07:07:56,659 --> 07:08:03,579 that doesn't exist. So let's try and find\n 3465 07:08:03,580 --> 07:08:09,860 we go to the right subtree, because 26 is\n 3466 07:08:09,860 --> 07:08:18,659 because 26 is less than 31. And once we're\n 3467 07:08:18,659 --> 07:08:25,990 then discovered that 26 does not exist in\n 3468 07:08:25,990 --> 07:08:33,308 it exists, we need to replace that node with\n 3469 07:08:33,308 --> 07:08:40,040 that if we just remove the node without finding\n 3470 07:08:40,040 --> 07:08:44,128 tree. And when we're looking for a successor\n 3471 07:08:44,128 --> 07:08:51,020 will happen. Either were a leaf node, in which\n 3472 07:08:51,020 --> 07:08:56,790 has no left subtree the node to remove has\n 3473 07:08:56,790 --> 07:09:02,700 both the left subtree and right subtree. We'll\n 3474 07:09:02,700 --> 07:09:08,920 first case where the node to remove is a leaf\n 3475 07:09:08,919 --> 07:09:15,349 side effects. The successor node in this case\n 3476 07:09:15,349 --> 07:09:21,739 a remove node eight from this tree, the first\n 3477 07:09:21,740 --> 07:09:28,600 the tree. So we'd go down the tree and then\n 3478 07:09:28,599 --> 07:09:37,109 Oh, it's a leaf node, so we can just remove\n 3479 07:09:37,110 --> 07:09:45,047 where there's only a left or a right subtree.\n 3480 07:09:45,047 --> 07:09:51,397 immediate child of that left or right subtree.\n 3481 07:09:51,398 --> 07:09:58,850 node down from the node we're removing is\n 3482 07:09:58,849 --> 07:10:06,529 than it the case right subtree or less than\n 3483 07:10:06,529 --> 07:10:13,057 do an example, suppose we want to remove node\n 3484 07:10:13,058 --> 07:10:20,120 nine is in the tree. So start the route and\n 3485 07:10:20,119 --> 07:10:25,669 want to remove, which is nine. And then we\n 3486 07:10:25,669 --> 07:10:33,309 a left subtree. So the successor node is its\n 3487 07:10:33,310 --> 07:10:36,850 so seven. So what we do is we get a reference\nto seven 3488 07:10:36,849 --> 07:10:43,789 and then get ready to remove nine. And then\n 3489 07:10:43,790 --> 07:10:51,720 by linking it back up to five. And if we rebalance\n 3490 07:10:51,720 --> 07:10:58,600 nine has been removed. So the last case, is\n 3491 07:10:58,599 --> 07:11:06,069 a left subtree and a right subtree. So the\n 3492 07:11:06,069 --> 07:11:13,810 will we find the successor of the node we're\n 3493 07:11:13,810 --> 07:11:19,690 surprisingly, the answer is both. The successor\n 3494 07:11:19,689 --> 07:11:27,539 subtree, or the smallest value in the right\n 3495 07:11:27,540 --> 07:11:33,770 found in either left or right subtree, we\n 3496 07:11:33,770 --> 07:11:39,387 a value in the successor node. However, the\n 3497 07:11:39,387 --> 07:11:45,569 to remove the duplicate value of the successor\n 3498 07:11:45,569 --> 07:11:51,009 strategy to resolve this is to recursively\n 3499 07:11:51,009 --> 07:11:57,639 to remove as the value in the successor node.\n 3500 07:11:57,639 --> 07:12:04,220 trivial. Let's remove node seven from this\n 3501 07:12:04,220 --> 07:12:09,930 start at the root node and discover that in\n 3502 07:12:09,930 --> 07:12:16,387 notice that it has two non empty sub trees,\n 3503 07:12:16,387 --> 07:12:22,739 successor we either pick the smallest value\n 3504 07:12:22,740 --> 07:12:28,440 in the left subtree. Let's find the smallest\n 3505 07:12:28,439 --> 07:12:36,887 you will go to the right ones, and then dig\n 3506 07:12:36,887 --> 07:12:43,759 the successor node 11, we will copy its value\n 3507 07:12:43,759 --> 07:12:51,207 is the root node seven. Notice that now there\n 3508 07:12:51,207 --> 07:12:58,689 want unique values in our tree. So to remove\n 3509 07:12:58,689 --> 07:13:06,029 recursively call our remove method, but on\n 3510 07:13:06,029 --> 07:13:13,759 always result in a case one two or three removal.\n 3511 07:13:13,759 --> 07:13:23,099 a right subtree. So its successor is its immediate\n 3512 07:13:23,099 --> 07:13:32,189 get ready to remove 11. So we remove 11 and\n 3513 07:13:32,189 --> 07:13:38,279 the tree, then we can see that the duplicate\n 3514 07:13:38,279 --> 07:13:43,797 we've been waiting for how do we augment the\n 3515 07:13:43,797 --> 07:13:49,159 trees. The solution is simple, you only need\n 3516 07:13:49,159 --> 07:13:54,990 tree remains balanced and that the bounce\n 3517 07:13:54,990 --> 07:13:59,940 On the recursive callback, you invoke the\n 3518 07:13:59,939 --> 07:14:07,449 insert video, which ensure that when the node\n 3519 07:14:07,450 --> 07:14:13,250 tree remains balanced. It's as easy as that.\n 3520 07:14:13,250 --> 07:14:19,200 of the source code for the avielle tree. The\n 3521 07:14:19,200 --> 07:14:26,340 in this video can be found on GitHub github.com\n 3522 07:14:26,340 --> 07:14:34,542 Make sure you have watched the last three\n 3523 07:14:34,542 --> 07:14:39,920 insertions and removals in avielle trees before\n 3524 07:14:39,919 --> 07:14:47,647 code I'm presenting. I don't normally do this,\n 3525 07:14:47,648 --> 07:14:53,380 avielle tree in action. So I'm here in my\n 3526 07:14:53,380 --> 07:15:01,040 of the Java code with avielle tree and then\n 3527 07:15:01,040 --> 07:15:10,860 a random tree with some values and notice\n 3528 07:15:10,860 --> 07:15:20,137 for the number of nodes that are in it. So\n 3529 07:15:20,137 --> 07:15:29,878 expect the tree to be a bit more sloppy if\n 3530 07:15:29,878 --> 07:15:35,319 But the avielle tree really keeps the tree\nquite rigid. 3531 07:15:35,319 --> 07:15:41,878 So I think we're ready to dive into the source\n 3532 07:15:41,878 --> 07:15:49,458 of a recursive avielle tree implementation\n 3533 07:15:49,457 --> 07:15:56,770 get started. If you look at the class definition\n 3534 07:15:56,770 --> 07:16:08,290 this class takes a generic type argument,\n 3535 07:16:08,290 --> 07:16:16,620 And this generic type I'm defining, basically\n 3536 07:16:16,619 --> 07:16:21,549 to be inserting inside the tree need to be\n 3537 07:16:21,549 --> 07:16:29,558 to be able to insert them and know how to\n 3538 07:16:29,558 --> 07:16:36,450 node, which I've created, you can see that\n 3539 07:16:36,450 --> 07:16:44,619 node is of type T. So it's a comparable. And\n 3540 07:16:44,619 --> 07:16:51,039 variables I'm storing inside the, the node\n 3541 07:16:51,040 --> 07:16:59,159 the bounce factor as an integer, the height\n 3542 07:16:59,159 --> 07:17:05,808 query the height of a node in constant time,\n 3543 07:17:05,808 --> 07:17:12,090 of course, there's going to be the left and\n 3544 07:17:12,090 --> 07:17:20,200 notice that this, this node class implements\n 3545 07:17:20,200 --> 07:17:27,058 And that's just an interface I have somewhere\n 3546 07:17:27,058 --> 07:17:32,520 tree I did on the terminal. So this isn't\n 3547 07:17:32,520 --> 07:17:39,310 an ACL tree. And nor are these overrides,\n 3548 07:17:39,310 --> 07:17:46,840 tree in the terminal, which is really handy\n 3549 07:17:46,840 --> 07:17:55,189 variables at the avielle tree class level,\n 3550 07:17:55,189 --> 07:18:03,989 be private, although I'm using it for testing,\n 3551 07:18:03,990 --> 07:18:12,530 keeping track of the number of nodes inside\n 3552 07:18:12,529 --> 07:18:17,279 Something else I'm also using for testing\n 3553 07:18:17,279 --> 07:18:23,289 good to just saying you check yourself to\n 3554 07:18:23,290 --> 07:18:30,430 relatively low. Then there are these methods\n 3555 07:18:30,430 --> 07:18:38,292 explanatory. This the display method I call\n 3556 07:18:38,292 --> 07:18:45,280 The set contains method to check if a certain\n 3557 07:18:45,279 --> 07:18:52,329 is the public facing method, which calls the\n 3558 07:18:52,330 --> 07:18:59,610 the initial node to start off with. So to\n 3559 07:18:59,610 --> 07:19:07,319 hit the base case, then we know that that\n 3560 07:19:07,319 --> 07:19:12,207 the current value to the value inside the\n 3561 07:19:12,207 --> 07:19:19,659 of either a value less than zero, which means\n 3562 07:19:19,659 --> 07:19:26,029 in the left subtree or comparateur is greater\n 3563 07:19:26,029 --> 07:19:31,860 is in the right subtree. Otherwise, the means\n 3564 07:19:31,860 --> 07:19:40,360 And that means we found the node inside the\n 3565 07:19:40,360 --> 07:19:48,530 to insert this value variable, if it's no,\n 3566 07:19:48,530 --> 07:19:54,779 just return false. And we return a Boolean\n 3567 07:19:54,779 --> 07:20:01,817 an insertion was successful or not. So if\n 3568 07:20:01,817 --> 07:20:13,090 tree, then we're going to call the private\n 3569 07:20:13,090 --> 07:20:20,398 the root. And also increment the node count,\n 3570 07:20:20,398 --> 07:20:24,468 insert something if we're inside this block,\n 3571 07:20:25,599 --> 07:20:34,049 Alright, let's have a look at the private\n 3572 07:20:34,049 --> 07:20:40,770 meaning we traverse all the way down our tree,\n 3573 07:20:40,770 --> 07:20:46,540 this is the position where we need to insert\n 3574 07:20:46,540 --> 07:20:54,840 it the value. Otherwise, we're searching for\n 3575 07:20:54,840 --> 07:21:03,378 comparator function, then do value compared\n 3576 07:21:03,378 --> 07:21:10,250 from the comparable interface at the class\n 3577 07:21:10,250 --> 07:21:18,371 dot compare to on a generic type here and\n 3578 07:21:18,371 --> 07:21:25,829 is less than zero, then insert in left subtree.\n 3579 07:21:25,830 --> 07:21:33,080 here is the extra two lines, you need to add\n 3580 07:21:33,080 --> 07:21:40,638 update method to update the bounce factor\n 3581 07:21:40,637 --> 07:21:51,520 rebalance the tree if necessary. Well rebalance\n 3582 07:21:51,520 --> 07:21:59,279 look at the update method. So this is to update\n 3583 07:21:59,279 --> 07:22:06,430 here are two variables that grab the height\n 3584 07:22:06,430 --> 07:22:16,628 Then I update the height for this note. So\n 3585 07:22:16,628 --> 07:22:25,218 the left subtree or the right subtree height.\n 3586 07:22:25,218 --> 07:22:31,080 left and right, we can of course update the\n 3587 07:22:31,080 --> 07:22:38,340 the right and the left subtree heights. Okay,\n 3588 07:22:38,340 --> 07:22:46,420 the balance method. And inside the balance\n 3589 07:22:46,419 --> 07:22:53,979 balance factors are either minus two or plus\n 3590 07:22:53,979 --> 07:23:02,898 that our tree is left heavy. And inside the\n 3591 07:23:02,898 --> 07:23:10,580 left case or the left right case. And inside\n 3592 07:23:10,580 --> 07:23:17,270 case and the right left case. And to identify\n 3593 07:23:17,270 --> 07:23:28,610 factor, but on either no dot left or no dot\n 3594 07:23:28,610 --> 07:23:33,409 factor of the node is not minus two or plus\n 3595 07:23:33,409 --> 07:23:40,189 is either zero plus one or minus one, which\n 3596 07:23:40,189 --> 07:23:45,279 We don't have to rebalance the tree if either\n 3597 07:23:45,279 --> 07:23:53,919 the note. So now we can look at the individual\n 3598 07:23:53,919 --> 07:24:04,199 we'll just perform a right rotation. The left\n 3599 07:24:04,200 --> 07:24:12,477 and then call the left left case. So we're\n 3600 07:24:12,477 --> 07:24:19,477 this case degrades to that case, after one\n 3601 07:24:19,477 --> 07:24:24,739 the right right case and the right left case,\n 3602 07:24:24,740 --> 07:24:33,350 one right rotation over here. then here are\n 3603 07:24:33,349 --> 07:24:42,259 methods. For the avielle tree. It's very important\n 3604 07:24:42,259 --> 07:24:52,079 the height and balance factor values after\n 3605 07:24:52,080 --> 07:24:58,870 those values will undoubtedly change. And\n 3606 07:24:58,869 --> 07:25:04,250 you do them and you can't Do them in inverse\n 3607 07:25:04,250 --> 07:25:11,159 node first to get the correct balance factor\n 3608 07:25:13,139 --> 07:25:23,378 now we're at the Remove method. So in this\n 3609 07:25:23,378 --> 07:25:28,898 or I should have called it value rather. But\n 3610 07:25:28,898 --> 07:25:35,058 it doesn't exist in the tree. So just return\n 3611 07:25:35,058 --> 07:25:45,590 in the tree, and then remove it by calling\n 3612 07:25:45,590 --> 07:25:55,000 node count and simply return true on the Remove\n 3613 07:25:55,000 --> 07:25:59,009 So let's look at this private remove method 3614 07:25:59,009 --> 07:26:06,057 where all the action is happening. So if we\n 3615 07:26:06,058 --> 07:26:13,200 then we get our comparative value. And we\n 3616 07:26:13,200 --> 07:26:18,727 is less than zero. And if so we dig into the\n 3617 07:26:18,727 --> 07:26:26,369 we're looking for is smaller than the current\n 3618 07:26:26,369 --> 07:26:35,020 method. But passing down no dot left, so we're\n 3619 07:26:35,020 --> 07:26:43,629 passing the element as well. Otherwise do\n 3620 07:26:43,629 --> 07:26:52,270 down the right subtree. Otherwise, if this\n 3621 07:26:52,270 --> 07:26:59,020 know that the comparative value is equal to\n 3622 07:26:59,020 --> 07:27:07,939 to remove. And we know from the slides that\n 3623 07:27:07,939 --> 07:27:20,009 dot left is equal to null, then we know that\n 3624 07:27:20,009 --> 07:27:27,989 right is null, meaning the right subtree is\n 3625 07:27:27,990 --> 07:27:38,860 the final tricky case, where we're trying\n 3626 07:27:38,860 --> 07:27:42,637 so the left subtree is there and the right\n 3627 07:27:42,637 --> 07:27:50,789 that node. And we can either pick the smallest\n 3628 07:27:50,790 --> 07:27:56,280 smallest value in the right subtree of the\n 3629 07:27:56,279 --> 07:28:03,840 using a heuristic right here to actually determine\n 3630 07:28:03,840 --> 07:28:10,750 it's a good heuristic. And here's what it\n 3631 07:28:10,750 --> 07:28:19,727 I want to remove nodes from that subtree.\n 3632 07:28:19,727 --> 07:28:29,297 larger height, than I want to remove nodes\n 3633 07:28:29,297 --> 07:28:37,709 I'm using to choose which subtree I remove\n 3634 07:28:37,709 --> 07:28:45,180 I think in general, it'll work pretty well.\n 3635 07:28:45,180 --> 07:28:52,308 the left subtree has larger height, than what\n 3636 07:28:52,308 --> 07:29:01,540 value by going down the left subtree once\n 3637 07:29:01,540 --> 07:29:08,940 to find the max value, and then swapping that\n 3638 07:29:08,939 --> 07:29:19,329 is the recursive part, we recurse. on removing\n 3639 07:29:19,330 --> 07:29:27,870 we passed in, or rather the element we passed\n 3640 07:29:27,869 --> 07:29:32,227 the element, but now we also have to remove\n 3641 07:29:32,227 --> 07:29:38,707 duplicate values in our tree, then we basically\n 3642 07:29:38,707 --> 07:29:47,359 If the condition goes the other way. Here's\n 3643 07:29:47,360 --> 07:29:54,090 you want to call the update and rebalance\n 3644 07:29:54,090 --> 07:30:00,297 so that the tree remains balanced even though\n 3645 07:30:00,297 --> 07:30:08,159 men and find max method I was calling right\n 3646 07:30:08,159 --> 07:30:13,509 depending on the case. And here's just an\n 3647 07:30:13,509 --> 07:30:18,500 I need to go over this. And here's another\n 3648 07:30:18,500 --> 07:30:24,909 search tree invariant, also not particularly\n 3649 07:30:24,909 --> 07:30:32,360 in the main function that will just randomly\n 3650 07:30:32,360 --> 07:30:37,159 it. So when I invoked this file on the terminal 3651 07:30:37,159 --> 07:30:46,968 this is what executed. So that is an ACL tree.\n 3652 07:30:46,968 --> 07:30:53,740 enjoyed writing it up. Today's data structure\n 3653 07:30:53,740 --> 07:30:59,230 to prove to be a very useful data structure\n 3654 07:30:59,229 --> 07:31:05,387 time ago. So just before we get started, this\n 3655 07:31:05,387 --> 07:31:10,779 priority queue videos, which simply go over\n 3656 07:31:10,779 --> 07:31:15,919 get by without watching all those videos,\n 3657 07:31:15,919 --> 07:31:20,369 those of you who want to know, priority queues\n 3658 07:31:20,369 --> 07:31:24,409 for links to those. So what exactly is an\n 3659 07:31:24,409 --> 07:31:32,707 priority queue variant, which on top of having\n 3660 07:31:32,707 --> 07:31:38,029 also supports quick updates and deletions\n 3661 07:31:38,029 --> 07:31:43,509 that the index party queue solves is being\n 3662 07:31:43,509 --> 07:32:08,039 the values in your priority queue on the fly,\n 3663 07:32:08,040 --> 07:32:37,727 an example. Suppose a hospital has a waiting\n 3664 07:32:37,727 --> 07:33:05,450 of attention. Each person in the waiting room\n 3665 07:33:05,450 --> 07:33:35,450 with. For instance, Mary is in labor. So she\n 3666 07:33:35,450 --> 07:33:55,659 cut, he has a priority of one. James has an\n 3667 07:33:55,659 --> 07:34:20,069 Naija stomach hurts, she gets priority of\n 3668 07:34:20,069 --> 07:34:33,779 priority is five. And lastly, Leah also has\n 3669 07:34:33,779 --> 07:34:47,469 patients by highest priority first. This means\n 3670 07:34:47,470 --> 07:34:57,970 by James. However, then something happens\n 3671 07:34:57,970 --> 07:35:11,970 she starts vomiting. Her priority needs to\n 3672 07:35:11,970 --> 07:35:24,220 served next once they're finished with James.\n 3673 07:35:24,220 --> 07:35:40,569 leaves he goes to another clinic down the\n 3674 07:35:40,569 --> 07:36:04,128 for. Further suppose that a car wash goes\n 3675 07:36:04,128 --> 07:36:27,430 and as a result cracks his head and needs\n 3676 07:36:27,430 --> 07:36:50,500 to 10. Once the EDA is dealt with a karsch\n 3677 07:36:50,500 --> 07:37:12,750 by layer. As we saw in the hospital example,\n 3678 07:37:12,750 --> 07:37:28,930 update the priority of certain people. The\n 3679 07:37:28,930 --> 07:37:46,529 lets us do this efficiently. The first step\n 3680 07:37:46,529 --> 07:37:59,809 index values to all the keys thus forming\n 3681 07:37:59,810 --> 07:38:16,420 persecute to track who should get served next\n 3682 07:38:16,419 --> 07:38:30,029 a unique key index value between zero and\n 3683 07:38:30,029 --> 07:39:01,489 intended to be bi directional. So I would\n 3684 07:39:01,490 --> 07:39:18,200 be able to flip back and forth between the\n 3685 07:39:18,200 --> 07:39:28,990 on the index party q will require the associated\n 3686 07:39:28,990 --> 07:39:36,080 wondering why I'm saying that we need to map\n 3687 07:39:36,080 --> 07:39:41,010 inclusive. The reason for this is that typically\n 3688 07:39:41,009 --> 07:39:48,789 under the hood are actually arrays. So we\n 3689 07:39:48,790 --> 07:39:54,780 those arrays this will become apparent shortly.\n 3690 07:39:54,779 --> 07:40:01,869 often, the keys themselves are already integers\n 3691 07:40:01,869 --> 07:40:07,067 to actually construct this bi directional\n 3692 07:40:07,067 --> 07:40:15,637 it is handy to be able to support any type\n 3693 07:40:15,637 --> 07:40:23,058 can think of an index party queue As an abstract\n 3694 07:40:23,059 --> 07:40:31,830 it to support here are about a dozen or so\n 3695 07:40:31,830 --> 07:40:36,910 These are deleting keys, getting the value\n 3696 07:40:36,909 --> 07:40:42,750 in the priority queue, getting the key index\n 3697 07:40:42,750 --> 07:40:46,707 value in the index Burcu, being able to insert\n 3698 07:40:46,707 --> 07:40:49,149 specialized update operations increase in\n 3699 07:40:49,150 --> 07:40:51,958 end. For all these operations, you need the\n 3700 07:40:51,957 --> 07:40:57,039 that you're dealing with. Throughout these\n 3701 07:40:57,040 --> 07:41:03,229 as the variable KPI to distinguish it from\n 3702 07:41:03,229 --> 07:41:09,759 that. And index party queue can be implemented\n 3703 07:41:09,759 --> 07:41:14,797 time complexity is using specialized heap\n 3704 07:41:14,797 --> 07:41:22,409 the binary heap implementation for simplicity,\n 3705 07:41:22,409 --> 07:41:27,520 all these operations are either constant or\n 3706 07:41:27,520 --> 07:41:33,950 party queue. The remove and update operations\n 3707 07:41:33,950 --> 07:41:43,010 a mapping to the position of where our values\n 3708 07:41:43,009 --> 07:41:52,590 the index party queue per se, I want to spend\n 3709 07:41:52,591 --> 07:42:02,547 priority queue data structure which only supports\n 3710 07:42:02,547 --> 07:42:07,628 barbecue. Still, both data structures are\n 3711 07:42:07,628 --> 07:42:17,580 them the same. Although there are key differences\n 3712 07:42:17,580 --> 07:42:23,370 to represent a binary heap is within array\n 3713 07:42:23,369 --> 07:42:34,349 If we were to represent the following binary\n 3714 07:42:34,349 --> 07:42:46,430 of values. If we know the index of node i,\n 3715 07:42:46,430 --> 07:42:53,637 child nodes are by using simple formulas,\n 3716 07:42:53,637 --> 07:43:03,079 the right child is two times i plus two, assuming\n 3717 07:43:03,080 --> 07:43:09,708 the children of the node at index four? Well,\n 3718 07:43:09,707 --> 07:43:16,199 I just gave you to obtain the indices, nine\n 3719 07:43:16,200 --> 07:43:23,378 math backwards and figure out with a parent\n 3720 07:43:23,378 --> 07:43:30,740 useful if you're either walking up or down\n 3721 07:43:30,740 --> 07:43:35,830 value into the priority queue, you insert\n 3722 07:43:35,830 --> 07:43:44,250 the bottom right of the binary tree. Suppose\n 3723 07:43:44,250 --> 07:43:49,849 violate the heap invariant, so we need to\n 3724 07:43:49,849 --> 07:43:54,977 met. So swap nodes five and 12. The heap invariant\n 3725 07:43:54,977 --> 07:44:00,659 nodes two and five, and now the tree is balanced.\n 3726 07:44:00,659 --> 07:44:10,409 a traditional priority queue. To remove items\n 3727 07:44:10,409 --> 07:44:13,898 and then swap it with the last node, perform\n 3728 07:44:13,898 --> 07:44:22,208 the swapped value. For this example, suppose\n 3729 07:44:22,207 --> 07:44:31,180 five, we don't know where the node value five\n 3730 07:44:31,180 --> 07:44:36,240 search for it. This is one of the major differences\n 3731 07:44:36,240 --> 07:44:42,780 queue. So start at node zero and process each\n 3732 07:44:44,849 --> 07:44:49,449 So we found a node with a value five to actually\n 3733 07:44:49,450 --> 07:44:56,889 right most bottom node. Once this is done,\n 3734 07:44:56,889 --> 07:45:02,227 node we swapped into five spoon position may\n 3735 07:45:02,227 --> 07:45:08,159 to either move it up or down the tree. In\n 3736 07:45:08,159 --> 07:45:15,939 of one, which is smaller than its children,\n 3737 07:45:15,939 --> 07:45:26,369 to move the node down. That was a quick recap\n 3738 07:45:26,369 --> 07:45:33,459 about a traditional party queue. Now let's\n 3739 07:45:33,459 --> 07:45:43,317 queue with a binary heap. For the following\n 3740 07:45:43,317 --> 07:45:49,040 with different priorities that we need to\n 3741 07:45:49,040 --> 07:45:55,440 a queue at a hospital, a waiting line at a\n 3742 07:45:55,439 --> 07:45:59,797 is, we'll assume that the values can dynamically\n 3743 07:45:59,797 --> 07:46:15,779 person with the lowest priority to figure\n 3744 07:46:15,779 --> 07:46:20,520 priority queue to sort by lowest value first.\n 3745 07:46:20,520 --> 07:46:28,628 assign each person a unique index value. between\n 3746 07:46:28,628 --> 07:46:34,760 index values in the second column beside each\n 3747 07:46:34,759 --> 07:46:39,599 person an initial value to place inside the\n 3748 07:46:39,599 --> 07:46:44,779 by the index priority queue once inserted,\n 3749 07:46:44,779 --> 07:46:51,779 value, you want not only integers as shown\n 3750 07:46:51,779 --> 07:46:57,819 or whatever type of data we want. If I was\n 3751 07:46:57,819 --> 07:47:04,020 of the key value pairs I have in the last\n 3752 07:47:04,020 --> 07:47:08,930 that unlike the previous example, we're sorting\n 3753 07:47:08,930 --> 07:47:16,110 with a min heap. If I want to access the value\n 3754 07:47:16,110 --> 07:47:21,430 out what its key indexes. And then you will\n 3755 07:47:21,430 --> 07:47:26,270 the index priority queue. Here's a good question,\n 3756 07:47:26,270 --> 07:47:30,218 index priority queue? Well, first, find Bella's\n 3757 07:47:30,218 --> 07:47:33,930 into the values array at position one to find\n 3758 07:47:33,930 --> 07:47:38,939 index partner queue. So Bella has a value\n 3759 07:47:38,939 --> 07:47:44,250 value for a particular key in the index priority\n 3760 07:47:44,250 --> 07:47:50,990 node for a particular key? To do that, we'll\n 3761 07:47:50,990 --> 07:47:59,388 namely a position map we can use to tell us\n 3762 07:47:59,387 --> 07:48:06,237 key index. For convenience, I will store the\n 3763 07:48:06,238 --> 07:48:13,319 the priority queue. As an example, let's find\n 3764 07:48:13,319 --> 07:48:20,619 find the key index for Dylan, which happens\n 3765 07:48:20,619 --> 07:48:25,599 position map to tell you where Dylan is in\n 3766 07:48:25,599 --> 07:48:28,399 and the index seven highlighted in orange.\n 3767 07:48:28,400 --> 07:48:33,790 George in the heap? I'll give you a quick\n 3768 07:48:34,790 --> 07:48:40,659 All right, so with just about every operation,\n 3769 07:48:40,659 --> 07:48:48,797 the key we care about, which is George, then\n 3770 07:48:48,797 --> 07:49:00,770 position map and find out the node for George,\n 3771 07:49:00,770 --> 07:49:12,680 know how to look up the node for a given key.\n 3772 07:49:12,680 --> 07:49:18,047 This inverse lookup will prove to be a very\n 3773 07:49:18,047 --> 07:49:22,180 key is associated with the root node at index\n 3774 07:49:22,180 --> 07:49:27,229 lookup to figure that out. To do the inverse\n 3775 07:49:27,229 --> 07:49:33,637 need to maintain an inverse lookup table.\n 3776 07:49:33,637 --> 07:49:38,340 for inverse map. Let's see if we can figure\n 3777 07:49:38,340 --> 07:49:45,020 at index two. To do that, simply do a lookup\n 3778 07:49:45,020 --> 07:49:50,930 us information about which key index is associated\n 3779 07:49:50,930 --> 07:49:56,689 us to retrieve the actual key by doing a lookup\n 3780 07:49:56,689 --> 07:50:07,387 case, the node at position two represents\n 3781 07:50:07,387 --> 07:50:11,669 make sure you're still paying attention. Which\n 3782 07:50:11,669 --> 07:50:18,459 position three. Same as before, find the key\n 3783 07:50:18,459 --> 07:50:25,079 key next, figure out the actual key from the\n 3784 07:50:25,080 --> 07:50:30,350 that the node at position three represents\n 3785 07:50:30,349 --> 07:50:37,547 an index party Q is structured internally\n 3786 07:50:37,547 --> 07:50:41,207 want to actually do some useful operations\n 3787 07:50:41,207 --> 07:50:48,609 new key value pairs, removing key value pairs\n 3788 07:50:48,610 --> 07:50:52,898 associated with a key. These are all possible\n 3789 07:50:52,898 --> 07:50:58,978 do it with a regular priority queue insertion\n 3790 07:50:58,977 --> 07:51:10,567 update the position map pm and the inverse\n 3791 07:51:10,567 --> 07:51:18,411 pairs. Suppose we want to insert the key marry\n 3792 07:51:18,411 --> 07:51:21,560 What we first have to do is assign marry a\n 3793 07:51:21,560 --> 07:51:24,690 insert the new key value pair at the insertion\n 3794 07:51:24,689 --> 07:51:29,180 our arrays at index 12. To reflect that the\n 3795 07:51:29,180 --> 07:51:34,878 the heap invariant is not satisfied since\n 3796 07:51:34,878 --> 07:51:40,420 than the one at node five. To resolve this,\n 3797 07:51:40,419 --> 07:51:45,259 upwards until the heap invariant is satisfied\n 3798 07:51:45,259 --> 07:51:51,090 map and the inverse map. by swapping the values\n 3799 07:51:51,090 --> 07:51:55,599 array does not need to be touched since it\n 3800 07:51:55,599 --> 07:52:00,359 get from the map and not the node index per\n 3801 07:52:00,360 --> 07:52:04,440 still not satisfied, so we need to keep swapping\nupwards. 3802 07:52:04,439 --> 07:52:12,189 Let's have a look at some pseudocode for insertions\n 3803 07:52:12,189 --> 07:52:17,539 needs to provide a valid key index k II, as\n 3804 07:52:17,540 --> 07:52:24,308 The first thing I do is store the value associated\n 3805 07:52:24,308 --> 07:52:31,530 update the position map and the inverse map\n 3806 07:52:31,529 --> 07:52:39,529 has been inserted into the priority queue.\n 3807 07:52:39,529 --> 07:52:42,579 the heap invariant is satisfied. Let's take\n 3808 07:52:42,580 --> 07:52:47,120 that happens. Here we are looking at the swim\n 3809 07:52:47,119 --> 07:52:52,520 functions swap, and let's swap is simply exchanges\n 3810 07:52:52,520 --> 07:53:00,159 if no AI has a value less than node j. The\n 3811 07:53:00,159 --> 07:53:08,169 It begins by finding the index of the parent\n 3812 07:53:08,169 --> 07:53:16,739 we walk up one layer in the tree. If the index\n 3813 07:53:16,740 --> 07:53:22,048 the value of the current node is less than\n 3814 07:53:22,047 --> 07:53:29,619 with a min heap, and want small values to\n 3815 07:53:29,619 --> 07:53:35,020 issue a node exchange simply call the swap\n 3816 07:53:35,020 --> 07:53:40,810 node and the parent node, and then update\n 3817 07:53:40,810 --> 07:53:50,530 values. I also want to talk about swapping\n 3818 07:53:50,529 --> 07:53:56,369 slightly different than a traditional party\n 3819 07:53:56,369 --> 07:54:03,289 moving around the values in the array, we're\n 3820 07:54:03,290 --> 07:54:10,790 that the values array is indexed by the key\n 3821 07:54:10,790 --> 07:54:18,780 the values array can remain constant while\n 3822 07:54:18,779 --> 07:54:25,849 map. First, we update the positions of where\n 3823 07:54:25,849 --> 07:54:35,449 queue. Remember what the position map is,\n 3824 07:54:35,450 --> 07:54:42,520 key index is found at. So we can do a straightforward\n 3825 07:54:42,520 --> 07:54:46,479 the key index values and swap indices i and\n 3826 07:54:46,479 --> 07:54:54,979 index values associated with nodes iMj and\n 3827 07:54:54,979 --> 07:55:01,292 a simple straightforward exchange. Next up,\n 3828 07:55:01,292 --> 07:55:06,760 elements from an indexed priority queue. Polling\n 3829 07:55:06,759 --> 07:55:13,371 removing is improved from a linear time complexity\n 3830 07:55:13,371 --> 07:55:18,450 position lookups are now constant time but\n 3831 07:55:18,450 --> 07:55:25,869 is why the heap invariant is still logarithmic.\n 3832 07:55:25,869 --> 07:55:32,909 want to pull the root node This is something\n 3833 07:55:32,909 --> 07:55:40,509 key value pair with the lowest value in the\n 3834 07:55:40,509 --> 07:55:50,647 node with the bottom right node. As we do\n 3835 07:55:50,648 --> 07:55:55,620 we can remove the read node from the tree.\n 3836 07:55:55,619 --> 07:56:03,137 store the key value pair we're removing so\n 3837 07:56:03,137 --> 07:56:13,539 Then clean up the Remove node. Finally, restore\n 3838 07:56:13,540 --> 07:56:19,690 node down. Since the left and the right child\n 3839 07:56:22,779 --> 07:56:29,270 Let's do a slightly more involved removal\n 3840 07:56:29,270 --> 07:56:34,898 party Q. And this example, let's remove the\n 3841 07:56:34,898 --> 07:56:43,610 get the key index for Laura, which is the\n 3842 07:56:43,610 --> 07:56:49,840 key index is equal to 11. Once we know the\n 3843 07:56:49,840 --> 07:56:55,797 locate the node within the heap by looking\n 3844 07:56:55,797 --> 07:57:02,579 we want to remove that is the node which contains\n 3845 07:57:02,580 --> 07:57:08,000 it with the bottom rightmost node. store the\n 3846 07:57:08,000 --> 07:57:16,299 clean up the Remove node. And finally restore\n 3847 07:57:16,299 --> 07:57:24,819 node up or down, we're going to make the purple\n 3848 07:57:24,819 --> 07:57:31,779 Alright, let's look at some pseudocode. for\n 3849 07:57:31,779 --> 07:57:39,897 very short only five lines of implementation\n 3850 07:57:39,898 --> 07:57:50,740 to do is exchange the position of the node\n 3851 07:57:50,740 --> 07:57:58,320 node, which is always at index position as\n 3852 07:57:58,319 --> 07:58:01,878 exchanged the nodes, the rightmost node is\n 3853 07:58:01,878 --> 07:58:08,637 to remove was. So we need to move it either\n 3854 07:58:08,637 --> 07:58:14,229 we don't know which it will be so we try to\n 3855 07:58:14,229 --> 07:58:20,457 swim. Lastly, I just clean up the values associated\n 3856 07:58:20,457 --> 07:58:26,637 return the key value pair being removed. But\n 3857 07:58:26,637 --> 07:58:34,509 look at the sync method. So we understand\n 3858 07:58:34,509 --> 07:58:40,419 select the child with the smallest value and\n 3859 07:58:40,419 --> 07:58:46,797 tie. In this next block, I tried to update\n 3860 07:58:46,797 --> 07:58:51,430 But first I need to check if the right child\n 3861 07:58:51,430 --> 07:58:57,000 and its value is actually less than the one\n 3862 07:58:57,000 --> 07:59:03,090 if we're outside the size of the heap, or\n 3863 07:59:03,090 --> 07:59:09,067 Lastly, we want to make sure we swap the current\n 3864 07:59:09,067 --> 07:59:18,829 value. Lastly, we want to make sure we swap\n 3865 07:59:18,830 --> 07:59:25,750 with the smallest value. The last core operation\n 3866 07:59:25,750 --> 07:59:33,430 key value pair updates similar to remove those\n 3867 07:59:33,430 --> 07:59:40,099 logarithmic time due to the constant time\n 3868 07:59:40,099 --> 07:59:45,159 logarithmic time to adjust where the key value\n 3869 07:59:45,159 --> 07:59:53,689 want to update the value of the key Carly\n 3870 07:59:53,689 --> 08:00:03,049 find the key index of the key we want to work\n 3871 08:00:03,049 --> 08:00:07,860 two, then we can use that key index value\n 3872 08:00:07,860 --> 08:00:13,819 Of course, the heap invariant may not be satisfied,\n 3873 08:00:17,330 --> 08:00:26,250 the pseudocode for updating the value of a\n 3874 08:00:26,250 --> 08:00:32,520 in the values array and move the node either\n 3875 08:00:32,520 --> 08:00:33,520 so there was nothing too special about updates,\n 3876 08:00:33,520 --> 08:00:35,727 increase and decrease key. In many applications\n 3877 08:00:35,727 --> 08:00:37,259 often useful to be able to update a given\n 3878 08:00:37,259 --> 08:00:41,567 or always larger. In the event that a worse\n 3879 08:00:41,567 --> 08:00:45,808 should not be updated. In such situations,\n 3880 08:00:45,808 --> 08:00:46,808 form of update operation called increased\n 3881 08:00:46,808 --> 08:00:49,708 on whether you want to increase the value\n 3882 08:00:49,707 --> 08:00:50,707 the value associated with the key. Both of\n 3883 08:00:50,707 --> 08:00:51,707 consist of doing an if statement before performing\n 3884 08:00:51,707 --> 08:00:52,707 or decreases the value associated with the\n 3885 08:00:52,707 --> 08:00:53,707 just convenience methods that wrap a get operation\n 3886 08:00:53,707 --> 08:00:54,707 we're going to look at some source code for\n 3887 08:00:54,707 --> 08:00:55,707 we get started, make sure you watch my video\n 3888 08:00:55,707 --> 08:00:56,770 the implementation details and why an index\n 3889 08:00:56,770 --> 08:00:58,049 All the source code for this video can be\n 3890 08:00:58,049 --> 08:00:59,049 slash wm fuzzer slash data structures, the\n 3891 08:00:59,049 --> 08:01:02,529 Here we are in the source code for a min indexed\n 3892 08:01:02,529 --> 08:01:04,128 in the Java programming language. To get started,\n 3893 08:01:04,128 --> 08:01:07,000 that we pass in a type of object which is\n 3894 08:01:07,000 --> 08:01:08,909 value pairs within the heap. You'll also notice\n 3895 08:01:08,909 --> 08:01:11,919 This is just to be more generic. And all I\n 3896 08:01:11,919 --> 08:01:12,957 this heap to have at most two children for\n 3897 08:01:12,957 --> 08:01:14,069 general and teach children. So let's look\n 3898 08:01:14,069 --> 08:01:16,128 the fun stuff is happening. So let's go over\n 3899 08:01:16,128 --> 08:01:18,909 is just the number of elements in the heap,\n 3900 08:01:18,909 --> 08:01:23,270 number of elements in the heap, D is the degree\n 3901 08:01:23,270 --> 08:01:28,930 number is two. The two arrays, child and parent\n 3902 08:01:28,930 --> 08:01:34,207 node, so we don't have to compute them dynamically.\n 3903 08:01:34,207 --> 08:01:37,779 maps, which we're going to use to track the\n 3904 08:01:37,779 --> 08:01:42,467 values array, which is the array that contains\n 3905 08:01:42,468 --> 08:01:49,137 Note that it's very important to notice that\n 3906 08:01:49,137 --> 08:01:51,468 not by the nodes, indices, per se. So in the\n 3907 08:01:51,468 --> 08:01:56,280 maximum size for our heap. Then we just initialize\n 3908 08:01:56,279 --> 08:01:57,547 size of the heap, then I initialize all our\n 3909 08:01:57,547 --> 08:01:58,547 parent indices should be. And I also initialize\n 3910 08:01:58,547 --> 08:01:59,547 negative one values, then we have a few straightforward\n 3911 08:01:59,547 --> 08:02:02,759 So you'll notice that for contains, we don't\n 3912 08:02:02,759 --> 08:02:03,759 we pass in the key index. And we're going\n 3913 08:02:03,759 --> 08:02:04,759 have a few convenience bounds checking methods\n 3914 08:02:04,759 --> 08:02:05,949 such as key has to be in the bounds or throw.\n 3915 08:02:05,950 --> 08:02:09,680 is valid. And after this check is done, we\n 3916 08:02:09,680 --> 08:02:13,040 by looking inside the position map and checking\n 3917 08:02:13,040 --> 08:02:16,138 Then we have things like peak min key index,\n 3918 08:02:16,137 --> 08:02:17,137 of the heap. And similarly, Paul min key index 3919 08:02:17,137 --> 08:02:18,137 and also peak min value and pull min value.\n 3920 08:02:18,137 --> 08:02:19,137 we want to just look at the value at the top\n 3921 08:02:19,137 --> 08:02:20,137 the pole version will actually remove it.\n 3922 08:02:20,137 --> 08:02:21,137 value pair, we need to make sure that that\n 3923 08:02:21,137 --> 08:02:22,137 heap. Otherwise, we're going to throw an exception\n 3924 08:02:22,137 --> 08:02:23,137 null. And if it is we throw an exception.\n 3925 08:02:23,137 --> 08:02:24,137 the position map and inverse map. Then we\n 3926 08:02:24,137 --> 08:02:25,137 we passed in. And then we swim up that node\n 3927 08:02:25,137 --> 08:02:26,137 we also increment the size variable so that\n 3928 08:02:26,137 --> 08:02:27,137 of pretty straightforward, just do a look\n 3929 08:02:27,137 --> 08:02:28,137 slightly more interesting. We make sure the\n 3930 08:02:28,137 --> 08:02:29,137 that key exists within the heap. we swap the\n 3931 08:02:29,137 --> 08:02:30,137 heap. Then we reposition the new node we swapped\n 3932 08:02:30,137 --> 08:02:31,137 or down the heap, we capture the value in\n 3933 08:02:31,137 --> 08:02:32,137 can return it later, we clean up the node\n 3934 08:02:32,137 --> 08:02:33,137 value. update is also pretty easy. Just make\n 3935 08:02:33,137 --> 08:02:34,137 not no, then we get the index for the node,\n 3936 08:02:34,137 --> 08:02:35,137 new value, then move it within the heap. And\n 3937 08:02:35,137 --> 08:02:36,137 and decrease are just short for decrease key\n 3938 08:02:36,137 --> 08:02:37,137 dr heap here. So make sure you take that into\n 3939 08:02:37,137 --> 08:02:38,137 sure the key exists and it's not know then\n 3940 08:02:38,137 --> 08:02:39,137 in the heap, the values array at the key index,\n 3941 08:02:39,137 --> 08:02:40,137 that I didn't call the update method here,\n 3942 08:02:40,137 --> 08:02:41,137 the update method, we sink and we swim. So\n 3943 08:02:41,137 --> 08:02:42,137 way the node will go whether it's going to\n 3944 08:02:42,137 --> 08:02:43,137 key method, we do. Same thing for the increased\n 3945 08:02:43,137 --> 08:02:44,137 the less competitor so that the values array\n 3946 08:02:44,137 --> 08:02:45,137 passed in is on the right. These are just\n 3947 08:02:45,137 --> 08:02:46,137 the slides, we can go over quickly. So to\n 3948 08:02:46,137 --> 08:02:47,137 i since we're working with a D reheat, we\n 3949 08:02:47,137 --> 08:02:48,137 the one with the least value, this is going\n 3950 08:02:48,137 --> 08:02:49,137 sure that i is equal to the value of j and\n 3951 08:02:49,137 --> 08:02:50,137 repeat this until we can't sink the node anymore.\n 3952 08:02:50,137 --> 08:02:51,137 find the parent of note I which we can just\n 3953 08:02:51,137 --> 08:02:52,137 I swap with the parent and keep doing this\n 3954 08:02:52,137 --> 08:02:53,137 just looks at all the children of Note II\n 3955 08:02:53,137 --> 08:02:54,137 returns its index. Also pretty straightforward.\n 3956 08:02:54,137 --> 08:02:55,137 swapped the indices and the position map and\n 3957 08:02:55,137 --> 08:02:56,137 have the last function which simply compares\n 3958 08:02:56,137 --> 08:02:57,137 convenience methods just because I didn't\n 3959 08:02:57,137 --> 08:02:58,137 exceptions everywhere. Just kind of wrap them\n 3960 08:02:58,137 --> 08:02:59,137 to make sure that our heat is in these a min\n 3961 08:02:59,137 --> 08:03:03,637 the air heap. I hope there wasn't too much\n 3962 08:03:03,637 --> 08:03:10,700 give this video a thumbs up if you learn something\n 326221