Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:10,000 --> 00:00:17,000
The primary focus here is on intuition. There will be some math and formulas, but focus in the course is on
2
00:00:17,000 --> 00:00:26,000
understanding the main concepts, so you should be able to follow along, even if math is not among your primary interests.
3
00:00:26,000 --> 00:00:33,000
Before I explain anything about the course, let's head straight into a demo, which sole purpose is to
4
00:00:33,000 --> 00:00:43,000
illustrate that the way we implement a solution to a problem can have quite a bit of effect on how the solution performs.
5
00:00:43,000 --> 00:00:52,000
So, consider a popular website and assume that we are provided with a chunk of a log file from the web
6
00:00:52,000 --> 00:00:58,000
servers and are asked to report the number of unique visitors seen in that log file.
7
00:00:58,000 --> 00:01:06,000
In our case, visitors are identified solely by their IP address, so in order to find the answer, we will need
8
00:01:06,000 --> 00:01:11,000
to traverse the log and count the number of different IP addresses.
9
00:01:11,000 --> 00:01:15,000
We will try two approaches.
10
00:01:15,000 --> 00:01:22,000
In the first approach, we will solve the problem on a modern laptop produced in the year of this recording.
11
00:01:22,000 --> 00:01:29,000
I have put some specs in the table here. The lowest Geekbench 3 is a benchmark that I downloaded and
12
00:01:29,000 --> 00:01:35,000
executed on the machine, and the benchmark runs a large variety of tests and produces a final score,
13
00:01:35,000 --> 00:01:40,000
which can be used to measure the actual performance of the machine.
14
00:01:40,000 --> 00:01:49,000
And that machine is going to compete against a relatively old mobile phone equipped with much less capable
15
00:01:49,000 --> 00:01:56,000
hardware, and as you can see, it didn't score nearly as good in the Geekbench test.
16
00:01:56,000 --> 00:02:04,000
So, judging by the numbers, the mobile phone should have no chance in solving our challenge faster than a laptop.
17
00:02:04,000 --> 00:02:11,000
It seems like an unfair competition. Let's see if we can do something about that.
18
00:02:11,000 --> 00:02:21,000
In this example, I have implemented a solution for the iPhone and laptop using the languages Swift and C# respectively.
19
00:02:21,000 --> 00:02:30,000
In pseudo code, the solution looks like this. First, I read the whole log in order to be able to time that part separately.
20
00:02:30,000 --> 00:02:39,000
Next, I run the real solution, which in addition to the log traversal, also counts the IP addresses.
21
00:02:39,000 --> 00:02:46,000
So, first, let's look at how to traverse the log file in Swift. I create an instance of LogReader,
22
00:02:46,000 --> 00:02:54,000
which can provide me with all the log lines from the log file. I set a counter to 0, and then traverse all
23
00:02:54,000 --> 00:03:01,000
lines provided by the LogReader. For each logLine, I extract the IP address without using it, but then this
24
00:03:01,000 --> 00:03:10,000
action be measured as well, and increase the counter, which is returned after the loop.
25
00:03:10,000 --> 00:03:19,000
Here's the C# imitation. Consider to pause the video to convince yourself that these two implementations are practically identical.
26
00:03:19,000 --> 00:03:26,000
One subtle difference, however, is how all the logLines are returned, but as we shall see in a moment,
27
00:03:26,000 --> 00:03:33,000
that difference is insignificant in the total picture. Let's proceed to the actual solution that also counts
28
00:03:33,000 --> 00:03:39,000
the number of unique IP addresses. As you can see, the two methods are very alike.
29
00:03:39,000 --> 00:03:46,000
After creating a logReader, I initialize the container in which I will store all the different IP addresses
30
00:03:46,000 --> 00:03:54,000
I have seen so far. In this case, I use a container called NSMutableSet, which is an implementation of a
31
00:03:54,000 --> 00:04:02,000
so-called HashSet. If the IPs extracted are not already present in the container, they are added,
32
00:04:02,000 --> 00:04:11,000
and otherwise we just proceed with the next logLine. Finally, the function returns the number of elements in the container.
33
00:04:11,000 --> 00:04:17,000
Here is the C# implementation. Again, consider to pause the video to convince yourself that the two
34
00:04:17,000 --> 00:04:25,000
implementations are practically identical. One difference, however, is the choice of container.
35
00:04:25,000 --> 00:04:34,000
Even though that C# also supports HashSets, I have chosen to use a List in C#, which is basically an array
36
00:04:34,000 --> 00:04:41,000
with non-static size. The two containers are very different and we will dive deeper into both of them later
37
00:04:41,000 --> 00:04:50,000
on, but for now, just notice that the only practical difference between the two solutions is the choice of container.
38
00:04:50,000 --> 00:04:59,000
Okay, here are two frames, which in a moment will execute our two solutions simultaneously,
39
00:04:59,000 --> 00:05:05,000
the iPhone app in the left frame, and with Visual Studio loaded with the implementation in the right frame.
40
00:05:05,000 --> 00:05:18,000
Let's start the experiment. Both solutions finished reading the log file very quickly, but the modern
41
00:05:18,000 --> 00:05:27,000
machine won using only 0.02 seconds. The iPhone, however, won the counting using only 5.0 seconds.
42
00:05:27,000 --> 00:05:32,000
I'll pause the recording and return once the laptop finishes.
43
00:05:32,000 --> 00:05:36,000
36.6 seconds.
44
00:05:36,000 --> 00:05:43,000
So the iPhone won this competition, despite being much less capable in terms of hardware.
45
00:05:43,000 --> 00:05:48,000
We used an NSMutableSet on the iPhone, and a List on the laptop.
46
00:05:48,000 --> 00:05:57,000
If we, however, use a HashSet on the laptop instead, which is equivalent to the NSMutableSet on the iPhone,
47
00:05:57,000 --> 00:06:07,000
the laptop solves the problem using only 0.04 seconds. So the main point here is that the choice of
48
00:06:07,000 --> 00:06:15,000
container or data structure really matters when designing a problem solution.
49
00:06:15,000 --> 00:06:20,000
Data structures are structures in which we organize data, and we have just seen how the choice of data
50
00:06:20,000 --> 00:06:27,000
structures can affect performance quite drastically. The term data structure is often mentioned in
51
00:06:27,000 --> 00:06:33,000
conjunction with the term algorithms, which covers ways to do things.
52
00:06:33,000 --> 00:06:39,000
Data structures and algorithms are heavily linked. Data structures typically use some sort of algorithm to
53
00:06:39,000 --> 00:06:48,000
perform their inner organization, and algorithms typically use data structures to store internal states.
54
00:06:48,000 --> 00:06:54,000
In the remainder of this course, we will spend a moment on establishing a kind of language to describe how
55
00:06:54,000 --> 00:07:01,000
data structures and algorithms will perform and to help compare different performance characteristics without
56
00:07:01,000 --> 00:07:05,000
having to actually implement and execute the solutions to compare.
57
00:07:05,000 --> 00:07:11,000
Next, we will take a look at some essential data structures that are good to know, and proceed with
58
00:07:11,000 --> 00:07:18,000
discussing some essential algorithms, and I'll end the course by introducing you to an interesting category
59
00:07:18,000 --> 00:07:24,000
of problems that may seem simple at first, but that are really hard to solve, no matter how clever algorithms
60
00:07:24,000 --> 00:07:27,000
or data structures we use.
7813
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.