Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:01,080 --> 00:00:02,880
Instructor: You are probably watching this course
2
00:00:02,880 --> 00:00:05,130
because you wanna learn the appropriate statistics
3
00:00:05,130 --> 00:00:07,200
to perform different tests.
4
00:00:07,200 --> 00:00:10,080
Maybe you wanna use this knowledge as a stepping stone
5
00:00:10,080 --> 00:00:12,330
to a career in data science.
6
00:00:12,330 --> 00:00:14,520
Either way, before we can start testing
7
00:00:14,520 --> 00:00:16,770
we have to get acquainted with the types of variables
8
00:00:16,770 --> 00:00:18,570
we usually encounter.
9
00:00:18,570 --> 00:00:21,060
Different types of variables require different types
10
00:00:21,060 --> 00:00:23,820
of statistical and visualization approaches.
11
00:00:23,820 --> 00:00:25,470
Therefore, to be able to classify
12
00:00:25,470 --> 00:00:27,783
the data you are working with is key.
13
00:00:28,860 --> 00:00:31,800
We can classify data in two main ways
14
00:00:31,800 --> 00:00:35,040
based on its type and on its measurement level.
15
00:00:35,040 --> 00:00:37,560
Let's start from the types of data we can have.
16
00:00:37,560 --> 00:00:40,980
There is categorical and numerical data.
17
00:00:40,980 --> 00:00:44,460
Categorical data describes categories or groups.
18
00:00:44,460 --> 00:00:48,990
One example is car brands like Mercedes, BMW and Audi.
19
00:00:48,990 --> 00:00:50,613
They show different categories.
20
00:00:51,510 --> 00:00:55,230
Another instance is answers to yes and no questions.
21
00:00:55,230 --> 00:00:57,918
If I ask questions like, Are you currently enrolled
22
00:00:57,918 --> 00:01:02,190
in a university or do you own a car?
23
00:01:02,190 --> 00:01:04,410
Yes and no would be the two groups of answers
24
00:01:04,410 --> 00:01:05,459
that can be obtained.
25
00:01:06,450 --> 00:01:08,493
This is categorical data.
26
00:01:09,420 --> 00:01:11,340
Numerical data, on the other hand,
27
00:01:11,340 --> 00:01:14,700
as its name suggests, represents numbers.
28
00:01:14,700 --> 00:01:17,370
It is further divided into two subsets.
29
00:01:17,370 --> 00:01:19,353
Discreet and continuous.
30
00:01:20,460 --> 00:01:23,940
Discreet data can usually be counted in a finite matter.
31
00:01:23,940 --> 00:01:25,230
A good example would be the number
32
00:01:25,230 --> 00:01:27,330
of children that you want to have.
33
00:01:27,330 --> 00:01:29,640
Even if you don't know exactly how many,
34
00:01:29,640 --> 00:01:32,550
you are absolutely sure that the value will be an integer
35
00:01:32,550 --> 00:01:35,763
such as zero, one, two, or even 10.
36
00:01:36,900 --> 00:01:40,290
Another instance is grades on the SAT exam.
37
00:01:40,290 --> 00:01:45,290
You may get 1,000, 1,560, 1,570 or 2,400.
38
00:01:47,130 --> 00:01:50,190
What is important for a variable to be defined as discrete
39
00:01:50,190 --> 00:01:53,070
is that you can imagine each member of the data set.
40
00:01:53,070 --> 00:01:57,024
Knowing that SAT scores range from 600 to 2,410 points
41
00:01:57,024 --> 00:02:00,783
separate all possible scores that can be obtained is key.
42
00:02:02,430 --> 00:02:04,470
It's easier to understand discrete data
43
00:02:04,470 --> 00:02:07,530
by saying it's the opposite of continuous data.
44
00:02:07,530 --> 00:02:11,009
Continuous data is infinite and impossible to count.
45
00:02:11,009 --> 00:02:13,020
For instance, your weight can take on
46
00:02:13,020 --> 00:02:15,510
every value in some range.
47
00:02:15,510 --> 00:02:18,030
Let's dig a bit deeper into this.
48
00:02:18,030 --> 00:02:21,810
You get on the scale and the screen shows 150 pounds
49
00:02:21,810 --> 00:02:26,810
or 68.0389 kilograms, but this is just an approximation.
50
00:02:27,780 --> 00:02:31,650
If you gain 0.01 pound, the figure on the scale
51
00:02:31,650 --> 00:02:34,290
is unlikely to change but your new weight
52
00:02:34,290 --> 00:02:39,290
will be 150.01 pounds or 68.0434 kilograms.
53
00:02:41,580 --> 00:02:43,860
Now, think about sweating.
54
00:02:43,860 --> 00:02:46,050
Every drop of sweat reduces your weight
55
00:02:46,050 --> 00:02:48,360
by the weight of that drop, but once again,
56
00:02:48,360 --> 00:02:51,270
a scale is unlikely to capture that change.
57
00:02:51,270 --> 00:02:53,820
Your exact weight is a continuous variable.
58
00:02:53,820 --> 00:02:55,920
It can take on an infinite amount of values
59
00:02:55,920 --> 00:02:58,420
no matter how many digits there are after the dot.
60
00:03:00,030 --> 00:03:01,980
To sum up, your weight can vary
61
00:03:01,980 --> 00:03:05,580
by incomprehensibly small amounts and is continuous
62
00:03:05,580 --> 00:03:08,158
while the number of children you want to have
63
00:03:08,158 --> 00:03:10,308
is directly understandable and is discreet.
64
00:03:11,280 --> 00:03:14,010
Just to make sure, here are some other examples
65
00:03:14,010 --> 00:03:16,770
of discreet and continuous data.
66
00:03:16,770 --> 00:03:21,770
Grades at university are discreet, A, B, C, D, E, F
67
00:03:21,810 --> 00:03:24,840
or zero to 100%.
68
00:03:24,840 --> 00:03:26,940
The number of objects in general,
69
00:03:26,940 --> 00:03:30,330
no matter if bottles, glasses, tables, or cars,
70
00:03:30,330 --> 00:03:32,553
they can only take integer values.
71
00:03:33,780 --> 00:03:36,270
Money can be considered both, but physical money
72
00:03:36,270 --> 00:03:39,690
like bank notes and coins are definitely discreet.
73
00:03:39,690 --> 00:03:44,203
You can't pay $1.243, you can only pay $1.24.
74
00:03:45,270 --> 00:03:47,790
That's because the difference between two sums of money
75
00:03:47,790 --> 00:03:49,443
can be 1 cent at most.
76
00:03:50,970 --> 00:03:53,280
What else is continuous?
77
00:03:53,280 --> 00:03:57,030
Apart from weight, other measurements are also continuous.
78
00:03:57,030 --> 00:04:02,030
Examples are height, area, distance and time.
79
00:04:03,000 --> 00:04:05,910
All of these can vary by infinitely smaller amounts,
80
00:04:05,910 --> 00:04:10,260
incomprehensible for a human, time on a clock is discreet
81
00:04:10,260 --> 00:04:12,330
but time in general isn't.
82
00:04:12,330 --> 00:04:16,800
It can be anything like 72.123456 seconds.
83
00:04:16,800 --> 00:04:20,339
We are constrained in measuring weight, height, area,
84
00:04:20,339 --> 00:04:24,540
distance and time by our technology, but in general
85
00:04:24,540 --> 00:04:27,060
they can take on any value.
86
00:04:27,060 --> 00:04:29,640
All right, These were the types of data.
87
00:04:29,640 --> 00:04:31,380
In our next lesson, we will explore
88
00:04:31,380 --> 00:04:32,763
the levels of measurement.
7089
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.