Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
0
1
00:00:06,912 --> 00:00:07,680
Hello everyone
1
2
00:00:07,936 --> 00:00:11,776
Welcome to our bonus lecture on pattern matching in postgres SQL
2
3
00:00:13,056 --> 00:00:18,176
In this lecture we will start by discussing different types of pattern matching, their syntax
3
4
00:00:18,432 --> 00:00:21,248
And then we will discuss few examples of each type
4
5
00:00:22,784 --> 00:00:26,880
At the end, we will also give you some tips on the usage of different methods
5
6
00:00:28,416 --> 00:00:29,440
So let's start
6
7
00:00:30,464 --> 00:00:32,000
We can define pattern matching
7
8
00:00:32,256 --> 00:00:36,352
As a method to identify the string complying to given format
8
9
00:00:37,376 --> 00:00:42,240
Suppose, we want to identify all the customer name that starts with letter A
9
10
00:00:42,496 --> 00:00:47,104
Or we want to identify the customer who has provided email id instead of names
10
11
00:00:47,872 --> 00:00:48,384
Or
11
12
00:00:48,640 --> 00:00:53,248
To identify the customer who has not provided the domain names in their email ID
12
13
00:00:53,760 --> 00:00:58,368
We can find all such customers using a pattern matching in postgres SQL
13
14
00:01:00,672 --> 00:01:05,536
This lecture will set your fundamentals of other advanced string functions as well
14
15
00:01:06,048 --> 00:01:07,840
Such as string split
15
16
00:01:08,352 --> 00:01:09,120
Etc.
16
17
00:01:10,400 --> 00:01:15,776
So understand the concept thoroughly and you will be able to use other functions as well
17
18
00:01:18,336 --> 00:01:20,128
In postgres SQL there are
18
19
00:01:20,384 --> 00:01:22,432
Three methods to perform pattern matching
19
20
00:01:23,200 --> 00:01:24,992
First is the like statement
20
21
00:01:25,760 --> 00:01:27,808
Second is the similar to statement
21
22
00:01:28,320 --> 00:01:32,416
And third is using Tilde operators with regular expressions
22
23
00:01:32,672 --> 00:01:34,720
Which are also known as regex expression
23
24
00:01:36,000 --> 00:01:39,584
We have already discussed like statements in our previous videos
24
25
00:01:40,352 --> 00:01:42,656
And we will be only covering few examples
25
26
00:01:42,912 --> 00:01:44,192
To refresh our memory
26
27
00:01:44,960 --> 00:01:47,008
Now moving on to similar to function
27
28
00:01:47,520 --> 00:01:49,568
These are the SQL standard functions
28
29
00:01:50,080 --> 00:01:56,224
And the only reason postgres SQL support it, is to stay compliant with SQL standards
29
30
00:01:57,504 --> 00:01:58,272
And Internally
30
31
00:01:58,528 --> 00:02:03,136
Every similar to expression is written in the form of regular expressions
31
32
00:02:03,648 --> 00:02:08,512
And therefore there always be a regular expression to do the same job faster
32
33
00:02:08,768 --> 00:02:10,304
Then the Similar to statements
33
34
00:02:11,072 --> 00:02:14,400
So there is no point in discussing similar to expressions
34
35
00:02:14,912 --> 00:02:19,264
And you should also avoid it in your queries and try regular expressions instead
35
36
00:02:20,544 --> 00:02:26,688
Regular expression with tilde operator provide us a very powerful and flexible tool to perform pattern
36
37
00:02:26,944 --> 00:02:27,456
Matching
37
38
00:02:28,992 --> 00:02:34,368
And one thing to note here is that the wild cards of like statements and wildcards
38
39
00:02:34,624 --> 00:02:36,928
Of regular expressions are different
39
40
00:02:37,696 --> 00:02:39,744
And you should try to learn them separately
40
41
00:02:40,512 --> 00:02:44,608
In this video, we will be mainly focusing on the regular expressions only
41
42
00:02:45,376 --> 00:02:49,216
Another major difference between like a statement and regular statement is
42
43
00:02:49,728 --> 00:02:53,312
Like statements perform pattern matching on the whole string
43
44
00:02:53,824 --> 00:02:58,688
Whereas regular expressions perform pattern matching also on the part of string
44
45
00:02:59,200 --> 00:03:03,040
So suppose if I want to find customer name with just
45
46
00:03:03,296 --> 00:03:04,320
Character A
46
47
00:03:04,832 --> 00:03:08,160
Like statement will find the customer name where
47
48
00:03:08,416 --> 00:03:11,232
There is only one character which is a
48
49
00:03:11,488 --> 00:03:14,560
Whereas regular expression will find all the customers
49
50
00:03:14,816 --> 00:03:17,120
Where the name contains a character A
50
51
00:03:17,888 --> 00:03:20,192
So let's start with the like operator
51
52
00:03:20,960 --> 00:03:23,264
There are two wildcards in like operator
52
53
00:03:24,544 --> 00:03:26,336
First is the percentage symbol
53
54
00:03:26,592 --> 00:03:28,640
And second is the underscore symbol
54
55
00:03:30,176 --> 00:03:34,016
Percentage symbols allow you to match any string of any length
55
56
00:03:34,528 --> 00:03:37,344
Whereas underscore symbol allow you to match
56
57
00:03:37,600 --> 00:03:38,880
Only a single character
57
58
00:03:39,904 --> 00:03:41,440
Let's look at some example
58
59
00:03:43,488 --> 00:03:46,048
We have already discussed this in our previous videos
59
60
00:03:46,304 --> 00:03:51,680
So in this video we will be only discussing it and not executing it in pg admin
60
61
00:03:52,448 --> 00:03:56,032
So suppose if we want to find all the customers
61
62
00:03:56,800 --> 00:03:58,848
Where the first name starts with
62
63
00:03:59,104 --> 00:04:00,640
Character J and o
63
64
00:04:01,152 --> 00:04:02,944
Will write select star
64
65
00:04:03,712 --> 00:04:04,736
From customer table
65
66
00:04:05,248 --> 00:04:06,272
Where first name
66
67
00:04:07,040 --> 00:04:07,552
Is like
67
68
00:04:08,576 --> 00:04:14,720
J o and then the percentage symbol, we are using percentage symbol because we don't know the length
68
69
00:04:14,976 --> 00:04:21,119
Of the name and we want to identify all the customer names where the starting characters are J and o
69
70
00:04:21,375 --> 00:04:24,191
Now suppose you want to find all the customers
70
71
00:04:24,703 --> 00:04:27,263
Which contains letter O and d
71
72
00:04:28,799 --> 00:04:33,407
So will write, select star from customer table where first name
72
73
00:04:33,663 --> 00:04:34,175
Like
73
74
00:04:34,431 --> 00:04:35,455
Percentage symbol
74
75
00:04:35,967 --> 00:04:37,247
Then OD
75
76
00:04:37,503 --> 00:04:41,087
Then percentage symbol this means that first name should contain
76
77
00:04:41,343 --> 00:04:43,391
O and D adjacent to each other
77
78
00:04:44,415 --> 00:04:47,487
Now in the next example suppose I want customer name
78
79
00:04:47,743 --> 00:04:50,303
Which will start with J A S
79
80
00:04:50,815 --> 00:04:53,631
And then there should be exactly One character
80
81
00:04:54,143 --> 00:04:55,679
That can be anything and
81
82
00:04:55,935 --> 00:05:00,543
Then there should be character N, in that case I will use underscore
82
83
00:05:01,055 --> 00:05:04,639
Underscore will ensure only a single character replacement
83
84
00:05:05,663 --> 00:05:11,807
We can also use not statements with the like a statement and the next example you can see that we have
84
85
00:05:12,063 --> 00:05:16,159
Used not like J percent to identify all the customer
85
86
00:05:16,415 --> 00:05:18,463
Whose names doesn't start with J
86
87
00:05:19,231 --> 00:05:20,255
Now suppose
87
88
00:05:20,511 --> 00:05:22,815
I want to identify all the customers
88
89
00:05:23,071 --> 00:05:24,607
Whose names start with
89
90
00:05:25,119 --> 00:05:26,399
A B
90
91
00:05:26,655 --> 00:05:27,935
C or E
91
92
00:05:28,959 --> 00:05:34,079
For this I have to write four different like statements with or statements in between
92
93
00:05:35,359 --> 00:05:38,431
Now consider a more complicated case
93
94
00:05:38,687 --> 00:05:39,199
Where
94
95
00:05:39,455 --> 00:05:41,759
I want my first name to start with
95
96
00:05:42,783 --> 00:05:43,807
ABC
96
97
00:05:44,063 --> 00:05:45,343
D or E
97
98
00:05:46,111 --> 00:05:48,415
And my second name should start with
98
99
00:05:48,927 --> 00:05:50,207
F or g
99
100
00:05:51,487 --> 00:05:52,255
In this case
100
101
00:05:52,767 --> 00:05:54,559
I have to write 5 into 2
101
102
00:05:55,071 --> 00:06:00,191
So total 10 like statements with the combination of or and and keywords
102
103
00:06:00,703 --> 00:06:06,079
Now let's suppose I also want to constrain on the length of my first name and last name
103
104
00:06:07,359 --> 00:06:08,383
In this case
104
105
00:06:08,639 --> 00:06:14,783
I first have to identify the separator between the first name and the last name that is a position of space
105
106
00:06:15,039 --> 00:06:15,551
Space
106
107
00:06:15,807 --> 00:06:16,575
In my name
107
108
00:06:17,599 --> 00:06:19,135
And then I have to write
108
109
00:06:19,391 --> 00:06:20,159
Two More
109
110
00:06:20,415 --> 00:06:23,231
Conditions on the length of first name and last name
110
111
00:06:25,791 --> 00:06:28,607
So you can see with the increase in number of condition
111
112
00:06:28,863 --> 00:06:30,143
And the complications
112
113
00:06:30,399 --> 00:06:32,959
The number of like statement in SQL
113
114
00:06:33,471 --> 00:06:35,007
Increases exponentially
114
115
00:06:36,799 --> 00:06:41,151
In general, like statements provide quick and easy way to solve
115
116
00:06:41,407 --> 00:06:43,199
Simple pattern matching problems
116
117
00:06:43,455 --> 00:06:47,295
But for Complex matching problem we have to use the regex functions
117
118
00:06:50,367 --> 00:06:55,743
You must be thinking, we will hardly encounter any such situation in our professional career
118
119
00:06:56,255 --> 00:06:57,791
So Let me tell you an example
119
120
00:06:58,047 --> 00:07:02,143
Suppose if you want to filter out invalid email ids from your data
120
121
00:07:03,167 --> 00:07:05,983
So a valid email id should contain
121
122
00:07:06,751 --> 00:07:09,055
A string of alphanumeric characters
122
123
00:07:09,567 --> 00:07:10,079
With
123
124
00:07:10,335 --> 00:07:11,103
Either dot
124
125
00:07:11,359 --> 00:07:11,871
Or
125
126
00:07:12,383 --> 00:07:13,407
Underscore symbol
126
127
00:07:13,663 --> 00:07:14,175
Then
127
128
00:07:14,431 --> 00:07:16,479
There should be a @ sign
128
129
00:07:17,503 --> 00:07:20,831
And then again there should be alphanumeric string
129
130
00:07:21,343 --> 00:07:23,647
For example Google or Yahoo
130
131
00:07:24,159 --> 00:07:26,207
Then there should be a dot
131
132
00:07:26,975 --> 00:07:31,327
And then after that dot, there should be 2 to 8 alphabet
132
133
00:07:31,583 --> 00:07:35,167
Such as .com or .in etc
133
134
00:07:36,447 --> 00:07:40,287
A valid email id should contain all of this parts
134
135
00:07:41,055 --> 00:07:42,847
And we will learn how to write this using regex expression
11505
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.