Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
0
1
00:00:00,768 --> 00:00:04,864
Now let's discuss different wildcards of regex expression
1
2
00:00:05,632 --> 00:00:11,520
There are multiple wildcard, we have only selected few important wildcard
2
3
00:00:11,776 --> 00:00:13,824
The first one is the or operator
3
4
00:00:14,080 --> 00:00:17,920
With this pipe symbol you can denoted it as a Or operator
4
5
00:00:18,688 --> 00:00:20,992
The second is the asterisk sign
5
6
00:00:21,504 --> 00:00:26,624
asterisk denotes repetition of previous item 0 or more time
6
7
00:00:27,392 --> 00:00:29,184
Then there is a plus symbol
7
8
00:00:29,440 --> 00:00:35,584
Plus denotes repetition of previous item one or more time don't worry we will discuss this
8
9
00:00:35,840 --> 00:00:38,144
In more detail in our example
9
10
00:00:39,168 --> 00:00:40,704
Next is a ?
10
11
00:00:41,216 --> 00:00:45,568
Question mark denotes repetition of previous item 0 or 1 time
11
12
00:00:46,336 --> 00:00:46,848
Then
12
13
00:00:47,104 --> 00:00:53,248
There is a curly bracket and in curly bracket we can mention any number and this denotes
13
14
00:00:53,504 --> 00:00:57,600
repetition of previous item exactly m number of times
14
15
00:00:58,880 --> 00:01:02,208
So for example if you want a character a
15
16
00:01:02,720 --> 00:01:04,512
To come exactly three times
16
17
00:01:05,280 --> 00:01:06,048
You will denote A
17
18
00:01:06,304 --> 00:01:08,608
Curly bracket 3
18
19
00:01:10,912 --> 00:01:17,056
And if you want A to come m times or more than that then you will write curly brackets
19
20
00:01:17,312 --> 00:01:17,824
M
20
21
00:01:18,080 --> 00:01:18,592
Comma
21
22
00:01:19,104 --> 00:01:23,712
This denotes that the repetition of previous item M or more time
22
23
00:01:24,480 --> 00:01:29,600
Similarly if you want to apply upper limit and lower limit on the repetition you will write
23
24
00:01:30,112 --> 00:01:32,416
Curly bracket m, n
24
25
00:01:33,184 --> 00:01:38,560
Now as I told you earlier tilde operator will match only a part of string
25
26
00:01:39,072 --> 00:01:39,584
Now
26
27
00:01:39,840 --> 00:01:44,448
To denote the start of a string you have to write carat(^) or hat symbol
27
28
00:01:45,216 --> 00:01:48,800
And to denote end of the string you have to write ampersand($) symbol
28
29
00:01:50,336 --> 00:01:54,176
In regular expression you can also use character sets
29
30
00:01:54,432 --> 00:01:59,296
So you can mention all the possible characters in square bracket
30
31
00:02:00,320 --> 00:02:05,184
And SQL will treat it as or operations, so for example if I write
31
32
00:02:05,696 --> 00:02:07,488
Square bracket ABC
32
33
00:02:07,744 --> 00:02:09,024
That means that
33
34
00:02:09,280 --> 00:02:13,632
That place can contain a or b or c
34
35
00:02:14,400 --> 00:02:20,544
Similarly there are short hands also so for example if you want to mention all the characters
35
36
00:02:20,800 --> 00:02:24,384
From A to Z you can just write square bracket
36
37
00:02:25,664 --> 00:02:27,712
A then dash then Z
37
38
00:02:27,968 --> 00:02:32,832
Similarly if you want to mention numeric characters you can write
38
39
00:02:33,088 --> 00:02:35,392
1-9
39
40
00:02:36,160 --> 00:02:40,768
Now the operator to match using the regular expression is tilde sign
40
41
00:02:41,536 --> 00:02:43,584
But tilde sign is case sensitive
41
42
00:02:44,352 --> 00:02:47,680
You can also use tilde and then an asterisk sign
42
43
00:02:48,192 --> 00:02:50,240
To make it case insensitive
43
44
00:02:51,776 --> 00:02:54,336
Now let us look at some examples
44
45
00:02:55,104 --> 00:03:00,736
Suppose we want to identify all the customer where customer name start with a
45
46
00:03:03,040 --> 00:03:04,064
In this case
46
47
00:03:04,832 --> 00:03:10,976
We will write, tilde then * sign that means we are matching using case insensitive
47
48
00:03:11,232 --> 00:03:12,512
Operator
48
49
00:03:14,560 --> 00:03:20,704
Then like any other string we will start mentioning the rules using a single quotation, then we have to
49
50
00:03:20,960 --> 00:03:25,056
to mention the start of string that's why we will mention a ^ symbol
50
51
00:03:25,824 --> 00:03:28,128
And my first character should be A
51
52
00:03:28,384 --> 00:03:32,480
That's why I am writing ^ symbol then character a
52
53
00:03:33,248 --> 00:03:35,808
And this A can repeat itself
53
54
00:03:36,064 --> 00:03:38,880
Either one time or more time
54
55
00:03:39,136 --> 00:03:41,440
That's why I have to write plus symbol
55
56
00:03:42,208 --> 00:03:46,048
Now after my first character there can be anything
56
57
00:03:46,304 --> 00:03:52,448
So I am mentioning the character set, my character set contains all the alphabets of English that is
57
58
00:03:52,704 --> 00:03:53,728
A to Z
58
59
00:03:53,984 --> 00:03:59,360
And it also contains a space, since there is a space between first name and last name
59
60
00:03:59,616 --> 00:04:02,432
That's why I have to write space also
60
61
00:04:04,224 --> 00:04:09,600
So what this means is my second character can be anything from A to Z
61
62
00:04:10,624 --> 00:04:12,928
Or it can also be space
62
63
00:04:13,696 --> 00:04:18,815
No I want to replicate same rule for other positions as well
63
64
00:04:19,071 --> 00:04:21,375
That's why I have to use plus symbol
64
65
00:04:22,911 --> 00:04:25,727
And then have to mention the end of a string
65
66
00:04:25,983 --> 00:04:28,287
That's why I am writing $ symbol
66
67
00:04:29,311 --> 00:04:33,919
Do note that the symbol for space is backslash s \s
67
68
00:04:34,175 --> 00:04:38,271
Similarly symbol for dot is backslash dot
68
69
00:04:38,527 --> 00:04:40,063
And symbol for
69
70
00:04:40,575 --> 00:04:46,719
Dash is backslash dash, So for all the special characters you have to write backslash before
70
71
00:04:46,975 --> 00:04:47,487
Them
71
72
00:04:48,511 --> 00:04:51,327
So let's go to PG admin to write this
72
73
00:04:52,607 --> 00:04:56,959
So you will write select star from customer
73
74
00:05:01,823 --> 00:05:04,127
Where
74
75
00:05:04,383 --> 00:05:06,943
Customer name
75
76
00:05:09,247 --> 00:05:14,367
Then a tilde operator with asterisk sign to make it case insensitive
76
77
00:05:14,879 --> 00:05:18,463
And then in the single quotation I will write hat
77
78
00:05:19,231 --> 00:05:19,743
A
78
79
00:05:20,511 --> 00:05:24,095
Just to denote that string is starting with A
79
80
00:05:24,607 --> 00:05:28,447
Then this A can repeat itself n number of times
80
81
00:05:28,959 --> 00:05:31,263
So I am mentioning + sign
81
82
00:05:31,519 --> 00:05:36,127
Then I am defining the character set which contain all the characters from A to Z
82
83
00:05:36,383 --> 00:05:37,407
And a space
83
84
00:05:37,663 --> 00:05:40,223
Now the characters of this character set
84
85
00:05:40,991 --> 00:05:43,551
Can repeat itself N number of
85
86
00:05:43,807 --> 00:05:47,135
Time that's why I am mentioning a Plus sign
86
87
00:05:47,647 --> 00:05:50,207
And then I have to mention the end of a string
87
88
00:05:50,463 --> 00:05:53,023
That's why I am mentioning & sign
88
89
00:05:55,071 --> 00:05:57,119
Let's run this query
89
90
00:05:57,887 --> 00:06:01,471
So right now we are not getting our desired result
90
91
00:06:02,751 --> 00:06:06,335
So we have to change this / to backslash
91
92
00:06:07,359 --> 00:06:10,431
And then again rerun the code
92
93
00:06:13,503 --> 00:06:17,599
Now you can see that all the customer names are starting with A
93
94
00:06:18,879 --> 00:06:25,023
And after that there can be N number of characters between A to Z or space
94
95
00:06:26,815 --> 00:06:27,839
Now suppose
95
96
00:06:28,863 --> 00:06:35,007
In the second example, we want all the customers whose first name is start with a b c
96
97
00:06:35,263 --> 00:06:36,287
Or D
97
98
00:06:38,335 --> 00:06:41,663
So in this example we will use the or operator
98
99
00:06:42,687 --> 00:06:48,063
Will provider pipe operator between a b c and d and put this
99
100
00:06:48,319 --> 00:06:49,855
In a parentheses
100
101
00:06:51,135 --> 00:06:57,279
We can also use character set of ABCD instead of this or operator and parentheses
101
102
00:06:59,071 --> 00:07:01,631
So let's write this query in pg admin
102
103
00:07:02,655 --> 00:07:05,727
Will write select star from customer
103
104
00:07:11,615 --> 00:07:13,663
Where
104
105
00:07:13,919 --> 00:07:16,223
Customer name
105
106
00:07:17,759 --> 00:07:19,295
Then the tilde operator
106
107
00:07:19,551 --> 00:07:22,623
With asterisk sign to make it case insensitive
107
108
00:07:23,391 --> 00:07:26,975
And then in single quotes I will write
108
109
00:07:31,327 --> 00:07:37,471
Start of the string then ABCD in the parentheses with
109
110
00:07:37,727 --> 00:07:38,495
Or operator
110
111
00:07:40,543 --> 00:07:45,407
Then this can repeat N number of time that's why I will put
111
112
00:07:46,431 --> 00:07:47,711
Plus symbol
112
113
00:07:47,967 --> 00:07:52,063
Then in the next character set it can take any value from A to Z
113
114
00:07:52,575 --> 00:07:56,927
And a space as well remember to put backslash as for space
114
115
00:07:57,183 --> 00:07:58,719
And then the + sign
115
116
00:07:58,975 --> 00:08:00,511
For the repetition
116
117
00:08:01,023 --> 00:08:02,559
Then I will end the string
117
118
00:08:03,071 --> 00:08:04,095
And
118
119
00:08:04,607 --> 00:08:05,375
Run the code
119
120
00:08:10,751 --> 00:08:16,895
You can see we have all the customer whose first name starts with either A or B or
120
121
00:08:17,151 --> 00:08:18,175
C or d
121
122
00:08:18,687 --> 00:08:24,575
Remember if you want to write same query using like we have to use four like statements
122
123
00:08:26,367 --> 00:08:29,439
With regex we can do this in a single query
123
124
00:08:30,463 --> 00:08:30,975
Now
124
125
00:08:31,231 --> 00:08:32,767
in the third example
125
126
00:08:33,535 --> 00:08:39,679
I want the first name and the last name of my customer to be exactly of 4 character and the first name
126
127
00:08:39,935 --> 00:08:44,031
Name should start with either A B C or d
127
128
00:08:44,799 --> 00:08:45,567
To do this
128
129
00:08:45,823 --> 00:08:51,967
I will write the start of the string then I will mention a b c d using or operator
129
130
00:08:53,247 --> 00:08:58,623
Then here I will not write plus symbol because I want exactly One character
130
131
00:08:59,391 --> 00:09:01,951
After that I will write a to z
131
132
00:09:02,719 --> 00:09:04,255
This is the character set
132
133
00:09:05,023 --> 00:09:08,607
And I want this character set to repeat exactly three times
133
134
00:09:09,119 --> 00:09:12,447
So my first character should be ABCD
134
135
00:09:12,703 --> 00:09:16,287
And the next three character Can Be Anything between a to z
135
136
00:09:17,567 --> 00:09:21,151
And after this first name I want space in between
136
137
00:09:21,919 --> 00:09:24,991
That's why I will mention backslash s
137
138
00:09:26,015 --> 00:09:32,159
And after that I want my last name to contain exactly four character that's why I will
138
139
00:09:32,415 --> 00:09:38,559
Write A dash Z, I am mentioning the character set and I want this character set to repeat
139
140
00:09:38,815 --> 00:09:40,095
itself 4 times
140
141
00:09:41,631 --> 00:09:42,655
So let's
141
142
00:09:42,911 --> 00:09:45,471
Write this example in PG admin
142
143
00:10:07,231 --> 00:10:13,375
Mention the start of string I will write ^ symbol after that I want my first name to start with
143
144
00:10:13,631 --> 00:10:19,775
A B C or D Audi that's why I will write all this characters with
144
145
00:10:20,031 --> 00:10:21,567
Or symbol or pipe symbol
145
146
00:10:21,823 --> 00:10:27,967
After that I want any character between A to Z and wanted to repeat 3 times
146
147
00:10:28,223 --> 00:10:32,575
Therefore I will put 3 in the curly brackets after that I want a space
147
148
00:10:33,855 --> 00:10:38,207
So I mentioned backslash space after that I want a to z
148
149
00:10:39,231 --> 00:10:43,839
4 times to ensure that my last name should contain only 4 characters
149
150
00:10:44,351 --> 00:10:46,911
So I will mention four in the
150
151
00:10:47,167 --> 00:10:50,239
Curly brackets after that I want my string
151
152
00:10:51,007 --> 00:10:54,335
To get over so that's why I will put $ symbol
152
153
00:10:57,919 --> 00:10:59,967
Now let's run this
153
154
00:11:05,087 --> 00:11:05,855
You can see
154
155
00:11:06,111 --> 00:11:07,647
We are getting all the customer
155
156
00:11:08,415 --> 00:11:11,743
Who is first name and last name contains only four characters
156
157
00:11:12,767 --> 00:11:13,279
And
157
158
00:11:13,535 --> 00:11:14,303
Customers
158
159
00:11:14,559 --> 00:11:15,071
Where
159
160
00:11:15,327 --> 00:11:16,095
First name are
160
161
00:11:16,607 --> 00:11:19,679
Starting with A B C or d
161
162
00:11:22,751 --> 00:11:23,775
Now suppose
162
163
00:11:24,031 --> 00:11:27,103
We have some email IDs in our customer name
163
164
00:11:27,871 --> 00:11:34,015
And for this example we are using another table that is the users table we have already loaded
164
165
00:11:34,271 --> 00:11:35,039
Data into this table
165
166
00:11:35,295 --> 00:11:41,183
For your practice we have provided a Notepad file containing the queries to create this table
166
167
00:11:42,207 --> 00:11:44,511
In this table we want to identify
167
168
00:11:45,535 --> 00:11:47,327
Valid email ids
168
169
00:11:47,583 --> 00:11:49,119
That are present in my name
169
170
00:11:50,655 --> 00:11:52,959
So first let's go to
170
171
00:11:53,471 --> 00:11:55,263
PG admin and view the table
171
172
00:11:55,775 --> 00:12:00,639
Select star from users
172
173
00:12:08,575 --> 00:12:11,647
You can see that we have only one column that is
173
174
00:12:11,903 --> 00:12:12,927
Column for name
174
175
00:12:13,183 --> 00:12:16,511
And in that column we have few email ids as well
175
176
00:12:18,047 --> 00:12:23,423
So now we want to identify this email ID , only valid email ids
176
177
00:12:23,935 --> 00:12:24,959
From this data
177
178
00:12:30,079 --> 00:12:32,383
So a valid email ID
178
179
00:12:32,639 --> 00:12:37,247
Should contain alphanumeric string it can also contain a dot
179
180
00:12:37,503 --> 00:12:38,271
Or a Dash
180
181
00:12:39,295 --> 00:12:42,111
Then there should be a @ sign
181
182
00:12:43,903 --> 00:12:45,439
And then after that
182
183
00:12:45,695 --> 00:12:48,255
There can be alphanumeric string
183
184
00:12:48,511 --> 00:12:49,791
For the domain name
184
185
00:12:50,047 --> 00:12:52,095
It can also contain dash as well
185
186
00:12:53,375 --> 00:12:56,191
And after there that should be a dot
186
187
00:12:56,959 --> 00:12:59,007
For example in gmail.com
187
188
00:12:59,519 --> 00:13:02,079
There is a .com coming at the end
188
189
00:13:02,591 --> 00:13:03,615
That's why
189
190
00:13:03,871 --> 00:13:10,015
We are putting a condition on the dot and after that there can be 2 to 5 characters
190
191
00:13:10,271 --> 00:13:12,831
That's why we are putting a to z
191
192
00:13:13,087 --> 00:13:14,879
It should be between two
192
193
00:13:15,135 --> 00:13:15,903
Or 5
193
194
00:13:16,671 --> 00:13:21,279
So we are not accepting a single character and we are not accepting character length
194
195
00:13:21,791 --> 00:13:22,815
More than 5
195
196
00:13:23,583 --> 00:13:25,119
So let's
196
197
00:13:25,375 --> 00:13:27,423
Write this in our PG admin
197
198
00:13:33,567 --> 00:13:36,383
Select star from users
198
199
00:13:39,199 --> 00:13:40,479
Where
199
200
00:13:41,503 --> 00:13:45,087
Make it case insensitive I will write tilde star
200
201
00:13:47,135 --> 00:13:48,927
Then under single quotes
201
202
00:13:50,207 --> 00:13:52,255
To mention the first part of our
202
203
00:13:52,511 --> 00:13:53,791
Email ID
203
204
00:13:54,047 --> 00:13:58,911
We'll mention all the possible characters that is a to z 0 to 9
204
205
00:13:59,423 --> 00:14:00,447
Dot
205
206
00:14:00,959 --> 00:14:01,727
or Dash
206
207
00:14:03,007 --> 00:14:07,359
Or underscore and we want this to repeat N number of time
207
208
00:14:10,943 --> 00:14:12,223
After that we want
208
209
00:14:12,479 --> 00:14:17,087
@ symbol and after that we want another character set
209
210
00:14:18,111 --> 00:14:20,415
Of A to Z or 0 to 9
210
211
00:14:20,671 --> 00:14:26,815
Or Dash since domain name contains only alphanumeric characters and can also contain Dash
211
212
00:14:28,095 --> 00:14:30,655
And we want this to repeat n number of times
212
213
00:14:31,679 --> 00:14:32,959
That's why we are putting
213
214
00:14:33,215 --> 00:14:33,727
+
214
215
00:14:33,983 --> 00:14:37,567
After that we want a dot to come after the domain name
215
216
00:14:38,079 --> 00:14:43,711
That's why backslash dot and after that another character set of a to z
216
217
00:14:43,967 --> 00:14:47,551
We want to repeat it exactly between 2 and 5
217
218
00:14:48,575 --> 00:14:53,183
So this query will retrieve only valid email ids from a database
218
219
00:14:53,439 --> 00:14:55,231
Let's run this query
219
220
00:14:57,535 --> 00:14:59,583
You can see we are only getting
220
221
00:14:59,839 --> 00:15:01,375
Two valid email ids
221
222
00:15:01,631 --> 00:15:05,471
There was an email id that was starting with @
222
223
00:15:06,239 --> 00:15:09,567
And we have filtered out that email id using this query
223
224
00:15:11,103 --> 00:15:17,247
So regex expressions are very useful and you can use it in other advance string function
224
225
00:15:17,503 --> 00:15:18,015
As well
225
226
00:15:18,271 --> 00:15:20,319
SO that's all for this lecture
18935
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.