Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated:
1
00:00:00,000 --> 00:00:15,560
A few weeks ago, I created an episode on transcribing with artificial intelligence, and it was a
2
00:00:15,560 --> 00:00:22,960
really neat application because it essentially used a Python script which ran as a service,
3
00:00:22,960 --> 00:00:28,000
and with that, we were able to make requests from a Rails application over to this Python
4
00:00:28,000 --> 00:00:31,600
service to transcribe audio and video files.
5
00:00:31,600 --> 00:00:36,519
However, there were several pieces of that implementation that I really did not like,
6
00:00:36,519 --> 00:00:40,860
and so in this episode, we're going to have a revisit to this and how we can make it not
7
00:00:40,860 --> 00:00:43,680
only more stable but also thread safe.
8
00:00:43,680 --> 00:00:48,640
And we're also going to be looking at creating a text to image with stable diffusion, and
9
00:00:48,640 --> 00:00:53,840
again, all of this is hosted on our own local machine, and we're not using any APIs to a
10
00:00:53,840 --> 00:00:56,940
third-party image generation service.
11
00:00:56,940 --> 00:01:01,599
So we'll just use this prompt, tiger walking through a forest, and I'll create the image.
12
00:01:01,599 --> 00:01:06,520
It will go from pending to processing, and then processing, it'll then get completed,
13
00:01:06,520 --> 00:01:08,920
automatically showing us the image.
14
00:01:08,920 --> 00:01:13,039
And just for fun, so we can create multiple of these, we're going to also have a regenerate
15
00:01:13,039 --> 00:01:16,960
button which will just create a new record with the same prompt, which we'll go through
16
00:01:16,960 --> 00:01:19,340
and then recreate that image.
17
00:01:19,340 --> 00:01:23,600
And so you can see we'll get a different kind of image each time, all based off of that
18
00:01:23,600 --> 00:01:24,600
prompt.
19
00:01:24,600 --> 00:01:28,800
And so I have it set up here, you'll see on the bottom right hand side, this is a program
20
00:01:28,800 --> 00:01:33,320
called MVTOP, and it gives us an overview of the GPU.
21
00:01:33,320 --> 00:01:38,699
So currently you see we have about four gigabytes of VRAM being used, because we do have one
22
00:01:38,699 --> 00:01:41,320
instance of our model being loaded.
23
00:01:41,320 --> 00:01:45,800
And so on the left hand side, I have four instances where we can then regenerate the
24
00:01:45,800 --> 00:01:46,800
image.
25
00:01:46,800 --> 00:01:51,600
And on the top right, we have the Python script running as a microservice.
26
00:01:51,600 --> 00:01:55,720
So I'm going to go through and quickly click on the generate on each one of these, and
27
00:01:55,720 --> 00:01:59,860
you'll see that it is generating multiple lines at the same time.
28
00:01:59,860 --> 00:02:04,699
So it is then going to be able to complete all four images concurrently.
29
00:02:04,699 --> 00:02:09,680
And so here you see that we have four different images generated each time.
30
00:02:09,680 --> 00:02:11,560
And so this does work pretty well.
31
00:02:11,560 --> 00:02:14,280
Again, I'll just regenerate the images.
32
00:02:14,280 --> 00:02:16,959
And you'll see it goes from a pending to processing.
33
00:02:16,960 --> 00:02:21,760
And now we're not limited to just one thread on our sidekick worker to handle these.
34
00:02:21,760 --> 00:02:25,800
And we're not even going to use sidekick in this case, it's just going to use the normal
35
00:02:25,800 --> 00:02:29,200
active job to queue them all up and then process them.
36
00:02:29,200 --> 00:02:32,860
And the nice thing about this approach is that it's going to be even easier than the
37
00:02:32,860 --> 00:02:37,860
previous episode because of our approach that we're taking with the Python script.
38
00:02:37,860 --> 00:02:44,280
So to get started, I am SSH into my development machine that has an Nvidia GPU.
39
00:02:44,280 --> 00:02:49,120
I do already have Python installed on here, and I'm going to create two different files.
40
00:02:49,120 --> 00:02:51,300
I'm going to create a web dot Python.
41
00:02:51,300 --> 00:02:55,140
And then I'm also going to create a main dot Python.
42
00:02:55,140 --> 00:02:57,580
And we are actually going to need one more file.
43
00:02:57,580 --> 00:03:02,980
And that's going to be the requirements dot txt, because on a new machine, we want to
44
00:03:02,980 --> 00:03:08,240
be able to quickly install all of the requirements for this particular application.
45
00:03:08,240 --> 00:03:11,740
So just to go through what we're going to need, we're going to need flask.
46
00:03:11,740 --> 00:03:17,220
And this is basically a Sinatra like web service that we're going to be able to use.
47
00:03:17,220 --> 00:03:20,660
And it's going to allow us to create an endpoint.
48
00:03:20,660 --> 00:03:24,700
We'll then need the diffusers and also the transformers.
49
00:03:24,700 --> 00:03:29,880
And this is going to be the hugging face libraries that we use to pull down and interact with
50
00:03:29,880 --> 00:03:32,040
the stable diffusion model.
51
00:03:32,040 --> 00:03:34,600
And then we're also going to be using torch.
52
00:03:34,600 --> 00:03:39,560
So once you have these, we can go into our terminal and then we can do a pip install
53
00:03:39,560 --> 00:03:42,820
dash are and the requirements dot txt.
54
00:03:42,820 --> 00:03:45,540
And it'll go through and make sure that all of those are installed.
55
00:03:45,540 --> 00:03:50,580
Again, I am using a SDF to host the Python interpreter.
56
00:03:50,580 --> 00:03:55,540
And the Python version that I'm using in this episode is three dot 10 dot 11.
57
00:03:55,540 --> 00:03:59,580
And so let's start out with the Python script that's actually going to be doing the image
58
00:03:59,580 --> 00:04:00,900
generation.
59
00:04:00,900 --> 00:04:04,480
So we do need to bring in from the library diffusers.
60
00:04:04,480 --> 00:04:09,300
We need to import in the stable diffusion pipeline.
61
00:04:09,300 --> 00:04:11,580
We also need to import in torch.
62
00:04:11,580 --> 00:04:16,860
And then we also need to import in IO because we're going to be generating an image and
63
00:04:16,860 --> 00:04:19,579
we're not going to be saving it on this machine.
64
00:04:19,579 --> 00:04:24,460
Instead, we're just going to return the contents of that image.
65
00:04:24,460 --> 00:04:28,620
We then set a model underscore ID and this is just a variable.
66
00:04:28,620 --> 00:04:31,820
And we then need to point to a hugging face model.
67
00:04:31,820 --> 00:04:38,300
And there is a company stability AI, which has created the stable dash diffusion model.
68
00:04:38,300 --> 00:04:41,100
We're going to be using the version two point one.
69
00:04:41,100 --> 00:04:45,780
We can then set a pipe that's going to be equal to the stable diffusion pipeline.
70
00:04:45,780 --> 00:04:47,580
And then we need to load in that model.
71
00:04:47,580 --> 00:04:53,020
So we can do a from pre trained and then we can pass in that model ID and then we can
72
00:04:53,020 --> 00:04:54,940
call pipe dot two.
73
00:04:54,940 --> 00:04:59,580
We can pass in CUDA saying that we want to push this to our Nvidia GPU.
74
00:04:59,580 --> 00:05:03,260
So this model is going to be loaded into our Vram.
75
00:05:03,260 --> 00:05:08,700
We then create a method and we're just going to call this the generate underscore image.
76
00:05:08,700 --> 00:05:13,219
And then we're going to take in some keywords and we're also going to take in a number of
77
00:05:13,219 --> 00:05:18,460
steps, which is basically just going to be the accuracy of the image that we're going
78
00:05:18,460 --> 00:05:19,460
to be generating.
79
00:05:19,460 --> 00:05:25,039
We can set an image is equal to we can then call our model with the pipe and then pass
80
00:05:25,039 --> 00:05:26,780
in the keywords.
81
00:05:26,780 --> 00:05:31,340
We then set the number of the inferred steps and then we'll set it to the number of steps
82
00:05:31,340 --> 00:05:33,580
that we're passing into this function.
83
00:05:33,580 --> 00:05:38,419
We then call the images and then we'll grab the first item in that array.
84
00:05:38,419 --> 00:05:45,659
We can set another variable image bytes and we'll set that equal to the I.O. bytes I.O.
85
00:05:45,659 --> 00:05:50,620
and then we can call our image dot save and we're going to save it to this image bytes.
86
00:05:50,620 --> 00:05:54,260
So we're not saving it to our hard disk or anything.
87
00:05:54,260 --> 00:05:56,940
Instead it's just going to be in memory for a moment.
88
00:05:56,940 --> 00:06:00,320
We can also set this to a format of PNG.
89
00:06:00,320 --> 00:06:06,540
We can then rewind this image bytes with a seek to zero and then we can call the torch
90
00:06:06,540 --> 00:06:09,140
dot kuda dot empty cache.
91
00:06:09,140 --> 00:06:14,120
And this is just going to free up any memory that may have been used or reserved when generating
92
00:06:14,120 --> 00:06:15,460
this image.
93
00:06:15,460 --> 00:06:19,219
And then finally we can return the image underscore bytes.
94
00:06:19,219 --> 00:06:24,219
And so essentially what we're going to do is have this main Python function and this
95
00:06:24,219 --> 00:06:27,560
is all we need to write, just the 15 lines of code.
96
00:06:27,560 --> 00:06:32,860
But for this function we then need to interface with it and that's where this web dot Python
97
00:06:32,860 --> 00:06:34,960
file is going to come into play.
98
00:06:34,960 --> 00:06:39,900
So essentially we're going to create a endpoint and with that endpoint we're then going to
99
00:06:39,900 --> 00:06:41,940
be able to make requests to it.
100
00:06:41,940 --> 00:06:44,680
It'll then call that generate image function.
101
00:06:44,680 --> 00:06:51,800
So to do that we need to import in the OS and from flask we also need to import in a
102
00:06:51,800 --> 00:06:53,000
few different things.
103
00:06:53,000 --> 00:06:59,180
We need flask the request and then also send underscore file from main.
104
00:06:59,180 --> 00:07:04,800
We need to import in the generate image and this generate image and the main is actually
105
00:07:04,800 --> 00:07:10,460
coming from this main dot pi and within there we have that generate image function.
106
00:07:10,460 --> 00:07:13,660
So we can pull this in so we can then interact with it.
107
00:07:13,660 --> 00:07:17,980
We'll set app is equal to flask and then we can create a route.
108
00:07:17,980 --> 00:07:23,260
So we'll have the app dot route and let's just create this route to the generate underscore
109
00:07:23,260 --> 00:07:26,920
image and it does not have to be the same name as our function here.
110
00:07:26,920 --> 00:07:31,580
I just happen to have it as the same name and we're going to have another parameter
111
00:07:31,580 --> 00:07:38,840
methods is equal to and then an array and we want to allow a post request to this endpoint.
112
00:07:38,840 --> 00:07:43,980
We can then have a function and we're just going to call this a handle underscore request.
113
00:07:43,980 --> 00:07:45,660
It really doesn't have to be named this.
114
00:07:45,660 --> 00:07:50,580
You can have it called anything but really that generate image and so with this handle
115
00:07:50,580 --> 00:07:56,020
request we're going to have a prompt is equal to the request which is what's coming into
116
00:07:56,020 --> 00:08:00,580
our application and it is a form request that we're going to take in and then we're going
117
00:08:00,580 --> 00:08:06,040
to have a parameter and let's just call this the keywords and if the form was posted with
118
00:08:06,040 --> 00:08:12,900
no keywords value we can check that if not prompt we can then return and we'll just return
119
00:08:12,900 --> 00:08:17,419
the hash and we'll just say that there was an error and we'll say that the keywords is
120
00:08:17,419 --> 00:08:21,340
required and we'll give it a status of 400.
121
00:08:21,340 --> 00:08:26,620
However if we did receive a prompt then we can set the image underscore bytes and we'll
122
00:08:26,620 --> 00:08:29,659
set that equal to the generate image function.
123
00:08:29,659 --> 00:08:34,500
We'll pass in our prompt and then we need to pass in the number of inference steps and
124
00:08:34,500 --> 00:08:37,419
just so this happens quickly I'll say 10.
125
00:08:37,419 --> 00:08:41,860
If you want it to be a lot more accurate you can set it to 50 but the higher number you
126
00:08:41,860 --> 00:08:45,460
use it is going to take longer and longer to generate.
127
00:08:45,460 --> 00:08:52,180
So once we have our image bytes we can return the send underscore file we want to return
128
00:08:52,180 --> 00:08:58,340
that image underscore bytes and we'll set the mime type is equal to an image for slash
129
00:08:58,340 --> 00:09:04,780
PNG and lastly we just had to call the app dot run we can set what host we want to bind
130
00:09:04,780 --> 00:09:11,500
this on because this is not going to be happening on our local host we can listen on any interface
131
00:09:11,500 --> 00:09:16,580
and then let's give this a port is equal to maybe something like 8000 and so that's all
132
00:09:16,580 --> 00:09:22,380
the python we have to write we have our web interface which is importing in the function
133
00:09:22,380 --> 00:09:28,260
generate image from our main file and when we accept a post request to this endpoint
134
00:09:28,260 --> 00:09:33,700
that has a form parameter keywords with some value then it's going to call that generate
135
00:09:33,700 --> 00:09:38,860
image with that prompt that generate image function is just going to pass into our stable
136
00:09:38,860 --> 00:09:45,300
diffusion model generating the image temporarily saving it to memory and returning it and so
137
00:09:45,300 --> 00:09:50,700
we can go ahead and start up our web server with a python and the web dot pi and now load
138
00:09:50,700 --> 00:09:55,180
it up into memory you will get a warning that this is a development server don't use it
139
00:09:55,180 --> 00:10:00,540
in a production deployment and so if you did want to deploy this I would recommend using
140
00:10:00,540 --> 00:10:05,780
something like g unicorn so you could just run that instead saying that we basically
141
00:10:05,780 --> 00:10:11,579
just want one worker but if you are going to do that they want to disable this app run
142
00:10:11,579 --> 00:10:16,579
so that way we are solely just going to be using the g unicorn it'll load it up into
143
00:10:16,579 --> 00:10:22,020
the vram as we would expect and then once it's up and running we can make our request
144
00:10:22,020 --> 00:10:27,720
to the ip address of this machine and port 8000 and so I'm going to use paul here to
145
00:10:27,720 --> 00:10:33,780
make this request just so we can test and make sure it's working so I can do the HTTP
146
00:10:33,780 --> 00:10:38,740
colon forward slash forward slash I can put in the ip address of this machine and then
147
00:10:38,740 --> 00:10:44,060
I can specify the port 8000 I made this a bit wider so we can see it all and then we
148
00:10:44,060 --> 00:10:51,339
want to go to the endpoint forward slash generate underscore image because it is a post request
149
00:10:51,339 --> 00:10:58,220
we do need to pass in into the body the form url encoded and we need to specify the keywords
150
00:10:58,220 --> 00:11:03,740
and for the keywords let's just call the planet earth as an example oh and I do have
151
00:11:03,740 --> 00:11:09,100
a few little mistypes here so I do need to fix this spelling on the inference steps and
152
00:11:09,100 --> 00:11:14,660
then also this is supposed to be image bytes once we fix those issues we can then restart
153
00:11:14,660 --> 00:11:20,620
our application we can make our request again we'll see that it's generating the image and
154
00:11:20,620 --> 00:11:25,420
then we get our image back and the nice thing about using something like paw or the rapid
155
00:11:25,420 --> 00:11:30,980
API now is that we can then just keep making this same request over and over and we should
156
00:11:30,980 --> 00:11:36,660
then see a new image being generated each time and so I would say that this works and
157
00:11:36,660 --> 00:11:42,180
now we can handle the Ruby on Rails side where we can then make this request to this endpoint
158
00:11:42,180 --> 00:11:47,780
as well as looking at some of the gotchas around it so because this is up and running and working
159
00:11:47,780 --> 00:11:53,260
I'm going to just leave it here I will copy the G unicorn and then just paste that in
160
00:11:53,260 --> 00:11:58,260
here just so we have reference to it in the show notes but for the most part we have under
161
00:11:58,260 --> 00:12:03,620
30 lines of Python code and it's going to do some really cool stuff and so now we can
162
00:12:03,620 --> 00:12:08,660
start with a fresh real 7 application I'll generate a scaffold we'll just call this the
163
00:12:08,660 --> 00:12:14,420
images we'll have a prompt which is a text and then we're also going to have a status
164
00:12:14,420 --> 00:12:20,060
which is an integer we'll go ahead and generate that and then we'll also call the rails active
165
00:12:20,060 --> 00:12:25,300
underscore storage colon install because we are going to be using active storage to store
166
00:12:25,300 --> 00:12:32,339
these images before we run the migrations I am going to go into the images migration
167
00:12:32,339 --> 00:12:38,420
and I just want to set a default value for the status it will set it equal to zero because
168
00:12:38,420 --> 00:12:43,620
we are going to be using the enums for the status to just keep track of it and then in
169
00:12:43,620 --> 00:12:49,719
the image model we can do the has one attached and then let's just call this our file we'll
170
00:12:49,720 --> 00:12:55,660
have the enum for this status and again we're going to have just like we did in the previous
171
00:12:55,660 --> 00:13:01,780
episode a pending a processing and then a completed and similar we're going to have
172
00:13:01,780 --> 00:13:08,020
a broadcast so as we're getting updates from the background job it's then going to broadcast
173
00:13:08,020 --> 00:13:13,060
it to our end user and then in the form I'm just going to remove the status because that's
174
00:13:13,060 --> 00:13:17,380
going to be automatically generated and we don't need to take in anything except the
175
00:13:17,380 --> 00:13:23,460
prompt in the controllers in the images controller we don't want to take in the status so we'll
176
00:13:23,460 --> 00:13:28,140
just have the prompt there and then let's go into the show page and within the show
177
00:13:28,140 --> 00:13:33,180
page we're rendering the image which means that we need to look into this image partial
178
00:13:33,180 --> 00:13:38,620
and then we see the prompt and the status and so that's good and so let's also display
179
00:13:38,620 --> 00:13:44,500
out that image so I'll just create a new paragraph and within there we can call the image underscore
180
00:13:44,500 --> 00:13:51,780
tag and then the URL underscore four and then the image dot file and we only want to do
181
00:13:51,780 --> 00:13:57,740
this if the image dot file is attached we still need to listen for the action cable
182
00:13:57,740 --> 00:14:03,420
channel so I'm just going to put that under the show page and we can do that with an ERB
183
00:14:03,420 --> 00:14:09,200
tag for the turbo stream from and we just want to stream from the image but let's also
184
00:14:09,200 --> 00:14:13,900
go ahead and add in that regenerate image button so just under the status I'm just going
185
00:14:13,900 --> 00:14:19,540
to create a button to and if you're unfamiliar with the button to that's basically going
186
00:14:19,540 --> 00:14:25,740
to create a form and around that form we can then have a URL passing some parameters and
187
00:14:25,740 --> 00:14:31,579
it's going to make a post request so I'm just going to have this called the regenerate and
188
00:14:31,579 --> 00:14:36,939
we can send this to the images underscore path and then we can also pass something into
189
00:14:36,939 --> 00:14:43,300
here and so just to keep it in the same format that our images controller is going to expect
190
00:14:43,300 --> 00:14:48,780
where we have the prams we need to have an image and then within that image hash we can
191
00:14:48,780 --> 00:14:54,219
have the prompt and we just want to set that equal to the image dot prompt and of course
192
00:14:54,219 --> 00:14:59,300
we would only want to do this if that image is completed so that's going to look at the
193
00:14:59,300 --> 00:15:05,099
enum to see if it's completed and if it is then we can regenerate the image and so in
194
00:15:05,099 --> 00:15:11,260
the images controller whenever we are creating an image let's call on a job let's call the
195
00:15:11,260 --> 00:15:18,620
process image job will perform underscore later and let's just pass in the image so
196
00:15:18,620 --> 00:15:23,500
typically I'll pass in the image ID I do like doing that but I'm just going to pass in the
197
00:15:23,500 --> 00:15:29,420
image here because active job does support passing in more complex objects like an active
198
00:15:29,420 --> 00:15:35,439
record object and it will use global ID to return the record but if you're uncomfortable
199
00:15:35,439 --> 00:15:40,100
with doing that or if you're using a background processor that doesn't support it then you're
200
00:15:40,100 --> 00:15:44,900
going to be able to pass in the image ID and then look up that record within the background
201
00:15:44,900 --> 00:15:51,660
job so we need to generate that job with a rails generate job and the process underscore
202
00:15:51,660 --> 00:15:58,220
image that'll give us that process image job and within this job we need to take in the
203
00:15:58,220 --> 00:16:04,820
image we could always set up some guard clauses return unless the image is there we could
204
00:16:04,820 --> 00:16:11,840
also return if the image is completed we can then set the image to processing which by
205
00:16:11,840 --> 00:16:18,320
setting it to processing that's going to broadcast out and update our view we can get a prompt
206
00:16:18,320 --> 00:16:24,260
is equal to the image dot prompt and let's just make sure that it is present if it is
207
00:16:24,260 --> 00:16:29,600
present then we can return the image dot prompt and if it's not present then let's just give
208
00:16:29,600 --> 00:16:34,240
it a default prompt in this case I'm just going to call children book our cover monkey
209
00:16:34,240 --> 00:16:40,500
hanging in a tree we can give it the same URL that we used within the rapid API and
210
00:16:40,500 --> 00:16:45,900
that is to a local IP address port eight thousand and generate image and of course you probably
211
00:16:45,900 --> 00:16:50,180
don't want to hard code an IP address in here I would probably extract that out as well
212
00:16:50,180 --> 00:16:54,660
as support to an environment variable but in this case I'm just going to leave it in
213
00:16:54,660 --> 00:16:59,820
here and so we're then going to expect some kind of image data back and I'm going to make
214
00:16:59,820 --> 00:17:04,540
a separate class for this because I don't want to put in all this logic within here
215
00:17:04,540 --> 00:17:10,060
so we're just going to have a class we'll call it HTTP we'll make a post request and
216
00:17:10,060 --> 00:17:15,380
let's set up this post class method to take in a few different things we need the URL
217
00:17:15,380 --> 00:17:20,140
and then we also need some kind of keywords and in this case I'm just going to pass in
218
00:17:20,140 --> 00:17:25,980
the prompt and so before we go any further let's go ahead and create this HTTP class
219
00:17:25,980 --> 00:17:32,780
in this post method under our models I have the HTTP dot RB we're just going to use the
220
00:17:32,780 --> 00:17:40,620
standard net HTTP it's going to be a class of HTTP we'll have our method for the class
221
00:17:40,620 --> 00:17:47,020
level post taking in the URL and also the form data we can create a new instance of
222
00:17:47,020 --> 00:17:53,580
this class with the form data and our URL and then we can call the instance method post
223
00:17:53,580 --> 00:17:59,300
so we can initialize this with our URL and the form data and then we can set instance
224
00:17:59,300 --> 00:18:05,659
methods for our URL and the form data is equal to those local variables and then we create
225
00:18:05,659 --> 00:18:11,860
our post method and within this post method let's just call our post underscore request
226
00:18:11,860 --> 00:18:16,780
which is a method that we'll have to create and then we're going to expect an attribute
227
00:18:16,780 --> 00:18:23,300
body to be called on in return so we can create this post underscore request we can set our
228
00:18:23,300 --> 00:18:31,940
URI is equal to the URI dot parse and we want to parse our URL we then set the HTTP is equal
229
00:18:31,940 --> 00:18:39,220
to the net colon colon HTTP we'll create a new instance with our URI dot host and the
230
00:18:39,220 --> 00:18:48,700
URI dot port we can set the use SSL is equal to true if the URI dot scheme is equal to
231
00:18:48,700 --> 00:18:55,940
HTTPS otherwise they'll use HTTP and depending on your application and the server that you're
232
00:18:55,940 --> 00:19:01,580
hosting this on you may need to set the open timeout and I'm going to set it to 10 seconds
233
00:19:01,580 --> 00:19:07,260
and then you also need to set the read timeout because if you do have a lot of traffic coming
234
00:19:07,260 --> 00:19:11,820
into the server and if it's going to take a while to generate the images then you may
235
00:19:11,820 --> 00:19:19,439
need a bit of higher number here we then make our request is equal to the net HTTP and this
236
00:19:19,439 --> 00:19:26,740
is a post we'll create a new instance to the URI dot request underscore URI and then we
237
00:19:26,740 --> 00:19:32,980
can set the form data for this and we'll just set that equal to our form data we'll finally
238
00:19:32,980 --> 00:19:40,340
then make our request with the HTTP dot request and then we'll pass in the request and so
239
00:19:40,340 --> 00:19:44,699
the nice thing about this is that now we're going to be able to reuse this multiple times
240
00:19:44,699 --> 00:19:50,820
for basically any kind of HTTP post that simple enough where we just have a URL and the form
241
00:19:50,820 --> 00:19:58,379
data so back in our image processing job we're sending our image data to that HTTP post if
242
00:19:58,379 --> 00:20:04,740
we are getting some image data back then we can call the image dot file dot attach and
243
00:20:04,740 --> 00:20:11,220
then we want to attach within IO with the string IO creating a new instance of that
244
00:20:11,220 --> 00:20:17,440
image underscore data we do need to give it a file name which let's just call it the image
245
00:20:17,440 --> 00:20:24,980
dot PNG we also need to set the content type is an image for slash PNG once that attaches
246
00:20:24,980 --> 00:20:30,160
we can call the image dot completed which should then broadcast a new change however
247
00:20:30,160 --> 00:20:34,660
there are a couple of thoughts that you may want to take into consideration one if you
248
00:20:34,660 --> 00:20:40,060
want to throttle the number of requests coming into the server to generate the images then
249
00:20:40,060 --> 00:20:44,900
you can create a separate queue and with that separate queue you have a separate service
250
00:20:44,900 --> 00:20:50,020
maybe only one two items within that queue or two workers handling requests for that
251
00:20:50,020 --> 00:20:55,960
queue and there's also a situation where if the python service is not running then it's
252
00:20:55,960 --> 00:21:02,500
going to error out and so we could do a retry on there and we can retry on that error where
253
00:21:02,500 --> 00:21:07,340
it's basically a connection refused because we don't want this to keep hammering that
254
00:21:07,340 --> 00:21:13,000
service we could just wait five seconds let's say we're going to give it an unlimited number
255
00:21:13,000 --> 00:21:19,500
of attempts we could also retry on if there is a situation where our timeout wasn't long
256
00:21:19,500 --> 00:21:24,180
enough and again we're just going to wait five seconds on there and so now we can test
257
00:21:24,180 --> 00:21:28,360
this out we can give it a prompt or we can leave the prompt blank and then it'll show
258
00:21:28,360 --> 00:21:33,260
us a monkey hanging from the tree we see that it went to processing and then completed and
259
00:21:33,260 --> 00:21:39,459
then we got our image we could try to regenerate this it'll create a new record and again we
260
00:21:39,459 --> 00:21:45,280
got the processing and completed and so this all works and so i'm going to kill the server
261
00:21:45,280 --> 00:21:50,300
and i'm going to change the inference because i want this to take a little bit longer so
262
00:21:50,300 --> 00:21:55,679
i'll set it to 50 and it should take about 14 seconds to generate an image based on this
263
00:21:55,680 --> 00:22:01,920
graphics card i can then set this up to run again so i'll just use a g unicorn it'll then
264
00:22:01,920 --> 00:22:07,340
load it up into memory and i can get in the top running again just so we can see this
265
00:22:07,340 --> 00:22:12,060
and then i'll have two browsers open where we generate one let's just test it with one
266
00:22:12,060 --> 00:22:18,100
first we can see the cpu spiking on the gpu and then it's going through and then it's
267
00:22:18,100 --> 00:22:24,260
taking a lot longer now to generate the image but that is going to make it a bit more detailed
268
00:22:24,260 --> 00:22:30,020
and accurate to what we are looking for and so now let's test it with two simultaneously
269
00:22:30,020 --> 00:22:35,140
where we're generating two images at a time so here you can see that it's going to generate
270
00:22:35,140 --> 00:22:39,740
one image and then once it generates that first image it's then going to work on the
271
00:22:39,740 --> 00:22:44,220
second one and so that's kind of one of the drawbacks with the g unicorn is that it's
272
00:22:44,220 --> 00:22:50,379
only handling one request at a time but we are able to sort of fix that so i'll kill
273
00:22:50,380 --> 00:22:56,060
the server we'll then just change the number of workers and we'll set it equal to two but
274
00:22:56,060 --> 00:23:00,580
you do want to be careful doing that because if you notice that the memory it is going
275
00:23:00,580 --> 00:23:06,420
to spike a lot higher because now we have two separate processes running so again i'll
276
00:23:06,420 --> 00:23:10,740
click generate on both of these real quick and you'll see that it's going through and
277
00:23:10,740 --> 00:23:16,580
it's generating them and it is kind of jumping back and forth because it is running two separate
278
00:23:16,580 --> 00:23:22,179
processes to generate them but one thing you will notice is that it is going to take longer
279
00:23:22,179 --> 00:23:28,699
to generate both of those images simply because now it is having to do two calculations at
280
00:23:28,699 --> 00:23:34,500
the same time so we got the one image and then the second one will appear shortly after
281
00:23:34,500 --> 00:23:40,199
and i really enjoy doing this image generation and the transcribing and i'm just coming up
282
00:23:40,199 --> 00:23:45,620
with all new kinds of ideas for this i don't think i'll really make any more episodes around
283
00:23:45,620 --> 00:23:50,659
the ai unless if there is something a bit more breakthrough but i do just want to kind
284
00:23:50,659 --> 00:23:55,979
of reiterate the approach that we took on the python script because i do think that
285
00:23:55,979 --> 00:24:02,659
this makes it very usable and i do think that this is a great case for a micro service because
286
00:24:02,659 --> 00:24:08,699
essentially what we have is a very small script that's doing one thing and it does that one
287
00:24:08,699 --> 00:24:14,820
thing very well and we also have one simple script that is our main entry point into this
288
00:24:14,820 --> 00:24:20,580
ai image generation you could also dockerize this if you want but the one thing that i
289
00:24:20,580 --> 00:24:25,300
would recommend if you are going to dockerize it is that i would copy out this model in
290
00:24:25,300 --> 00:24:31,060
the pipe into a preload script so then you can bundle in that model that you're going
291
00:24:31,060 --> 00:24:36,860
to be using into that docker image because i have found if you try to start the geonuclearn
292
00:24:36,860 --> 00:24:41,960
process within docker the health checks will fail and then you're just going to have some
293
00:24:41,960 --> 00:24:47,220
quirks there but if you preload that image in then it's going to deploy and run very
294
00:24:47,220 --> 00:24:53,780
fast but it is going to result in a very large docker image and i have created a website
295
00:24:53,780 --> 00:24:57,860
where i've kind of gone through all of the different ai things that i've been creating
296
00:24:57,860 --> 00:25:03,780
and playing around with and so you can go to the ai.railsapp.dev to check some of these
297
00:25:03,780 --> 00:25:09,660
out you can do different things like text to speech where you can generate audio clips
298
00:25:09,660 --> 00:25:14,720
that are spoken you could also play around with the text to image which this text to
299
00:25:14,720 --> 00:25:20,020
image is going to have a bunch of the different community generated images it works in a very
300
00:25:20,020 --> 00:25:25,220
similar process where we have a pinning it'll then go into processing and then complete
301
00:25:25,220 --> 00:25:31,180
it once it's finished and once it completes it'll then hide the progress bar and it'll
302
00:25:31,180 --> 00:25:36,180
take a moment but then it'll load in the image you could also use this with closed captions
303
00:25:36,180 --> 00:25:41,900
so you can upload a file and then it'll generate the closed captions for you because this is
304
00:25:41,900 --> 00:25:47,340
just an example not really meant to be used as a utility because my resources are limited
305
00:25:47,340 --> 00:25:53,280
i do have the first 1000 characters basically being displayed from that transcription and
306
00:25:53,280 --> 00:25:58,540
closed caption generation and similar for the transcriptions where it is just doing
307
00:25:58,540 --> 00:26:03,740
the transcription of it you can see where it's just displaying out the first 1000 characters
308
00:26:03,740 --> 00:26:08,980
and this really just so you can get a taste on how it works well that's all for this episode
309
00:26:08,980 --> 00:26:36,620
thanks for watching
35619
Can't find what you're looking for?
Get subtitles in any language from opensubtitles.com, and translate them here.