subtitlecat.com

All language subtitles for Text to Image with Machine Learning _ Drifting Ruby (Transcribed on 04-May-2023 10-51-29)

Afrikaans

Albanian

Amharic

Arabic

Armenian

Azerbaijani

Basque

Belarusian

Bengali

Bosnian

Bulgarian

Catalan

Cebuano

Chichewa

Chinese (Simplified) Download

Chinese (Traditional)

Corsican

Croatian

Czech

Danish

Dutch

English

Esperanto

Estonian

Filipino

Finnish

French

Frisian

Galician

Georgian

German

Greek

Gujarati

Haitian Creole

Hausa

Hawaiian

Hebrew

Hindi

Hmong

Hungarian

Icelandic

Igbo

Indonesian

Irish

Italian

Japanese

Javanese

Kannada

Kazakh

Khmer

Korean

Kurdish (Kurmanji)

Kyrgyz

Lao

Latin

Latvian

Lithuanian

Luxembourgish

Macedonian

Malagasy

Malay

Malayalam

Maltese

Maori

Marathi

Mongolian

Myanmar (Burmese)

Nepali

Norwegian

Pashto

Persian

Polish

Portuguese

Punjabi

Romanian

Russian

Samoan

Scots Gaelic

Serbian

Sesotho

Shona

Sindhi

Sinhala

Slovak

Slovenian

Somali

Spanish

Sundanese

Swahili

Swedish

Tajik

Tamil

Telugu

Thai

Turkish

Ukrainian

Urdu

Uzbek

Vietnamese

Welsh

Xhosa

Yiddish

Yoruba

Zulu

Odia (Oriya)

Kinyarwanda

Turkmen

Tatar

Uyghur

Would you like to inspect the original subtitles? These are the user uploaded subtitles that are being translated: 1 00:00:00,000 --> 00:00:15,560 A few weeks ago, I created an episode on transcribing with artificial intelligence, and it was a 2 00:00:15,560 --> 00:00:22,960 really neat application because it essentially used a Python script which ran as a service, 3 00:00:22,960 --> 00:00:28,000 and with that, we were able to make requests from a Rails application over to this Python 4 00:00:28,000 --> 00:00:31,600 service to transcribe audio and video files. 5 00:00:31,600 --> 00:00:36,519 However, there were several pieces of that implementation that I really did not like, 6 00:00:36,519 --> 00:00:40,860 and so in this episode, we're going to have a revisit to this and how we can make it not 7 00:00:40,860 --> 00:00:43,680 only more stable but also thread safe. 8 00:00:43,680 --> 00:00:48,640 And we're also going to be looking at creating a text to image with stable diffusion, and 9 00:00:48,640 --> 00:00:53,840 again, all of this is hosted on our own local machine, and we're not using any APIs to a 10 00:00:53,840 --> 00:00:56,940 third-party image generation service. 11 00:00:56,940 --> 00:01:01,599 So we'll just use this prompt, tiger walking through a forest, and I'll create the image. 12 00:01:01,599 --> 00:01:06,520 It will go from pending to processing, and then processing, it'll then get completed, 13 00:01:06,520 --> 00:01:08,920 automatically showing us the image. 14 00:01:08,920 --> 00:01:13,039 And just for fun, so we can create multiple of these, we're going to also have a regenerate 15 00:01:13,039 --> 00:01:16,960 button which will just create a new record with the same prompt, which we'll go through 16 00:01:16,960 --> 00:01:19,340 and then recreate that image. 17 00:01:19,340 --> 00:01:23,600 And so you can see we'll get a different kind of image each time, all based off of that 18 00:01:23,600 --> 00:01:24,600 prompt. 19 00:01:24,600 --> 00:01:28,800 And so I have it set up here, you'll see on the bottom right hand side, this is a program 20 00:01:28,800 --> 00:01:33,320 called MVTOP, and it gives us an overview of the GPU. 21 00:01:33,320 --> 00:01:38,699 So currently you see we have about four gigabytes of VRAM being used, because we do have one 22 00:01:38,699 --> 00:01:41,320 instance of our model being loaded. 23 00:01:41,320 --> 00:01:45,800 And so on the left hand side, I have four instances where we can then regenerate the 24 00:01:45,800 --> 00:01:46,800 image. 25 00:01:46,800 --> 00:01:51,600 And on the top right, we have the Python script running as a microservice. 26 00:01:51,600 --> 00:01:55,720 So I'm going to go through and quickly click on the generate on each one of these, and 27 00:01:55,720 --> 00:01:59,860 you'll see that it is generating multiple lines at the same time. 28 00:01:59,860 --> 00:02:04,699 So it is then going to be able to complete all four images concurrently. 29 00:02:04,699 --> 00:02:09,680 And so here you see that we have four different images generated each time. 30 00:02:09,680 --> 00:02:11,560 And so this does work pretty well. 31 00:02:11,560 --> 00:02:14,280 Again, I'll just regenerate the images. 32 00:02:14,280 --> 00:02:16,959 And you'll see it goes from a pending to processing. 33 00:02:16,960 --> 00:02:21,760 And now we're not limited to just one thread on our sidekick worker to handle these. 34 00:02:21,760 --> 00:02:25,800 And we're not even going to use sidekick in this case, it's just going to use the normal 35 00:02:25,800 --> 00:02:29,200 active job to queue them all up and then process them. 36 00:02:29,200 --> 00:02:32,860 And the nice thing about this approach is that it's going to be even easier than the 37 00:02:32,860 --> 00:02:37,860 previous episode because of our approach that we're taking with the Python script. 38 00:02:37,860 --> 00:02:44,280 So to get started, I am SSH into my development machine that has an Nvidia GPU. 39 00:02:44,280 --> 00:02:49,120 I do already have Python installed on here, and I'm going to create two different files. 40 00:02:49,120 --> 00:02:51,300 I'm going to create a web dot Python. 41 00:02:51,300 --> 00:02:55,140 And then I'm also going to create a main dot Python. 42 00:02:55,140 --> 00:02:57,580 And we are actually going to need one more file. 43 00:02:57,580 --> 00:03:02,980 And that's going to be the requirements dot txt, because on a new machine, we want to 44 00:03:02,980 --> 00:03:08,240 be able to quickly install all of the requirements for this particular application. 45 00:03:08,240 --> 00:03:11,740 So just to go through what we're going to need, we're going to need flask. 46 00:03:11,740 --> 00:03:17,220 And this is basically a Sinatra like web service that we're going to be able to use. 47 00:03:17,220 --> 00:03:20,660 And it's going to allow us to create an endpoint. 48 00:03:20,660 --> 00:03:24,700 We'll then need the diffusers and also the transformers. 49 00:03:24,700 --> 00:03:29,880 And this is going to be the hugging face libraries that we use to pull down and interact with 50 00:03:29,880 --> 00:03:32,040 the stable diffusion model. 51 00:03:32,040 --> 00:03:34,600 And then we're also going to be using torch. 52 00:03:34,600 --> 00:03:39,560 So once you have these, we can go into our terminal and then we can do a pip install 53 00:03:39,560 --> 00:03:42,820 dash are and the requirements dot txt. 54 00:03:42,820 --> 00:03:45,540 And it'll go through and make sure that all of those are installed. 55 00:03:45,540 --> 00:03:50,580 Again, I am using a SDF to host the Python interpreter. 56 00:03:50,580 --> 00:03:55,540 And the Python version that I'm using in this episode is three dot 10 dot 11. 57 00:03:55,540 --> 00:03:59,580 And so let's start out with the Python script that's actually going to be doing the image 58 00:03:59,580 --> 00:04:00,900 generation. 59 00:04:00,900 --> 00:04:04,480 So we do need to bring in from the library diffusers. 60 00:04:04,480 --> 00:04:09,300 We need to import in the stable diffusion pipeline. 61 00:04:09,300 --> 00:04:11,580 We also need to import in torch. 62 00:04:11,580 --> 00:04:16,860 And then we also need to import in IO because we're going to be generating an image and 63 00:04:16,860 --> 00:04:19,579 we're not going to be saving it on this machine. 64 00:04:19,579 --> 00:04:24,460 Instead, we're just going to return the contents of that image. 65 00:04:24,460 --> 00:04:28,620 We then set a model underscore ID and this is just a variable. 66 00:04:28,620 --> 00:04:31,820 And we then need to point to a hugging face model. 67 00:04:31,820 --> 00:04:38,300 And there is a company stability AI, which has created the stable dash diffusion model. 68 00:04:38,300 --> 00:04:41,100 We're going to be using the version two point one. 69 00:04:41,100 --> 00:04:45,780 We can then set a pipe that's going to be equal to the stable diffusion pipeline. 70 00:04:45,780 --> 00:04:47,580 And then we need to load in that model. 71 00:04:47,580 --> 00:04:53,020 So we can do a from pre trained and then we can pass in that model ID and then we can 72 00:04:53,020 --> 00:04:54,940 call pipe dot two. 73 00:04:54,940 --> 00:04:59,580 We can pass in CUDA saying that we want to push this to our Nvidia GPU. 74 00:04:59,580 --> 00:05:03,260 So this model is going to be loaded into our Vram. 75 00:05:03,260 --> 00:05:08,700 We then create a method and we're just going to call this the generate underscore image. 76 00:05:08,700 --> 00:05:13,219 And then we're going to take in some keywords and we're also going to take in a number of 77 00:05:13,219 --> 00:05:18,460 steps, which is basically just going to be the accuracy of the image that we're going 78 00:05:18,460 --> 00:05:19,460 to be generating. 79 00:05:19,460 --> 00:05:25,039 We can set an image is equal to we can then call our model with the pipe and then pass 80 00:05:25,039 --> 00:05:26,780 in the keywords. 81 00:05:26,780 --> 00:05:31,340 We then set the number of the inferred steps and then we'll set it to the number of steps 82 00:05:31,340 --> 00:05:33,580 that we're passing into this function. 83 00:05:33,580 --> 00:05:38,419 We then call the images and then we'll grab the first item in that array. 84 00:05:38,419 --> 00:05:45,659 We can set another variable image bytes and we'll set that equal to the I.O. bytes I.O. 85 00:05:45,659 --> 00:05:50,620 and then we can call our image dot save and we're going to save it to this image bytes. 86 00:05:50,620 --> 00:05:54,260 So we're not saving it to our hard disk or anything. 87 00:05:54,260 --> 00:05:56,940 Instead it's just going to be in memory for a moment. 88 00:05:56,940 --> 00:06:00,320 We can also set this to a format of PNG. 89 00:06:00,320 --> 00:06:06,540 We can then rewind this image bytes with a seek to zero and then we can call the torch 90 00:06:06,540 --> 00:06:09,140 dot kuda dot empty cache. 91 00:06:09,140 --> 00:06:14,120 And this is just going to free up any memory that may have been used or reserved when generating 92 00:06:14,120 --> 00:06:15,460 this image. 93 00:06:15,460 --> 00:06:19,219 And then finally we can return the image underscore bytes. 94 00:06:19,219 --> 00:06:24,219 And so essentially what we're going to do is have this main Python function and this 95 00:06:24,219 --> 00:06:27,560 is all we need to write, just the 15 lines of code. 96 00:06:27,560 --> 00:06:32,860 But for this function we then need to interface with it and that's where this web dot Python 97 00:06:32,860 --> 00:06:34,960 file is going to come into play. 98 00:06:34,960 --> 00:06:39,900 So essentially we're going to create a endpoint and with that endpoint we're then going to 99 00:06:39,900 --> 00:06:41,940 be able to make requests to it. 100 00:06:41,940 --> 00:06:44,680 It'll then call that generate image function. 101 00:06:44,680 --> 00:06:51,800 So to do that we need to import in the OS and from flask we also need to import in a 102 00:06:51,800 --> 00:06:53,000 few different things. 103 00:06:53,000 --> 00:06:59,180 We need flask the request and then also send underscore file from main. 104 00:06:59,180 --> 00:07:04,800 We need to import in the generate image and this generate image and the main is actually 105 00:07:04,800 --> 00:07:10,460 coming from this main dot pi and within there we have that generate image function. 106 00:07:10,460 --> 00:07:13,660 So we can pull this in so we can then interact with it. 107 00:07:13,660 --> 00:07:17,980 We'll set app is equal to flask and then we can create a route. 108 00:07:17,980 --> 00:07:23,260 So we'll have the app dot route and let's just create this route to the generate underscore 109 00:07:23,260 --> 00:07:26,920 image and it does not have to be the same name as our function here. 110 00:07:26,920 --> 00:07:31,580 I just happen to have it as the same name and we're going to have another parameter 111 00:07:31,580 --> 00:07:38,840 methods is equal to and then an array and we want to allow a post request to this endpoint. 112 00:07:38,840 --> 00:07:43,980 We can then have a function and we're just going to call this a handle underscore request. 113 00:07:43,980 --> 00:07:45,660 It really doesn't have to be named this. 114 00:07:45,660 --> 00:07:50,580 You can have it called anything but really that generate image and so with this handle 115 00:07:50,580 --> 00:07:56,020 request we're going to have a prompt is equal to the request which is what's coming into 116 00:07:56,020 --> 00:08:00,580 our application and it is a form request that we're going to take in and then we're going 117 00:08:00,580 --> 00:08:06,040 to have a parameter and let's just call this the keywords and if the form was posted with 118 00:08:06,040 --> 00:08:12,900 no keywords value we can check that if not prompt we can then return and we'll just return 119 00:08:12,900 --> 00:08:17,419 the hash and we'll just say that there was an error and we'll say that the keywords is 120 00:08:17,419 --> 00:08:21,340 required and we'll give it a status of 400. 121 00:08:21,340 --> 00:08:26,620 However if we did receive a prompt then we can set the image underscore bytes and we'll 122 00:08:26,620 --> 00:08:29,659 set that equal to the generate image function. 123 00:08:29,659 --> 00:08:34,500 We'll pass in our prompt and then we need to pass in the number of inference steps and 124 00:08:34,500 --> 00:08:37,419 just so this happens quickly I'll say 10. 125 00:08:37,419 --> 00:08:41,860 If you want it to be a lot more accurate you can set it to 50 but the higher number you 126 00:08:41,860 --> 00:08:45,460 use it is going to take longer and longer to generate. 127 00:08:45,460 --> 00:08:52,180 So once we have our image bytes we can return the send underscore file we want to return 128 00:08:52,180 --> 00:08:58,340 that image underscore bytes and we'll set the mime type is equal to an image for slash 129 00:08:58,340 --> 00:09:04,780 PNG and lastly we just had to call the app dot run we can set what host we want to bind 130 00:09:04,780 --> 00:09:11,500 this on because this is not going to be happening on our local host we can listen on any interface 131 00:09:11,500 --> 00:09:16,580 and then let's give this a port is equal to maybe something like 8000 and so that's all 132 00:09:16,580 --> 00:09:22,380 the python we have to write we have our web interface which is importing in the function 133 00:09:22,380 --> 00:09:28,260 generate image from our main file and when we accept a post request to this endpoint 134 00:09:28,260 --> 00:09:33,700 that has a form parameter keywords with some value then it's going to call that generate 135 00:09:33,700 --> 00:09:38,860 image with that prompt that generate image function is just going to pass into our stable 136 00:09:38,860 --> 00:09:45,300 diffusion model generating the image temporarily saving it to memory and returning it and so 137 00:09:45,300 --> 00:09:50,700 we can go ahead and start up our web server with a python and the web dot pi and now load 138 00:09:50,700 --> 00:09:55,180 it up into memory you will get a warning that this is a development server don't use it 139 00:09:55,180 --> 00:10:00,540 in a production deployment and so if you did want to deploy this I would recommend using 140 00:10:00,540 --> 00:10:05,780 something like g unicorn so you could just run that instead saying that we basically 141 00:10:05,780 --> 00:10:11,579 just want one worker but if you are going to do that they want to disable this app run 142 00:10:11,579 --> 00:10:16,579 so that way we are solely just going to be using the g unicorn it'll load it up into 143 00:10:16,579 --> 00:10:22,020 the vram as we would expect and then once it's up and running we can make our request 144 00:10:22,020 --> 00:10:27,720 to the ip address of this machine and port 8000 and so I'm going to use paul here to 145 00:10:27,720 --> 00:10:33,780 make this request just so we can test and make sure it's working so I can do the HTTP 146 00:10:33,780 --> 00:10:38,740 colon forward slash forward slash I can put in the ip address of this machine and then 147 00:10:38,740 --> 00:10:44,060 I can specify the port 8000 I made this a bit wider so we can see it all and then we 148 00:10:44,060 --> 00:10:51,339 want to go to the endpoint forward slash generate underscore image because it is a post request 149 00:10:51,339 --> 00:10:58,220 we do need to pass in into the body the form url encoded and we need to specify the keywords 150 00:10:58,220 --> 00:11:03,740 and for the keywords let's just call the planet earth as an example oh and I do have 151 00:11:03,740 --> 00:11:09,100 a few little mistypes here so I do need to fix this spelling on the inference steps and 152 00:11:09,100 --> 00:11:14,660 then also this is supposed to be image bytes once we fix those issues we can then restart 153 00:11:14,660 --> 00:11:20,620 our application we can make our request again we'll see that it's generating the image and 154 00:11:20,620 --> 00:11:25,420 then we get our image back and the nice thing about using something like paw or the rapid 155 00:11:25,420 --> 00:11:30,980 API now is that we can then just keep making this same request over and over and we should 156 00:11:30,980 --> 00:11:36,660 then see a new image being generated each time and so I would say that this works and 157 00:11:36,660 --> 00:11:42,180 now we can handle the Ruby on Rails side where we can then make this request to this endpoint 158 00:11:42,180 --> 00:11:47,780 as well as looking at some of the gotchas around it so because this is up and running and working 159 00:11:47,780 --> 00:11:53,260 I'm going to just leave it here I will copy the G unicorn and then just paste that in 160 00:11:53,260 --> 00:11:58,260 here just so we have reference to it in the show notes but for the most part we have under 161 00:11:58,260 --> 00:12:03,620 30 lines of Python code and it's going to do some really cool stuff and so now we can 162 00:12:03,620 --> 00:12:08,660 start with a fresh real 7 application I'll generate a scaffold we'll just call this the 163 00:12:08,660 --> 00:12:14,420 images we'll have a prompt which is a text and then we're also going to have a status 164 00:12:14,420 --> 00:12:20,060 which is an integer we'll go ahead and generate that and then we'll also call the rails active 165 00:12:20,060 --> 00:12:25,300 underscore storage colon install because we are going to be using active storage to store 166 00:12:25,300 --> 00:12:32,339 these images before we run the migrations I am going to go into the images migration 167 00:12:32,339 --> 00:12:38,420 and I just want to set a default value for the status it will set it equal to zero because 168 00:12:38,420 --> 00:12:43,620 we are going to be using the enums for the status to just keep track of it and then in 169 00:12:43,620 --> 00:12:49,719 the image model we can do the has one attached and then let's just call this our file we'll 170 00:12:49,720 --> 00:12:55,660 have the enum for this status and again we're going to have just like we did in the previous 171 00:12:55,660 --> 00:13:01,780 episode a pending a processing and then a completed and similar we're going to have 172 00:13:01,780 --> 00:13:08,020 a broadcast so as we're getting updates from the background job it's then going to broadcast 173 00:13:08,020 --> 00:13:13,060 it to our end user and then in the form I'm just going to remove the status because that's 174 00:13:13,060 --> 00:13:17,380 going to be automatically generated and we don't need to take in anything except the 175 00:13:17,380 --> 00:13:23,460 prompt in the controllers in the images controller we don't want to take in the status so we'll 176 00:13:23,460 --> 00:13:28,140 just have the prompt there and then let's go into the show page and within the show 177 00:13:28,140 --> 00:13:33,180 page we're rendering the image which means that we need to look into this image partial 178 00:13:33,180 --> 00:13:38,620 and then we see the prompt and the status and so that's good and so let's also display 179 00:13:38,620 --> 00:13:44,500 out that image so I'll just create a new paragraph and within there we can call the image underscore 180 00:13:44,500 --> 00:13:51,780 tag and then the URL underscore four and then the image dot file and we only want to do 181 00:13:51,780 --> 00:13:57,740 this if the image dot file is attached we still need to listen for the action cable 182 00:13:57,740 --> 00:14:03,420 channel so I'm just going to put that under the show page and we can do that with an ERB 183 00:14:03,420 --> 00:14:09,200 tag for the turbo stream from and we just want to stream from the image but let's also 184 00:14:09,200 --> 00:14:13,900 go ahead and add in that regenerate image button so just under the status I'm just going 185 00:14:13,900 --> 00:14:19,540 to create a button to and if you're unfamiliar with the button to that's basically going 186 00:14:19,540 --> 00:14:25,740 to create a form and around that form we can then have a URL passing some parameters and 187 00:14:25,740 --> 00:14:31,579 it's going to make a post request so I'm just going to have this called the regenerate and 188 00:14:31,579 --> 00:14:36,939 we can send this to the images underscore path and then we can also pass something into 189 00:14:36,939 --> 00:14:43,300 here and so just to keep it in the same format that our images controller is going to expect 190 00:14:43,300 --> 00:14:48,780 where we have the prams we need to have an image and then within that image hash we can 191 00:14:48,780 --> 00:14:54,219 have the prompt and we just want to set that equal to the image dot prompt and of course 192 00:14:54,219 --> 00:14:59,300 we would only want to do this if that image is completed so that's going to look at the 193 00:14:59,300 --> 00:15:05,099 enum to see if it's completed and if it is then we can regenerate the image and so in 194 00:15:05,099 --> 00:15:11,260 the images controller whenever we are creating an image let's call on a job let's call the 195 00:15:11,260 --> 00:15:18,620 process image job will perform underscore later and let's just pass in the image so 196 00:15:18,620 --> 00:15:23,500 typically I'll pass in the image ID I do like doing that but I'm just going to pass in the 197 00:15:23,500 --> 00:15:29,420 image here because active job does support passing in more complex objects like an active 198 00:15:29,420 --> 00:15:35,439 record object and it will use global ID to return the record but if you're uncomfortable 199 00:15:35,439 --> 00:15:40,100 with doing that or if you're using a background processor that doesn't support it then you're 200 00:15:40,100 --> 00:15:44,900 going to be able to pass in the image ID and then look up that record within the background 201 00:15:44,900 --> 00:15:51,660 job so we need to generate that job with a rails generate job and the process underscore 202 00:15:51,660 --> 00:15:58,220 image that'll give us that process image job and within this job we need to take in the 203 00:15:58,220 --> 00:16:04,820 image we could always set up some guard clauses return unless the image is there we could 204 00:16:04,820 --> 00:16:11,840 also return if the image is completed we can then set the image to processing which by 205 00:16:11,840 --> 00:16:18,320 setting it to processing that's going to broadcast out and update our view we can get a prompt 206 00:16:18,320 --> 00:16:24,260 is equal to the image dot prompt and let's just make sure that it is present if it is 207 00:16:24,260 --> 00:16:29,600 present then we can return the image dot prompt and if it's not present then let's just give 208 00:16:29,600 --> 00:16:34,240 it a default prompt in this case I'm just going to call children book our cover monkey 209 00:16:34,240 --> 00:16:40,500 hanging in a tree we can give it the same URL that we used within the rapid API and 210 00:16:40,500 --> 00:16:45,900 that is to a local IP address port eight thousand and generate image and of course you probably 211 00:16:45,900 --> 00:16:50,180 don't want to hard code an IP address in here I would probably extract that out as well 212 00:16:50,180 --> 00:16:54,660 as support to an environment variable but in this case I'm just going to leave it in 213 00:16:54,660 --> 00:16:59,820 here and so we're then going to expect some kind of image data back and I'm going to make 214 00:16:59,820 --> 00:17:04,540 a separate class for this because I don't want to put in all this logic within here 215 00:17:04,540 --> 00:17:10,060 so we're just going to have a class we'll call it HTTP we'll make a post request and 216 00:17:10,060 --> 00:17:15,380 let's set up this post class method to take in a few different things we need the URL 217 00:17:15,380 --> 00:17:20,140 and then we also need some kind of keywords and in this case I'm just going to pass in 218 00:17:20,140 --> 00:17:25,980 the prompt and so before we go any further let's go ahead and create this HTTP class 219 00:17:25,980 --> 00:17:32,780 in this post method under our models I have the HTTP dot RB we're just going to use the 220 00:17:32,780 --> 00:17:40,620 standard net HTTP it's going to be a class of HTTP we'll have our method for the class 221 00:17:40,620 --> 00:17:47,020 level post taking in the URL and also the form data we can create a new instance of 222 00:17:47,020 --> 00:17:53,580 this class with the form data and our URL and then we can call the instance method post 223 00:17:53,580 --> 00:17:59,300 so we can initialize this with our URL and the form data and then we can set instance 224 00:17:59,300 --> 00:18:05,659 methods for our URL and the form data is equal to those local variables and then we create 225 00:18:05,659 --> 00:18:11,860 our post method and within this post method let's just call our post underscore request 226 00:18:11,860 --> 00:18:16,780 which is a method that we'll have to create and then we're going to expect an attribute 227 00:18:16,780 --> 00:18:23,300 body to be called on in return so we can create this post underscore request we can set our 228 00:18:23,300 --> 00:18:31,940 URI is equal to the URI dot parse and we want to parse our URL we then set the HTTP is equal 229 00:18:31,940 --> 00:18:39,220 to the net colon colon HTTP we'll create a new instance with our URI dot host and the 230 00:18:39,220 --> 00:18:48,700 URI dot port we can set the use SSL is equal to true if the URI dot scheme is equal to 231 00:18:48,700 --> 00:18:55,940 HTTPS otherwise they'll use HTTP and depending on your application and the server that you're 232 00:18:55,940 --> 00:19:01,580 hosting this on you may need to set the open timeout and I'm going to set it to 10 seconds 233 00:19:01,580 --> 00:19:07,260 and then you also need to set the read timeout because if you do have a lot of traffic coming 234 00:19:07,260 --> 00:19:11,820 into the server and if it's going to take a while to generate the images then you may 235 00:19:11,820 --> 00:19:19,439 need a bit of higher number here we then make our request is equal to the net HTTP and this 236 00:19:19,439 --> 00:19:26,740 is a post we'll create a new instance to the URI dot request underscore URI and then we 237 00:19:26,740 --> 00:19:32,980 can set the form data for this and we'll just set that equal to our form data we'll finally 238 00:19:32,980 --> 00:19:40,340 then make our request with the HTTP dot request and then we'll pass in the request and so 239 00:19:40,340 --> 00:19:44,699 the nice thing about this is that now we're going to be able to reuse this multiple times 240 00:19:44,699 --> 00:19:50,820 for basically any kind of HTTP post that simple enough where we just have a URL and the form 241 00:19:50,820 --> 00:19:58,379 data so back in our image processing job we're sending our image data to that HTTP post if 242 00:19:58,379 --> 00:20:04,740 we are getting some image data back then we can call the image dot file dot attach and 243 00:20:04,740 --> 00:20:11,220 then we want to attach within IO with the string IO creating a new instance of that 244 00:20:11,220 --> 00:20:17,440 image underscore data we do need to give it a file name which let's just call it the image 245 00:20:17,440 --> 00:20:24,980 dot PNG we also need to set the content type is an image for slash PNG once that attaches 246 00:20:24,980 --> 00:20:30,160 we can call the image dot completed which should then broadcast a new change however 247 00:20:30,160 --> 00:20:34,660 there are a couple of thoughts that you may want to take into consideration one if you 248 00:20:34,660 --> 00:20:40,060 want to throttle the number of requests coming into the server to generate the images then 249 00:20:40,060 --> 00:20:44,900 you can create a separate queue and with that separate queue you have a separate service 250 00:20:44,900 --> 00:20:50,020 maybe only one two items within that queue or two workers handling requests for that 251 00:20:50,020 --> 00:20:55,960 queue and there's also a situation where if the python service is not running then it's 252 00:20:55,960 --> 00:21:02,500 going to error out and so we could do a retry on there and we can retry on that error where 253 00:21:02,500 --> 00:21:07,340 it's basically a connection refused because we don't want this to keep hammering that 254 00:21:07,340 --> 00:21:13,000 service we could just wait five seconds let's say we're going to give it an unlimited number 255 00:21:13,000 --> 00:21:19,500 of attempts we could also retry on if there is a situation where our timeout wasn't long 256 00:21:19,500 --> 00:21:24,180 enough and again we're just going to wait five seconds on there and so now we can test 257 00:21:24,180 --> 00:21:28,360 this out we can give it a prompt or we can leave the prompt blank and then it'll show 258 00:21:28,360 --> 00:21:33,260 us a monkey hanging from the tree we see that it went to processing and then completed and 259 00:21:33,260 --> 00:21:39,459 then we got our image we could try to regenerate this it'll create a new record and again we 260 00:21:39,459 --> 00:21:45,280 got the processing and completed and so this all works and so i'm going to kill the server 261 00:21:45,280 --> 00:21:50,300 and i'm going to change the inference because i want this to take a little bit longer so 262 00:21:50,300 --> 00:21:55,679 i'll set it to 50 and it should take about 14 seconds to generate an image based on this 263 00:21:55,680 --> 00:22:01,920 graphics card i can then set this up to run again so i'll just use a g unicorn it'll then 264 00:22:01,920 --> 00:22:07,340 load it up into memory and i can get in the top running again just so we can see this 265 00:22:07,340 --> 00:22:12,060 and then i'll have two browsers open where we generate one let's just test it with one 266 00:22:12,060 --> 00:22:18,100 first we can see the cpu spiking on the gpu and then it's going through and then it's 267 00:22:18,100 --> 00:22:24,260 taking a lot longer now to generate the image but that is going to make it a bit more detailed 268 00:22:24,260 --> 00:22:30,020 and accurate to what we are looking for and so now let's test it with two simultaneously 269 00:22:30,020 --> 00:22:35,140 where we're generating two images at a time so here you can see that it's going to generate 270 00:22:35,140 --> 00:22:39,740 one image and then once it generates that first image it's then going to work on the 271 00:22:39,740 --> 00:22:44,220 second one and so that's kind of one of the drawbacks with the g unicorn is that it's 272 00:22:44,220 --> 00:22:50,379 only handling one request at a time but we are able to sort of fix that so i'll kill 273 00:22:50,380 --> 00:22:56,060 the server we'll then just change the number of workers and we'll set it equal to two but 274 00:22:56,060 --> 00:23:00,580 you do want to be careful doing that because if you notice that the memory it is going 275 00:23:00,580 --> 00:23:06,420 to spike a lot higher because now we have two separate processes running so again i'll 276 00:23:06,420 --> 00:23:10,740 click generate on both of these real quick and you'll see that it's going through and 277 00:23:10,740 --> 00:23:16,580 it's generating them and it is kind of jumping back and forth because it is running two separate 278 00:23:16,580 --> 00:23:22,179 processes to generate them but one thing you will notice is that it is going to take longer 279 00:23:22,179 --> 00:23:28,699 to generate both of those images simply because now it is having to do two calculations at 280 00:23:28,699 --> 00:23:34,500 the same time so we got the one image and then the second one will appear shortly after 281 00:23:34,500 --> 00:23:40,199 and i really enjoy doing this image generation and the transcribing and i'm just coming up 282 00:23:40,199 --> 00:23:45,620 with all new kinds of ideas for this i don't think i'll really make any more episodes around 283 00:23:45,620 --> 00:23:50,659 the ai unless if there is something a bit more breakthrough but i do just want to kind 284 00:23:50,659 --> 00:23:55,979 of reiterate the approach that we took on the python script because i do think that 285 00:23:55,979 --> 00:24:02,659 this makes it very usable and i do think that this is a great case for a micro service because 286 00:24:02,659 --> 00:24:08,699 essentially what we have is a very small script that's doing one thing and it does that one 287 00:24:08,699 --> 00:24:14,820 thing very well and we also have one simple script that is our main entry point into this 288 00:24:14,820 --> 00:24:20,580 ai image generation you could also dockerize this if you want but the one thing that i 289 00:24:20,580 --> 00:24:25,300 would recommend if you are going to dockerize it is that i would copy out this model in 290 00:24:25,300 --> 00:24:31,060 the pipe into a preload script so then you can bundle in that model that you're going 291 00:24:31,060 --> 00:24:36,860 to be using into that docker image because i have found if you try to start the geonuclearn 292 00:24:36,860 --> 00:24:41,960 process within docker the health checks will fail and then you're just going to have some 293 00:24:41,960 --> 00:24:47,220 quirks there but if you preload that image in then it's going to deploy and run very 294 00:24:47,220 --> 00:24:53,780 fast but it is going to result in a very large docker image and i have created a website 295 00:24:53,780 --> 00:24:57,860 where i've kind of gone through all of the different ai things that i've been creating 296 00:24:57,860 --> 00:25:03,780 and playing around with and so you can go to the ai.railsapp.dev to check some of these 297 00:25:03,780 --> 00:25:09,660 out you can do different things like text to speech where you can generate audio clips 298 00:25:09,660 --> 00:25:14,720 that are spoken you could also play around with the text to image which this text to 299 00:25:14,720 --> 00:25:20,020 image is going to have a bunch of the different community generated images it works in a very 300 00:25:20,020 --> 00:25:25,220 similar process where we have a pinning it'll then go into processing and then complete 301 00:25:25,220 --> 00:25:31,180 it once it's finished and once it completes it'll then hide the progress bar and it'll 302 00:25:31,180 --> 00:25:36,180 take a moment but then it'll load in the image you could also use this with closed captions 303 00:25:36,180 --> 00:25:41,900 so you can upload a file and then it'll generate the closed captions for you because this is 304 00:25:41,900 --> 00:25:47,340 just an example not really meant to be used as a utility because my resources are limited 305 00:25:47,340 --> 00:25:53,280 i do have the first 1000 characters basically being displayed from that transcription and 306 00:25:53,280 --> 00:25:58,540 closed caption generation and similar for the transcriptions where it is just doing 307 00:25:58,540 --> 00:26:03,740 the transcription of it you can see where it's just displaying out the first 1000 characters 308 00:26:03,740 --> 00:26:08,980 and this really just so you can get a taste on how it works well that's all for this episode 309 00:26:08,980 --> 00:26:36,620 thanks for watching 35619