miraj.cloud: ai image generator

i really like ai image generators. i immediately signed up for midjourney when it came on my radar and after paying for a few months and using other free services, i wanted to see how i could put one togther.

i was interested in the message queue architecture they use as well for future projects and felt like it would be a great test case.

prerequisites

i first needed to acquire a new GPU, i found one from last gen and chose Nvidia because many projects support CUDA first class and i did not want to get sidetracked.

i also needed some clean way to manage the GPU, i felt what worked best for me was the Nvidia container toolkit. i always prefer containerizing any service i make so it’s easy to reproduce the build, and this was a bit of an extra step but worth for good developer experience.

initially

when i wanted to first test a script that could generate images like stablediffusion, i was forced to use python after years away from it. i was familiar with HuggingFace and they had a new library they were touting called diffusers for python that seemed modular enough for my liking.

i setup a proof of concept using diffusers first

test/txt2img.py
import diffusers

once that worked, i had the very basic way to generate images from hardcoded prompts.

using a server

naturally i then wanted to send dynamic input, and possibly from other devices than a cli. to accomplish this i integrated my project with a flask server at the time since it seemed similar enough.

once i had a server that could take my params and return a pointer to my images, i felt like i could then try integrating it at a basic level.

src/server.py
import flask

my main use case was generating images on my phone or computer from Discord or from public bots and i wanted a similar experience

adding a basic ui

since i love astro for quick ideas i used it to create my ui. it supports a server backend that i could use to make api requests to my flask service and forward the results of the image to the web. it essentially worked like this:

src/pages/api/generate.ts
export const POST = (req, res) => {
  const params = await req.json();
  const { prompt } = params;
  const genRequest = await fetch("http://localhost:5000", {
    method: "POST",
    body: JSON.stringify({
      prompt
    })
  })
  const { genId } = await genRequest.json()

  return new Response(200, JSON.stringify({id: genId }));
}

my backend would then receive a url to the final image it would find using the id, and for inital simplicity i just polled until the file existed and tada

after the proof of concept

since the proof of concept i’ve iterated heavily on the archtiecture and it’s pretty robust. there is a discord bot, graphql server, pubsub queue to manage concurrent requests. it’s turned into a fun production stack to experiment with.