DreamPixel Forge: Local AI Image Generator

SYS.BLOG

DreamPixel Forge: Local AI Image Generator

Announcing DreamPixelForge, a cross-platform GUI for running Stable Diffusion models locally. Multi-model support, LLM prompt enhancement via Ollama, GPU acceleration, and app icon generation.

|Aditya Bawankule
AIPythonStable DiffusionOllamaOpen Source

DreamPixelForge is a desktop GUI for running Stable Diffusion models locally. It supports SD 1.5, 2.1, XL, Dreamlike Diffusion, Kandinsky 2.2, Pony Diffusion V6 XL, and custom CivitAI models, all managed from a single interface with built-in LLM prompt enhancement via Ollama. The full source is on GitHub.


Why Build Another Stable Diffusion Frontend?

ComfyUI is powerful but node-based. Great for complex workflows, bad for someone who just wants to type a prompt and get an image. Automatic1111 covers the basics but feels bolted together, and switching between model architectures (say, SD 1.5 to Kandinsky) usually means different UIs or manual configuration.

I wanted something simpler: one app where you pick a model from a dropdown, type a prompt, and hit generate. No nodes, no YAML files, no hunting for the right command-line flags. And I wanted the whole thing to stay local: no cloud API keys, no data leaving your machine.


Choosing Tkinter (Yes, Really)

I went with Tkinter for the GUI. The obvious question is why not Electron or a web-based frontend. The answer is dependency weight. DreamPixelForge already pulls in PyTorch, diffusers, transformers, and model weights that can be multiple gigabytes. Adding Electron on top of that felt wrong. Tkinter ships with Python, zero extra dependencies, and it runs on Windows, macOS, and Linux without any platform-specific packaging for the UI layer.

The trade-off is obvious: Tkinter looks dated, and advanced layouts are painful. But for a form-based app (text fields, dropdowns, buttons, image preview), it does the job. The time I didn't spend fighting Electron went into features that actually matter.


Dealing with VRAM Across Models

The hardest part of supporting multiple model architectures in one app is VRAM. SD 1.5 runs comfortably on 4GB of VRAM. SDXL wants 8GB+. Kandinsky sits somewhere in between. If you load an SDXL model on a 4GB card with the same settings you'd use for SD 1.5, you get an out-of-memory crash with no useful error message.

DreamPixelForge handles this with per-model configuration: each model gets its own resolution presets, default negative prompts, and VRAM warnings. When you switch models, the UI updates to show sensible defaults for that architecture. If you import a custom model from CivitAI, the app inspects the file to detect whether it's SD 1.5, SD 2.1, or SDXL and applies the right configuration automatically.

For GPU acceleration, it's CUDA on Windows and Linux, Metal Performance Shaders on Apple Silicon. The app detects the available backend at startup and configures PyTorch accordingly. On machines with no supported GPU, it falls back to CPU. Slow, but it works.


The Ollama Prompt Enhancement Pipeline

This is the feature I'm most satisfied with. Most people don't write good image generation prompts. They type something like “a cat in space” and get mediocre results, then assume the model is bad.

DreamPixelForge connects to a locally running Ollama instance and uses whatever LLM you have installed to rewrite prompts. It works in two modes: tag conversion (turning a natural language sentence into comma-separated tags optimized for the selected model) and creative expansion (telling the LLM to build on your idea and add detail). The app queries Ollama's API to list available models, so you pick which LLM to use from a dropdown, no configuration files.

The key decision was keeping this entirely local. Cloud-based prompt enhancement would be simpler to implement, but it defeats the purpose of a privacy-focused local tool. Ollama makes this painless: if it's running, the feature is available; if not, the UI simply hides the enhancement options.


App Icon Generation

This started as a personal itch. I kept generating icons for side projects and then manually resizing them in an image editor, applying corner radii, and exporting at every required size for iOS and Android. So I built it into the app.

The icon preset configures square output (512x512 or 1024x1024 depending on model), 25 inference steps, guidance scale 7.0, a batch size of 4 for options, and a negative prompt tuned to avoid text artifacts. After generation, a post-processing step applies corner radii with proper transparency, generates all platform-required sizes, and names files according to iOS/Android conventions.


What I'd Do Differently

The prompt enhancement pipeline works surprisingly well. It's the single feature that most improves output quality for casual users. But if I were starting over, I'd split the model management into a separate background service instead of handling downloads and loading in the GUI thread. Right now, downloading a multi-gigabyte model blocks the UI in ways that Tkinter's threading model makes awkward to fix cleanly.

The project is open source . Installation instructions and setup details are in the README.