Upload images and get enterprise-grade dataset preparation: intelligent multi-API captioning, checkpoint-specific optimization for SDXL, Flux, SD-1.5, Qwen, and more. Trainer-ready exports with complete control over caption quality and style.
Built from the ground up for professional LORA training with intelligence at every step.
Switch between Gemini, OpenAI, and Grok vision APIs. Optimize for cost (Gemini at $0.14/1K images) or quality. Smart routing ensures you get the best results for your workflow.
Model-specific captioning strategies for SDXL, Flux, SD-1.5, Qwen-Image, and WAN-2.2. Each checkpoint receives tailored prompts for maximum training effectiveness.
Fine-tune quality levels, temperature settings, token limits, and caption scoring. Create perfectly balanced captions for character, style, concept, and product LoRAs.
Files automatically renamed for Kohya format with bucket/repeat awareness. Compatible with all popular training applications. No manual renaming needed.
Generates trainer-friendly .txt files per image. Goal-aligned phrasing, consistent token ordering, and ready for immediate training use.
Full transparency into every step of the process. See API responses, caption generation, and scoring. Debug and optimize with confidence.
Automatically avoid captioning "baked features" like hair/eye color to prevent training conflicts.
Optimized prompting for artistic styles, visual concepts, and aesthetic training.
Detailed captioning for product design, interior architecture, and technical subjects.
Precise garment descriptions and style metadata for fashion-focused training.
Automatic pose detection and cinematic element tagging for dynamic subjects.
Handle multiple model types in one dataset with automatic optimization per checkpoint.
Gemini is most cost-effective at ~$0.14 per 1,000 images. OpenAI is ~$2.16/1K images, and Grok is ~$4.05/1K images. The tool recommends the best API for your checkpoint type, but you can override manually. For budget-conscious workflows, Gemini is unbeatable; for maximum quality, try OpenAI.
The tool includes optimized configurations for SDXL, Flux, SD-1.5, Qwen-Image, and WAN-2.2. Each checkpoint receives model-specific captioning strategies and prompt engineering. If you need support for additional checkpoints, you can customize the settings or reach out for feature requests.
The tool exports in Kohya format with proper filename patterns for bucket and repeat awareness. Files are compatible with all popular training applications including Kohya, AI Toolkit, and Civitai Trainer. Each image gets a matching .txt caption file ready for immediate training.
Yes. The advanced caption controls let you adjust quality levels (standard, detailed, expert), temperature for creative variation, token limits for conciseness, and quality/token scoring to balance detail vs. efficiency. You can also set different controls per LoRA training goal (character, style, concept, etc.).
Your images are sent to your chosen API (Gemini, OpenAI, or Grok) for captioning. All processing is handled per your API provider's terms. The tool itself doesn't store or log your images—everything runs in your browser session.
Start building professional LoRA datasets today. No signup required.