Image Generation Pipeline (GPU Version)
About 723 wordsAbout 2 min
2026-02-15
1. Overview
The Image Generation Pipeline generates target images from user-provided text, providing image data for subsequent tasks such as image understanding and image editing.
This version uses local GPU models for text-to-image generation, supporting local deployment of models such as FLUX.1-dev.
💡 Tip: If you want to use cloud API models for text-to-image generation, please see Image Generation Pipeline (API Version)
2. Quick Start
Step 1: Create a New DataFlow Working Directory
mkdir run_dataflow_mm
cd run_dataflow_mmStep 2: Configure Model Path
Configure the model path in the pipeline code. Two methods are supported:
(1) Method 1: Use Hugging Face model path (auto-download)
hf_model_name_or_path="black-forest-labs/FLUX.1-dev"(2) Method 2: Use local model path (downloaded model)
hf_model_name_or_path="/path/to/your/local/FLUX.1-dev"Modify the hf_model_name_or_path parameter of LocalImageGenServing in text_to_image_generation_pipeline.py:
self.serving = LocalImageGenServing(
image_io=ImageIO(save_path=image_save_path),
batch_size=4,
hf_model_name_or_path="black-forest-labs/FLUX.1-dev", # Model path
hf_cache_dir="./cache_local", # Hugging Face model cache directory
hf_local_dir="./ckpt/models/", # Local model storage directory
diffuser_num_inference_steps=20, # Diffusion model inference steps, adjustable to balance speed and quality
diffuser_image_height=512, # Generated image height
diffuser_image_width=512, # Generated image width
)Step 3: Prepare Text Data
We use jsonl files to store text data, with one sample per line. Here is a simple example of input data:
{"conversations": [{"content": "a fox darting between snow-covered pines at dusk", "role": "user"}]}
{"conversations": [{"content": "a kite surfer riding emerald waves under a cloudy sky", "role": "user"}]}conversations contains a list of dialogues for image generation descriptions, and the content field is the text prompt.
Step 4: Run the Pipeline
python dataflow/statics/gpu_pipelines/text_to_image_generation_pipeline.pyGenerated files will be saved by default in the ./cache_local/text2image_local directory.
3. Data Flow and Pipeline Logic
1. Input Data
The input data for this pipeline includes the following fields:
- conversations: Dialogue format data containing text prompts.
This input data is stored in jsonl files and managed and read through the FileStorage object:
self.storage = FileStorage(
first_entry_file_name="<your_jsonl_file_path>",
cache_path="./cache_local/text2image_local",
file_name_prefix="dataflow_cache_step",
cache_type="jsonl"
)2. Text-to-Image Generation (PromptedImageGenerator)
The core step of the pipeline is using the Prompted Image Generator (PromptedImageGenerator) combined with local GPU models to generate corresponding images for each text prompt.
Features:
- Generate images from text prompts using local GPU models (e.g., FLUX.1-dev)
- Support configuration of inference steps, image dimensions, and other parameters
- Adjustable batch size to optimize GPU utilization
- Automatically save generated images to specified paths
Input: Dialogue format data (containing text prompts)
Output: Generated image file paths
Local GPU Service Configuration:
self.serving = LocalImageGenServing(
image_io=ImageIO(save_path=image_save_path), # Image save path
batch_size=4, # Batch size
hf_model_name_or_path="black-forest-labs/FLUX.1-dev", # Model path
hf_cache_dir="./cache_local", # Hugging Face model cache directory
hf_local_dir="./ckpt/models/", # Local model storage directory
diffuser_num_inference_steps=20, # Diffusion model inference steps
diffuser_image_height=512, # Generated image height
diffuser_image_width=512, # Generated image width
)Operator Initialization:
self.text_to_image_generator = PromptedImageGenerator(
t2i_serving=self.serving, # Text-to-image service
save_interval=10 # Save interval
)Operator Execution:
self.text_to_image_generator.run(
storage=self.storage.step(),
input_conversation_key="conversations", # Input dialogue field
output_image_key="images", # Output image field
)3. Output Data
Finally, the output data generated by the pipeline will include the following:
- conversations: Original dialogue data (containing text prompts)
- images: List of generated image file paths
Output Data Example:
{"conversations":[{"content":"a fox darting between snow-covered pines at dusk","role":"user"}],"images":["./cache_local/text2image_local/sample0_condition0/sample0_condition0_0.png"]}4. Pipeline Example
Below is an example of a text-to-image generation pipeline using local FLUX models:
import os
from pathlib import Path
from dataflow.operators.core_vision import PromptedImageGenerator
from dataflow.serving.local_image_gen_serving import LocalImageGenServing
from dataflow.utils.storage import FileStorage
from dataflow.io import ImageIO
class ImageGenerationPipeline():
def __init__(self):
current_file = Path(__file__).resolve()
project_root = current_file.parent.parent.parent.parent.parent
prompts_file = project_root / "dataflow" / "example" / "image_gen" / "text2image" / "prompts.jsonl"
# -------- Storage Configuration --------
self.storage = FileStorage(
first_entry_file_name=str(prompts_file),
cache_path="./cache_local/text2image_local",
file_name_prefix="dataflow_cache_step",
cache_type="jsonl"
)
image_save_path = str(project_root / "cache_local" / "text2image_local")
# -------- Local GPU Image Generation Service --------
self.serving = LocalImageGenServing(
image_io=ImageIO(save_path=image_save_path),
batch_size=4,
hf_model_name_or_path="black-forest-labs/FLUX.1-dev", # Or local model path
hf_cache_dir="./cache_local",
hf_local_dir="./ckpt/models/",
diffuser_num_inference_steps=20,
diffuser_image_height=512,
diffuser_image_width=512,
)
# -------- Text-to-Image Generation Operator --------
self.text_to_image_generator = PromptedImageGenerator(
t2i_serving=self.serving,
save_interval=10
)
def forward(self):
# Call PromptedImageGenerator to generate images
self.text_to_image_generator.run(
storage=self.storage.step(),
input_conversation_key="conversations",
output_image_key="images",
)
if __name__ == "__main__":
# -------- Pipeline Entry Point --------
model = ImageGenerationPipeline()
model.forward()
