text-guided-image-editing
22 papers with code • 0 benchmarks • 0 datasets
Editing images using text prompts.
Benchmarks
These leaderboards are used to track progress in text-guided-image-editing
Most implemented papers
InstructEdit: Improving Automatic Masks for Diffusion-based Image Editing With User Instructions
In this work, we propose a framework termed InstructEdit that can do fine-grained editing based on user instructions.
MagicBrush: A Manually Annotated Dataset for Instruction-Guided Image Editing
To address this issue, we introduce MagicBrush (https://osu-nlp-group. github. io/MagicBrush/), the first large-scale, manually annotated dataset for instruction-guided real image editing that covers diverse scenarios: single-turn, multi-turn, mask-provided, and mask-free editing.
Masked-Attention Diffusion Guidance for Spatially Controlling Text-to-Image Generation
To address this issue, we propose masked-attention guidance, which can generate images more faithful to semantic masks via indirect control of attention to each word and pixel by manipulating noise images fed to diffusion models.
Forgedit: Text Guided Image Editing via Learning and Forgetting
Text-guided image editing on real or synthetic images, given only the original image itself and the target text prompt as inputs, is a very general and challenging task.
CoDi: Conditional Diffusion Distillation for Higher-Fidelity and Faster Image Generation
Our conditional-task learning and distillation approach outperforms previous distillation methods, achieving a new state-of-the-art in producing high-quality images with very few steps (e. g., 1-4) across multiple tasks, including super-resolution, text-guided image editing, and depth-to-image generation.
DeltaSpace: A Semantic-aligned Feature Space for Flexible Text-guided Image Editing
Based on DeltaSpace, we propose a novel framework called DeltaEdit, which maps the CLIP visual feature differences to the latent space directions of a generative model during the training phase, and predicts the latent space directions from the CLIP textual feature differences during the inference phase.
Lasagna: Layered Score Distillation for Disentangled Object Relighting
Although generative editing methods now enable some forms of image editing, relighting is still beyond today's capabilities; existing methods struggle to keep other aspects of the image -- colors, shapes, and textures -- consistent after the edit.
Your Student is Better Than Expected: Adaptive Teacher-Student Collaboration for Text-Conditional Diffusion Models
While several powerful distillation methods were recently proposed, the overall quality of student samples is typically lower compared to the teacher ones, which hinders their practical usage.
SpecRef: A Fast Training-free Baseline of Specific Reference-Condition Real Image Editing
To increase user freedom, we propose a new task called Specific Reference Condition Real Image Editing, which allows user to provide a reference image to further control the outcome, such as replacing an object with a particular one.
Diffusion Model-Based Image Editing: A Survey
In this survey, we provide an exhaustive overview of existing methods using diffusion models for image editing, covering both theoretical and practical aspects in the field.