Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2505.19297

Text2Image - Dataset (how to filter bad image)

Alchemist: Turning Public Text-to-Image Data into Generative Gold

Paper • 2505.19297 • Published May 25, 2025 • 84

One-Minute Video Generation with Test-Time Training

Paper • 2504.05298 • Published Apr 7, 2025 • 110
MoCha: Towards Movie-Grade Talking Character Synthesis

Paper • 2503.23307 • Published Mar 30, 2025 • 138
Towards Understanding Camera Motions in Any Video

Paper • 2504.15376 • Published Apr 21, 2025 • 155
Antidistillation Sampling

Paper • 2504.13146 • Published Apr 17, 2025 • 59

Generation Quality Enhancement

VMix: Improving Text-to-Image Diffusion Model with Cross-Attention Mixing Control

Paper • 2412.20800 • Published Dec 30, 2024 • 11
Padding Tone: A Mechanistic Analysis of Padding Tokens in T2I Models

Paper • 2501.06751 • Published Jan 12, 2025 • 32
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps

Paper • 2501.09732 • Published Jan 16, 2025 • 71
Learnings from Scaling Visual Tokenizers for Reconstruction and Generation

Paper • 2501.09755 • Published Jan 16, 2025 • 35

Gen AI Diffusion

Animate-X: Universal Character Image Animation with Enhanced Motion Representation

Paper • 2410.10306 • Published Oct 14, 2024 • 56
ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning

Paper • 2411.05003 • Published Nov 7, 2024 • 71
TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation

Paper • 2411.04709 • Published Nov 5, 2024 • 26
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation

Paper • 2410.07171 • Published Oct 9, 2024 • 43

📊 Dataset and 🏆 checkpoints for paper 📝 "Alchemist: Turning Public Text-to-Image Data into Generative Gold"

Alchemist: Turning Public Text-to-Image Data into Generative Gold

Paper • 2505.19297 • Published May 25, 2025 • 84
yandex/alchemist

Viewer • Updated Jun 6, 2025 • 3.35k • 171 • 48
yandex/stable-diffusion-3.5-large-alchemist

Text-to-Image • Updated May 16, 2025 • 7 • 9
yandex/stable-diffusion-3.5-medium-alchemist

Text-to-Image • Updated May 16, 2025 • 2 • 6

CoRAG: Collaborative Retrieval-Augmented Generation

Paper • 2504.01883 • Published Apr 2, 2025 • 9
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning

Paper • 2504.08837 • Published Apr 10, 2025 • 43
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model

Paper • 2504.10068 • Published Apr 14, 2025 • 30
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

Paper • 2504.10481 • Published Apr 14, 2025 • 85

Data and other things

MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval

Paper • 2412.14475 • Published Dec 19, 2024 • 55
How to Synthesize Text Data without Model Collapse?

Paper • 2412.14689 • Published Dec 19, 2024 • 52
Token-Budget-Aware LLM Reasoning

Paper • 2412.18547 • Published Dec 24, 2024 • 46
WavePulse: Real-time Content Analytics of Radio Livestreams

Paper • 2412.17998 • Published Dec 23, 2024 • 11

Text2Image - Dataset (how to filter bad image)

Alchemist: Turning Public Text-to-Image Data into Generative Gold

Paper • 2505.19297 • Published May 25, 2025 • 84

📊 Dataset and 🏆 checkpoints for paper 📝 "Alchemist: Turning Public Text-to-Image Data into Generative Gold"

Alchemist: Turning Public Text-to-Image Data into Generative Gold

Paper • 2505.19297 • Published May 25, 2025 • 84
yandex/alchemist

Viewer • Updated Jun 6, 2025 • 3.35k • 171 • 48
yandex/stable-diffusion-3.5-large-alchemist

Text-to-Image • Updated May 16, 2025 • 7 • 9
yandex/stable-diffusion-3.5-medium-alchemist

Text-to-Image • Updated May 16, 2025 • 2 • 6

One-Minute Video Generation with Test-Time Training

Paper • 2504.05298 • Published Apr 7, 2025 • 110
MoCha: Towards Movie-Grade Talking Character Synthesis

Paper • 2503.23307 • Published Mar 30, 2025 • 138
Towards Understanding Camera Motions in Any Video

Paper • 2504.15376 • Published Apr 21, 2025 • 155
Antidistillation Sampling

Paper • 2504.13146 • Published Apr 17, 2025 • 59

CoRAG: Collaborative Retrieval-Augmented Generation

Paper • 2504.01883 • Published Apr 2, 2025 • 9
VL-Rethinker: Incentivizing Self-Reflection of Vision-Language Models with Reinforcement Learning

Paper • 2504.08837 • Published Apr 10, 2025 • 43
Mavors: Multi-granularity Video Representation for Multimodal Large Language Model

Paper • 2504.10068 • Published Apr 14, 2025 • 30
xVerify: Efficient Answer Verifier for Reasoning Model Evaluations

Paper • 2504.10481 • Published Apr 14, 2025 • 85

Generation Quality Enhancement

VMix: Improving Text-to-Image Diffusion Model with Cross-Attention Mixing Control

Paper • 2412.20800 • Published Dec 30, 2024 • 11
Padding Tone: A Mechanistic Analysis of Padding Tokens in T2I Models

Paper • 2501.06751 • Published Jan 12, 2025 • 32
Inference-Time Scaling for Diffusion Models beyond Scaling Denoising Steps

Paper • 2501.09732 • Published Jan 16, 2025 • 71
Learnings from Scaling Visual Tokenizers for Reconstruction and Generation

Paper • 2501.09755 • Published Jan 16, 2025 • 35

Data and other things

MegaPairs: Massive Data Synthesis For Universal Multimodal Retrieval

Paper • 2412.14475 • Published Dec 19, 2024 • 55
How to Synthesize Text Data without Model Collapse?

Paper • 2412.14689 • Published Dec 19, 2024 • 52
Token-Budget-Aware LLM Reasoning

Paper • 2412.18547 • Published Dec 24, 2024 • 46
WavePulse: Real-time Content Analytics of Radio Livestreams

Paper • 2412.17998 • Published Dec 23, 2024 • 11

Gen AI Diffusion

Animate-X: Universal Character Image Animation with Enhanced Motion Representation

Paper • 2410.10306 • Published Oct 14, 2024 • 56
ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning

Paper • 2411.05003 • Published Nov 7, 2024 • 71
TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation

Paper • 2411.04709 • Published Nov 5, 2024 • 26
IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation

Paper • 2410.07171 • Published Oct 9, 2024 • 43

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs