9 15 5

Yiyuan Zhang

Yiyuan

https://invictus717.github.io/

invictus717

AI & ML interests

None yet

Recent Activity

updated a model 13 days ago

Yiyuan/t2i_dit

published a model 13 days ago

Yiyuan/t2i_dit

upvoted a paper about 1 month ago

OneThinker: All-in-one Reasoning Model for Image and Video

View all activity

Organizations

upvoted a paper about 1 month ago

OneThinker: All-in-one Reasoning Model for Image and Video

Paper • 2512.03043 • Published Dec 2, 2025 • 32

upvoted a paper about 2 months ago

NaTex: Seamless Texture Generation as Latent Color Diffusion

Paper • 2511.16317 • Published Nov 20, 2025 • 15

upvoted a paper 2 months ago

FARMER: Flow AutoRegressive Transformer over Pixels

Paper • 2510.23588 • Published Oct 27, 2025 • 58

upvoted 2 papers 4 months ago

Understand Before You Generate: Self-Guided Training for Autoregressive Image Generation

Paper • 2509.15185 • Published Sep 18, 2025 • 29

Transition Models: Rethinking the Generative Learning Objective

Paper • 2509.04394 • Published Sep 4, 2025 • 28

upvoted 2 papers 7 months ago

InterActHuman: Multi-Concept Human Animation with Layout-Aligned Audio Conditions

Paper • 2506.09984 • Published Jun 11, 2025 • 14

Native-Resolution Image Synthesis

Paper • 2506.03131 • Published Jun 3, 2025 • 18

upvoted a paper 8 months ago

Seed1.5-VL Technical Report

Paper • 2505.07062 • Published May 11, 2025 • 154

upvoted a paper 10 months ago

Unleashing Vecset Diffusion Model for Fast Shape Generation

Paper • 2503.16302 • Published Mar 20, 2025 • 43

upvoted 2 papers about 1 year ago

Vision Search Assistant: Empower Vision-Language Models as Multimodal Search Engines

Paper • 2410.21220 • Published Oct 28, 2024 • 11

Scaling Up Your Kernels: Large Kernel Design in ConvNets towards Universal Representations

Paper • 2410.08049 • Published Oct 10, 2024 • 8

upvoted 2 papers over 1 year ago

SAM 2: Segment Anything in Images and Videos

Paper • 2408.00714 • Published Aug 1, 2024 • 120

Explore the Limits of Omni-modal Pretraining at Scale

Paper • 2406.09412 • Published Jun 13, 2024 • 11

upvoted 2 papers almost 2 years ago

InteractiveVideo: User-Centric Controllable Video Generation with Synergistic Multimodal Instructions

Paper • 2402.03040 • Published Feb 5, 2024 • 19

Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities

Paper • 2401.14405 • Published Jan 25, 2024 • 13

Yiyuan Zhang

AI & ML interests

Recent Activity

Organizations

Yiyuan's activity