6 50 7

Harold Chen

Harold328

https://haroldchen19.github.io/

HaroldChen19

AI & ML interests

Computer Vision

Recent Activity

updated a dataset about 1 hour ago

Harold328/0124

published a dataset about 1 hour ago

Harold328/0124

upvoted a paper 2 days ago

Rethinking Video Generation Model for the Embodied World

View all activity

Organizations

None yet

updated a dataset about 1 hour ago

Harold328/0124

Updated about 1 hour ago

published a dataset about 1 hour ago

Harold328/0124

Updated about 1 hour ago

upvoted 2 papers 2 days ago

Rethinking Video Generation Model for the Embodied World

Paper • 2601.15282 • Published 3 days ago • 41

Agentic Reasoning for Large Language Models

Paper • 2601.12538 • Published 6 days ago • 163

upvoted a paper 4 days ago

Future Optical Flow Prediction Improves Robot Control & Video Generation

Paper • 2601.10781 • Published 9 days ago • 19

upvoted 4 papers 7 days ago

Inference-time Physics Alignment of Video Generative Models with Latent World Models

Paper • 2601.10553 • Published 9 days ago • 12

Action100M: A Large-scale Video Action Dataset

Paper • 2601.10592 • Published 9 days ago • 27

Molmo2: Open Weights and Data for Vision-Language Models with Video Understanding and Grounding

Paper • 2601.10611 • Published 9 days ago • 26

CoF-T2I: Video Models as Pure Visual Reasoners for Text-to-Image Generation

Paper • 2601.10061 • Published 10 days ago • 30

upvoted a paper 10 days ago

ShowUI-π: Flow-based Generative Models as GUI Dexterous Hands

Paper • 2512.24965 • Published 24 days ago • 41

upvoted 4 papers 12 days ago

VideoAR: Autoregressive Video Generation via Next-Frame & Scale Prediction

Paper • 2601.05966 • Published 15 days ago • 23

AgentOCR: Reimagining Agent History via Optical Self-Compression

Paper • 2601.04786 • Published 16 days ago • 28

The Molecular Structure of Thought: Mapping the Topology of Long Chain-of-Thought Reasoning

Paper • 2601.06002 • Published 15 days ago • 49

Thinking with Map: Reinforced Parallel Map-Augmented Agent for Geolocalization

Paper • 2601.05432 • Published 16 days ago • 160

upvoted 2 papers 15 days ago

Plenoptic Video Generation

Paper • 2601.05239 • Published 16 days ago • 12

Choreographing a World of Dynamic Objects

Paper • 2601.04194 • Published 17 days ago • 13

upvoted a paper 17 days ago

GARDO: Reinforcing Diffusion Models without Reward Hacking

Paper • 2512.24138 • Published 25 days ago • 29

upvoted a paper 19 days ago

Taming Hallucinations: Boosting MLLMs' Video Understanding via Counterfactual Video Generation

Paper • 2512.24271 • Published 25 days ago • 62

upvoted a paper 29 days ago

Spatia: Video Generation with Updatable Spatial Memory

Paper • 2512.15716 • Published Dec 17, 2025 • 33

liked a model 29 days ago

FastVideo/FastWan2.2-TI2V-5B-FullAttn-Diffusers

Text-to-Video • Updated Nov 25, 2025 • 13.1k • 59

Harold Chen

AI & ML interests

Recent Activity

Organizations

Harold328's activity