Yasunori Ozaki's picture

In a Training Loop 🔄

Yasunori Ozaki PRO

alfredplpl

·

https://alfredplpl.github.io/en/index.html

AI & ML interests

Computer Vision, LLM

Recent Activity

liked a model about 14 hours ago

stabilityai/stable-audio-3-small-sfx

liked a model about 14 hours ago

stabilityai/stable-audio-3-small-sfx-base

liked a model about 14 hours ago

stabilityai/stable-audio-3-medium

View all activity

Organizations

upvoted a paper 6 days ago

Asymmetric Flow Models

Paper • 2605.12964 • Published 9 days ago • 21

upvoted a paper 7 days ago

Qwen-Image-VAE-2.0 Technical Report

Paper • 2605.13565 • Published 9 days ago • 58

upvoted a paper 9 days ago

Qwen-Image-2.0 Technical Report

Paper • 2605.10730 • Published 11 days ago • 106

upvoted a paper 10 days ago

STARFlow2: Bridging Language Models and Normalizing Flows for Unified Multimodal Generation

Paper • 2605.08029 • Published 14 days ago • 11

upvoted 2 papers 13 days ago

Continuous-Time Distribution Matching for Few-Step Diffusion Distillation

Paper • 2605.06376 • Published 15 days ago • 26

Continuous Latent Diffusion Language Model

Paper • 2605.06548 • Published 15 days ago • 78

upvoted a collection 14 days ago

SenseNova-U1

SenseNova-U1: Unifying Multimodal Understanding and Generation with NEO-Unify Architecture • 8 items • Updated 6 days ago • 66

upvoted 2 collections 17 days ago

GenLIP

Model weights of paper "Let ViT Speak: Generative Language-Image Pre-training" • 6 items • Updated 16 days ago • 5

imabari-dialect-models

今治弁モデル • 6 items • Updated 29 days ago • 2

upvoted a paper 23 days ago

World-R1: Reinforcing 3D Constraints for Text-to-Video Generation

Paper • 2604.24764 • Published 25 days ago • 118

upvoted a collection 24 days ago

MiMo-V2.5

4 items • Updated 24 days ago • 86

upvoted a paper 24 days ago

AVControl: Efficient Framework for Training Audio-Visual Controls

Paper • 2603.24793 • Published Mar 25 • 28

upvoted a collection 25 days ago

MiDashengLM-7B-1021

4 items • Updated Oct 27, 2025 • 2

upvoted a collection 27 days ago

DeepSeek-V4

4 items • Updated 27 days ago • 651

upvoted a collection 29 days ago

Qwen3.6

4 items • Updated 29 days ago • 368

upvoted 2 collections about 1 month ago

MOSS-Audio

An open-source audio understanding model supporting speech recognition, environmental sound analysis, music understanding, time-aware QA, and complex • 7 items • Updated 19 days ago • 55

AI Images

Collect of AI-generated/assisted Images • 14 items • Updated Oct 18, 2025 • 3

upvoted 3 papers about 1 month ago

HY-World 2.0: A Multi-Modal World Model for Reconstructing, Generating, and Simulating 3D Worlds

Paper • 2604.14268 • Published Apr 15 • 121

Lyra 2.0: Explorable Generative 3D Worlds

Paper • 2604.13036 • Published Apr 14 • 41

Seedance 2.0: Advancing Video Generation for World Complexity

Paper • 2604.14148 • Published Apr 15 • 162