University of Texas at Austin

university

Verified

https://www.utexas.edu

AI & ML interests

None defined yet.

cotran2

authored a paper 5 months ago

Arch-Router: Aligning LLM Routing with Human Preferences

Paper • 2506.16655 • Published Jun 19, 2025 • 17

gdhe17

authored 2 papers 8 months ago

Noise Contrastive Alignment of Language Models with Explicit Rewards

Paper • 2402.05369 • Published Feb 8, 2024 • 2

Vidu: a Highly Consistent, Dynamic and Skilled Text-to-Video Generator with Diffusion Models

Paper • 2405.04233 • Published May 7, 2024 • 3

ajd12342

authored a paper 8 months ago

Rhapsody: A Dataset for Highlight Detection in Podcasts

Paper • 2505.19429 • Published May 26, 2025 • 1

gdhe17

authored a paper 8 months ago

Self Forcing: Bridging the Train-Test Gap in Autoregressive Video Diffusion

Paper • 2506.08009 • Published Jun 9, 2025 • 30

ajd12342

authored a paper 10 months ago

Scaling Rich Style-Prompted Text-to-Speech Datasets

Paper • 2503.04713 • Published Mar 6, 2025 • 1

gdhe17

authored 2 papers 11 months ago

Direct Discriminative Optimization: Your Likelihood-Based Visual Generative Model is Secretly a GAN Discriminator

Paper • 2503.01103 • Published Mar 3, 2025 • 5

RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers

Paper • 2502.15894 • Published Feb 21, 2025 • 20

XCLiu

authored a paper 12 months ago

TripoSG: High-Fidelity 3D Shape Synthesis using Large-Scale Rectified Flow Models

Paper • 2502.06608 • Published Feb 10, 2025 • 39

ajd12342

authored 6 papers about 1 year ago

Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks

Paper • 2411.05361 • Published Nov 8, 2024 • 4

Textless Speech-to-Speech Translation With Limited Parallel Data

Paper • 2305.15405 • Published May 24, 2023

Why is Winoground Hard? Investigating Failures in Visuolinguistic Compositionality

Paper • 2211.00768 • Published Nov 1, 2022

Continual Learning for On-Device Speech Recognition using Disentangled Conformers

Paper • 2212.01393 • Published Dec 2, 2022 • 1

Reduce and Reconstruct: ASR for Low-Resource Phonetic Languages

Paper • 2010.09322 • Published Oct 19, 2020

Multilingual and code-switching ASR challenges for low resource Indian languages

Paper • 2104.00235 • Published Apr 1, 2021

gdhe17

authored a paper about 1 year ago

Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis

Paper • 2312.03491 • Published Dec 6, 2023 • 34

XCLiu

authored a paper about 1 year ago

JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation

Paper • 2411.07975 • Published Nov 12, 2024 • 31

XCLiu

authored 3 papers over 1 year ago

Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation

Paper • 2410.13848 • Published Oct 17, 2024 • 35

SlimFlow: Training Smaller One-Step Diffusion Models with Rectified Flow

Paper • 2407.12718 • Published Jul 17, 2024

Consistency Flow Matching: Defining Straight Flows with Velocity Consistency

Paper • 2407.02398 • Published Jul 2, 2024 • 18