Text to image papers
updated
UFOGen: You Forward Once Large Scale Text-to-Image Generation via
Diffusion GANs
Paper
• 2311.09257
• Published
• 47
VideoPoet: A Large Language Model for Zero-Shot Video Generation
Paper
• 2312.14125
• Published
• 47
TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones
Paper
• 2312.16862
• Published
• 31
VideoDrafter: Content-Consistent Multi-Scene Video Generation with LLM
Paper
• 2401.01256
• Published
• 22
DocGraphLM: Documental Graph Language Model for Information Extraction
Paper
• 2401.02823
• Published
• 36
DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM
Workflows
Paper
• 2402.10379
• Published
• 31
Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for
Language Models
Paper
• 2402.13064
• Published
• 50
OpenCodeInterpreter: Integrating Code Generation with Execution and
Refinement
Paper
• 2402.14658
• Published
• 83
Design2Code: How Far Are We From Automating Front-End Engineering?
Paper
• 2403.03163
• Published
• 98
SaulLM-7B: A pioneering Large Language Model for Law
Paper
• 2403.03883
• Published
• 90
Direct Preference Optimization of Video Large Multimodal Models from
Language Model Reward
Paper
• 2404.01258
• Published
• 12
CameraCtrl: Enabling Camera Control for Text-to-Video Generation
Paper
• 2404.02101
• Published
• 24
LVLM-Intrepret: An Interpretability Tool for Large Vision-Language
Models
Paper
• 2404.03118
• Published
• 25
MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation
Paper
• 2404.05674
• Published
• 15
Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs
Paper
• 2404.05719
• Published
• 83
BLINK: Multimodal Large Language Models Can See but Not Perceive
Paper
• 2404.12390
• Published
• 26
Prometheus 2: An Open Source Language Model Specialized in Evaluating
Other Language Models
Paper
• 2405.01535
• Published
• 124
AgentInstruct: Toward Generative Teaching with Agentic Flows
Paper
• 2407.03502
• Published
• 51
SpreadsheetLLM: Encoding Spreadsheets for Large Language Models
Paper
• 2407.09025
• Published
• 139
DataDream: Few-shot Guided Dataset Generation
Paper
• 2407.10910
• Published
• 10
Sibyl: Simple yet Effective Agent Framework for Complex Real-world
Reasoning
Paper
• 2407.10718
• Published
• 19
Case2Code: Learning Inductive Reasoning with Synthetic Data
Paper
• 2407.12504
• Published
• 8
EVLM: An Efficient Vision-Language Model for Visual Understanding
Paper
• 2407.14177
• Published
• 45
LazyLLM: Dynamic Token Pruning for Efficient Long Context LLM Inference
Paper
• 2407.14057
• Published
• 46
LongCite: Enabling LLMs to Generate Fine-grained Citations in
Long-context QA
Paper
• 2409.02897
• Published
• 48
Attention Heads of Large Language Models: A Survey
Paper
• 2409.03752
• Published
• 92
OmniGen: Unified Image Generation
Paper
• 2409.11340
• Published
• 115
Meissonic: Revitalizing Masked Generative Transformers for Efficient
High-Resolution Text-to-Image Synthesis
Paper
• 2410.08261
• Published
• 52