Yuqing Yang
20 papers · 1568 citations
Most-cited papers
- Longllmlingua: Accelerating And Enhancing Llms In Long Context Scenarios Via Prompt Compression2023 · 407 citations
- Minference 1.0: Accelerating Pre-filling For Long-context Llms Via Dynamic Sparse Attention2024 · 309 citations
- Llmlingua: Compressing Prompts For Accelerated Inference Of Large Language Models2023 · 227 citations
- Parrot: Efficient Serving Of Llm-based Applications With Semantic Variable2024 · 105 citations
- Retrievalattention: Accelerating Long-context LLM Inference Via Vector Retrieval2024 · 1 citations
Topics