Llama
Emerging14papers using it
2024first seen
Papers using Llama (14)
- Reasoning-preserved Efficient Distillation of Large Language Models via Activation-aware InitializationGAMMA: Global Bit Allocation for Mixed-Precision Models under Arbitrary BudgetsAccess Sets Matter: Budgeting Expert Reads for Scalable Weight-Space Model Merging1+1>2: A Synergistic Sparse and Low-Rank Compression Method for Large Language ModelsChunks as Arms: Multi-Armed Bandit-Guided Sampling for Long-Context LLM Preference OptimizationPhantomHunter: Detecting Unseen Privately-Tuned LLM-Generated Text via Family-Aware LearningTransMLA: Multi-head Latent Attention Is All You NeedQuartet: Native FP4 Training Can Be Optimal for Large Language ModelsSpectral Scaling Laws in Language Models: How Effectively Do
Feed-Forward Networks Use Their Latent Space?ARMOR: High-Performance Semi-Structured Pruning via Adaptive Matrix
FactorizationAccelerating Large Language Model Reasoning via Speculative SearchSlimLLM: Accurate Structured Pruning for Large Language ModelsSlimGPT: Layer-wise Structured Pruning for Large Language ModelsDAQ: Density-Aware Post-Training Weight-Only Quantization For LLMs