Awesome Papers

Papers

Verus-SpecGym: An Agentic Environment for Evaluating Specification Autoformalization (2026)
Anmol Agarwal et al.
6.98
AgentAtlas: Beyond Outcome Leaderboards for LLM Agents (2026)
Parsa Mazaheri et al.
3.91
AI Agent for Reverse-Engineering Legacy Finite-Difference Code and Translating to Devito (2026)
Yinghan Hou et al.
0.00
Version Control of Speaker Recognition Systems (2024)
Quan Wang et al.
—
Catch Me If You Can: Blackbox Adversarial Attacks on Automatic Speech Recognition using Frequency Masking (2022)
Xiaoliang Wu et al.
—
ASDF: A Differential Testing Framework for Automatic Speech Recognition Systems (2023)
Daniel Hao Xian Yuen et al.
—
AutoDroid: LLM-powered Task Automation in Android (2024)
Hao Wen et al.
—
ASTER: Automatic Speech Recognition System Accessibility Testing for Stutterers (2023)
Yi Liu et al.
—
Mi-Go: Test Framework which uses YouTube as Data Source for Evaluating Speech Recognition Models like OpenAI's Whisper (2023)
Tomasz Wojnar et al.
—
The Impact of Large Language Models on Open-source Innovation: Evidence from GitHub Copilot (2026)
Doron Yeverechyahu et al.
—
Evolution of IVR building techniques: from code writing to AI-powered automation (2024)
Khushbu Mehboob Shaikh et al.
—
Improving Requirements Classification with SMOTE-Tomek Preprocessing (2026)
Barak Or
—
LoCoML: A Framework for Real-World ML Inference Pipelines (2025)
Kritin Maddireddy et al.
—
Pragmatic Reasoning improves LLM Code Generation (2026)
Zhuchen Cao et al.
—
SCALAR: A Part-of-speech Tagger for Identifiers (2025)
Christian D. Newman et al.
—
Understanding Automated Program Repair Agents Through the Lens of Traceability: An Empirical Study (2026)
Ira Ceka et al.
—
ToolRegistry: A Protocol-Agnostic Tool Management Library for Function-Calling LLMs (2026)
Peng Ding et al.
—
Neutone SDK: An Open Source Framework for Neural Audio Processing (2025)
Christopher Mitcheltree et al.
—
MCPXKIT: The Unified Toolkit for Analyzing Model Context Protocol Security (2026)
Yongjian Guo et al.
—
Metamorphic Testing for Audio Content Moderation Software (2025)
Wenxuan Wang et al.
—
Regression Language Models for Code (2026)
Yash Akhauri et al.
—
A Low-Resource Speech-Driven NLP Pipeline for Sinhala Dyslexia Assistance (2025)
Peshala Perera et al.
—
Which Is Better For Reducing Outdated and Vulnerable Dependencies: Pinning or Floating? (2026)
Imranur Rahman et al.
—
HW/SW Co-design of a PCM/PWM converter: a System Level Approach based in the SpecC Methodology (2025)
Daniel G. P. Petrini and Braz Izaias da Silva Junior
—
SynthTools: A Framework for Scaling Synthetic Tools for Agent Development (2026)
Tommaso Castellani et al.
—
VULPO: Context-Aware Vulnerability Detection via On-Policy LLM Optimization (2026)
Youpeng Li et al.
—
From Signal to Turn: Interactional Friction in Modular Speech-to-Speech Pipelines (2025)
Tittaya Mairittha et al.
—
Revisiting the Reliability of Language Models in Instruction-Following (2026)
Jianshuo Dong et al.
—
RiskBridge: Turning CVEs into Business-Aligned Patch Priorities (2026)
Yelena Mujibur Sheikh et al.
—
Lost in Transcription: How Speech-to-Text Errors Derail Code Understanding (2026)
Jayant Havare et al.
—
Sink or SWIM: Tackling Real-Time ASR at Scale (2026)
Federico Bruzzone et al.
—
HE-SNR: Uncovering Latent Logic via Entropy for Guiding Mid-Training on SWE-bench (2026)
Yueyang Wang et al.
—
MedBeads: An Agent-Native, Immutable Data Substrate for Trustworthy Medical AI (2026)
Takahito Nakajima
—
Scalable Explainability-as-a-Service (XaaS) for Edge AI Systems (2026)
Samaresh Kumar Singh et al.
—
SWE-Adept: An LLM-Based Agentic Framework for Deep Codebase Analysis and Structured Issue Resolution (2026)
Kang He et al.
—
BeyondSWE: Can Current Code Agent Survive Beyond Single-Repo Bug Fixing? (2026)
Guoxin Chen et al.
—
Procedural Refinement by LLM-driven Algorithmic Debugging for ARC-AGI-2 (2026)
Yu-Ning Qiu et al.
—
Coherence Collapse: Diagnosing Why Code Agents Fail After Reaching the Right Code (2026)
Myeongsoo Kim et al.
—
Where Code Meets Natural Language: Taxonomy-Driven Information Flow Analysis for LLM-Integrated Applications (2026)
Zihao Xu et al.
—
ORACLE-SWE: Quantifying the Contribution of Oracle Information Signals on SWE Agents (2026)
Kenan Li et al.
—
sciwrite-lint: Verification Infrastructure for the Age of Science Vibe-Writing (2026)
Sergey V Samsonau
—
The A-R Behavioral Space: Execution-Level Profiling of Tool-Using Language Model Agents in Organizational Deployment (2026)
Shasha Yu et al.
—
SWE-Edit: Rethinking Code Editing for Efficient SWE-Agent (2026)
Yikai Zhang et al.
—
Cryptographic Registry Provenance: Structural Defense Against Dependency Confusion in AI Package Ecosystems (2026)
Alan L. McCann
—
The AI-Native Large-Scale Agile Software Development Manifesto (2026)
Ricardo Britto et al.
—
Tool Calling is Linearly Readable and Steerable in Language Models (2026)
Zekun Wu (University College London) et al.
—
CUDABeaver: Benchmarking LLM-Based Automated CUDA Debugging (2026)
Shiyang Li et al.
—
Shepherd: A Runtime Substrate Empowering Meta-Agents with a Formalized Execution Trace (2026)
Simon Yu et al.
—
AgentLens: Revealing The Lucky Pass Problem in SWE-Agent Evaluation (2026)
Priyam Sahoo et al.
—
A Two-Dimensional Framework for AI Agent Design Patterns: Cognitive Function and Execution Topology (2026)
Jia Huang et al.
—