HumanEval
Emerging25papers using it
2024first seen
Papers using HumanEval (25)
- How Generation Architecture Shapes Code Complexity in Multi-Agent LLM Systems: A Paired Study on HumanEvalFASE: Fast Adaptive Semantic Entropy for Code QualitySmarter Saboteurs, Better Fixers: Scaling & Security in Linear Multi-Agent WorkflowsStrategies for Guiding LLMs to Use Software Design Patterns: A Case of SingletonPoison with Style: A Practical Poisoning Attack on Code Large Language ModelsHonest Lying: Understanding Memory Confabulation in Reflexive AgentsLeveraging Metamemory Mechanisms for Enhanced Data-Free Code Generation
in LLMsG-Designer: Architecting Multi-agent Communication Topologies via Graph
Neural NetworksVoting Protocols as Coordination Mechanisms for Role-Constrained Multi-Agent Tutoring SystemsSafesieve: From Heuristics To Experience In Progressive Pruning For Llm-based Multi-agent CommunicationOmniCode: A Benchmark for Evaluating Software Engineering AgentsR2V Agent: Teaching SLMs When to Ask for HelpTDPGen: Optimizing Agentic Code Generation via Test-Driven Planning and Hierarchical ReActAdaptive Confidence Gating In Multi-agent Collaboration For Efficient And Optimized Code GenerationGraph-of-Agents: A Graph-based Framework for Multi-Agent LLM CollaborationCARD: Towards Conditional Design of Multi-agent Topological StructuresMAR:Multi-Agent Reflexion Improves Reasoning Abilities in LLMsA Multi-Agent Framework for Stateful Inference-Time SearchAgentGroupChat-V2: Divide-and-Conquer Is What LLM-Based Multi-Agent System NeedRethinking Verification For LLM Code Generation: From Generation To TestingNexus: A Lightweight and Scalable Multi-Agent Framework for Complex
Tasks AutomationGuided Code Generation with LLMs: A Multi-Agent Framework for Complex
Code TasksFALCON: Feedback-driven Adaptive Long/short-term memory reinforced Coding Optimization systemDivide-and-Conquer Meets Consensus: Unleashing the Power of Functions in
Code GenerationCPL: Critical Plan Step Learning Boosts LLM Generalization in Reasoning
Tasks