Python benchmarks
Emerging43papers using it
2018first seen
Papers using Python benchmarks (39)
- Benchmarking LLM for Code Smells Detection: OpenAI GPT-4.0 vs
DeepSeek-V3ASTER: Natural and Multi-language Unit Test Generation with LLMsUnify and Triumph: Polyglot, Diverse, and Self-Consistent Generation of
Unit Tests with LLMsMarking Code Without Breaking It: Code Watermarking for Detecting LLM-Generated CodeBabbling Suppression: Making LLMs Greener One Token at a TimeA framework for assessing the capabilities of code generation of constraint domain-specific languages with large language modelsChipBench: A Next-Step Benchmark for Evaluating LLM Performance in AI-Aided Chip DesignNeuron-Guided Interpretation of Code LLMs: Where, Why, and How?BRIDGE: Building Representations In Domain Guided Program SynthesisGramTrans: A Better Code Representation Approach in Code GenerationChallenge on Optimization of Context Collection for Code CompletionEvaluating Large Language Models for Code Translation: Effects of Prompt Language and Prompt DesignWhen Retriever Meets Generator: A Joint Model for Code Comment GenerationExploring Generalizable Automated Program Repair with Large Language ModelsBenchmarking Large Language Models for Multi-Language Software
Vulnerability DetectionUnderstanding the Effectiveness of LLMs in Automated Self-Admitted
Technical Debt RepaymentA General Path-Based Representation for Predicting Program PropertiesBig Code != Big Vocabulary: Open-Vocabulary Models for Source CodeUnsupervised Translation of Programming LanguagesLeveraging Code Generation to Improve Code Retrieval and Summarization
via Dual LearningI Know What You Are Searching For: Code Snippet Recommendation from
Stack Overflow PostsExploiting Method Names to Improve Code Summarization: A Deliberation
Multi-Task Learning ApproachSyntax and Domain Aware Model for Unsupervised Program TranslationCode Execution with Pre-trained Language ModelsPromSec: Prompt Optimization for Secure Generation of Functional Source
Code with Large Language Models (LLMs)MMF3: Neural Code Summarization Based on Multi-Modal Fine-Grained
Feature FusionA Controlled Experiment on the Energy Efficiency of the Source Code
Generated by Code LlamaCoTran: An LLM-based Code Translator using Reinforcement Learning with
Feedback from Compiler and Symbolic ExecutionAre Human Rules Necessary? Generating Reusable APIs with CoT Reasoning
and In-Context LearningAutomated Source Code Generation and Auto-completion Using Deep
Learning: Comparing and Discussing Current Language-Model-Related ApproachesCOSEA: Convolutional Code Search with Layer-wise AttentionCodePlan: Repository-level Coding using LLMs and PlanningSynCode: LLM Generation with Grammar AugmentationCodeFusion: A Pre-trained Diffusion Model for Code GenerationCodeShell Technical ReportGenerating Adversarial Computer Programs using Optimized ObfuscationsAutomated Transpilation of Imperative to Functional Code using
Neural-Guided Program Synthesis (Extended Version)TASTY: A Transformer based Approach to Space and Time complexityPrecision or Peril: A PoC of Python Code Quality from Quantized Large Language Models