DS-1000
Canonical16papers using it
2022first seen
The 'DS-1000' dataset/benchmark contains a collection of domain-specific coding tasks used to evaluate the effectiveness of code generation models in generating specialized solutions for real-world software development scenarios.
Papers using DS-1000 (16)
- Knowledge-Enhanced Program Repair for Data Science CodeDeep-Bench: Deep Learning Benchmark Dataset for Code GenerationDomAgent: Leveraging Knowledge Graphs and Case-Based Reasoning for Domain-Specific Code GenerationWizardCoder: Empowering Code Large Language Models with Evol-InstructDS-1000: A Natural and Reliable Benchmark for Data Science Code
GenerationSelfEvolve: A Code Evolution Framework via Large Language ModelsGrounding Data Science Code Generation with Input-Output SpecificationsNaturalCodeBench: Examining Coding Performance Mismatch on HumanEval and
Natural User PromptsUncovering Weaknesses in Neural Code GenerationAn Empirical Study on Self-correcting Large Language Models for Data
Science Code GenerationTraining Language Models on Synthetic Edit Sequences Improves Code
SynthesisCoCoST: Automatic Complex Code Generation with Online Searching and
Correctness TestingInfiBench: Evaluating the Question-Answering Capabilities of Code Large
Language ModelsXFT: Unlocking the Power of Code Instruction Tuning by Simply Merging
Upcycled Mixture-of-ExpertsInverseCoder: Self-improving Instruction-Tuned Code LLMs with
Inverse-InstructWizardCoder: Empowering Code Large Language Models with Evol-Instruct