Awesome Reinforcement Learning
πŸ“„Papers🧭TopicsπŸ”₯TrendingπŸ—ΊοΈMapπŸ†LeaderboardsπŸŽ“LearnπŸ€–Ask AI
β‹―More
πŸ‘₯AuthorsπŸ“šReading PacksπŸ› οΈToolsπŸ“Blogsβœ‰οΈNewsletterπŸ”–Saved
+ Add Paper

← authors Β· overview

Peter Henderson

12 papers Β· 11545 citations
Most-cited papers
  • On The Opportunities And Risks Of Foundation Models
    2021 Β· 6272 citations
  • Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!
    2023 Β· 1086 citations
  • Legalbench: A Collaboratively Built Benchmark For Measuring Legal Reasoning In Large Language Models
    2023 Β· 365 citations
  • Safety Alignment Should Be Made More Than Just A Few Tokens Deep
    2024 Β· 360 citations
Topics
Safety & AlignmentSurvey PaperFine-TuningModel ArchitectureTraining TechniquesRAGEvaluation

Stay Updated

E-Mail Digest

Submit a paper Β· Privacy Β· Terms

Β© 2026 Awesome Papers.