#ModelAverage ScorePaper
1GPT-469.60β€”
2GPT-3.5-Turbo62.60β€”
3Mistral-7B-v0.158.10β€”
4Zephyr-7B-beta57.70β€”
5Vicuna-13B-v1.557.30β€”
6Mistral-7B-Instruct-v0.155.00β€”
7Llama-2-13B54.10β€”
8Vicuna-7B-v1.553.00β€”
9Llama-2-7B50.60β€”
10Llama-2-13B-Chat45.00β€”
11Llama-2-7B-Chat44.60β€”
12Falcon-7B39.40β€”
13Falcon-7B-Instruct37.50β€”
CTIBench (Average) ctibench Leaderboard