Benchmark Results - Overview
Here we collect the results of the living BioChatter benchmark. For an explanation, see the benchmarking documentation and the developer docs for further reading.
Scores per model
Table sorted by median score in descending order. Click the column names to reorder.
Model name | Size | Median Accuracy | SD |
---|---|---|---|
gpt-3.5-turbo-0125 | 175 | 0.79 | 0.2 |
gpt-4o-2024-08-06 | Unknown | 0.78 | 0.24 |
claude-3-opus-20240229 | Unknown | 0.77 | 0.28 |
gpt-3.5-turbo-0613 | 175 | 0.76 | 0.21 |
claude-3-5-sonnet-20240620 | Unknown | 0.76 | 0.28 |
llama-3.1-instruct | 70 | 0.73 | 0.29 |
gpt-4-0613 | Unknown | 0.73 | 0.17 |
llama-3.1-instruct | 8 | 0.72 | 0.28 |
gpt-4-turbo-2024-04-09 | Unknown | 0.71 | 0.26 |
gpt-4-0125-preview | Unknown | 0.69 | 0.27 |
gpt-4o-mini-2024-07-18 | Unknown | 0.69 | 0.23 |
gpt-4o-2024-05-13 | Unknown | 0.68 | 0.31 |
llama-3-instruct | 8 | 0.65 | 0.36 |
openhermes-2.5 | 7 | 0.6 | 0.3 |
chatglm3 | 6 | 0.44 | 0.26 |
llama-2-chat | 70 | 0.42 | 0.34 |
mistral-instruct-v0.2 | 7 | 0.4 | 0.33 |
code-llama-instruct | 7 | 0.4 | 0.35 |
code-llama-instruct | 34 | 0.38 | 0.35 |
code-llama-instruct | 13 | 0.38 | 0.33 |
llama-2-chat | 13 | 0.38 | 0.33 |
mixtral-instruct-v0.1 | 46,7 | 0.34 | 0.28 |
llama-2-chat | 7 | 0.31 | 0.3 |
Scores per quantisation
Table sorted by median score in descending order. Click the column names to reorder.
Model name | Size | Version | Quantisation | Median Accuracy | SD |
---|---|---|---|---|---|
gpt-3.5-turbo-0125 | 175 | nan | nan | 0.79 | 0.2 |
gpt-4o-2024-08-06 | Unknown | nan | nan | 0.78 | 0.24 |
claude-3-opus-20240229 | Unknown | nan | nan | 0.77 | 0.28 |
claude-3-5-sonnet-20240620 | Unknown | nan | nan | 0.76 | 0.28 |
gpt-3.5-turbo-0613 | 175 | nan | nan | 0.76 | 0.21 |
llama-3.1-instruct | 8 | ggufv2 | Q6_K | 0.74 | 0.28 |
llama-3.1-instruct | 70 | ggufv2 | IQ2_M | 0.74 | 0.29 |
llama-3.1-instruct | 8 | ggufv2 | Q5_K_M | 0.74 | 0.28 |
gpt-4-0613 | Unknown | nan | nan | 0.73 | 0.17 |
llama-3.1-instruct | 70 | ggufv2 | IQ4_XS | 0.73 | 0.29 |
llama-3.1-instruct | 8 | ggufv2 | Q8_0 | 0.72 | 0.3 |
gpt-4-turbo-2024-04-09 | Unknown | nan | nan | 0.71 | 0.26 |
llama-3.1-instruct | 8 | ggufv2 | Q3_K_L | 0.71 | 0.28 |
llama-3.1-instruct | 8 | ggufv2 | Q4_K_M | 0.7 | 0.26 |
llama-3.1-instruct | 8 | ggufv2 | IQ4_XS | 0.69 | 0.28 |
gpt-4-0125-preview | Unknown | nan | nan | 0.69 | 0.27 |
gpt-4o-mini-2024-07-18 | Unknown | nan | nan | 0.69 | 0.23 |
gpt-4o-2024-05-13 | Unknown | nan | nan | 0.68 | 0.31 |
llama-3.1-instruct | 70 | ggufv2 | Q3_K_S | 0.67 | 0.28 |
llama-3-instruct | 8 | ggufv2 | Q8_0 | 0.65 | 0.35 |
llama-3-instruct | 8 | ggufv2 | Q4_K_M | 0.65 | 0.38 |
llama-3-instruct | 8 | ggufv2 | Q6_K | 0.65 | 0.36 |
openhermes-2.5 | 7 | ggufv2 | Q6_K | 0.62 | 0.3 |
llama-3-instruct | 8 | ggufv2 | Q5_K_M | 0.62 | 0.36 |
openhermes-2.5 | 7 | ggufv2 | Q5_K_M | 0.6 | 0.29 |
openhermes-2.5 | 7 | ggufv2 | Q8_0 | 0.6 | 0.3 |
openhermes-2.5 | 7 | ggufv2 | Q4_K_M | 0.6 | 0.3 |
openhermes-2.5 | 7 | ggufv2 | Q3_K_M | 0.56 | 0.3 |
code-llama-instruct | 34 | ggufv2 | Q2_K | 0.5 | 0.33 |
openhermes-2.5 | 7 | ggufv2 | Q2_K | 0.5 | 0.28 |
code-llama-instruct | 7 | ggufv2 | Q3_K_M | 0.49 | 0.31 |
code-llama-instruct | 7 | ggufv2 | Q4_K_M | 0.47 | 0.39 |
mistral-instruct-v0.2 | 7 | ggufv2 | Q5_K_M | 0.46 | 0.34 |
mistral-instruct-v0.2 | 7 | ggufv2 | Q6_K | 0.45 | 0.34 |
code-llama-instruct | 34 | ggufv2 | Q3_K_M | 0.45 | 0.31 |
chatglm3 | 6 | ggmlv3 | q4_0 | 0.44 | 0.26 |
llama-2-chat | 70 | ggufv2 | Q4_K_M | 0.44 | 0.35 |
llama-2-chat | 70 | ggufv2 | Q5_K_M | 0.44 | 0.35 |
code-llama-instruct | 13 | ggufv2 | Q6_K | 0.44 | 0.35 |
code-llama-instruct | 13 | ggufv2 | Q8_0 | 0.44 | 0.33 |
code-llama-instruct | 13 | ggufv2 | Q5_K_M | 0.43 | 0.32 |
llama-2-chat | 70 | ggufv2 | Q3_K_M | 0.41 | 0.33 |
mistral-instruct-v0.2 | 7 | ggufv2 | Q3_K_M | 0.41 | 0.34 |
mistral-instruct-v0.2 | 7 | ggufv2 | Q8_0 | 0.4 | 0.33 |
llama-2-chat | 13 | ggufv2 | Q8_0 | 0.4 | 0.34 |
code-llama-instruct | 7 | ggufv2 | Q8_0 | 0.4 | 0.37 |
code-llama-instruct | 7 | ggufv2 | Q5_K_M | 0.39 | 0.34 |
llama-2-chat | 13 | ggufv2 | Q3_K_M | 0.39 | 0.33 |
llama-2-chat | 13 | ggufv2 | Q5_K_M | 0.39 | 0.33 |
code-llama-instruct | 7 | ggufv2 | Q2_K | 0.38 | 0.29 |
code-llama-instruct | 34 | ggufv2 | Q4_K_M | 0.38 | 0.35 |
code-llama-instruct | 7 | ggufv2 | Q6_K | 0.38 | 0.39 |
code-llama-instruct | 34 | ggufv2 | Q5_K_M | 0.38 | 0.38 |
llama-2-chat | 70 | ggufv2 | Q2_K | 0.38 | 0.35 |
llama-2-chat | 13 | ggufv2 | Q6_K | 0.37 | 0.34 |
code-llama-instruct | 34 | ggufv2 | Q8_0 | 0.37 | 0.35 |
mistral-instruct-v0.2 | 7 | ggufv2 | Q2_K | 0.37 | 0.29 |
mistral-instruct-v0.2 | 7 | ggufv2 | Q4_K_M | 0.37 | 0.35 |
code-llama-instruct | 34 | ggufv2 | Q6_K | 0.37 | 0.36 |
llama-2-chat | 13 | ggufv2 | Q4_K_M | 0.36 | 0.34 |
mixtral-instruct-v0.1 | 46,7 | ggufv2 | Q4_K_M | 0.35 | 0.3 |
mixtral-instruct-v0.1 | 46,7 | ggufv2 | Q5_K_M | 0.34 | 0.31 |
mixtral-instruct-v0.1 | 46,7 | ggufv2 | Q6_K | 0.34 | 0.29 |
llama-2-chat | 7 | ggufv2 | Q5_K_M | 0.34 | 0.29 |
mixtral-instruct-v0.1 | 46,7 | ggufv2 | Q3_K_M | 0.33 | 0.28 |
llama-2-chat | 7 | ggufv2 | Q2_K | 0.33 | 0.32 |
code-llama-instruct | 13 | ggufv2 | Q4_K_M | 0.33 | 0.31 |
mixtral-instruct-v0.1 | 46,7 | ggufv2 | Q8_0 | 0.33 | 0.25 |
mixtral-instruct-v0.1 | 46,7 | ggufv2 | Q2_K | 0.32 | 0.27 |
llama-2-chat | 7 | ggufv2 | Q8_0 | 0.31 | 0.28 |
llama-2-chat | 7 | ggufv2 | Q6_K | 0.31 | 0.29 |
llama-2-chat | 7 | ggufv2 | Q4_K_M | 0.3 | 0.29 |
llama-2-chat | 7 | ggufv2 | Q3_K_M | 0.3 | 0.33 |
llama-2-chat | 13 | ggufv2 | Q2_K | 0.28 | 0.29 |
code-llama-instruct | 13 | ggufv2 | Q2_K | 0.17 | 0.34 |
code-llama-instruct | 13 | ggufv2 | Q3_K_M | 0.15 | 0.34 |
Scores of all tasks
Wide table; you may need to scroll horizontally to see all columns. Table sorted by median score in descending order. Click the column names to reorder.
Full model name | medical_exam | naive_query_generation_using_schema | implicit_relevance_of_multiple_fragments | property_exists | query_generation | relationship_selection | property_selection | explicit_relevance_of_single_fragments | multimodal_answer | end_to_end_query_generation | sourcedata_info_extraction | entity_selection | api_calling | Mean Accuracy | Median Accuracy | SD |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
gpt-3.5-turbo-0125 | 0.671322 | 0.514451 | 0.9 | 0.789474 | 0.953757 | 1 | 0.36747 | 1 | nan | 0.919075 | 0.510032 | 0.978261 | 0.746479 | 0.779193 | 0.789474 | 0.201291 |
gpt-4o-2024-08-06 | 0.850211 | 0.528302 | 0.666667 | 0.179894 | 0.874214 | 1 | 0.425439 | 1 | nan | 0.830189 | 0.711185 | 1 | nan | 0.733282 | 0.781735 | 0.242634 |
claude-3-opus-20240229 | 0.805556 | 0.733333 | 1 | 1 | 0.944444 | 0 | 0.421875 | 0.833333 | nan | 0.655556 | 0.691235 | 1 | nan | 0.73503 | 0.770293 | 0.276411 |
claude-3-5-sonnet-20240620 | 0.7737 | 0.633333 | 1 | 0.866667 | 0.966667 | 0 | 0.375 | 1 | nan | 0.733333 | 0.756088 | 1 | nan | 0.736799 | 0.764894 | 0.283502 |
gpt-3.5-turbo-0613 | nan | 0.5 | 1 | 0.755556 | 0.946667 | 0.5 | 0.3625 | 1 | nan | 0.833333 | 0.575381 | 0.888889 | nan | 0.736233 | 0.755556 | 0.211926 |
llama-3.1-instruct:8:ggufv2:Q6_K | 0.751748 | 0.633333 | 0.833333 | 0.833333 | 0.955556 | 0 | 0.46875 | 1 | nan | 0.733333 | 0.394469 | 1 | nan | 0.69126 | 0.742541 | 0.278015 |
llama-3.1-instruct:70:ggufv2:IQ2_M | 0.772881 | 0.633333 | 1 | 0.916667 | 0.955556 | 0 | 0.328125 | 1 | nan | 0.6 | 0.626498 | 1 | nan | 0.712096 | 0.742489 | 0.293676 |
llama-3.1-instruct:8:ggufv2:Q5_K_M | 0.749117 | 0.7 | 0.833333 | 0.833333 | 0.933333 | 0 | 0.4375 | 1 | nan | 0.733333 | 0.380477 | 1 | nan | 0.690948 | 0.741225 | 0.279278 |
gpt-4-0613 | 0.730912 | 0.682081 | 1 | 0.666667 | 0.959538 | 0.695652 | 0.38253 | 1 | nan | 0.878613 | 0.668903 | 0.920635 | 0.619048 | 0.767048 | 0.730912 | 0.172297 |
llama-3.1-instruct:70:ggufv2:IQ4_XS | 0.822581 | 0.566667 | 1 | 0.75 | 0.955556 | 0 | 0.375 | 1 | nan | 0.6 | 0.699238 | 1 | nan | 0.706276 | 0.728138 | 0.285226 |
llama-3.1-instruct:8:ggufv2:Q8_0 | 0.765734 | 0.660377 | 1 | 0.161765 | 0.937107 | 0.142857 | 0.565789 | 1 | nan | 0.773585 | 0.38907 | 1 | nan | 0.67239 | 0.719062 | 0.295232 |
gpt-4-turbo-2024-04-09 | 0.83701 | 0.508671 | 1 | 0.657143 | 0.83237 | 0.130435 | 0.325301 | 1 | 0.99 | 0.635838 | 0.650369 | 1 | nan | 0.713928 | 0.713928 | 0.262809 |
llama-3.1-instruct:8:ggufv2:Q3_K_L | 0.768421 | 0.622642 | 0.833333 | 0.24 | 0.943396 | 0.142857 | 0.486842 | 1 | nan | 0.811321 | 0.360379 | 1 | nan | 0.655381 | 0.711901 | 0.280463 |
llama-3.1-instruct:8:ggufv2:Q4_K_M | 0.741935 | 0.660377 | 0.833333 | 0.150538 | 0.924528 | 0.285714 | 0.513158 | 1 | nan | 0.735849 | 0.382027 | 0.92 | nan | 0.649769 | 0.698113 | 0.257418 |
llama-3.1-instruct:8:ggufv2:IQ4_XS | 0.756944 | 0.646018 | 0.833333 | 0.439394 | 0.946903 | 0 | 0.387255 | 1 | nan | 0.743363 | 0.414621 | 0.893939 | nan | 0.641979 | 0.69469 | 0.276449 |
gpt-4-0125-preview | 0.777159 | 0.456647 | 0.5 | 0.619048 | 0.83815 | 0.782609 | 0.0301205 | 1 | nan | 0.109827 | 0.689705 | 0.824561 | 0.793651 | 0.618456 | 0.689705 | 0.27372 |
gpt-4o-mini-2024-07-18 | 0.840796 | 0.537572 | 0.5 | 0.52439 | 0.953757 | 0.130435 | 0.388554 | 0.833333 | 0.98 | 0.687861 | 0.684553 | 0.921053 | 0.714286 | 0.668969 | 0.686207 | 0.229321 |
gpt-4o-2024-05-13 | 0.763501 | 0.537572 | 0.7 | 0.526316 | 0.809249 | 0.130435 | 0.0301205 | 1 | 0.96 | 0.115607 | 0.653946 | 1 | 0.809524 | 0.618175 | 0.676973 | 0.312259 |
llama-3.1-instruct:70:ggufv2:Q3_K_S | 0.8 | 0.633333 | 1 | 0.625 | 0.966667 | 0 | 0.375 | 1 | nan | 0.6 | 0.642336 | 1 | nan | 0.694758 | 0.668547 | 0.28438 |
llama-3-instruct:8:ggufv2:Q8_0 | 0.640669 | 0.666667 | 1 | 0.725 | 0.92 | 0 | 0.28125 | 1 | nan | 0 | 0.188555 | 0.875 | nan | 0.572467 | 0.653668 | 0.35454 |
llama-3-instruct:8:ggufv2:Q4_K_M | 0.624884 | 0.666667 | 1 | 0.775 | 0.92 | 0 | 0.109375 | 1 | nan | 0 | 0.116871 | 0.861111 | nan | 0.552173 | 0.645775 | 0.376754 |
llama-3-instruct:8:ggufv2:Q6_K | 0.623955 | 0.666667 | 1 | 0.775 | 0.926667 | 0 | 0.28125 | 1 | nan | 0 | 0.162657 | 0.875 | nan | 0.573745 | 0.645311 | 0.359165 |
openhermes-2.5:7:ggufv2:Q6_K | 0.57423 | 0.557078 | 1 | 0.113379 | 0.890411 | 0.896552 | 0.126404 | 1 | nan | 0.273973 | 0.619167 | 0.7675 | nan | 0.619881 | 0.619524 | 0.300867 |
llama-3-instruct:8:ggufv2:Q5_K_M | 0.635097 | 0.6 | 1 | 0.65 | 0.926667 | 0 | 0.1875 | 1 | nan | 0 | 0.166434 | 0.875 | nan | 0.549154 | 0.617549 | 0.360565 |
openhermes-2.5:7:ggufv2:Q5_K_M | 0.571429 | 0.56621 | 1 | 0.120988 | 0.917808 | 0.896552 | 0.196629 | 1 | nan | 0.26484 | 0.579916 | 0.758808 | nan | 0.624835 | 0.602375 | 0.293641 |
openhermes-2.5:7:ggufv2:Q8_0 | 0.577031 | 0.497717 | 1 | 0.100629 | 0.90411 | 0.896552 | 0.196629 | 1 | nan | 0.237443 | 0.600829 | 0.628118 | nan | 0.603551 | 0.60219 | 0.296591 |
openhermes-2.5:7:ggufv2:Q4_K_M | 0.586368 | 0.479452 | 1 | 0.140921 | 0.894977 | 0.896552 | 0.126404 | 1 | nan | 0.246575 | 0.597281 | 0.66967 | nan | 0.603473 | 0.600377 | 0.299216 |
openhermes-2.5:7:ggufv2:Q3_K_M | 0.563959 | 0.47032 | 0.5 | 0.156805 | 0.917808 | 1 | 0.171348 | 1 | nan | 0.287671 | 0.554488 | 1 | nan | 0.602036 | 0.559223 | 0.301462 |
code-llama-instruct:34:ggufv2:Q2_K | nan | 0.566667 | 1 | 0.75 | 0.686667 | 0.5 | 0 | 0.5 | nan | 0 | nan | 0 | nan | 0.444815 | 0.5 | 0.328199 |
openhermes-2.5:7:ggufv2:Q2_K | 0.539106 | 0.420091 | 0.5 | 0.18638 | 0.917808 | 0.655172 | 0.0168539 | 1 | nan | 0.159817 | 0.444054 | 0.604444 | nan | 0.494884 | 0.497442 | 0.27656 |
code-llama-instruct:7:ggufv2:Q3_K_M | nan | 0.426667 | 0.7 | 0.8 | 0.873333 | 0.25 | 0 | 0.833333 | nan | 0 | nan | 0.5 | nan | 0.487037 | 0.493519 | 0.307716 |
code-llama-instruct:7:ggufv2:Q4_K_M | nan | 0.653333 | 1 | 0.6 | 0.966667 | 0 | 0 | 1 | nan | 0 | 0.138732 | 0.333333 | nan | 0.469207 | 0.469207 | 0.38731 |
mistral-instruct-v0.2:7:ggufv2:Q5_K_M | 0.364146 | 0.466667 | 1 | 0.688889 | 0.826667 | 0 | 0 | 1 | nan | 0 | 0.385754 | 0.444444 | nan | 0.470597 | 0.455556 | 0.34385 |
mistral-instruct-v0.2:7:ggufv2:Q6_K | 0.366947 | 0.433333 | 1 | 0.65 | 0.833333 | 0 | 0.046875 | 1 | nan | 0 | 0.367412 | 0.5 | nan | 0.472536 | 0.452935 | 0.337974 |
code-llama-instruct:34:ggufv2:Q3_K_M | nan | 0.6 | 0.5 | 0.875 | 0.786667 | 0.25 | 0 | 0.5 | nan | 0 | nan | 0 | nan | 0.390185 | 0.445093 | 0.306514 |
chatglm3:6:ggmlv3:q4_0 | 0.426704 | 0.48 | 1 | 0.275 | 0.553333 | 0.4 | 0.2875 | 0.733333 | nan | 0 | 0.188284 | 0.75 | nan | 0.463105 | 0.444905 | 0.260423 |
llama-2-chat:70:ggufv2:Q4_K_M | nan | 0.42 | 1 | 0.755556 | 0.92 | 0.25 | 0 | 1 | nan | 0 | 0.240936 | 0.444444 | nan | 0.503094 | 0.444444 | 0.354692 |
llama-2-chat:70:ggufv2:Q5_K_M | nan | 0.36 | 0.9 | 0.777778 | 0.906667 | 0.25 | 0 | 1 | nan | 0 | 0.210166 | 0.444444 | nan | 0.484905 | 0.444444 | 0.346535 |
code-llama-instruct:13:ggufv2:Q6_K | nan | 0.54 | 0.5 | 0.825 | 0.793333 | 0 | 0 | 0.833333 | nan | 0 | nan | 0 | nan | 0.387963 | 0.443981 | 0.345581 |
code-llama-instruct:13:ggufv2:Q8_0 | nan | 0.566667 | 0.5 | 0.75 | 0.766667 | 0 | 0 | 0.833333 | nan | 0 | nan | 0 | nan | 0.37963 | 0.439815 | 0.334971 |
code-llama-instruct:13:ggufv2:Q5_K_M | nan | 0.566667 | 0.5 | 0.775 | 0.78 | 0 | 0 | 0.666667 | nan | 0 | nan | 0 | nan | 0.36537 | 0.432685 | 0.320506 |
llama-2-chat:70:ggufv2:Q3_K_M | nan | 0.413333 | 0.5 | 0.777778 | 0.906667 | 0 | 0.171875 | 1 | nan | 0 | 0.197898 | 0.333333 | nan | 0.430088 | 0.413333 | 0.327267 |
mistral-instruct-v0.2:7:ggufv2:Q3_K_M | 0.360411 | 0.466667 | 1 | 0.666667 | 0.773333 | 0 | 0.046875 | 1 | nan | 0 | 0.368974 | 0.333333 | nan | 0.456024 | 0.412499 | 0.335885 |
mistral-instruct-v0.2:7:ggufv2:Q8_0 | 0.366947 | 0.433333 | 0.9 | 0.644444 | 0.846667 | 0 | 0.0375 | 1 | nan | 0 | 0.351684 | 0.333333 | nan | 0.446719 | 0.40014 | 0.330107 |
llama-2-chat:13:ggufv2:Q8_0 | 0.431373 | 0.48 | 0.5 | 0.711111 | 0.786667 | 0 | 0 | 1 | nan | 0 | 0.0762457 | 0 | nan | 0.362309 | 0.396841 | 0.335904 |
code-llama-instruct:7:ggufv2:Q8_0 | nan | 0.4 | 0.5 | 0.666667 | 0.96 | 0 | 0 | 1 | nan | 0 | nan | 0 | nan | 0.391852 | 0.395926 | 0.37338 |
code-llama-instruct:7:ggufv2:Q5_K_M | nan | 0.4 | 0.5 | 0.688889 | 0.96 | 0 | 0 | 0.833333 | nan | 0 | nan | 0.111111 | nan | 0.388148 | 0.394074 | 0.340156 |
llama-2-chat:13:ggufv2:Q3_K_M | 0.428571 | 0.48 | 0.5 | 0.733333 | 0.68 | 0 | 0 | 1 | nan | 0 | 0.112631 | 0 | nan | 0.357685 | 0.393128 | 0.325419 |
llama-2-chat:13:ggufv2:Q5_K_M | 0.431373 | 0.433333 | 0.5 | 0.644444 | 0.746667 | 0 | 0 | 1 | nan | 0 | 0.0766167 | 0 | nan | 0.348403 | 0.389888 | 0.32518 |
code-llama-instruct:7:ggufv2:Q2_K | nan | 0.533333 | 0.7 | 0.8 | 0.92 | 0.25 | 0.0625 | 0.333333 | nan | 0 | nan | 0.25 | nan | 0.427685 | 0.380509 | 0.292686 |
code-llama-instruct:34:ggufv2:Q4_K_M | nan | 0.466667 | 0.4 | 0.975 | 0.906667 | 0 | 0 | 0.5 | nan | 0 | nan | 0 | nan | 0.360926 | 0.380463 | 0.350483 |
code-llama-instruct:7:ggufv2:Q6_K | nan | 0.333333 | 0.9 | 0.775 | 0.96 | 0 | 0 | 0.833333 | nan | 0 | nan | 0 | nan | 0.422407 | 0.37787 | 0.391629 |
code-llama-instruct:34:ggufv2:Q5_K_M | nan | 0.466667 | 1 | 0.95 | 0.9 | 0 | 0 | 0.333333 | nan | 0 | nan | 0.125 | nan | 0.419444 | 0.376389 | 0.384096 |
llama-2-chat:70:ggufv2:Q2_K | nan | 0.473333 | 0.5 | 0.666667 | 0.9 | 0 | 0 | 1 | nan | 0 | 0.215047 | 0 | nan | 0.375505 | 0.375505 | 0.352226 |
llama-2-chat:13:ggufv2:Q6_K | 0.428571 | 0.386667 | 0.5 | 0.775 | 0.813333 | 0 | 0 | 1 | nan | 0 | 0.0781337 | 0 | nan | 0.361973 | 0.37432 | 0.342819 |
code-llama-instruct:34:ggufv2:Q8_0 | nan | 0.466667 | 0.9 | 0.925 | 0.86 | 0 | 0 | 0.333333 | nan | 0 | nan | 0.25 | nan | 0.415 | 0.374167 | 0.353285 |
mistral-instruct-v0.2:7:ggufv2:Q2_K | 0.352941 | 0.573333 | 0.5 | 0.6 | 0.693333 | 0 | 0 | 1 | nan | 0 | 0.331261 | 0.222222 | nan | 0.388463 | 0.370702 | 0.294881 |
mistral-instruct-v0.2:7:ggufv2:Q4_K_M | 0.365079 | 0.366667 | 1 | 0.688889 | 0.826667 | 0 | 0 | 1 | nan | 0 | 0.347025 | 0.333333 | nan | 0.447969 | 0.365873 | 0.348328 |
code-llama-instruct:34:ggufv2:Q6_K | nan | 0.473333 | 0.9 | 0.9 | 0.853333 | 0 | 0 | 0.333333 | nan | 0 | nan | 0.125 | nan | 0.398333 | 0.365833 | 0.356636 |
llama-2-chat:13:ggufv2:Q4_K_M | 0.428571 | 0.366667 | 0.5 | 0.777778 | 0.76 | 0 | 0 | 1 | nan | 0 | 0.0888675 | 0 | nan | 0.356535 | 0.361601 | 0.336686 |
mixtral-instruct-v0.1:46_7:ggufv2:Q4_K_M | 0.368814 | 0.426667 | 1 | 0.755556 | 0.76 | 0 | 0.1625 | 0.166667 | nan | 0 | 0.193786 | 0.333333 | nan | 0.378848 | 0.351074 | 0.301567 |
mixtral-instruct-v0.1:46_7:ggufv2:Q5_K_M | 0.352941 | 0.333333 | 1 | 0.711111 | 0.84 | 0.25 | 0 | 0 | nan | 0 | 0.235659 | 0.422222 | nan | 0.376842 | 0.343137 | 0.313874 |
mixtral-instruct-v0.1:46_7:ggufv2:Q6_K | 0.34267 | 0.333333 | 0.7 | 0.85 | 0.826667 | 0.25 | 0 | 0 | nan | 0 | 0.225524 | 0.475 | nan | 0.363927 | 0.338002 | 0.289705 |
llama-2-chat:7:ggufv2:Q5_K_M | 0.40056 | 0.3379 | 0.6 | 0.155556 | 0.547945 | 0 | 0.0337079 | 1 | nan | 0 | 0.0697591 | 0.57037 | nan | 0.3378 | 0.33785 | 0.293711 |
mixtral-instruct-v0.1:46_7:ggufv2:Q3_K_M | nan | 0.38 | 0.5 | 0.777778 | 0.893333 | 0.25 | 0.065625 | 0 | nan | 0 | 0.229622 | 0.333333 | nan | 0.342969 | 0.333333 | 0.278279 |
llama-2-chat:7:ggufv2:Q2_K | 0.369748 | 0.164384 | 1 | 0.324786 | 0.611872 | 0 | 0 | 0.833333 | nan | 0 | 0.0361865 | 0.410256 | nan | 0.340961 | 0.332873 | 0.320018 |
code-llama-instruct:13:ggufv2:Q4_K_M | nan | 0.533333 | 0.5 | 0.775 | 0.833333 | 0 | 0 | 0.333333 | nan | 0 | nan | 0 | nan | 0.330556 | 0.331944 | 0.30939 |
mixtral-instruct-v0.1:46_7:ggufv2:Q8_0 | 0.358543 | 0.386667 | 0.6 | 0.666667 | 0.846667 | 0.25 | 0 | 0.133333 | nan | 0 | 0.189177 | 0.311111 | nan | 0.340197 | 0.325654 | 0.248216 |
mixtral-instruct-v0.1:46_7:ggufv2:Q2_K | 0.329599 | 0.48 | 0.6 | 0.733333 | 0.726667 | 0 | 0 | 0.333333 | nan | 0 | 0.157514 | 0 | nan | 0.305495 | 0.317547 | 0.269925 |
llama-2-chat:7:ggufv2:Q8_0 | 0.40056 | 0.292237 | 0.5 | 0.163399 | 0.589041 | 0.103448 | 0 | 1 | nan | 0 | 0.0847297 | 0.481481 | nan | 0.328627 | 0.310432 | 0.278624 |
llama-2-chat:7:ggufv2:Q6_K | 0.406162 | 0.292237 | 0.5 | 0.177778 | 0.561644 | 0 | 0 | 1 | nan | 0 | 0.0614608 | 0.553846 | nan | 0.323012 | 0.307625 | 0.290181 |
llama-2-chat:7:ggufv2:Q4_K_M | 0.40056 | 0.273973 | 0.5 | 0.251852 | 0.611872 | 0 | 0 | 1 | nan | 0 | 0.0852494 | 0.57037 | nan | 0.335807 | 0.30489 | 0.291028 |
llama-2-chat:7:ggufv2:Q3_K_M | 0.394958 | 0.228311 | 1 | 0.207407 | 0.589041 | 0.103448 | 0.0898876 | 1 | nan | 0 | 0.0650717 | 0.435897 | nan | 0.374002 | 0.301156 | 0.326263 |
llama-2-chat:13:ggufv2:Q2_K | 0.414566 | 0.366667 | 0.5 | 0.288889 | 0.433333 | 0 | 0 | 1 | nan | 0 | 0.0649389 | 0 | nan | 0.278945 | 0.283917 | 0.285171 |
code-llama-instruct:13:ggufv2:Q2_K | nan | 0.566667 | 0.4 | 0.875 | 0.82 | 0 | 0 | 0.0333333 | nan | 0 | nan | 0 | nan | 0.299444 | 0.166389 | 0.336056 |
code-llama-instruct:13:ggufv2:Q3_K_M | nan | 0.533333 | 0 | 0.85 | 0.833333 | 0 | 0 | 0 | nan | 0 | nan | 0.45 | nan | 0.296296 | 0.148148 | 0.336707 |