Oct 12, 2023
Many thanks for this. This morning our team was discussing exactly this issue, and the need to verify it, and voila, you've already done it. Why do you think MMLU created this confusing test format? We figured that it was because the combined format would reduce the likelihood that a model had been trained on the question.