Under Review / In Preparation
- Quantifying the Impact of Translation Errors on Multilingual LLM Evaluation.
Under review. (arXiv forthcoming)
2026
- Diagnosing Translated Benchmarks: An Automated Quality Assurance Study of the EU20 Benchmark Suite.
Klaudia-Doris Thellmann, Bernhard Stadler, et al. — LREC 2026. (arXiv forthcoming)
2025
Teuken-7B-Base & Teuken-7B-Instruct: Towards European LLMs.
Mehdi Ali, Michael Fromm, Klaudia-Doris Thellmann, et al. — EACL 2025.
arXiv: https://arxiv.org/abs/2410.03730Towards Multilingual LLM Evaluation for European Languages.
Klaudia-Doris Thellmann, Bernhard Stadler, Michael Fromm, et al. — arXiv (2025).
https://arxiv.org/abs/2410.08928
2024
Investigating Multilingual Instruction-Tuning: Do Polyglot Models Demand for Multilingual Instructions?
Alexander Arno Weber, Klaudia-Doris Thellmann, Jan Ebert, et al. — EMNLP 2024.
DOI: https://doi.org/10.18653/v1/2024.emnlp-main.1159Tokenizer Choice For LLM Training: Negligible or Crucial?
Mehdi Ali, Michael Fromm, Klaudia-Doris Thellmann, et al. — Findings of NAACL 2024.
DOI: https://doi.org/10.18653/v1/2024.findings-naacl.247