Kai Golan Hashiloni, Shay Cohen, Asaf Shina, Jingyi Yang, Orr Meir Zwebner, Nicola Bajetta, Guy Bilitski, Rebecca Sundén, Guy Maduel, Ryan Conlon, Ari Barzilai, Daniel Mass, Shanshan Jia, Aviv Naaman, Sonam Choden, Sonam Jamtsho, Yadi Qu, Harunaga Isaacson, Dorji Wangchuk, Shai Fine, Orna Almogi, Kfir Bar, 2025.
This collaborative research project between the RUNI and UHH teams developed 13 classification and detection tasks for evaluating language models on Buddhist texts in Sanskrit and Classical Tibetan. The work involved weekly coordination meetings, development of annotation guidelines, systematic dataset annotation, and the creation of protocols for benchmarking both proprietary and open-weight models.
