Publications
You can also find my articles on my Google Scholar profile.
Monographs
Book Chapters
- Biemann, C., Bontcheva, K., Eckart de Castilho, R., Gurevych, I., Yimam, S.M. (2017): Collaborative Web-based Tools for Multi-layer Text Annotation. In: N. Ide and J. Pustejovsky (Eds.): Handbook of Linguistic Annotation, Springer (pdf)
Journal Publications
- Ayalew Kassahun and Seid Muhie Yimam and Yonas Seifu Muanenda and Beshir Melkaw Ali and Seleshi Getahun Yalew(2024): Uncovering the priorities of scientific research on sustainable development goals: A case study in Ethiopia, Sustainable Development, published by ERP Environment and John Wiley & Sons Ltd. 2024;1–26 , DOI: 10.1002/sd.3020 (pdf)
- Jana, A., Venkatesh, G., Yimam, S.M., and Biemann, C., Hypernymy Detection for Low-Resource Languages: A Study for Hindi, Bengali, and Amharic, ACM Transactions on Asian and Low-Resource Language Information Processing (2022). (pdf)
- Yimam, S.M., Biemann, C., Majnaric, L., Šabanović, Š., Holzinger, A. (2016): An adaptive annotation approach for
biomedical entity and relation recognition. Brain Informatics, (online first), 10.1007/s40708-016-0036-4 (pdf) - Yimam, S.M., Ayele, A.A.; Venkatesh, G.; Gashaw I.; Biemann C. (2021): Introducing Various Semantic Models for Amharic: Experimentation and Evaluation with Multiple Tasks and Datasets. Future Internet 2021, 13, 275. (pdf) https://doi.org/10.3390/fi13110275
Conference Proceedings
- Sewunetie, W., Beza, A., Abebe, H., Abuhay, T. M., Admass, W., Hassen, H., Haile, T., Hailemariam, H., Debebe, L., Moges, N., Bekele, N., Tilahun, S. L., Berta, M., Mammo, M., Yimam, S. M., and Laszlo, K. (2024): Large Language Models for Sexual, Reproductive, and Maternal Health Rights. 2024 IEEE 12th International Conference on Healthcare Informatics (ICHI). (pdf)
- Azime, I. A., Tonja, A. L., Belay, T. D., Fuge, M. Y., Wassie, A. K., Jada, E. S., Chanie, Y., Sewunetie, W. T., and Yimam, S. M. (2024): Walia-LLM: Enhancing Amharic-LLaMA by Integrating Task-Specific and Generative Datasets. EMNLP 2024, Miami, Florida, USA. (pdf)
- Abinew Ali Ayele, Nikolay Babakov, Janek Bevendorff, Xavier Bonet Casals, Berta Chulvi, Daryna Dementieva, Ashaf Elnagar, Dayne Freitag, Maik Fröbe, Damir Korenčić, Maximilian Mayerl, Daniil Moskovskiy, Animesh Mukherjee, Alexander Panchenko, Martin Potthast, Francisco Rangel, Naquee Rizwan, Paolo Rosso, Florian Schneider, Alisa Smirnova, Efstathios Stamatatos, Elisei Stakovskii, Benno Stein, Mariona Taulé, Dmitry Ustalov, Xintong Wang, Matti Wiegmann, Seid Muhie Yimam, Eva Zangerle. (authors are listed in alphabetical order) (2024): Overview of PAN 2024: Multi-Author Writing Style Analysis, Multilingual Text Detoxification, Oppositional Thinking Analysis, and Generative AI Authorship Verification, 15th International Conference of the Cross-Language Evaluation Forum for European Languages (CLEF 2024), Grenoble, France (pdf)
- Nedjma Ousidhoum, Shamsuddeen Hassan Muhammad, Mohamed Abdalla, Idris Abdulmumin, Ibrahim Said Ahmad, Sanchit Ahuja, Alham Fikri Aji, Vladimir Araujo, Abinew Ali Ayele, Pavan Baswani, Meriem Beloucif, Chris Biemann, Sofia Bourhim, Christine de Kock, Genet Shanko Dekebo, Oumaima Hourrane, Gopichand Kanumolu, Lokesh Madasu, Samuel Rutunda, Manish Shrivastava, Thamar Solorio, Nirmal Surange, Hailegnaw Getaneh Tilaye, Krishnapriya Vishnubhotla, Genta Indra Winata, Seid Muhie Yimam, Saif M. Mohammad (2024): SemRel2024: A Collection of Semantic Textual Relatedness Datasets for 13 Languages. Findings of the Association for Computational Linguistics (ACL 2024), Bangkok Thailand. (pdf)
- Atnafu Lambebo Tonja, Israel Abebe Azime, Tadesse Destaw Belay, Mesay Gemeda Yigezu, Moges Ahmed Mehamed, Abinew Ali Ayele, Ebrahim Chekol Jibril, Michael Melese Woldeyohannis, Olga Kolesnikova, Philipp Slusallek, Dietrich Klakow and Yimam, S.M. (2024): EthioLLM: Multilingual Large Language Models for Ethiopian Languages with Task Evaluation, The 2024 Joint International Conference on Computational Linguistics, Language and Evaluation (LREC-COLING 2024, Torino, Italy) (pdf)
- Ayele A.A., Yimam, S.M., Belay T.D., Asfaw T. and Biemann C. (2023): Exploring Amharic Hate Speech Data Collection and Classification Approaches, in the 14th Conference RECENT ADVANCES IN NATURAL LANGUAGE PROCESSING, Varna, Bulgaria (pdf)
- Ayele A.A., Dinter S., Yimam, S.M. and Biemann C. (2023): Multilingual Racial Hate Speech Detection Using Transfer Learning, in the 14th Conference RECENT ADVANCES IN NATURAL LANGUAGE PROCESSING, Varna, Bulgaria (pdf)
- Schneider, F., Yimam, S.M., Petersen-Frey , F., Biemann, C., von Nordheim, G., Kleinen-von Königslöw, K., (2023): CodeAnno: Extending WebAnno with Hierarchical Document Level Annotation and Automation. The 17th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2023), System Demonstrations Track, Dubrovnik, Croatia (pdf)
- Belay T.D., Tonja A.L., Kolesnikova O., Yimam S. M., Ayele A.A., Haile S.B., Sidorov G., Gelbukh A. (2022): The Effect of Normalization for Bi-directional Amharic-English Neural Machine Translation, International Conference on Information and Communication Technology for Development for Africa (ICT4DA 2022), Bahir Dar, Ethiopia (pdf)
- Ayele A.A., Belay T.D., Asfaw, T.T., Dinter S., Yimam S. M., Biemann, C. (2022): The 5Js in Ethiopia: Amharic Hate Speech Data Annotation Using Toloka Crowdsourcing Platform, International Conference on Information and Communication Technology for Development for Africa (ICT4DA 2022), Bahir Dar (pdf)
- Remus S., Wiedemann G., Anwar S., Petersen-Frey F., Yimam S. M., Biemann C. (2022), More Like This: Semantic Retrieval with Linguistic Information, In Proceedings of the 18th Conference on Natural Language Processing (KONVENS 2022), pages 156–166, Potsdam, Germany (pdf).
- Beloucif M., Yimam, S.M., Stahlhacke S. and Biemann C. (2022): Elvis vs. M. Jackson: Who has More Albums? Classification and Identification of Elements in Comparative Question. In the 2022 International Conference on Language Resources and Evaluation (LREC 2022), Marseille, France (pdf).
- Belay, T. D., Ayele, A.A., Gelaye, G., Yimam, S.M., Biemann, C. (2021): Impacts of Homophone Normalization on Semantic Models for Amharic. Proceedings of the Third International Conference on ICT for Development for Africa (ICT4DA 2021), Bahir Dar, Ethiopia (pdf)
- von Boguszewski, N., Moin, S., Bhowmick, A., Yimam, S.M., Biemann, C. (2021): How Hateful are Movies? A Study and Prediction on Movie Subtitles. Proceedings of KONVENS, Düsseldorf, Germany (pdf)
- Wiechmann, M., Yimam S. M., Biemann, C. (2021): ActiveAnno: General-Purpose Document-Level Annotation Tool with Active Learning Integration. The 2021 Conference of the North American Chapter of the Association for Computational Linguistics - Human Language Technologies - System Demonstrations, Mexico City, Mexico (online) (pdf)
- Gooding, S., Kochmar, E., Yimam S. M., Biemann, C. (2021): Word Complexity is in the Eye of the Beholder. The 2021 Conference of the North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL-HLT), Mexico City, Mexico. (pdf)
- Mathew, B., Saha, P., Yimam S. M., Biemann, C., Goyal, P., Mukherjee, A. (2021): HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection. Proceedings of AAAI-21, Virtual Conference. (pdf)
- Haase, C., Anwar, S.,Yimam S. M., Friedrich, A., Biemann, C. (2021): SCoT: Sense Clustering over Time: a tool for the analysis of lexical change. The 2021 Conference of the European Chapter of the Association for Computational Linguistics - System Demonstrations. Kyiv, Ukraine (Online) (pdf)
- Yimam S. M., Alemayehu H. M., Ayele A. A. and Biemann C. (2020): Exploring Amharic Sentiment Analysis from Social Media Texts: Building Annotation Tools and Classification Models. The 28th International Conference on Computational Linguistics (COLING 2020), Barcelona, Spain (pdf )(poster)
- Yimam S. M., Venkatesh, G., Lee, J. Biemann, C. (2020): Automatic Compilation of Resources for Academic Writing and Evaluating with Informal Word Identification and Paraphrasing System, The International Conference on Language Resources and Evaluation (LREC 2020), Marseille, France. ( pdf )
- Wiedemann G., Yimam S.M., and Biemann C. (2018) : A Multilingual Information Extraction Pipeline for Investigative Journalism. In Proceedings of 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP 2018). Brussels, Belgium (pdf)
- Yimam S. M., Biemann C. (2018): Demonstrating Par4Sem - A Semantic Writing Aid with Adaptive Paraphrasing. In Proceedings of 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP 2018). Brussels, Belgium (pdf).
- Wiedemann G., Yimam S.M., and Biemann C. (2018) : New/s/leak 2.0 – Multilingual Information Extraction and Visualization for Investigative Journalism. In: Proceedings of the 10th International Conference on Social Informatics (SocInfo 2018). St.Petersburg, Russia (pdf)
- Yimam S.M, Biemann C. (2018): Par4Sim – Adaptive Paraphrasing for Text Simplification. In Proceedings of The 27th International Conference on Computational Linguistics (COLING 2018). Santa Fe, New-Mexico, USA (pdf).
- Yimam S.M, Štajner S., Riedl M., Biemann C. (2017): CWIG3G2 - Complex Word Identification Task across Three Text Genres and Two User Groups. In Proceedings of The 8th International Joint Conference on Natural Language Processing (IJCNLP 2017). Taipei, Taiwan (pdf)
- Yimam S.M, Štajner S., Riedl Martin, Biemann C. (2017): Multilingual and Cross-Lingual Complex Word Identification. In Proceedings of The 2017 International Conference on Recent Advances in Natural Language Processing (RANLP). Varna, Bulgaria (pdf)
- Yimam, S.M., Ulrich, H., von Landesberger, T., Rosenbach, M., Regneri, M., Panchenko, A., Lehmann, F., Fahrer, U., Biemann, C. and Ballweg, K. (2016): new/s/leak – Information Extraction and Visualization for Investigative Data Journalists. ACL 2016 Demo Session, Berlin, Germany (pdf)
- Yimam, S.M., Biemann, C., Majnaric, L., Šabanović, Š., Holzinger, A. (2015): Interactive and Iterative Annotation for Biomedical Entity Recognition, International Conference on Brain Informatics and Health (BIH’15), London, UK (pdf)
- Benikova, D., Yimam, S.M., Biemann C. (2015). GermaNER: Free Open German Named Entity Recognition Tool. In: Proceedings of the GSCL 2015. Essen, Germany (pdf)
- Yimam, S.M., Eckart de Castilho, R., Gurevych, I., Biemann C. (2014): Automatic Annotation Suggestions and Custom Annotation Layers in WebAnno. Proceedings of ACL 2014 System Demonstrations, Baltimore, MD, USA (pdf)
- Yimam, S.M., Gurevych, I., Eckart de Castilho, R., and Biemann C. (2013): WebAnno: A Flexible, Web-based and Visually Supported System for Distributed Annotations. Proceedings of ACL-2013, demo session, Sofia, Bulgaria (pdf)
Workshop Proceedings
- Sewunetie, W., Tonja, A., Belay, T., Nigatu, H. H., Gebremeskel, G., Mossie, Z., Seid, H., and Yimam, S. (2024): Gender Bias Evaluation in Machine Translation for Amharic, Tigrigna, and Afaan Oromoo. Proceedings of the 2nd International Workshop on Gender-Inclusive Translation Technologies, Sheffield, United Kingdom. European Association for Machine Translation (EAMT). (pdf)
- Daryna Dementieva, Daniil Moskovskiy, Nikolay Babakov, Abinew Ali Ayele, Naquee Rizwan, Florian Schneider, Xintong Wang, Seid Muhie Yimam, Dmitry Ustalov, Elisei Stakovskii, Alisa Smirnova, A Elnagar, Animesh Mukherjee, Alexander Panchenko. (2024): Overview of the Multilingual Text Detoxification Task at PAN 2024, Working Notes of CLEF 2024, Grenoble, France (pdf)
- Melese Ayichlie Jigar, Abinew Ali Ayele, Yimam, S.M. and Chris Biemann (2024): Detecting Hate Speech in Amharic Using Multimodal Analysis of Social Media Memes. Proceedings of The Fourth Workshop on Threat, Aggression & Cyberbullying, Torino, Italy (pdf)
- Abinew Ali Ayele, Esubalew Alemneh Jalew, Adem Chanie Ali, Yimam, S.M., Chris Biemann (2024): Exploring Boundaries and Intensities in Offensive and Hate Speech: Unveiling the Complex Spectrum of Social Media Discourse. Proceedings of The Fourth Workshop on Threat, Aggression & Cyberbullying, Torino, Italy (pdf)
- Shamsuddeen Hassan Muhammad, Idris Abdulmumin, Yimam, S.M., David Ifeoluwa Adelani, Ibrahim Sa'id Ahmad, Nedjma Ousidhoum, Abinew Ayele, Saif M Mohammad, Meriem BELOUCIF, Sebastian Ruder. SemEval-2023 Task 12: Sentiment Analysis for African Languages (AfriSenti-SemEval): arXiv preprint arXiv:2304.06845. 2023. (pdf)
- Shamsuddeen Hassan Muhammad, Idris Abdulmumin, Abinew Ali Ayele, Nedjma Ousidhoum, David Ifeoluwa Adelani, Yimam, S.M., Ibrahim Sa'id Ahmad, Meriem BELOUCIF, Saif Mohammad, Sebastian Ruder, Oumaima Hourrane, Pavel Brazdil, Felermino Dário Mário António Ali, Davis Davis, Salomey Osei, Bello Shehu Bello, Falalu Ibrahim, Tajuddeen Gwadabe, Samuel Rutunda, Tadesse Belay, Wendimu Baye Messelle, Hailu Beshada Balcha, Sisay Adugna Chala, Hagos Tesfahun Gebremichael, Bernard Opoku, Steven Arthur. Afrisenti: A Twitter sentiment analysis benchmark for African languages: arXiv preprint arXiv:2302.08956. 2023. (pdf)
- Tonja A. L., Belay T. D., Azime I. A., Ayele A. A., Mehamed M. A., Kolesnikova O., Yimam S. M. (2023): Natural Language Processing in Ethiopian Languages: Current State, Challenges, and Opportunities, In the fourth workshop on Resources for African Indigenous Languages (RAIL) at EACL2023, Dubrovnik, Croatia (pdf)
- Banerjee D., Yimam S. M., Awale S. and Biemann C (2023), ARDIAS: AI-Enhanced Research Management, Discovery, and Advisory System, The AAAI-23 Workshop on Scientific Document Understanding at the Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI-23), Washington, DC, USA. (pdf)
- Ayele A.A., Belay T.D., Yimam S. M., Dinter S., Asfaw, T.T., Biemann C. (2022): The 5Js in Ethiopia: Amharic Hate Speech Data Annotation Using Toloka Crowdsourcing Platform, The Sixth Widening Natural Language Processing Workshop (WiNLP 2022) in conjunction with EMNLP 2022, Abu Dhabi, UAE (pdf)
- Belay T.D., Tonja A.L., Kolesnikova O., Yimam S. M., Ayele A.A., Haile S.B., Sidorov G., Gelbukh A. (2022): The Effect of Normalization for Bi-directional Amharic-English Neural Machine Translation, The Sixth Widening Natural Language Processing Workshop (WiNLP 2022) in conjunction with EMNLP 2022, Abu Dhabi, UAE (pdf)
- Belay, T. D., Yimam, S.M., Ayele, A. A., and Biemann, C. (2022): Question Answering Classification for Amharic Social Media Community Based Questions, The 1st Annual Meeting of the ELRA/ISCA Special Interest Group on Under-Resourced Languages (SIGUL 2022), Marseille, France (pdf).
- Destaw T., Ayele A.A. and Yimam, S.M. (2021): The Development of Pre-processing Tools and Pre-trained Embedding Models for Amharic. Proceedings of The fifth WiNLP (“Widening NLP”) Workshop held in conjunction with EMNLP 2021, Punta Cana, Dominican Republic. (pdf).
- David Ifeoluwa Adelani and Jade Abbott and Graham Neubig and Daniel D'souza and Julia Kreutzer and Constantine Lignos and Chester Palen-Michel and Happy Buzaaba and Shruti Rijhwani and Sebastian Ruder and Stephen Mayhew and Israel Abebe Azime and Shamsuddeen Muhammad and Chris Chinenye Emezue and Joyce Nakatumba-Nabende and Perez Ogayo and Anuoluwapo Aremu and Catherine Gitau and Derguene Mbaye and Jesujoba Alabi and Seid Muhie Yimam and Tajuddeen Gwadabe and Ignatius Ezeani and Rubungo Andre Niyongabo and Jonathan Mukiibi and Verrah Otiende and Iroro Orife and Davis David and Samba Ngom and Tosin Adewumi and Paul Rayson and Mofetoluwa Adeyemi and Gerald Muriuki and Emmanuel Anebi and Chiamaka Chukwuneke and Nkiruka Odu and Eric Peter Wairagala and Samuel Oyerinde and Clemencia Siro and Tobius Saul Bateesa and Temilola Oloyede and Yvonne Wambui and Victor Akinode and Deborah Nabagereka and Maurice Katusiime and Ayodele Awokoya and Mouhamadane MBOUP and Dibora Gebreyohannes and Henok Tilaye and Kelechi Nwaike and Degaga Wolde and Abdoulaye Faye and Blessing Sibanda and Orevaoghene Ahia and Bonaventure F. P. Dossou and Kelechi Ogueji and Thierno Ibrahima DIOP and Abdoulaye Diallo and Adewale Akinfaderin and Tendai Marengereke and Salomey Osei (2021): MasakhaNER: Named Entity Recognition for African Languages. Transactions of the Association for Computational Linguistics,2021, 9 1116–1131. (pdf).
- Wiedemann G, Yimam, S.M., Biemann C. (2020): UHH-LT at SemEval-2020 Task 12: Fine-Tuning of Pre-Trained Transformer Networks for Offensive Language Detection. Proceedings of The 14th International Workshop on Semantic Evaluation (SemEval), Barcelona, Spain. (pdf) ( poster).
- Yimam, S.M., Ayele, A. A., Biemann C. (2019): Analysis of the Ethiopic Twitter Dataset for Abusive Speech in Amharic. In Proceedings of International Conference On Language Technologies For All: Enabling Linguistic Diversity And Multilingualism Worldwide (LT4ALL 2019). Paris, France p. 210-214 (pdf).
- Yimam, S.M., Biemann, C., Malmasi, S., Paetzold, G.H., Speica, L., Štajner, S., Tack, A., Zampieri, M., (2018): A Report on the Complex Word Identification Shared Task 2018. Proceedings of the 13th Workshop on Innovative Use of NLP for Building Educational Applications, New Orleans, LA, USA (pdf)
- Yimam S.M., Remus S., Panchenko A., Holzinger A., Biemann C. (2017): Entity-Centric Information Access with the Human-in-the-Loop for the Biomedical Domains. Biomedical NLP Workshop associated with RANLP 2017. Varna, Bulgaria (pdf)
- Müller, M., Ballweg, K. von Landesberger, T., Yimam, S.M., Fahrer, U., Biemann, C., Rosenbach, M., Regneri, M., Ulrich, H. (2017). Guidance for Multi-Type Entity Graphs from Text Collections. EuroVis Workshop on Visual Analytics 2017, Barcelona, Spain (pdf)
- Nandi, T., Biemann, C., Yimam, SM., Gupta, Deepak., Kohail, S., Ekbal, A., Bhattacharyya, Pushpak. (2017): IT-UHH at SemEval-2017 Task 3: Exploring Multiple Features for Community Question Answering and Implicit Dialogue Identification, In Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval 2017), Vancouver, Canada.(pdf)
- Eckart de Castilho, R. Mújdricza-Maydt, E., Yimam, S.M., Hartmann, S., Gurevych, I., Frank, A. and Biemann, C. (2016): A Web-based Tool for the Integrated Annotation of Semantic and Syntactic Structures. Proceedings of the COLING workshop on Language Technology Resources and Tools for Digital Humanities (LT4DH), Osaka, Japan (pdf)
- Ballweg K., Zouhar F., Wilhelmi-Dworski P., von Landesberger T., Fahrer U., Panchenko A., Yimam S.M. Biemann C., Regneri M., Ulrich H. (2016) new/s/leak – A Tool for Visual Exploration of Large Text Document Collections in the Journalistic Domain, Baltimore, MD, USA, (pdf)
- Yimam, S.M., Martínez Alonso, H., Riedl M. and Biemann, C. (2016): Learning Paraphrasing for Multiword Expressions. The 12th Workshop on Multiword Expressions (MWE 2016), co-located with ACL 2016, Berlin, Germany (pdf)
- Yimam, S.M. (2015): Narrowing the Loop: Integration of Resources and Linguistic Dataset Development with Interactive Machine Learning. NAACL 2015 Student Research Workshop, p. 88--95, Denver, Colorado (pdf)
- Eckart de Castilho, R., Biemann, C., Gurevych, I., Yimam, S.M. (2014): WebAnno: a flexible, web-based annotation tool for CLARIN. CLARIN Annual Conference 2014, Soesterberg, The Netherlands (pdf)
Benikova , D., Fahrer, U., Gabriel, A., Kaufmann, M., Yimam, S.M., von Landesberger, T., Biemann, C. (2014): Network of the Day: Aggregating and Visualizing Entity Networks from Online Sources. KONVENS 2014 Workshop proceedings: NLP4CMC, pp. 48-52, Hildesheim, Germany (pdf)- Yimam, S.M, Libse, M. (2009): TETEYEQ: Amharic Question Answering For Factoid Questions, SEPLN09. SALTMIL workshop - Information Retrieval and Information Extraction for
Less Resourced Languages (IE-IR-LRL), p. 17-25 (pdf)
Posters
- Yimam, S.M, Biemann C.,. (2019): Current Status, Issues, and Future Directions for Ethiopian Natural Language Processing (NLP) Research. International Conference Language Technologies for All (LT4All), Paris, France (pdf)
- Indaba 2024 - African Datasets Poster - Sentiment and Hate Speech datasets for more than 14 African languages (pdf)
- Indaba 2024 - Publications Poster - AM-DETOX: Analyzing Amharic Text Detoxification Using Pre-trained Large Language Models (pdf)
- Indaba 2024- General Poster-1 - AI4Democracy: Dynamic Dashboard for Analyzing Polarization and Extremism in Online Media (pdf)
- Indaba 2024- General Poster-2 - SEMEVAL-2025: Bridging the Gap in Text-Based Emotion Detection (pdf)