Open Data Collection Playbook and Annotation for African Languages
Date:
Get the presentation here
Abstract:
This Birds‑of‑a‑Feather session introduces the Open Data Collection Playbook and Annotation for African Languages, a practical and community‑driven guide designed to support the creation of high‑quality datasets for African languages. Building robust NLP resources for these languages remains a major challenge due to the scarcity of annotated data, limited digital presence, and the need for culturally grounded methodologies.
The session, led by Dr. Shamsuddeen Muhammad Hassan (Imperial College London) and Dr. Seid Muhie Yimam (University of Hamburg), presents practical, community‑driven strategies for ethical and scalable data collection and annotation for African languages.
