Whose Words, Whose Rights? Rethinking CC Licensing in African Language Data Sets.

The rapid advancement of AI has turned nearly all content into potential training data. This continued evolution signifies a shift that poses unique challenges for African language datasets. While open licensing frameworks like Creative Commons were designed to democratise access, they often fail to address the extractive dynamics that emerge when data is divorced from its cultural and community context.

This was the context within which the workshop hosted on 9th July 2025 by Creative Common, the Centre for Intellectual Property and Information Technology Law (CIPIT) an the Data Science Law Lab (University of Pretoria) sought to address the emerging issues bringing together AI researchers, legal scholars, AI developers and funders to explore alternatives that balance openness and equity from and African perspective.

Key to this were discussions on the limits of traditional open licensing, where Sarah Pearson of Creative Commons opened with a clarifying assessment of the CC licences contextualising the assessment to the age of AI. It was noted that, while CC BY attribution and CC0 Public Domain licenses have enabled widespread reuse, they were not created in anticipation of governing machine learning applications. Consequently, limitations of CC licences are noted by the fact that they do not address issues of data sovereignty, privacy or economic reciprocity, which are critical gaps when data is scraped, repurposed and monetised by entities that are not familiar with the cultural nuances or origins of the data.

Notably, the introduction of CC Signals, a new framework under development, signifies an attempt from Creative Commons to bridge the gaps. Grounded on reciprocity, the framework proposes standardised, machine-readable signals that allow data stewards to attach conditions such as attribution requirements, direct contributions to communities or ecosystem reinvestment to AI training users. In further discussion, it was important to acknowledge existing tensions, questioning how such signals would be both legally enforceable and respectful of copyright boundaries. Although there was no clear solution, Ms Pearson suggested that the answer lies in technical standardisation rather than legal expansion, primarily leveraging existing opt-out protocols from global bodies.

The session reflected on home grown solutions, looking at innovative licensing solutions developed in Africa, by Africans for Africans. Dr. Melissa Omino presented on the Nwulite Obodo License (NOODL), a three-tiered licensing model that seeks to address the gaps in licensing of African data sets that cannot be addressed by the traditional CC licensing framework. The NOODL license allows African data stewards to retain governance rights and impose conditions on commercial users. Distinctly, NOODL explicitly centres African agency, ensuring those who create data benefit from its value. In discussing open source data sets, particularly for African languages, Miguel Morachim, speaking on the Common Voice initiative by Mozilla, gave insights on challenges and the need for licenses as a governance framework. Migule noted crowd-sourcing challenges, noting how projects like Common Voice rely on public contributions but face tensions between open access and community control. On Attribution and consent, there was an emphasis on the need for dynamic consent mechanisms allowing contributors to update permissions as reuse contexts evolve, further stressing that licensing frameworks must balance standardisation for scalability with local adaptability to reflect community norms.

The Esethu License discussed by Aremu Anuoluwapo, much like the NOODL license presents a community-centric approach to data curation and governance to ensure equitable benefit sharing from linguistic resources contributed by local communities. The license features a six-step approach beginning from data set creation with native speakers controlling the data, data set license where community representatives ensure fair benefit distribution, data set release with a sitable license, contributions for research and commercial use and lastly, licensing fees received from non-African commercial entities. Sustainability for creating long-term impact was a key component of the discussions, which was highlighted by Chris Emezue of Naija Voices as he touched on considerations for key stakeholders, understanding which stakeholders are being considered and left out. Additionally, he touched on balancing accessibility and sustainability, his approach would also centre the community by creating data sets with the community, co-ownership of the data set by the community and applying a non-commercial default licence and use of data as given.

The discussion led by Prof Okorie brought out significant perspectives and underlying tensions in open licensing standards. While CC BY licensing and similar frameworks promote accessibility, they inadvertently perpetuate extractive dynamics by failing to address existing power asymmetries, cultural context, benefits to the community and compensation for data creators. Dr. Vukosi Marivate pointed out these dynamics reflecting on data philanthropy, underscoring how mandated openness from funders can undermine sustainability when separated from equitable and responsible considerations on data collection and accessibility of open data. Funders in the room noted that whereas the proposed solutions, i.e NOODL, Esethu and CC Signals, attempt to address and mitigate the tensions by incorporating community-defined terms of use, key implementation challenges remain, particularly in balancing local data governance parameters with scalability while avoiding the bureaucratic elements likely to arise. Wholistically, the discussion was clear on the fact that data governance ought to prioritise looking at open data, especially of African language datasets, from a contextualised manner where openness is negotiated and not imposed, and technical frameworks developed to ensure African datasets are not commodified without reciprocity.

Image used is from canva

Stay Updated

Subscribe to our newsletter to receive the latest research, publications, and blog posts directly in your inbox.

ozototo https://nongkiplay.com/ samson88 samson88 samson88 kingbokep jenongplay samson88 dausbet dausbet cagurbet samson88 dausbet slot777 cagurbet slot777 slot mpo dausbet dausbet samson88 samson88 cagurbet samson88 samson88 cagurbet slot777 slot gacor hari ini samson88 Slot777 slot mpo https://gasindustri.co.id/ slot gacor dausbet https://webs.stikesabi.ac.id/lib/ kno89 cagurbet cagurbet cagurbet samson88 cagurbet apk slot slot thailand https://www.chabad.com/videos/ cagurbet scatter hitam cagurbet slot777 jamur4d jamur4d slot2d cagurbet cagurbet slot777 livetotobet slot2d samson88 samson88 livetotobet livetotobet livetotobet livetotobet cagurbet cagurbet bintang4d cagurbet cagurbet cagurbet strategi pemain 2026 berubah perubahan sistem game digital 2026 dausbet cagurbet dausbet cagurbet dausbet cagurbet jokers4d jokers4d karinbet karinbet dausbet https://nks.com.vn/contact/ karinbet dausbet bintang4d jokers4d livetotobet https://smkpgri1jakarta.sch.id/ livetotobet karinbet cagurbet cagurbet kawat4d slot2d bintang4d cagurbet samson88 samson88 cagurbet kawat4d cagurbet slot88 slot777 slot2d slot2d bintang4d livetotobet jokers4d karinbet karinbet samson88 karinbet samson88 kawat4d cagurbet cagurbet cagurbet cagurbet cagurbet kawat4d kawat4d cagurbet slot777 cagurbet dausbet kawat4d kawat4d kawat4d slot toto slot2d cagurbet livetotobet https://routertool.co.uk/terms-and-conditions/ https://reginarick.de/kontakt/ https://htgfruit.id.vn/lien-he/ kawat4d slot88 cagurbet cagurbet cagurbet cagurbet dausbet slot qris slot qris scatter hitam slot dana kawat4d kawat4d karinbet samson88 kawat4d cagurbet samson88 samson88 cagurbet cagurbet slot qris cagurbet dausbet slot gacor cagurbet cagurbet cagurbet cagurbet cagurbet cagurbet samson88 cagurbet apk slot cagurbet slot777 cagurbet dausbet apk slot cagurbet cagurbet cagurbet cagurbet cagurbet cagurbet samson88 cagurbet cagurbet karinbet samson88 samson88 samson88 samson88 dausbet cagurbet dausbet cagurbet cagurbet cagurbet dausbet cagurbet cariwd88 cagurbet cagurbet cagurbet cariwd88 antares138 cagurbet cagurbet apk slot cagurbet slot thailand karinbet karinbet karinbet karinbet apk slot karinbet samson88 karinbet cagurbet slot gacor hari ini dausbet apk slot slot qris cariwd88 apk slot karinbet karinbet cagurbet cagurbet cagurbet samson88 karinbet cariwd88 karinbet cariwd88 apk slot cariwd88 cagurbet karinbet