We're a non-profit, building
in public, for the long term.
The BantuLanguages Initiative is a non-profit founded in 2026 in Central Africa, building the AI infrastructure that Bantu languages are missing — datasets, models, and tools, released in the open.
- Founded 01
- 2026
- Status 02
- Non-profit
- Based 03
- Brazzaville · Kinshasa
- License 04
- CC-BY-4.0
How it started.
The BantuLanguages Initiative began as a quiet realization, repeated across kitchens, classrooms and code editors: the largest AI systems in the world were not built with us in mind. Not maliciously — just as a default. The data wasn't there. The benchmarks weren't there. The infrastructure wasn't there.
In early 2026, a small group of researchers, engineers and linguists decided that someone had to start building it. Not as a side project. Not under a corporate roadmap. As a genuine public good, governed transparently, owned by the community.
The initiative is rooted in Central Africa — with founding hubs in Brazzaville and Kinshasa — and works in the open with contributors and partners across the continent and the diaspora. Our first commitment is Lingala. Our long horizon is the entire Bantu family — and from there, the rest of the continent's languages.
Four rules we hold to.
Open by default
Datasets, code, models, governance — all public, all licensed for reuse. If we can't share it, we don't ship it.
Communities first
Languages belong to the people who speak them. Contributors are credited, communities consulted, value returned.
Quality over noise
We'd rather release one well-documented dataset than ten unusable scrapes. Reproducibility is the bar.
Long horizon
This is decade-scale work. We measure success in foundations laid, not in vanity benchmarks.
A community in formation.
We're in our founding phase. Names will appear on this page as the team forms in public.