Your language. Your stories.
Amplified by AI.
The world speaks 7,000+ languages. NLP serves barely 20.
We build language intelligence for the billions AI was never designed to serve.
Every language gets its own XENI — [X]ploration & E Native-language Intelligence. The first letter changes per language.
664K articles → LLM annotation → 91.7% accuracy → validated index correlating with CPI (r=−0.75). The pipeline works.
Every pipeline, dataset, and paper stays open source. Built by researchers, for researchers — wherever your language is spoken.
The naming teaches itself on two axes — language and domain. Once you see the pattern, you instantly know how your language and research fit.
See BENI and AENI? The pattern is obvious:
Your language's initial goes here →
See BENI Economic Index and BENI Health Index? Same pattern:
Your research domain goes here →
The XENI suffix always refers to the pipeline (the language instrument). The domain is a plain-English qualifier on the index it produces. This keeps every XENI pronounceable while making the domain scope instantly readable.
Each pipeline is named [Language initial] + ENI. The pattern teaches itself.
Active 265M speakers
Economic narrative index for Bangla news. Classifies 664K+ articles with 91.7% accuracy. Validated against CPI, FX, reserves.
git clone https://github.com/LilaLABx/LILA-LAB
cd pipelines/beni/experiment/beni_pilot/
python3 train.py --model-type tfidf
Seeking Contributors 15M speakers
First NLP pipeline for Assamese economic narratives. Pipeline ready — needs native speakers for annotation.
Seeking Contributors 25M speakers
Nepali economic narrative index. Ready for adaptation with native-language collaboration.
Seeking Contributors 11M speakers
Sylheti language narrative extraction. Currently no NLP infrastructure exists for this language.
Seeking Contributors 16M speakers
Chittagonian narrative index. One of the most underserved languages in NLP relative to speaker count.
Be First
The pipeline is language-agnostic. If you speak it, we can process it. Fork the repo, adapt, publish.
Start Your XENI →A 6-paper research program on narrative measurement across underserved languages.
A quantitative framework for narrative-based economic analysis.
CompleteSystematic review, replication study, and Bangla extension (2007–2025).
Submitted to arXivFrom raw news to validated narrative index — the complete methodology.
Active (Jul 2026)Local-language news as a high-frequency economic indicator.
Planned (Aug 2026)110-year survey of language-based methods: from content analysis to LLMs.
Planned (Oct 2026)Framework for narrative extraction in low-resource languages.
Proposed (Jan 2027)Every contribution model leads to academic authorship. If you speak an underserved language, you are not a data source — you are a co-author.
Apply the pipeline to YOUR language. First-author paper.
Apply to health, climate, education. First-author paper.
Improve the classifier, reduce cost. Co-authorship.
Independently verify results. Published replication report.
Label articles in your language. Acknowledgement in papers.
Analyze narratives for policy. Co-authorship.
Build dashboards, APIs, tools. Tool paper co-authorship.
Create tutorials, course modules. Educational paper co-authorship.
"84% of NLP research is English-only. If your language isn't served, you're invisible in the data that shapes global decisions. We change that — one pipeline at a time."
Follow LILA Lab across platforms — all coordinated from the repository. Join the movement for language infrastructure.
All channels are documented and coordinated from the Communications Center.