LILA Lab

Your language. Your stories.
Amplified by AI.

The world speaks 7,000+ languages. NLP serves barely 20.
We build language intelligence for the billions AI was never designed to serve.

🌍

XENI Pipelines

Every language gets its own XENI — [X]ploration & E Native-language Intelligence. The first letter changes per language.

🔬

Proven in Bangla

664K articles → LLM annotation → 91.7% accuracy → validated index correlating with CPI (r=−0.75). The pipeline works.

🤝

Open & Community-Owned

Every pipeline, dataset, and paper stays open source. Built by researchers, for researchers — wherever your language is spoken.

How the XENI System Works

The naming teaches itself on two axes — language and domain. Once you see the pattern, you instantly know how your language and research fit.

Level 1

Language

See BENI and AENI? The pattern is obvious:

B ENI = Bangla
A ENI = Assamese
N ENI = Nepali
Y ENI = Yoruba
? ENI = Your language

Your language's initial goes here →

Level 2

Domain

See BENI Economic Index and BENI Health Index? Same pattern:

BENI Economic = economic narratives
BENI Health = health discourse
BENI Climate = climate narratives
AENI Economic = Assamese economics
[X]ENI [Domain] = your research

Your research domain goes here →

The XENI suffix always refers to the pipeline (the language instrument). The domain is a plain-English qualifier on the index it produces. This keeps every XENI pronounceable while making the domain scope instantly readable.

The XENI Pipeline Family

Each pipeline is named [Language initial] + ENI. The pattern teaches itself.

B

BENI — Bangla

Active 265M speakers

Economic narrative index for Bangla news. Classifies 664K+ articles with 91.7% accuracy. Validated against CPI, FX, reserves.

git clone https://github.com/LilaLABx/LILA-LAB
cd pipelines/beni/experiment/beni_pilot/
python3 train.py --model-type tfidf
A

AENI — Assamese

Seeking Contributors 15M speakers

First NLP pipeline for Assamese economic narratives. Pipeline ready — needs native speakers for annotation.

N

NENI — Nepali

Seeking Contributors 25M speakers

Nepali economic narrative index. Ready for adaptation with native-language collaboration.

S

SENI — Sylheti

Seeking Contributors 11M speakers

Sylheti language narrative extraction. Currently no NLP infrastructure exists for this language.

C

CENI — Chittagonian

Seeking Contributors 16M speakers

Chittagonian narrative index. One of the most underserved languages in NLP relative to speaker count.

?

YOUR Language Here

Be First

The pipeline is language-agnostic. If you speak it, we can process it. Fork the repo, adapt, publish.

Start Your XENI →

LILA Technical Reports

A 6-paper research program on narrative measurement across underserved languages.

#1

Statistical Economics of Narrative

A quantitative framework for narrative-based economic analysis.

Complete
#2

Systematic Review of Economic Narrative Indices

Systematic review, replication study, and Bangla extension (2007–2025).

#3

Building BENI: A Replicable Pipeline

From raw news to validated narrative index — the complete methodology.

Active (Jul 2026)
#4

Nowcasting Inflation with BENI

Local-language news as a high-frequency economic indicator.

Planned (Aug 2026)
#5

Text as Data in Social Science

110-year survey of language-based methods: from content analysis to LLMs.

Planned (Oct 2026)
#6

LLMs as Measurement Devices

Framework for narrative extraction in low-resource languages.

Proposed (Jan 2027)

Eight Ways to Contribute

Every contribution model leads to academic authorship. If you speak an underserved language, you are not a data source — you are a co-author.

🌍

Language Extension

Apply the pipeline to YOUR language. First-author paper.

🔬

Cross-Domain

Apply to health, climate, education. First-author paper.

⚙️

Methodological

Improve the classifier, reduce cost. Co-authorship.

Replication

Independently verify results. Published replication report.

🗣️

Citizen Annotation

Label articles in your language. Acknowledgement in papers.

📊

Policy Brief

Analyze narratives for policy. Co-authorship.

🛠️

Infrastructure

Build dashboards, APIs, tools. Tool paper co-authorship.

📖

Education

Create tutorials, course modules. Educational paper co-authorship.

"84% of NLP research is English-only. If your language isn't served, you're invisible in the data that shapes global decisions. We change that — one pipeline at a time."

Connect with LILA Lab

Follow LILA Lab across platforms — all coordinated from the repository. Join the movement for language infrastructure.

All channels are documented and coordinated from the Communications Center.