LILA Lab Control Room

// PIPELINE METRICS

📊

0

Articles Analyzed

Bangla news corpus

🎯

0

Classification Accuracy

TF-IDF baseline

📈

0

Months Indexed

2014–2020

🔗

0

CPI Correlation

p < 0.001

// KEY RESEARCH FINDINGS

Breakthrough Results

Validated evidence that narrative indices work for low-resource languages

★ Breakthrough

Economic Narratives Track Real Indicators

Bangla-language economic news narratives show strong negative correlation with CPI and exchange rate movements — the first evidence that narrative indices work for low-resource languages.

r = −0.75 CPI Correlation

r = −0.72 FX Correlation

Read Paper →

Validation

TF-IDF Outperforms Deep Learning

Simple TF-IDF + Logistic Regression achieves 91.7% accuracy, outperforming complex deep learning models while requiring no GPU and costing $0.02/article in annotation.

91.7% Accuracy

$0.02 Per Article

Scale

664K Articles Processed

Complete processing of 664,000+ Bangla news articles from the Potrika corpus, creating the largest low-resource language narrative dataset for economic analysis.

664K+ Articles

3.3 GB Corpus Size

Cost

LLM Annotation at Scale

Claude and GPT-4o ensemble achieves human-level annotation quality at $0.02–0.03 per article, making large-scale annotation feasible for low-resource languages.

3,200 Labels Created

$0.03 Max Cost

// XENI PIPELINE FRAMEWORK

How the Pipeline Works

Every language gets its own XENI — proven in Bangla as BENI, ready for yours.

01

📰

Collect Native News

Aggregate millions of native-language articles from local sources, preserving linguistic authenticity from day one.

02

🤖

Annotate & Classify

LLM ensemble (Claude, GPT-4o) annotates narratives across domains. Multi-model classification achieves 91.7% accuracy.

03

📊

Build Validated Index

Construct monthly narrative indices validated against macroeconomic indicators. CPI correlation at r = −0.75.

// DATA VISUALIZATIONS

Interactive Research Data

Live charts from the BENI pipeline analysis

BENI Index vs CPI (2014–2020)

Inverse correlation between narrative index and inflation

Classification Model Comparison

TF-IDF baseline vs. deep learning models

Monthly Article Volume

Consistent data collection across 79 months

Annotation Cost Breakdown

Cost per article by annotation method

// PIPELINE STATUS MONITOR

XENI Pipeline Status

Language pipeline development progress across all targets

BENI

Active

Bangla (বাংলা)

265M speakers

100% — Complete

✓ 664K articles processed
✓ 91.7% accuracy
✓ 79-month index
✓ CPI validated (r = −0.75)

AENI

Contributors Needed

Assamese (অসমীয়া)

15M speakers

5% — Seeking contributors

○ Pipeline code reusable
○ Annotation schema ready
○ Awaiting data collection
○ Extension template available

NENI

Contributors Needed

Nepali (नेपाली)

25M speakers

5% — Seeking contributors

○ Pipeline code reusable
○ Annotation schema ready
○ Awaiting data collection
○ Extension template available

SENI

Contributors Needed

Sylheti (ꠍꠤꠟꠐꠤ)

11M speakers

3% — Planned

○ Pipeline code reusable
○ Annotation schema ready
○ Awaiting data collection
○ Extension template available

// RESEARCH UPDATES

Latest from the Lab

Recent findings, publications, and community announcements

2026-06-10 Finding

First Evidence: Bangla Economic Narratives Track Real Indicators

Our validation shows that economic narratives in Bangla news articles correlate strongly with macroeconomic indicators (CPI r = −0.75, FX r = −0.72). This is the first evidence that narrative indices work for low-resource languages.

2026-06-08 Pipeline

BENI Pipeline Reaches 91.7% Accuracy

After optimizing the TF-IDF + Logistic Regression pipeline, we've achieved 91.7% classification accuracy on gold-standard human annotations — without any GPU.

2026-06-05 Community

Calling Assamese, Nepali & Sylheti Speakers

We're seeking native speakers to help build the next XENI pipelines. Each contributor receives co-authorship on the resulting paper. No coding required.

2026-06-01 Publication

Systematic Review Submitted to arXiv

Our systematic review of 20 years of economic narrative indices has been submitted to arXiv. The paper replicates and extends the full ENI methodology literature.

2026-05-28 Data

LILA-BENI v1.0 Dataset Released

The complete BENI dataset with 664K articles, annotations, and indices is now available on Zenodo with a permanent DOI for reproducible research.

2026-05-20 Method

LLM Annotation Costs: Claude vs GPT-4o

We compare annotation costs and quality between Claude and GPT-4o for Bangla economic relevance labeling. Claude achieves comparable quality at lower cost.

Subscribe to Newsletter View on GitHub

Research
Control Room

Breakthrough Results

Economic Narratives Track Real Indicators

TF-IDF Outperforms Deep Learning

664K Articles Processed

LLM Annotation at Scale

How the Pipeline Works

Collect Native News

Annotate & Classify

Build Validated Index

Interactive Research Data

BENI Index vs CPI (2014–2020)

Classification Model Comparison

Monthly Article Volume

Annotation Cost Breakdown

XENI Pipeline Status

BENI

AENI

NENI

SENI

Latest from the Lab

First Evidence: Bangla Economic Narratives Track Real Indicators

BENI Pipeline Reaches 91.7% Accuracy

Calling Assamese, Nepali & Sylheti Speakers

Systematic Review Submitted to arXiv

LILA-BENI v1.0 Dataset Released

LLM Annotation Costs: Claude vs GPT-4o

Contribute to LILA Lab