
Current Initiatives
Explore our ongoing research programs:

AI and Education
Building AI-powered learning tools for low-resource languages.
Developing adaptive, AI-driven platforms for language learning in low-resource contexts. Current pilots focus on Persian and Armenian, with tools designed to enhance learner engagement, comprehension, and accessibility.

The Art of Taarof
Modeling ritual politeness in Persian for culturally aware AI.
In collaboration with Brock University, this project explores how the Persian system of ritual politeness (Taarof) can be modeled computationally—for use in culturally aware AI, language learning, and pragmatic annotation.

Tajik Transliteration
Bridging Persian dialects across scripts and borders
This initiative builds a bidirectional transliteration system between Tajik Persian (Cyrillic) and Iranian Persian (Perso-Arabic). We compare ML and generative models to support cross-script communication and resource development for Persian varieties.

Narrative Analytics
Uncovering timelines, events, and hidden meanings in Persian and Kurdish narratives using LLMs.
Exploring how large language models (LLMs) handle narrative structure in low-resource languages, this project investigates event extraction, timeline construction, and implicit meaning in Persian and Sorani.

Benchmarking LLMs for Iranian Languages
Evaluating multilingual AI on linguistic features in low-resource Iranian languages.
This initiative develops a comprehensive evaluation suite to benchmark large language models across Persian, Kurdish, Balochi, Gilaki, and other Iranian languages. We focus on linguistic competence—morphology, syntax, and discourse-level understanding—to identify where LLMs succeed, where they fail, and how multilingual training can better represent low-resource linguistic diversity.

📊 Scientometrics: Mapping the Fields of NLP and Linguistics
Tracking the evolution, divergence, and intersections of linguistic and AI research communities.
Using bibliometric and network analysis, this project maps the evolving relationship between computational linguistics and theoretical linguistics—analyzing citation patterns, thematic shifts, and authorship networks over time.

Complex Predicates in Persian and Beyond
A long-term research program on the structure, semantics, and computation of complex predicates.
Building on decades of linguistic inquiry, this initiative surveys the literature on Persian complex predicates and develops computational methods to automatically identify and analyze them—bridging theoretical linguistics and language technology.

Verbal Reduplication in Eastern Armenian
Unpacking event structure through reduplication and morphosyntactic form.
A theoretical and morphosyntactic investigation of how verbal reduplication encodes event structure, intensity, and distribution in Eastern Armenian. We examine patterns like կտրել (ktrel, “to cut”) vs. կտրտել (ktrtel, “to chop repeatedly”).
Recent Publications
& Presentations
Our Work
Read the latest from the Zoorna Institute