🕰️ Temporal Reasoning in Persian NLP: A Closer Look at Expression Types and Timeline Generation

Karine Megerdoomian
43 minutes ago
3 min read

In natural language, time is everywhere — embedded in verbs, hidden in idioms, and scattered across calendars, seasons, and routines. But while humans can intuitively understand when something happened (or is happening), teaching machines to do the same is a much harder challenge. This is the domain of temporal reasoning in Natural Language Processing (NLP).

At Zoorna Institute, one of our research goals is to build robust systems that can automatically construct timelines of events — a task that demands accurate temporal expression detection, classification, and temporal inference.

A Visual Case Study

In the slide above, we analyzed a short Persian narrative text and extracted multiple temporal expressions tied to life events like university studies, music classes, internships, and employment. These expressions are color-coded based on category (e.g., Date, Range, Periodic, Relative).

What Are Temporal Expressions?

Temporal expressions are phrases or words that refer to points or periods in time. These can include:

Specific dates: “April 2006,” “Mehr 1392”
Datet ranges: "From 1392 to 1395"
Durations: “for six months”, “throughout her studies”
Relative times: “three days ago”, “four years later"
Recurring intervals: “every Tuesday”, “each summer”
Vague references: “post-internship”, "after graduation", “in the meantime”

Each type requires a different strategy to interpret and anchor it in a real-world timeline.

While a human can read this and somewhat easily map out a coherent timeline (although it can get quite confusing for humans too), the challenge for machines lies in interpreting and linking diverse temporal expressions that range from absolute calendar dates to vague relational cues.

Let’s break down a few key types:

Absolute Dates & Ranges: "Farvardin 1385" (April 2006) is straightforward. But ranges like "Mehr 1392 to Shahrivar 1395" (Oct 2013 – Sept 2016) require calendar conversion and interpretation of Persian months.
Relative Dates: Expressions like "three days ago" are only meaningful in the context of a document’s creation date — here, 12 October 2016. These require temporal anchoring and inferencing.
Periodic Expressions: "Every Tuesday" is a repeating pattern, not a one-time event, which makes it tricky to pin to discrete timeline points.
Vague Event-Linked Times: "After that internship" or "post-graduation" are dependent on correctly resolving what "that" refers to. This adds a layer of coreference resolution and event sequencing to the task.

What Is Timeline Generation?

Timeline generation is the process of automatically identifying, ordering, and anchoring events in time based on information in text. It involves extracting temporal expressions, linking them to specific events, and arranging those events chronologically to reconstruct a coherent narrative or history. The timeline of events for the above example is given below:

Why It’s Hard — and Why It Matters

Accurately interpreting temporal expressions is crucial for many applications:

Event timeline generation (for news, biographies, legal records)
Narrative understanding
Chronological search and question answering
Risk assessment due to geopolitical events (e.g., financial)
Planning and decision support systems

However, the diversity, ambiguity, and contextual nature of temporal language makes automatic processing a serious technical challenge — especially in low-resource languages or multilingual settings like Persian.

Our Work at Zoorna Institute

Through projects like our Persian temporal reasoning benchmark, we’re developing tools that:

Identify and normalize temporal expressions across languages
Resolve implicit and vague references using context and reasoning
Construct coherent, event-based timelines even in noisy or informal data

This work supports broader efforts in multilingual AI, educational technology, and computational social science.

Stay tuned as we continue building open datasets, tools, and models that help bridge the gap between human temporal reasoning and machine understanding.

✨