Unified Temporal Tokenization: A Hybrid Semantic and Numeric Mapping for Time-Aware Large Language Models

Inesh Hettiarachchi

doi:10.32996/fcsai.2025.4.4.3

Research Article

Unified Temporal Tokenization: A Hybrid Semantic and Numeric Mapping for Time-Aware Large Language Models

Authors

Inesh Hettiarachchi Independent Researcher, Wilmington DE, USA

Abstract

This paper introduces the Unified Temporal Tokenization (UTT) framework — a hybrid encoding mechanism that bridges numeric and semantic time representations for time-aware Large Language Models (LLMs). UTT translates continuous time-series signals and symbolic temporal descriptors into unified hybrid tokens, preserving hierarchical periodicity while retaining contextual meaning. UsinG the UCI Electricity Load dataset, we demonstrate that the Hybrid Temporal Tokenizer (HTT) improves prediction stability, interpretability, and efficiency on CPU-only environments, establishing a foundation for temporal reasoning in LLMs. Based on this, the paper expounds on improvements of the hybrid temporal embeddings in increasing the aptitude of the model to capture cyclical and contextual relationships at several levels of time. The framework suggested combines numeric continuity with semantic periodicity, which makes the LLMs reason about time-sensitive patterns, including consumption cycles of a day, seasonal variations, and events. Experiment analysis of LSTM, TemporalConv, and Transformer variations demonstrates improvements as regular in ensemble prediction-related accuracy, interpretation, and computationally wise than traditional numeric-only tokenization. In addition, the UTT architecture is flexible in many areas, which can provide a single response to enterprise analytics, internet data streams, and economic forecasting problems that require the use of temporal reasoning. This study integrates semantic and numeric temporal mapping and offers a scalable framework of time-aware large language models, which spans the language-based and data-based temporal intelligence. Using the UCI Electricity Load dataset, we demonstrate that the Hybrid Temporal Tokenizer (HTT) improves prediction stability, interpretability, and efficiency in CPU-only environments—establishing a foundation for temporal reasoning in LLMs.

Article information

Journal

Frontiers in Computer Science and Artificial Intelligence

Volume (Issue)

4 (4)

DOI

https://doi.org/10.32996/fcsai.2025.4.4.3

Pages

25-42

Published

2025-12-11

How to Cite

Hettiarachchi, I. (2025). Unified Temporal Tokenization: A Hybrid Semantic and Numeric Mapping for Time-Aware Large Language Models. Frontiers in Computer Science and Artificial Intelligence, 4(4), 25-42. https://doi.org/10.32996/fcsai.2025.4.4.3

Frontiers in Computer Science and Artificial Intelligence

Unified Temporal Tokenization: A Hybrid Semantic and Numeric Mapping for Time-Aware Large Language Models

Authors

Abstract

Article information

Journal

Frontiers in Computer Science and Artificial Intelligence

Volume (Issue)

4 (4)

DOI

https://doi.org/10.32996/fcsai.2025.4.4.3

Pages

25-42

Published

How to Cite

Downloads

201

307

Keywords:

rightbar

submission

menus