MaxToki: A 1B-Parameter Temporal Foundation Model for Cellular Aging Trajectories
These articles are AI-generated summaries. Please check the original sources for full details.
Meet MaxToki: The AI That Predicts How Your Cells Age — and What to Do About It
MaxToki is a transformer decoder model designed to predict temporal shifts in gene network states across the human lifespan. It was trained on nearly 1 trillion gene tokens across 175 million single-cell transcriptomes to overcome the “snapshot” limitation of current biological foundation models.
Why This Matters
Most foundation models in biology treat single-cell transcriptomes as frozen snapshots, failing to account for the slow, progressive shifts in gene network states that drive age-related diseases like Alzheimer’s and pulmonary fibrosis over decades. This technical blind spot prevents researchers from identifying where a cell is headed rather than just its current state.
MaxToki addresses this by implementing a temporal prompting strategy and continuous numerical tokenization, allowing the model to reason across trajectories. By scaling to 1 billion parameters and utilizing RoPE-based context extension to 16,384 tokens, it achieves a median prediction error of 87 months for held-out ages, nearly doubling the accuracy of linear regression baselines.
Key Insights
- Fact: MaxToki reduced the median prediction error for held-out cellular ages to 87 months, compared to 178 months for standard SGDRegressor baselines (2026).
- Concept: Rank value encoding orders genes by relative expression within a cell to amplify transcription factors and reduce technical batch effects.
- Tool: FlashAttention-2 via the NVIDIA BioNeMo stack enabled a 5x improvement in training throughput on H100 80GB GPUs.
- Fact: The model inferred 15 years of age acceleration in lung fibroblasts from patients with pulmonary fibrosis, despite being trained only on healthy donors.
- Concept: In-context learning allows the model to infer trajectory context (cell type and gender) from cell states without explicit labels.
Practical Applications
- In Silico Screening for Longevity: Researchers used MaxToki to nominate pro-aging drivers in cardiac cells, which were later validated in vivo to cause measurable cardiac dysfunction in mice. Pitfall: Relying on raw transcript counts instead of rank encoding, which biases models toward ubiquitous housekeeping genes.
- Alzheimer’s Pathology Analysis: Distinguishing between symptomatic Alzheimer’s patients (showing 3-year age acceleration in microglia) and resilient individuals who show no acceleration. Pitfall: Treating timelapses as discrete categories rather than a numerical continuum, which significantly degrades prediction accuracy.
References:
Continue reading
Next article
Robust LLM Response Parsing in DataWeave: Eliminating Production Crashes
Related Content
NVIDIA SANA-WM: 2.6B-Parameter World Model for 720p Minute-Scale Video on Single GPUs
NVIDIA's SANA-WM is a 2.6B-parameter world model that generates one-minute 720p video with 6-DoF camera control on a single GPU, delivering 36x higher throughput than competitors.
Liquid AI Releases LFM2-ColBERT-350M: A Compact Late Interaction Model for Multilingual Cross-Lingual Retrieval
Liquid AI introduces LFM2-ColBERT-350M, a 350M-parameter late interaction retriever optimized for multilingual and cross-lingual search, offering high accuracy and fast inference speeds.
Google AI Introduces Consistency Training for Safer Language Models Under Sycophantic and Jailbreak Style Prompts
Google AI introduces Consistency Training (Bias Augmented Consistency Training and Activation Consistency Training) to enhance language models' safety against sycophantic and jailbreak prompts while preserving their capabilities.