AI News
4963 articles in this category (Page 16 of 207)
AI NewsAI InfrastructureLarge Language Model
Sakana AI and NVIDIA Introduce TwELL: 20.5% Faster LLM Inference via Unstructured Sparsity
Sakana AI and NVIDIA introduced TwELL and custom CUDA kernels, achieving 20.5% inference and 21.9% training speedups in LLMs by exploiting activation sparsity.
Read more
AI NewsLarge Language ModelSoftware Engineering
Mastering LLM Distillation: Soft-Label, Hard-Label, and Co-distillation Strategies
LLM distillation uses teacher-student models to transfer reasoning capabilities, reducing costs while maintaining performance through techniques like soft-label and co-distillation.
Read more
AI NewsAgentic AISoftware Engineering
NadirClaw: Building Cost-Aware LLM Routing with Local Prompt Classification
NadirClaw introduces an intelligent local routing layer that classifies prompts into simple and complex tiers, enabling dynamic switching between Gemini Flash and Pro to reduce LLM costs by up to 50%.
Read more