Tag: ai-agents
All the articles with the tag "ai-agents".
LHC v0.2: A Benchmark for Long-Horizon Agent Coherence (and the Methodology That Got It Honest)
Published: at 08:00 PMI just published LHC v0.2, an open benchmark for long-horizon coherence in 8B-class agent models, plus a deterministic parser baseline that puts a useful floor on what fine-tuning is worth for structured-state tasks. This post explains what they're for, how to use them, and the methodology arc that produced them across five rounds of external review.
LHC v0.2: Um Benchmark para Coerência de Longo Horizonte em Agentes (e a Metodologia que Tornou os Resultados Honestos)
Published: at 08:00 PMAcabei de publicar o LHC v0.2, um benchmark aberto para coerência de longo horizonte em modelos de agentes da classe 8B, mais um baseline de parser determinístico que coloca um piso útil sobre o que fine-tuning vale para tarefas de estado estruturado. Este post explica para que servem, como usá-los, e o arco metodológico que os produziu ao longo de cinco rodadas de revisão externa.
We're Mistaking the Bootstrap Phase for the Future of AI Agents
Published: at 08:00 AMThe self-hosted AI agent movement is real and important. But we are confusing a bootstrap phase with a destination architecture. The long-term future of agents will be defined by platforms that make them reliable, governable, and operationally boring.
Estamos Confundindo a Fase de Bootstrap com o Futuro dos Agentes de IA
Published: at 08:00 AMO movimento de agentes de IA self-hosted e real e importante. Mas estamos confundindo uma fase de bootstrap com uma arquitetura de destino. O futuro de longo prazo dos agentes sera definido por plataformas que os tornem confiaveis, governaveis e operacionalmente entediantes.
Execution Is Cheap. Judgment Isn't: AI Agents and the Collapse of the CTO/CPO Divide
Published: at 10:00 AMWhen execution becomes abundant through AI agents, judgment becomes the bottleneck. The traditional separation between technical and product leadership breaks down, creating space for the CPTO role.