For much of the AI era, intelligence has been on-demand: a user issues a prompt, and the model responds after reasoning through the request. But as AI systems grow more autonomous and expectations rise for real-time reasoning, low latency, and cost-efficiency, the definition of intelligence is shifting. We’re entering a new phase where AI is expected to stay ready for the next request—even during downtime.
The key to unlocking this proactive AI future may lie in an unexpected moment: when the AI is “asleep,” a phase now called sleep-time compute.
The term was coined in an April 2025 white paper by Letta, a Berkeley-born AI startup spun out of UC Berkeley’s Sky Computing Lab, founded by researchers Charles Packer and Sarah Wooders. Developed in collaboration with Databricks and Anyscale cofounder Ion Stoica and others, the sleep-time compute framework aims to shift AI from reactive to proactive intelligence. Instead of waiting for prompts, AI agents use idle time to precompute answers, refine memory, and anticipate user needs.
Wooders says the idea draws inspiration from neuroscience. Just as humans consolidate memories during sleep and reflect beyond immediate tasks, AI should be able to do the same.