Case Study

How AfterQuery Replaced GKE with Daytona to Power Far Frontier-Grade Agent Simulations

86%

Reduction in sandbox creation times

40+

Hours of engineering time saved weekly

AfterQuery is an applied research lab curating high-quality human data for frontier AI development, working with domain experts from AI2 and Stanford to serve leading US AI and tech companies.

Headquarters

San Francisco, CA

Industry

Applied AI Research Foundation Model Development

Department

Research Operations Engineering and Tooling

Key Features

Sandbox Creation Speed Long-Running Sandboxes Infrastructure Scale

afterquery.com

Learn how this applied research lab partnered with Daytona to provision isolated, long-running sandboxes that power high-fidelity agent simulations at scale.

As a frontier applied AI research lab, the infrastructure we need often does not exist. Daytona broke that trend. Their efficient, intuitive runtime platform now powers the sandbox infrastructure for our entire environments program.

Sam Jung

Head of Research Operations at AfterQuery

01 -- CHALLENGE

Provisioning Sandboxes In-House Limited Agent Simulations

AfterQuery builds custom environments where AI agents are trained and evaluated against expert-designed tasks and criteria. Naturally, the quality of this training depends heavily on the infrastructure that provisions and manages these environments at scale.

At the time, Sam Jung, Head of Research Operations, and Agustin Garcinuno, Founding Engineer, had stood up a GKE cluster to generate sandboxes in-house. While this approach supported custom environment configurations, such as resource limits, file system paths, and API keys, handling this setup for hundreds of agent trials created significant overhead.

Building sandboxes was just the starting point. As agents completed trials, the AfterQuery team continuously refined task evaluation criteria and toolkits to improve training outcomes. To start trials from the latest sandbox state, they had to regularly version sandboxes after iteration cycles. As task types multiplied, doing this by hand quickly became unscalable.

“We had to configure resource limits, inject the right API keys and file paths, and save the environment state after every iteration cycle, all by hand,” shares Sam. “At a certain point, it was all we were doing.”

That overhead was only compounded by trial demands. Agents could work on a task for hours, and they often behave unpredictably during learning. To keep experiments moving safely, AfterQuery needed airtight sandboxes that could sustain long runs.

As the operational load began crowding out core research, the two decided to find a dedicated runtime provider. But of the solutions they evaluated, none met their infrastructure promises, requiring lengthy setups or failing under the load of long, concurrent runs.

That’s when Sam and Agustin discovered Daytona. Their agent-native runtime platform enabled the long-running stability and concurrent provisioning AfterQuery needed to accelerate agentic research and training at scale.

Every hour we spend on infrastructure is time taken from creating high-quality simulations. We needed a dedicated runtime solution that integrated with our simulation workflows and scaled as demand grew. Daytona checked all those boxes.

Sam Jung

Head of Research Operations at AfterQuery

02 -- SOLUTION

Unlocking Fast, Scalable, and Stateful Sandbox Infrastructure

Thanks to Daytona's detailed documentation and simple API, AfterQuery replaced their GKE setup with fully managed sandbox infrastructure in a single day. This turnkey onboarding process not only accelerated time-to-value but also ensured the AfterQuery team could retain its focus on building high-fidelity simulations.

Now, AfterQuery uses Daytona to provision hundreds of pre-configured sandboxes on demand. Through Daytona Snapshots, Sam and Agustin build, refine, and version every task’s sandbox, preserving updated agent toolkits, API keys, and evaluation criteria to sharpen simulation quality. With this workflow, sandboxes start instantly warm from stored states, eliminating hours of setup time.

Daytona's built-in Port Forwarding strengthens this iteration cycle, enabling Sam and Agustin to run development servers inside sandboxes and preview them in the browser. That way, they validate that simulations perform as designed before locking in a new snapshot version.

These sandboxes run fully isolated in parallel, so they don’t compete for resources or allow one agent’s actions to bleed into another trial. Because this process runs seamlessly in the cloud, AfterQuery’s team drives higher trial throughput with minimal overhead. At the same time, customers benefit from faster trials and model iterations.

As agents run commands and call APIs across hours-long trials, Daytona’s sandboxes support stable execution for the full duration of their run. That reliability gives Sam's team the confidence to focus on refining the next agent simulation.

While Daytona handles the runtime layer end-to-end, Sam and his team stay in full control of agent trials. Daytona's Pseudo Terminal capabilities give the team direct command-line access to any running sandbox. With this connection, they can tail logs, inspect agent state, or manually trigger a command to test a hypothesis, compressing the feedback loop between runs.

The depth of Daytona’s feature set is matched by an intuitive developer experience. When Sam and Agustin implement new tools or fine-tune their sandbox infrastructure, Daytona’s team support and comprehensive guides provide instant guidance. For a team pushing the frontier of autonomous agents’ capabilities, that clarity accelerates every step of the research cycle.

Our entire process is built around versioning environments. Having Daytona’s Snapshots handle this process has been a huge benefit for our team.

Agustin Garcinuno

Founding Engineer at AfterQuery

03 -- RESULT

AfterQuery Saves 40+ Engineering Hours Each Week With Daytona

With Daytona, AfterQuery now has the scalable sandbox infrastructure needed to accelerate its research roadmap. Sam and Agustin now consistently sharpen the quality of their simulations, scaling across domains while enhancing agent training outcomes for their clients.

86% reduction in sandbox creation times
40+ hours of engineering time saved weekly

Looking ahead, Sam and Agustin are already exploring how Daytona’s real-time CPU and storage resizing would help them dynamically optimize resources during long-running trials. They’re also eager to unlock faster iteration on simulated workflows running on iOS.