Building AI Agents On Enterprise Data

Written by

‍Staff acknowledgement: The following blog post is based on a recent talk, Bharath Bhat, CEO of Relvy gave to an audience of engineers and product managers at an e-commerce company’s AI talk series.

How Relvy’s AI agents plan, confirm, act, and summarize to keep investigations effective in complex enterprise deployments without losing context, or control.

At Relvy, we build AI agents that analyze engineering data - logs, metrics, traces, internal events, code, and documentation - to help engineers debug faster. Our system is made up of multiple collaborating agents: a planner that interacts with engineers via notebooks, and specialized agents that integrate with observability tools, code, documents and internal data sources found in the majority of enterprise size companies around the world. (See Part 1 of this blog series for more)

But one of our first pilots reminded us of a truth that cuts across every AI hype cycle: without human guidance, even the best AI agents can fall flat in the enterprise.

Ask for Human Input Early and Often

In our early experience, our early AI lacked context out of the box.

One telling exchange:

Engineer: “Please look at logs for the frontend service (production).”

AI: “Here’s the query I ran → env:production service:frontend”

Engineer: “We don’t use the service field. We use k8s.container.name.”

In the enterprise, data is messy and schemas are inconsistent. AI agents don’t “just know” your logging conventions, dashboards, or workflows. At Relvy, we learned to solve this by starting agents in learning mode:

AI proposes 3–4 next steps.
Engineers review, edit, and confirm before execution.
Minor corrections happen early, building trust before deeper autonomy.

The key: plan → confirm → act. This keeps control in the user’s hands and allows the AI to learn the organization’s unique patterns without creating costly mistakes.

Have the AI Do Its Homework in the Background

The most effective AI debugging isn’t magic—it’s preparation. At Relvy, we’ve learned that a well-prepared agent is a fast, accurate agent. That’s why our AI doesn’t sit idle, waiting for a prompt. Instead, it continuously explores your data in the background: identifying schemas and key fields, sampling values to understand normal patterns, and spotting anomalies to summarize for later use. By the time a question comes in, the AI already has a compact, relevant context at the ready.

The payoff of this preparation is clear: faster answers because there’s less need to scan raw data at query time, higher accuracy because reasoning happens over structured facts instead of noise, and smarter context use because only the most valuable details take up memory.

Think of it as having a research assistant who’s always updating your briefing notes so you can jump straight to decision-making.

Summarization Is Key in Context Engineering

Preparation alone isn’t enough. When you’re working with millions of log lines, high-cardinality metrics, and deep traces, managing model context with highly customized, domain specific summarization becomes critical.

We are always trying to improve the signal-to-noise ratio of the inputs we send to a large language model. We rely on traditional statistical techniques whenever possible to summarize and compress data and make it easy for the LLM to focus on reasoning.

The same process is embedded into our agentic loop: decide the next step, analyze tool output, extract the key findings, and append only the most important summaries back into working memory. This act → analyze → summarize → repeat cycle is how agents stay coherent and efficient through long, complex investigations..

Why we chose the Notebook UX for Relvy

Relvy’s core UX is similar to that of other data analysis and visualization tools like jupyter notebooks. We’ve found that this shared workspace where AI and engineers investigate incidents together, yields faster results and shorter ramp-up times than traditional chat or AI interfaces. Unlike automated systems that jump to hypotheses or run linear investigations with limited user input, a notebook approach is interactive, transparent, and reusable.

Interactive - Engineers can ask questions in natural language, edit queries, and steer the investigation at any point. “Learning” mode lets the AI check in frequently at first, then take more autonomous steps as mutual confidence grows, accelerating learning for both AI and team.
Transparent - Every query, log, trace, and conclusion is visible, explainable, and editable. Results and raw data are displayed in rich visualizations that allow for easy audit / verification of AI generated hypotheses by engineers.
Reusable - Each notebook is a living record that supports postmortems, team training, and faster resolution of future incidents by reusing proven investigation paths while incorporating new learnings.

This combination gives teams control, clarity, and a growing knowledge base, making AI a trusted partner rather than an opaque automation layer.

The Takeaway

LLMs are incredibly powerful, but they don’t come with your org’s tribal knowledge built in. Human input, background preparation, thoughtful summarization, and an interface to further foster human - AI collaboration and transparency are what make LLMs work in real-world debugging.

About Relvy

Relvy provides AI-powered debugging notebooks that help engineers investigate and resolve incidents faster. It finds the root cause in over 70% of alerts automatically, and when it doesn’t, engineers can easily guide or correct the investigation—saving valuable time across the team.

‍