Confidence in proxy AI: Why infrastructure must be evaluated first

When AI agents enter real-world deployments, organizations are under pressure to define where they belong, how to build them efficiently and how to operate them at scale. At VentureBeat’s 2025 Transform, technology leaders come together to talk about how they can transform their businesses with agents: Joanne Chen, general partner at Foundate Capital; Shailesh Nalawadi, vice president of project management at Sendbird; Thys Waanders, senior vice president of AI transformation in cognition; and Shawn Malhotra, chief technology officer, Rockets.

https://www.youtube.com/watch?v=dchzgcf1poo

Some advanced proxy AI use cases

“The initial attraction of these deployments to AI agents is often around saving human capital – math is very simple,” Naravadi said. “But this highlights the transformative power you gain with AI agents.”

At Rocket, AI proxy has proven to be a powerful tool to increase website conversion.

“We found that through agent-based experience, conversation experience on the website, customers are three times more likely to switch through the channel,” Malhotra said.

But it’s just scratching the surface. For example, a rocket engineer set up an agent in just two days to automate a highly professional task: calculating transfer taxes during mortgage underwriting.

“The two-day effort saved us a million dollars a year,” Malhotra said. “In 2024, we saved more than a million team members hours, mainly behind our AI solutions. It’s not just a cost saving. It also allows our team members to focus their time on people who make their lives the biggest financial transactions in their lives.”

Agents are essentially supercharged individual team members. The million hours saved are not the entire work that many people replicate. Things employees don’t like to do are part of the work, or do not add value to customers. And saving millions of hours, Rocket has the ability to handle more business.

“Last year, some of our team members were able to process 50% of their customers than they did the previous year,” Malhotra added. “That means we can have higher throughput, drive more business, and then see higher conversion rates because they spend time understanding what customers need, rather than doing more rote work that AI can do now.”

Complexity of coping agents

“Part of our journey of engineering teams is to shift from the way of thinking in software engineering – one time and test and run and give the same answer 1,000 times – a more probabilistic approach, in which you come up with the same LLM, which gives different answers through some probabilities,” Naravadi said. “A lot of them are always taking people. Not only software engineers, but product managers and UX designers.”

What helped is that LLMS has come a long way, Waanders said. If they built something 18 months or two years ago, they really had to choose the right model or the agent wouldn’t be able to perform as expected. Now, he said, we are at the stage where most mainstream models perform well. They are more predictable. But today, the challenge is to combine models, ensure responsiveness, perform the right model in the right order and weave it in the right data.

“Our customers drive tens of millions of conversations every year,” Waanders said. “If you have 30 million conversations automatically in a year, how about the scale in the LLM world? That’s all we have to discover, even with cloud providers to get the usability of the model.

Malhotra said that the agent network is being planned on the layer above the carefully planned LLM. The conversation experience has a network of agents under the hood, and the orchestrator is deciding which agent can farm requests from available agents.

“If you’re going forward and consider having hundreds or thousands of agents that can have different things, you’re having some really interesting technical issues,” he said. “This has become a bigger problem because latency and time matters. This proxy route will be a very interesting issue in the coming years.”

Utilize supplier relationships

So far, the first step for most companies launching proxy AI has been built internally because professional tools do not exist yet. However, you can’t distinguish and create value by building a common LLM infrastructure or AI infrastructure, and you need professional expertise to go beyond the initial build, debug, iterate and improve what is built and maintain the infrastructure.

“We often find that the most successful conversations with potential customers are often people who have built something inside,” Naravadi said. “They quickly realized that it’s OK to reach 1.0, but as the world grows and infrastructure grows, they don’t have the ability to plan all of these things when they need to swap technology for something new.”

Prepare for proxy AI complexity

In theory, proxy AI will only grow in complexity – the number of proxy in an organization will increase, and they will start learning from each other, and the number of use cases will explode. How do organizations prepare for challenges?

“This means that checks and balances in your system add stress,” Malhotra said. “For things that have regulatory processes, you have a loop of people to make sure someone signs on it. Do you have observability for critical internal processes or data access? Do you have the right alerts and surveillance to make sure something goes wrong, do you know it goes wrong? It will gradually drop in the case you find out because you need to be in the loop and you run into a certain range in these processes because it is at where and where it is, and you have to follow these issues because it is at where and where it is, and you have to follow these issues. Unlock, you have to do that.”

So, how do you have the confidence that AI agents will act reliably as AI agents develop?

“If you didn’t think about it in the first place, that part is really hard,” Naravadi said. “The short answer is that you should have an evaluation infrastructure before you even start building it. Make sure you have a strict environment that you can know from AI agents and that you have this test set. As you make improvements, go ahead and introduce it. A very simple way to think about it, unit testing your agent system.”

The problem, Waanders added, is that it is non-deterministic. Unit testing is crucial, but the biggest challenge is that you don’t know what you don’t know – the agent may show something incorrect, and it may react in any given situation.

“You can only find this by simulating the conversation at scale, pushing it in thousands of different situations, and then analyzing how it stays and how it reacts,” Waanders said.

What's Hot

AT&T launches wireless account lockout protection to curb Sim-S-Swap Scourge

Tesla delivery fell 14% in Q2

Confidence in proxy AI: Why infrastructure must be evaluated first

Confidence in proxy AI: Why infrastructure must be evaluated first

Download: Stumble with AI and stop the crawler robot

Unlock performance: Accelerate Pandas operation using Polars

CTGT’s AI platform is built to eliminate bias, hallucination in AI models

See blood clots before the strike

AI-controlled robot shows unstable driving, NHTSA problem Tesla

Estonia’s AI Leap brings chatbots to school

Smart Home Décor : Technology Offers a Slew of Options

Edifier W240TN Earbud Review: Fancy Specs Aren’t Everything

Review: Xiaomi’s New Mobile with Hi-fi and Home Cinema System

AT&T launches wireless account lockout protection to curb Sim-S-Swap Scourge

Tesla delivery fell 14% in Q2

Confidence in proxy AI: Why infrastructure must be evaluated first

Don’t wait for Prime Day PS5 deals – Here are the 13 best early savings I’ve found

Our Picks

AT&T launches wireless account lockout protection to curb Sim-S-Swap Scourge

Tesla delivery fell 14% in Q2

Confidence in proxy AI: Why infrastructure must be evaluated first

Top Reviews

Smart Home Décor : Technology Offers a Slew of Options

Edifier W240TN Earbud Review: Fancy Specs Aren’t Everything

Review: Xiaomi’s New Mobile with Hi-fi and Home Cinema System

Subscribe to Updates

What's Hot

Confidence in proxy AI: Why infrastructure must be evaluated first

Some advanced proxy AI use cases

Complexity of coping agents

Utilize supplier relationships

Prepare for proxy AI complexity

Related Posts