Can We Trust AI Agents in Time?| CCN.com


Key Takeaways

  • CCN speaks to Michael Sena, Co-Founder of Recall Labs.
  • Sena believes Recall Labs’ flagship Agent Rank platform is the answer to finding trust in the rapidly expanding industry of autonomous agents.
  • As AI agents become more capable, Sena argues that alignment with human values must be built in from the start.

AI agents, autonomous programs that can decide and act, are no longer the stuff of science fiction.

They’re now building crypto portfolios, executing trades and writing code, all while expanding at a breakneck speed.

But with hundreds and maybe thousands of agents emerging daily, the threats of AI can not be ignored and a central question looms: which ones can we trust?

Michael Sena, co-founder of Recall Labs, believes his company has found the answer.

Speaking to CCN Recall’s Sena opened up about why alignment and reputation will be critical as AI moves toward an era of autonomous swarms and potential AGI.

Top iGaming Sports Betting Sites

Sponsored

Disclosure

We sometimes use affiliate links in our content, when clicking on those we might receive a commission at no extra cost to you. By using this website you agree to our terms and conditions and privacy policy.

Thrill

Sportsbet.io

A Ranking System for Agent AI

Sena likened the current AI landscape to the early days of the internet:

“There’s an explosion of models, agents, tools, and workflows, but there’s not yet a good way to discover which are the most effective, high quality, high performance tools for your specific need or use case,” he told CCN.

Recall Labs’ answer is Agent Rank, a competition-driven system to evaluate AI agents.

Sena described it as “much like Google’s early PageRank system,” a reputation framework built through head-to-head contests.

Recall’s “on-chain AI arenas” pit agents against each other in controlled scenarios, with results transparently logged and ranked.

The idea is to go beyond marketing claims and measure real performance.

In crypto trading competitions, for example, “agents log their trades and their reasoning for why they’re making those trades,” Sena explained.

Metrics can be simple, like pure profit and loss, or more nuanced, like the Sharpe ratio, which measures risk-adjusted returns.

The goal, he said, is “to make the agent economy less of a ‘trust me, bro’ environment and more of a verifiable, auditable system.”

Open to Anyone, Not Just Big Players

One of the most striking aspects of Recall’s competitions is that participation is open.

“More than 70% of agents that competed in our last trading competition were built by people in our community that were non-developers,” Sena noted.

Some learned through short tutorials before going on to beat established winners.

He sees this inclusivity as essential to creating a “go-to repository for finding high-quality agents across a range of skills.”

Alignment, Transparency, and Guardrails

The conversation inevitably turns to the risks of AI agents, particularly in high-stakes industries like finance.

Sena frames the challenge in terms of alignment, ensuring that AI’s objectives match human values.

In Recall’s model, the community defines the goals and acceptable behaviors for agents, which are then embedded into the evaluation criteria.

Competitions require agents to record their “chain of thought” so observers can see not just what they did, but why.

“It’s early detection, it’s monitoring, and ultimately that’s what ensures alignment,” he said.

AGI Is Inevitable

When the conversation turned to artificial general intelligence (AGI), Sena acknowledges its inevitability and uncertainty.

“With AGI and the inevitability of some kind of a superintelligence… we will be somewhere close to that,” he said, adding that it remains to be seen whether one system will dominate or whether multiple superintelligences will coexist.

What’s certain, Sena argues, is that it will “change the way we work, create, live, [and] the types of jobs and roles that people will play in the world.”

But that transformation won’t be evenly distributed, “it will continue to separate those that can best harness the capabilities of AI from those that can’t.”

For Sena, it is clear that the world must act now before AGI arrives.

“We should all be doing everything we can right now to figure out how to best use AI… and to think now, not 10 years in the future, about what are the frameworks and systems we have to set up to ensure that the development of that is as aligned with our goals as possible,” he said.

AI Agents and Human Values

As AI agents become ever more capable, the question of whether they will act in ways consistent with human values has become a pressing issue throughout society.

In places like finance and healthcare, the consequences of misaligned systems could be severe.

“Agents are really the opposite [of big foundational models]. They aim to be really great at one thing,” Sena explained.

Alignment in such systems, he said, requires clearly defining goals, measurable success criteria, and unacceptable behaviors, all backed by transparent reporting of what the agents do and why.

His vision for Agent Rank extends well beyond human hiring decisions.

Because the rankings are built on open crypto rails, they could become a universal dataset for performance across skills, something that could power marketplaces, apps, AR interfaces, or even voice assistants.

Eventually, he sees a world where “agents are actually contracting other agents on demand because that agent is more specialized or better at the skill” needed, creating what he calls a “liquid economy of skills between agents.”

Expanding Beyond Agents

Although Recall has already made a name for itself ranking autonomous agents, the company is now taking its reputation-driven approach to large AI models.

Sena and his team recently launched RecallPredict, an “un-gamable” crowdsourced benchmark designed to evaluate foundational models with the same transparency input that powers Agent Rank.

Instead of relying on lab-created benchmarks, RecallPredict invites the public to submit skills and test scenarios.

People can propose any skills they want tested, such as creative writing ability, alignment with specific instructions, or even more niche preferences.

As Sena put it, submissions can be as niche (and frankly important) as “respect my request to not use em dashes in my writing.”

According to Sena, the response has been overwhelming, with more than 19,000 submissions and millions of performance predictions so far.

Top Trending Crypto Articles


Was this Article helpful?



Yes



No




#Trust #Agents #Time #CCN.com

Leave a Reply

Your email address will not be published. Required fields are marked *