While llms.txt helps AI read the web and APIs help them connect, neither solves the infinite customization found in the economically important tasks in enterprise software. The real solution lies in computer-use agents that operate at the pixel level, learning from human demonstrations to navigate screens directly. This approach bypasses brittle connectors, allowing AI to handle complex workflows while humans remain in the loop for critical verification.While llms.txt helps AI read the web and APIs help them connect, neither solves the infinite customization found in the economically important tasks in enterprise software. The real solution lies in computer-use agents that operate at the pixel level, learning from human demonstrations to navigate screens directly. This approach bypasses brittle connectors, allowing AI to handle complex workflows while humans remain in the loop for critical verification.

The Screen Is the API

2025/12/10 15:36

"Why not just use llms.txt to understand the page?"

My friend was watching an AI agent work through a complex enterprise workflow. Clicking through menus, filling forms, handling the kind of nested configuration screens that were the definition of scope creep.

It was a reasonable question. Everyone is excited about llms.txt right now. A simple text file that tells AI systems what your website contains. Finally, the thinking goes, we have a standardized way for machines, or LLMs, to understand the web.

But my friend was confusing two very different problems. Reading is not doing.

The web did not become useful when machines learned to read it. It became useful when machines learned to act on it. And right now, the reading part is limited and we must shift focus to the doing.

Reading Isn’t the Hard Part

Let me be clear about what llms.txt actually does. It is a curated map for LLM inference. A structured way for language models to understand what exists on a website and where to find it. 

This is useful for bringing information to an LLM. But it is not a control mechanism. It does not let AI systems actually do anything. The gap between reading and acting is where the real work begins.

The Action Space

When people talk about AI automation, they usually mean APIs. Expose endpoints, let the AI call them, and you have automation. Simple.

Except it is not simple at all.

APIs expose only what developers choose to expose. They represent a curated subset of functionality that someone decided was worth the engineering effort to formalize. And in enterprise software, that subset is usually tiny compared to what users actually need to do.

Then came MCP, the Model Context Protocol. MCP tries to solve the connector problem. Instead of every AI system needing custom integrations with every application, you build one MCP connector and any MCP-compatible AI can use it.

This is an improvement. It solves the M×N problem where M AI systems need to integrate with N applications. But it assumes someone builds the connector in the first place.

Building these connectors is still hard. It requires understanding both the application and the MCP protocol. Most enterprise software will never get proper MCP support because the economics, I believe, are hard to justify. \n \n Attempts to automate API to MCP conversion have become popular, but they mostly produce brittle, low-level tools. As Han Lee and others point out, REST APIs are designed around nouns (resources with GET/PUT/POST/DELETE), while MCP works best when tools are verbs (deleteRow, createTask). Auto-wrapping one into the other hides that mismatch instead of solving it.

The M×N×P Problem

There is a deeper issue that neither APIs nor MCP can address. Call it the P variable: interface diversity.

P represents the number of unique ways the same software can be configured. And in enterprise software, P grows to enormous scale.

Consider SAP. A single SAP S/4HANA server contains tens of thousands of customizing tables. Every implementation is different. Every organization has its own approval chains, its own business rules, its own custom ABAP developments.

Here is a concrete example. Take something supposedly simple: a purchase order approval workflow. In a real SAP implementation, this involves parallel approval processes with all-or-nothing requirements. Custom rules like auto-approve if a contract covers the full purchase order amount. Multi-level approval chains where limits are maintained in custom tables. Dynamic role assignment based on cost center responsibility.

None of this is standard.

The approval chain requires both the Department Manager and Finance Department to approve simultaneously. Either rejection kills the whole workflow. 

Then come the rules. If the purchase order references a contract and the totals match, auto-approve. Otherwise, check approval limits in a custom table. If the first approver lacks sufficient authority, cascade to the next level.

And the approvers themselves? Assigned dynamically. Sometimes it is the Manager of Workflow Initiator. Sometimes the Cost Center Responsible. Sometimes specific users pulled from yet another custom table.

This is one workflow in one module. 

It requires domain-knowledge-specific consultants to implement because the out-of-the-box logic is too simple for how real organizations actually work.

This is the M×N×P problem. Even if you solved M×N with perfect connector protocols (like the MCP), you would still face the reality that every enterprise implementation is effectively a unique interface.

Computer-Use as the Universal Layer

There is one interface that is universal: the screen.

Computer-use agents operate at the pixel level. They see what humans see. They click where humans click. They navigate the same menus and fill out the same forms.

This sounds crude compared to elegant API calls. But it has one massive advantage: it works with everything. No connector required. No API exposure decisions. No MCP protocol adoption. If a human can do it, a computer-use agent can learn to do it.

The question is whether computer-use works well enough for production use. And here the research is early but encouraging.

The Demonstration Effect

The SCUBA benchmark tests AI agents on real Salesforce CRM workflows. In zero-shot settings, meaning no task-specific training, open-source models achieved less than 5% success rates. Even strong models that perform well on generic desktop benchmarks failed catastrophically when confronted with actual enterprise software.

But with demonstrations, meaning examples of humans completing the workflows, success rates jumped to 50%. Simultaneously, time and costs (of the agents) dropped by 13% and 16% respectively.

General capability is not enough. You need specific training on specific workflows.

Data Efficiency

In my experience, collecting computer-use trajectories is painful. Domain experts rarely understand what actually challenges a model. The infrastructure stacks on top of brittle web environments. Building those environments is pure tedium. When every example costs this much, data efficiency stops being nice-to-have.

Which is why the PC Agent-E research matters. Trained on just 312 trajectories, the model achieved a 141% improvement over the base model.

312 examples. Not millions. Not even thousands. A few hundred carefully chosen demonstrations of the exact workflows.

The model outperformed Claude 3.7 Sonnet with extended thinking on the WindowsAgentArena benchmark. And it generalized well to different operating systems, suggesting the learned behaviors were not brittle.

The economics of enterprise AI automation are simple: you do not need massive datasets. You need the right datasets from the right workflows.

The Honest Trade-Off

Now for the uncomfortable part. Generalization is necessary but not sufficient for high-stakes operations.

The same research that shows promising results also reveals gaps. Some agents that perform well on generic benchmarks like OSWorld achieve less than 5% success on specialized enterprise environments. Despite advances, today's RL systems struggle to generalize beyond narrow training contexts.

The sim-to-real gap persists. An agent that performs flawlessly in simulation may fail in production due to unmodeled variables. 

For high-volume, repetitive workflows like expense approvals, CRM updates, and standard procurement, trained computer-use agents are approaching production readiness. The error rate is acceptable because any single mistake is recoverable.

For one-off, high-stakes operations like schema migrations, financial reconciliations, and compliance configurations, the calculus is different. A database configuration error can cost millions. A compliance failure can trigger regulatory action.

The honest answer is that computer-use can handle navigation and execution for these tasks, but humans must remain at verification checkpoints. The agent does the clicking. The human confirms the consequences.

This is not a failure of the technology. It is appropriate risk management. And it still represents an enormous productivity gain. Navigating to the right screen, filling in the right fields, and preparing the right configurations is most of the work. Human verification at critical decision points is the remaining essential piece. At least for now.

Down The Middle: Agents and Humans

The path forward is not pure automation or pure human control. They are hybrid workflows where computer-use agents handle the interface complexity while humans handle the judgment calls. Human-in-the-loop is already the norm for production AI agents.

This requires new infrastructure. You need training pipelines for enterprise-specific demonstrations. You need simulation environments that match production configurations. You need checkpoint mechanisms that pause for human review at appropriate moments. Companies like Applied Compute, Theta, Osmosis, and Scale AI are starting to build this infrastructure.

But the hard technical problem, making computers reliably operate arbitrary interfaces, is being solved. The remaining problems are organizational and economic. Those problems have a tendency to get solved when the benefits are large enough.

The best agents still fail on most real enterprise tasks. But a few years ago they could barely hit a single submit button. The screen is the only universal interface. That's where the work should go.

\n

\

Piyasa Fırsatı
Tx24 Logosu
Tx24 Fiyatı(TXT)
$0.0052
$0.0052$0.0052
-10.34%
USD
Tx24 (TXT) Canlı Fiyat Grafiği
Sorumluluk Reddi: Bu sitede yeniden yayınlanan makaleler, halka açık platformlardan alınmıştır ve yalnızca bilgilendirme amaçlıdır. MEXC'nin görüşlerini yansıtmayabilir. Tüm hakları telif sahiplerine aittir. Herhangi bir içeriğin üçüncü taraf haklarını ihlal ettiğini düşünüyorsanız, kaldırılması için lütfen [email protected] ile iletişime geçin. MEXC, içeriğin doğruluğu, eksiksizliği veya güncelliği konusunda hiçbir garanti vermez ve sağlanan bilgilere dayalı olarak alınan herhangi bir eylemden sorumlu değildir. İçerik, finansal, yasal veya diğer profesyonel tavsiye niteliğinde değildir ve MEXC tarafından bir tavsiye veya onay olarak değerlendirilmemelidir.

Ayrıca Şunları da Beğenebilirsiniz

[OPINION] US National Security Strategy 2025: An iconoclastic document

[OPINION] US National Security Strategy 2025: An iconoclastic document

Trump's national security strategy signals a radical shift in US foreign policy, prioritizing economic power and regional interests over global commitments
Paylaş
Rappler2025/12/16 12:30
Crucial US Stock Market Update: What Wednesday’s Mixed Close Reveals

Crucial US Stock Market Update: What Wednesday’s Mixed Close Reveals

BitcoinWorld Crucial US Stock Market Update: What Wednesday’s Mixed Close Reveals The financial world often keeps us on our toes, and Wednesday was no exception. Investors watched closely as the US stock market concluded the day with a mixed performance across its major indexes. This snapshot offers a crucial glimpse into current investor sentiment and economic undercurrents, prompting many to ask: what exactly happened? Understanding the Latest US Stock Market Movements On Wednesday, the closing bell brought a varied picture for the US stock market. While some indexes celebrated gains, others registered slight declines, creating a truly mixed bag for investors. The Dow Jones Industrial Average showed resilience, climbing by a notable 0.57%. This positive movement suggests strength in some of the larger, more established companies. Conversely, the S&P 500, a broader benchmark often seen as a barometer for the overall market, experienced a modest dip of 0.1%. The technology-heavy Nasdaq Composite also saw a slight retreat, sliding by 0.33%. This particular index often reflects investor sentiment towards growth stocks and the tech sector. These divergent outcomes highlight the complex dynamics currently at play within the American economy. It’s not simply a matter of “up” or “down” for the entire US stock market; rather, it’s a nuanced landscape where different sectors and company types are responding to unique pressures and opportunities. Why Did the US Stock Market See Mixed Results? When the US stock market delivers a mixed performance, it often points to a tug-of-war between various economic factors. Several elements could have contributed to Wednesday’s varied closings. For instance, positive corporate earnings reports from certain industries might have bolstered the Dow. At the same time, concerns over inflation, interest rate policies by the Federal Reserve, or even global economic uncertainties could have pressured growth stocks, affecting the S&P 500 and Nasdaq. Key considerations often include: Economic Data: Recent reports on employment, manufacturing, or consumer spending can sway market sentiment. Corporate Announcements: Strong or weak earnings forecasts from influential companies can significantly impact their respective sectors. Interest Rate Expectations: The prospect of higher or lower interest rates directly influences borrowing costs for businesses and consumer spending, affecting future profitability. Geopolitical Events: Global tensions or trade policies can introduce uncertainty, causing investors to become more cautious. Understanding these underlying drivers is crucial for anyone trying to make sense of daily market fluctuations in the US stock market. Navigating Volatility in the US Stock Market A mixed close, while not a dramatic downturn, serves as a reminder that market volatility is a constant companion for investors. For those involved in the US stock market, particularly individuals managing their portfolios, these days underscore the importance of a well-thought-out strategy. It’s important not to react impulsively to daily movements. Instead, consider these actionable insights: Diversification: Spreading investments across different sectors and asset classes can help mitigate risk when one area underperforms. Long-Term Perspective: Focusing on long-term financial goals rather than short-term gains can help weather daily market swings. Stay Informed: Keeping abreast of economic news and company fundamentals provides context for market behavior. Consult Experts: Financial advisors can offer personalized guidance based on individual risk tolerance and objectives. Even small movements in major indexes can signal shifts that require attention, guiding future investment decisions within the dynamic US stock market. What’s Next for the US Stock Market? Looking ahead, investors will be keenly watching for further economic indicators and corporate announcements to gauge the direction of the US stock market. Upcoming inflation data, statements from the Federal Reserve, and quarterly earnings reports will likely provide more clarity. The interplay of these factors will continue to shape investor confidence and, consequently, the performance of the Dow, S&P 500, and Nasdaq. Remaining informed and adaptive will be key to understanding the market’s trajectory. Conclusion: Wednesday’s mixed close in the US stock market highlights the intricate balance of forces influencing financial markets. While the Dow showed strength, the S&P 500 and Nasdaq experienced slight declines, reflecting a nuanced economic landscape. This reminds us that understanding the ‘why’ behind these movements is as important as the movements themselves. As always, a thoughtful, informed approach remains the best strategy for navigating the complexities of the market. Frequently Asked Questions (FAQs) Q1: What does a “mixed close” mean for the US stock market? A1: A mixed close indicates that while some major stock indexes advanced, others declined. It suggests that different sectors or types of companies within the US stock market are experiencing varying influences, rather than a uniform market movement. Q2: Which major indexes were affected on Wednesday? A2: On Wednesday, the Dow Jones Industrial Average gained 0.57%, while the S&P 500 edged down 0.1%, and the Nasdaq Composite slid 0.33%, illustrating the mixed performance across the US stock market. Q3: What factors contribute to a mixed stock market performance? A3: Mixed performances in the US stock market can be influenced by various factors, including specific corporate earnings, economic data releases, shifts in interest rate expectations, and broader geopolitical events that affect different market segments uniquely. Q4: How should investors react to mixed market signals? A4: Investors are generally advised to maintain a long-term perspective, diversify their portfolios, stay informed about economic news, and avoid impulsive decisions. Consulting a financial advisor can also provide personalized guidance for navigating the US stock market. Q5: What indicators should investors watch for future US stock market trends? A5: Key indicators to watch include upcoming inflation reports, statements from the Federal Reserve regarding monetary policy, and quarterly corporate earnings reports. These will offer insights into the future direction of the US stock market. Did you find this analysis of the US stock market helpful? Share this article with your network on social media to help others understand the nuances of current financial trends! To learn more about the latest stock market trends, explore our article on key developments shaping the US stock market‘s future performance. This post Crucial US Stock Market Update: What Wednesday’s Mixed Close Reveals first appeared on BitcoinWorld.
Paylaş
Coinstats2025/09/18 05:30
Kevin Durant odzyskał Bitcoiny z Coinbase warte fortunę

Kevin Durant odzyskał Bitcoiny z Coinbase warte fortunę

Kevin Durant, jedna z największych aktywnych gwiazd NBA. Zawodnik Houston Rockets, ponownie znalazł się w nagłówkach gazet. Tym razem nie chodzi jednak o sportowe sukcesy lub transferowe plotki. Po latach Kevin Durant odzyskał Bitcoiny! KD na nowo ma dostęp do swojego dawno zapomnianego konta na Coinbase, gdzie trzymał Bitcoiny kupione niemal dekadę temu. Wartość tych […]
Paylaş
Bitcoinist2025/09/19 20:11