Agentic AI is not a prompt

Agentic AI is not a prompt – Sii approach

25.06.2026

Almost every sales pitch about Agentic AI says the same thing: a revolution. Software that can plan, decide, and act on its own, taking on real work so people don’t have to – and the only risk is being too slow to adopt it. The market believes it too. Spending on AI agents is forecast to jump from around $7.8 billion in 2025 to roughly $52 billion by 2030, and Gartner expects a third of business software to include Agentic AI by 2028 (MarketsandMarkets, 2025; Gartner, 2025).

In practice, it is harder. Gartner also expects more than 40% of Agentic AI projects to be canceled by the end of 2027 – too costly, too unclear in value, too hard to control (Gartner, 2025). And a widely cited (if controversial) MIT study found that 95% of company AI pilots delivered no measurable return (MIT Project NANDA, 2025).

Most projects don’t fail because the AI isn’t smart enough. They fail because of everything around them.

What nobody puts in the brochure

When we take over Agentic AI projects – from our own clients or from other vendors – we keep seeing the same three problems. Most teams hit at least one. Many hit all three.

It automates the wrong thing. The hardest call comes before any code: Is this even a job for an agent? Plenty of projects fail right here – the task is too open-ended, the cost of a wrong answer is too high, or a simple rule or script would have done it cheaper and better. Pick the wrong goal, and no amount of engineering can save it.
No one can tell whether the agent is actually right. Often, a system that dazzles in a demo on simple scenarios falls apart on contact with reality. And because most teams deploy Agentic AI with no real evaluation, nobody spots the problem early enough. An agent you can’t measure is one you can’t trust or improve.
It’s built, but never trusted or used. Even a working agent delivers nothing if people can’t see why it did what it did, won’t change how they work, or simply don’t trust it. The technology is rarely the blocker here – the organization around it is.

To avoid these problems, we have to change how we think about Agentic AI.

Agentic AI is a system, not a prompt

The model is the commodity. Your edge is everything around it – the integrations, the data, the tools, and getting people to actually use it.

All the real engineering revolves around the goal, and it starts there: defining precisely what the agent is for – and being willing to conclude that some tasks shouldn’t be agents at all – is the first and most consequential decision. Focus first on discovering where Agentic AI is genuinely needed in your organization, rather than inventing places to use Agentic AI for its own sake.

Beyond that, Agentic AI is made up of many parts:

interaction surfaces,
an orchestrator that owns planning, retries, and hand-offs as an inspectable state machine, often with memory,
tools and APIs exposed and permissioned through MCP,
retrieval over a semantic layer so the agent reads your business rather than raw data,
and memory to carry context.

The layer that determines implementation

Underneath all of it runs the layer most demos skip – and the one that decides whether an agent ever reaches production.

Evaluation – does it work? A versioned test suite that scores the system component by component: retrieval, tool calls, the reasoning trajectory, and the final answer, built with frameworks like RAGAS or DeepEval. It runs on every change as a regression gate with hard thresholds for accuracy, cost-per-task, and latency, and reliability is measured across repeated runs (pass@k), because the model is stochastic, and one good demo proves nothing.
Observability – what is it doing right now? Distributed tracing of every step, tool call, retry, and token (OpenTelemetry, surfaced in LangSmith or Langfuse), with cost and latency tracked per task and per user, and drift watched on the data and tools the agent depends on, not just the model.
Explainability – why did it decide that? Every answer can cite its sources and show the tools it used, with decisions logged in plain business language that an owner or an auditor can read.
Governance – what is it allowed to do? Least-privilege tool scopes, policy guardrails, human-in-the-loop approval for irreversible actions, and immutable audit trails – the same controls that satisfy regimes such as the EU AI Act.
Change management – will people use it? The people who own the workflow in the room from day one, because a technically flawless agent still dies if the team it was built for doesn’t trust it or know how to work alongside it.

From isolated agents to an agentic platform

More and more of our customers have stopped building agents one at a time. Once you’ve put a single agent into production as a proper system – integrations, a semantic layer, evaluation, observability, governance – you’ve already built most of the hard parts of every agent that follows.

So the smart move is to treat that shared machinery as a platform, with each new agent as a thin layer on top: tools and connectors exposed once over MCP, a common retrieval and evaluation stack, and one observability plane. New use cases become assembly rather than greenfield builds, and time-to-production drops from quarters to weeks.

The bigger win is governance done once. Define access control, tool permissions, guardrails, audit, and EU AI Act compliance at the platform layer, and every agent inherits them by default. The first agent pays the cost of getting it right; everyone after reuses that work, which is the difference between safely running one agent and safely running fifty.

From the brochure to production

Built this way – as systems, on a shared platform – agents stop being demos and start creating serious, measurable value.

A few examples from our own backyard, anonymized; each earned its place on a number, not a demo:

Service desk automation (manufacturing). Wired into ServiceNow and Microsoft Teams, it cut service desk costs by 70% and lifted throughput by 30% – because it lived inside the systems people already used, not beside them.
Compliance monitoring (legal & compliance). Automated contract and compliance review that flags 94% of breaches autonomously, keeping people in the loop only on the genuine edge cases.
Edge AI diagnostics (semiconductors). A fully on-device troubleshooting assistant reaching about 80% of a frontier model’s performance, compact enough to run on embedded hardware, proves the biggest model isn’t always the right one.

Different industries, different stacks, one pattern: the model was interchangeable; the integration, the data, and the human-in-the-loop design were what made each one production-grade.

Don’t follow the trend – just build

The models are already good enough, and the tooling (orchestration, retrieval, evaluation, observability, and governance) is mature enough to deploy Agentic AI in production today. Teams that keep chasing the newest LLM or framework release rarely ship; teams that pick a solid stack and commit to it do.

The model was never the hard part. Almost every agent that fails, fails on the system around it – the integrations, the data, the governance, the people – and almost every agent that succeeds owes that success to the same thing. So stop optimizing for the cleverest model and start engineering the system around it. That’s the whole job, and it’s one you can start today, with us.

Sources

Gartner, “Gartner Predicts Over 40% of Agentic AI Projects Will Be Canceled by End of 2027,” press release, June 25, 2025
Gartner predictions on agentic AI adoption by 2028, reported by Reuters, 2025
MIT Project NANDA, “The GenAI Divide: State of AI in Business 2025,” July 2025; reported by Fortune
MarketsandMarkets, “AI Agents Market — Global Forecast to 2030.”

5/5

About the author

Marcin Mosiołek

Marcin is AI Center Lead at Sii Poland, where he runs the company's AI business across engineering and consulting services. With over 14 years of experience in applied AI, he has both built and led AI solutions for organizations across life sciences, legal, retail, automotive, and public administration – taking projects from initial concept through to production-grade solutions operating at scale. Through this work, he has developed a practical understanding of why AI strategies succeed or fail in real organizational environments, and what it takes to move from ambition to measurable impact. Alongside his work at Sii, Marcin has served as an AI advisor to technology startups and scale-ups, supporting founders and technical teams

Offices in Poland

Sii Sweden

Sii Ukraine

Sii India

Automating financial analysis with AI

Berlingske Media enters the digital-first era – platform modernization with Sii Poland

Quality Control Center for ABB – gaining control over IT system quality

E-commerce modernization to support growth and smooth shopping experience

How to scale AI safely in enterprise projects? Sii's approach to AI Governance

How to improve B2B e-commerce and customer service on a global scale

Sii Poland is the new IT partner of the FALA project. The company is taking over the maintenance and development of the system

Helping organizations build Sovereign AI: Sii's approach to secure enterprise AI

From demo to systems you can trust: how real AI works in business

The AI-Powered Ecosystem: Scaling Marketing Analytics

Mobile WMS for Process Manufacturing. Powered by Business Central.

Multi‑Agent AI for Embedded Software Development

How AI fixed corporate travel management

Workday HCM – promotions

Sovereign AI: why Europe wants its own Artificial Intelligence – and what Azure, AWS and Google Cloud offer

Undreamed risk management revolt: How Gen AI and Agentic AI are reshaping the effective challenge

Agentic AI is not a prompt – Sii approach

What nobody puts in the brochure

Agentic AI is a system, not a prompt

The layer that determines implementation

From isolated agents to an agentic platform

From the brochure to production

Artificial Intelligence

Don’t follow the trend – just build

Sources

About the author

Leave a comment

Cancel reply

Join our team

AI Engineer – LLM / Agentic Systems (f/m/x)

Senior AI Engineer (f/m/x)

You might also like

Sovereign AI: why Europe wants its own Artificial Intelligence – and what Azure, AWS and Google Cloud offer

Undreamed risk management revolt: How Gen AI and Agentic AI are reshaping the effective challenge

What can you be in the ServiceNow world?

Can AI Support Electronics Design? Testing GitHub Copilot and KiCad

AI Agents in process automation – UiPath vs n8n

Adobe Summit 2026: agentic AI, AEM, and a new era of customer experience orchestration

Rebranding in large organizations, as exemplified by Salesforce – a process that goes far beyond a logo change

Copilot in Microsoft Intune: How AI is transforming the daily work of endpoint administrators

Blazing fast websites and how Adobe Experience Manager can help achieve that

Practical application of AI models in programming. A comparison of GPT-5.4 and Claude Opus 4.7

Intelligent integration: How AI is revolutionizing IT

How UX and UI can discover new possibilities for the automotive industry and autonomous cars

SUBSCRIBE AND DON'T FALL BEHIND

Join our team

AI Engineer – LLM / Agentic Systems (f/m/x)

Senior AI Engineer (f/m/x)

Processing...

What we do angle-down

Industries angle-down

Who we are angle-down

Career angle-down

Job offers angle-down

Benefits angle-down

Send your CV angle-down

Trainings angle-down

News angle-down

Contact angle-down

Ta treść jest dostępna tylko w jednej wersji językowej. Nastąpi przekierowanie do strony głównej.

What we do

Industries

Who we are

Career

Job offers

Benefits

Send your CV

Trainings

News

Contact

Ta treść jest dostępna tylko w jednej wersji językowej.
Nastąpi przekierowanie do strony głównej.