The evolution towards Agentic Testing, illustrated by mabl

The software development industry is experiencing an unprecedented leap in productivity across multiple dimensions. First, through the transformation from Quality Assurance (QA) to Quality Engineering (QE), a shift from reactive bug-finding to proactive quality design throughout the entire software lifecycle. Second, the adoption of AI-assisted coding – from GitHub Copilot to built-in IDE assistants – means code is being produced faster than ever before. This is great news for project teams, but it is also a tremendous challenge for Quality Engineers.

In my work, I have noticed that QE teams increasingly face a dilemma: how do you maintain test coverage when the pace of code delivery is growing exponentially, while headcount remains the same?

The market is responding with Agentic Testing – a concept in which the testing tool is no longer a passive script executor. Still, it becomes an autonomous digital team member (Digital Teammate), capable of interpretation, decision-making, and adaptation. I would like to explore, in greater detail, the realization of this vision on the mabl platform.

Before we examine specific capabilities, however, it is worth understanding how mabl defines the concept of Agentic Testing itself and what foundation it builds upon.

What is Agentic Testing according to mabl?

In my view, the crucial distinction lies between AI-enhanced testing (where AI assists humans in targeted areas – for instance, through self-healing locators) and Agentic Testing proper.

mabl defines Agentic Testing as an approach in which our digital co-worker interprets, decides, and adapts – much like a human.

The platform describes this capability as the result of fusing two pillars.

The first pillar

Sophisticated Automation – provides solid technical foundations. In practical terms, this means the platform supports not only standard web and mobile scenarios but also handles elements that many competing tools require workarounds for – Shadow DOM or testing content embedded in PDF files, to name just two. Equally important is the ability to run tests in parallel, which translates into real-time savings for large regression suites.

The second pillar

Advanced Intelligence – is the layer mabl has been developing since the platform’s inception. Notably, this is not simply a large language model “bolted on” to an existing tool. Depending on the task context, the platform draws on different mechanisms – from classical machine learning, through expert rule-based systems, to generative AI. This hybrid approach avoids situations in which GenAI is applied where a simpler algorithm would be faster and cheaper.

According to mabl’s creators, the agent gradually builds a data profile of the system, enabling it to optimize its own work and support the team’s productivity.

Five Core Skills of the Agentic Tester in mabl

mabl defines its testing agent through the lens of five Core Skills – a set of competencies that mirror the way an experienced tester works. Below, I briefly discuss each one.

Acting – interaction like a real user

According to mabl’s documentation, the agent interacts with the application the way a human would. Under the Acting heading, the platform lists:

clicking, entering data, and navigating in web browsers;
scrolling and entering data in mobile applications (iOS, Android);
direct API invocation and validation.

We often encounter the challenge that automation tools struggle with more complex UI elements. As part of its automation foundation, mabl also supports Shadow DOM, e-mail, and PDFs.

Observing – intelligent perception of application state

As we all know, simply acting is not enough – the agent must understand what it sees. Under the Observing section, the platform collects three types of data during test execution.

screenshots that later feed into Visual AI features,
DOM structure snapshots, which are invaluable for debugging (they allow comparison of the DOM at the point of success versus failure),
API call results, completing the picture of what is happening “under the hood” of the application.

It is precisely this observational foundation that underpins one of the most compelling features – Visual AI, which I discuss in more detail later in this article.

Deciding & Reasoning – autonomous decision-making

This is the heart of the agentic concept. The mabl agent does not merely execute pre-programmed steps – it reasons and makes decisions. The agent is expectedto independently analyze situations, draw conclusions, and react to changes in the application without human intervention.

In practice, this translates into specific features. Auto-Healing automatically repairs tests when the UI changes. Meanwhile, Visual Assist adds visual analysis of screenshots, enabling the agent to “see” the page layout as a human would. Combining two analytical methods – the classical (DOM) and the visual – makes tests far more resilient to frontend refactoring.

Auto TFA (Test Failure Analysis) is the automated analysis of test and plan failures

mabl automates the failure-analysis process – based on run results, the agent independently forms a diagnosis, categorizes the problem (e.g., application bug vs. test instability), and indicates where to begin the fix. This works at both the individual-test and the entire-test-plan levels.

In my experience, Auto TFA is the feature with enormous potential, helping reduce the time from defect detection to resolution. In practice, we can skip manual log analysis and proceed directly to a ready-made diagnosis.

On top of all this, there is Intelligent Waits – an algorithm that automatically adjusts wait times, eliminating the “flaky tests” problem caused by network latency.

Learning & Remembering – continuous knowledge building

One of the more compelling differentiators is that the mabl Agent continuously learns from the tests it executes, allowing it to build sophisticated models of application behavior and performance. A key element introduced in October 2025 is AI Vectorization & Test Semantic Search.

The pivotal change lies in how tests are indexed. In practice, this means the agent is not limited to keyword identification but recognizes the actual purpose and operational function of each test within the system. This enables, for example, detecting that two tests with entirely different names in fact cover the same scenario.

Integrating & Collaborating – working within the team ecosystem

The final skill is integration with existing tools and workflows.

The mabl MCP Server plays a particularly interesting role. Personally, the MCP Server is one of the most promising elements of the entire architecture. In practice, it works like this: a developer is working in their IDE, introduces a code change, and, without leaving the editor, can ask the agent, “Which tests might be affected by my change?” (Test Impact Analysis). The agent returns a list of related tests and, if a failure is detected, suggests what might be causing it. This represents a significant shift compared with the traditional flow.

The five Core Skills form a general competency map for the agent. In the following sections, we will take a closer look at selected capabilities that grow out of this foundation – starting with one of the most advanced: Visual AI.

Visual AI – contextual visual regression detection

In the traditional approach, visual regression tests compare pixels. In practice, however, this can lead to false positives.

The main sources of the problem include:

dynamic content,
font-rendering differences across environments,
and the variability of third-party components.

These trigger unjustified alerts despite the absence of actual bugs in the application’s logic.

The question, then, is: how does Visual AI solve this problem?

According to mabl’s documentation, Visual AI does not compare pixels – it interprets visual meaning. This is described at three levels:

Semantic understanding of UI elements – the system recognizes what a given element is and evaluates changes in context. To use a simple but apt example: a button’s background changes (say, from blue #0066CC to navy #003366). Visual AI assesses the change from a usability perspective – whether the element is still recognizable as a button, whether it meets accessibility requirements, and whether it blends into the background. If the answer to all these questions is “yes,” the change is accepted.
Pattern recognition across states – recognizing patterns across different variants of the same view. Every modern web application can look different depending on context: logged-in users and guests see different layouts, the interface looks different on a phone versus a desktop, and then there are dark mode, loading spinners, or error messages. Rather than flagging each of these differences as potential issues, Visual AI builds a model of how the application should look in a given state and reports only deviations from that pattern.
Visual intent recognition – at the highest level, Visual AI evaluates visual changes not in terms of “what changed” but “whether the page still fulfills its purpose.” Consider a login form: the font, background color, or spacing may change, but the key question is whether the user can still log in. Are the fields visible? Does the user know where to click? Will they receive feedback if they make a mistake? This approach – evaluating the interface’s intent rather than pixel-level conformity – eliminates the majority of false positives that plague traditional tools.

Visual AI illustrates how the agent handles perception. But mabl’s agentic approach reaches further – it also encompasses the creation of tests from scratch.

Test Creation Agent – autonomous test authoring

One of the breakthrough features announced in mid-2025 and developed since then is the Test Creation Agent.

The process works as follows: a tester describes in natural language what the test should verify (e.g., “Verify that a user can add a product to the cart and proceed to payment”). On this basis, the agent independently plans the sequence of steps, identifies which parts of the test can be shared with other scenarios (e.g., the login flow), and generates a ready-to-run test in mabl’s cloud environment. The result naturally requires human verification – but the starting point is incomparably better than a blank canvas.

In my work, I have observed that creating new tests consumes the most time when delivering new features. If the agent can build a solid test skeleton that we can then manually verify and refine, we save time without sacrificing quality.

Application Summaries – context that changes the quality of generated tests

In January 2026, mabl introduced a new capability, Application Summaries, that strengthens the Test Creation Agent. This mechanism automatically generates a description of the web application under test, providing additional context when creating new tests.

One of the biggest challenges when using AI agents to generate tests is providing the right context.

How does Application Summaries work?

The mechanism operates automatically and requires no user intervention. When a test is created for a new web application, mabl begins generating an application summary within a few hours. Summaries are then periodically updated based on recent test activity. Most importantly, the Test Creation Agent uses this description as additional context when planning new test steps.

Two current limitations deserve mention: summaries are generated automatically and cannot be edited manually, and their processing and generation can take several hours.

Despite these two drawbacks, in my opinion, this is a significant step forward. Application Summaries address a problem familiar to anyone who has tried to generate tests with AI: the need to write detailed prompts that describe what the application under test actually is. This feature reduces the burden of crafting the perfect prompt and provides greater confidence in the agent’s planning process.

Summary – what does this mean for us, testers?

In my view, the key takeaway is that Agentic Testing does not replace the tester – it transforms the nature of their work. mabl consistently positions its agent as a digital co-worker that complements human expertise rather than eliminates it.

From my perspective, we must approach all the innovations in this space with curiosity, tempered by a healthy dose of skepticism. Let us not forget that technology is meant to help us perform our work better, often freeing us from repetitive, time-consuming tasks, so we have the time to view testing from the broader perspective of quality strategy and architecture.

5/5

Offices in Poland

Sii Sweden

Sii Ukraine

Sii India

Automating financial analysis with AI

Berlingske Media enters the digital-first era – platform modernization with Sii Poland

Quality Control Center for ABB – gaining control over IT system quality

E-commerce modernization to support growth and smooth shopping experience

Sii delivered a mobile application for the Shakespeare Festival

How to scale AI safely in enterprise projects? Sii's approach to AI Governance

How to improve B2B e-commerce and customer service on a global scale

Sii Poland is the new IT partner of the FALA project. The company is taking over the maintenance and development of the system

From demo to systems you can trust: how real AI works in business

The AI-Powered Ecosystem: Scaling Marketing Analytics

Mobile WMS for Process Manufacturing. Powered by Business Central.

Multi‑Agent AI for Embedded Software Development

How AI fixed corporate travel management

Workday HCM – promotions

Sovereign AI: why Europe wants its own Artificial Intelligence – and what Azure, AWS and Google Cloud offer

Undreamed risk management revolt: How Gen AI and Agentic AI are reshaping the effective challenge

From automation to autonomy: The evolution towards Agentic Testing, illustrated by mabl

What is Agentic Testing according to mabl?

The first pillar

The second pillar

Five Core Skills of the Agentic Tester in mabl

Acting – interaction like a real user

Observing – intelligent perception of application state

Deciding & Reasoning – autonomous decision-making

Auto TFA (Test Failure Analysis) is the automated analysis of test and plan failures

Learning & Remembering – continuous knowledge building

Integrating & Collaborating – working within the team ecosystem

Testing & QA

Visual AI – contextual visual regression detection

Test Creation Agent – autonomous test authoring

Application Summaries – context that changes the quality of generated tests

How does Application Summaries work?

Summary – what does this mean for us, testers?

About the author

Leave a comment

Cancel reply

Join our team

Network Engineer – Product Testing and Validation (f/m/x)

Mobile Test Automation Engineer (f/m/x)

You might also like

Why SAP S/4HANA releases still fail – even when everything was “tested”

What does the code from Sii Testing Lab look like? Repositories deep-dive

AI at work for a Test Developer – what Codex, Claude Design, Claude Opus 4.7, Codex Security, and Claude Mythos really change

Sii Testing Lab Study: Exploring the “AI Boom” in Test Automation

Beginner’s guide on how to start testing a website’s accessibility

What if we didn’t test at all? What a project without a testing process would look like

How AI transformed testing in 2025: Insights from Mabl Experience 25 Conference

How can Tricentis Tosca support you in accessibility testing?

Playwright in practice: integration API with UI architecture of testing framework. Part II

Playwright in practice: 5 steps to an effective UI & Web test automation framework. Part I

Cypress – component testing

When the going gets tough, the tough get going with Vision AI

SUBSCRIBE AND DON'T FALL BEHIND

Join our team

Network Engineer – Product Testing and Validation (f/m/x)

Mobile Test Automation Engineer (f/m/x)

Processing...

What we do angle-down

Industries angle-down

Who we are angle-down

Job offers angle-down

Benefits angle-down

Career angle-down

Send your CV angle-down

Trainings angle-down

News angle-down

Contact angle-down

Ta treść jest dostępna tylko w jednej wersji językowej. Nastąpi przekierowanie do strony głównej.

What we do

Industries

Who we are

Job offers

Benefits

Career

Send your CV

Trainings

News

Contact

Ta treść jest dostępna tylko w jednej wersji językowej.
Nastąpi przekierowanie do strony głównej.