Sii Poland

SII UKRAINE

SII SWEDEN

  • Trainings
  • Career
Join us Contact us
Back

Sii Poland

SII UKRAINE

SII SWEDEN

Back

27.04.2026

From automation to autonomy: The evolution towards Agentic Testing, illustrated by mabl

27.04.2026

Od automatyzacji do autonomii: Ewolucja w kierunku Agentic Testing na przykładzie mabl

The software development industry is experiencing an unprecedented leap in productivity across multiple dimensions. First, through the transformation from Quality Assurance (QA) to Quality Engineering (QE), a shift from reactive bug-finding to proactive quality design throughout the entire software lifecycle. Second, the adoption of AI-assisted coding – from GitHub Copilot to built-in IDE assistants – means code is being produced faster than ever before. This is great news for project teams, but it is also a tremendous challenge for Quality Engineers.

In my work, I have noticed that QE teams increasingly face a dilemma: how do you maintain test coverage when the pace of code delivery is growing exponentially, while headcount remains the same?

The market is responding with Agentic Testing – a concept in which the testing tool is no longer a passive script executor. Still, it becomes an autonomous digital team member (Digital Teammate), capable of interpretation, decision-making, and adaptation. I would like to explore, in greater detail, the realization of this vision on the mabl platform.

Before we examine specific capabilities, however, it is worth understanding how mabl defines the concept of Agentic Testing itself and what foundation it builds upon.

What is Agentic Testing according to mabl?

In my view, the crucial distinction lies between AI-enhanced testing (where AI assists humans in targeted areas – for instance, through self-healing locators) and Agentic Testing proper.

mabl defines Agentic Testing as an approach in which our digital co-worker interprets, decides, and adapts – much like a human.

The platform describes this capability as the result of fusing two pillars.

The first pillar

Sophisticated Automation – provides solid technical foundations. In practical terms, this means the platform supports not only standard web and mobile scenarios but also handles elements that many competing tools require workarounds for – Shadow DOM or testing content embedded in PDF files, to name just two. Equally important is the ability to run tests in parallel, which translates into real-time savings for large regression suites.

The second pillar

Advanced Intelligence – is the layer mabl has been developing since the platform’s inception. Notably, this is not simply a large language model “bolted on” to an existing tool. Depending on the task context, the platform draws on different mechanisms – from classical machine learning, through expert rule-based systems, to generative AI. This hybrid approach avoids situations in which GenAI is applied where a simpler algorithm would be faster and cheaper.

According to mabl’s creators, the agent gradually builds a data profile of the system, enabling it to optimize its own work and support the team’s productivity.

Five Core Skills of the Agentic Tester in mabl

mabl defines its testing agent through the lens of five Core Skills – a set of competencies that mirror the way an experienced tester works. Below, I briefly discuss each one.

Acting – interaction like a real user

According to mabl’s documentation, the agent interacts with the application the way a human would. Under the Acting heading, the platform lists:

  • clicking, entering data, and navigating in web browsers;
  • scrolling and entering data in mobile applications (iOS, Android);
  • direct API invocation and validation.

We often encounter the challenge that automation tools struggle with more complex UI elements. As part of its automation foundation, mabl also supports Shadow DOM, e-mail, and PDFs.

Observing – intelligent perception of application state

As we all know, simply acting is not enough – the agent must understand what it sees. Under the Observing section, the platform collects three types of data during test execution.

  • screenshots that later feed into Visual AI features,
  • DOM structure snapshots, which are invaluable for debugging (they allow comparison of the DOM at the point of success versus failure),
  • API call results, completing the picture of what is happening “under the hood” of the application.

It is precisely this observational foundation that underpins one of the most compelling features – Visual AI, which I discuss in more detail later in this article.

Deciding & Reasoning – autonomous decision-making

This is the heart of the agentic concept. The mabl agent does not merely execute pre-programmed steps – it reasons and makes decisions. The agent is expectedto independently analyze situations, draw conclusions, and react to changes in the application without human intervention.

In practice, this translates into specific features. Auto-Healing automatically repairs tests when the UI changes. Meanwhile, Visual Assist adds visual analysis of screenshots, enabling the agent to “see” the page layout as a human would. Combining two analytical methods – the classical (DOM) and the visual – makes tests far more resilient to frontend refactoring.

Auto TFA (Test Failure Analysis) is the automated analysis of test and plan failures

mabl automates the failure-analysis process – based on run results, the agent independently forms a diagnosis, categorizes the problem (e.g., application bug vs. test instability), and indicates where to begin the fix. This works at both the individual-test and the entire-test-plan levels.

In my experience, Auto TFA is the feature with enormous potential, helping reduce the time from defect detection to resolution. In practice, we can skip manual log analysis and proceed directly to a ready-made diagnosis.

On top of all this, there is Intelligent Waits – an algorithm that automatically adjusts wait times, eliminating the “flaky tests” problem caused by network latency.

Learning & Remembering – continuous knowledge building

One of the more compelling differentiators is that the mabl Agent continuously learns from the tests it executes, allowing it to build sophisticated models of application behavior and performance. A key element introduced in October 2025 is AI Vectorization & Test Semantic Search.

The pivotal change lies in how tests are indexed. In practice, this means the agent is not limited to keyword identification but recognizes the actual purpose and operational function of each test within the system. This enables, for example, detecting that two tests with entirely different names in fact cover the same scenario.

Integrating & Collaborating – working within the team ecosystem

The final skill is integration with existing tools and workflows.

The mabl MCP Server plays a particularly interesting role. Personally, the MCP Server is one of the most promising elements of the entire architecture. In practice, it works like this: a developer is working in their IDE, introduces a code change, and, without leaving the editor, can ask the agent, “Which tests might be affected by my change?” (Test Impact Analysis). The agent returns a list of related tests and, if a failure is detected, suggests what might be causing it. This represents a significant shift compared with the traditional flow.

The five Core Skills form a general competency map for the agent. In the following sections, we will take a closer look at selected capabilities that grow out of this foundation – starting with one of the most advanced: Visual AI.

Blog Testing Lab Desktop  - From automation to autonomy: The evolution towards Agentic Testing, illustrated by mabl

Testing & QA

Ensure the quality, performance, and security of your software with our testing and test automation services.

Testing&QA offering

Visual AI – contextual visual regression detection

In the traditional approach, visual regression tests compare pixels. In practice, however, this can lead to false positives.

The main sources of the problem include:

  • dynamic content,
  • font-rendering differences across environments,
  • and the variability of third-party components.

These trigger unjustified alerts despite the absence of actual bugs in the application’s logic.

The question, then, is: how does Visual AI solve this problem?

According to mabl’s documentation, Visual AI does not compare pixels – it interprets visual meaning. This is described at three levels:

  1. Semantic understanding of UI elements – the system recognizes what a given element is and evaluates changes in context. To use a simple but apt example: a button’s background changes (say, from blue #0066CC to navy #003366). Visual AI assesses the change from a usability perspective – whether the element is still recognizable as a button, whether it meets accessibility requirements, and whether it blends into the background. If the answer to all these questions is “yes,” the change is accepted.
  2. Pattern recognition across states – recognizing patterns across different variants of the same view. Every modern web application can look different depending on context: logged-in users and guests see different layouts, the interface looks different on a phone versus a desktop, and then there are dark mode, loading spinners, or error messages. Rather than flagging each of these differences as potential issues, Visual AI builds a model of how the application should look in a given state and reports only deviations from that pattern.
  3. Visual intent recognition – at the highest level, Visual AI evaluates visual changes not in terms of “what changed” but “whether the page still fulfills its purpose.” Consider a login form: the font, background color, or spacing may change, but the key question is whether the user can still log in. Are the fields visible? Does the user know where to click? Will they receive feedback if they make a mistake? This approach – evaluating the interface’s intent rather than pixel-level conformity – eliminates the majority of false positives that plague traditional tools.

Visual AI illustrates how the agent handles perception. But mabl’s agentic approach reaches further – it also encompasses the creation of tests from scratch.

Test Creation Agent – autonomous test authoring

One of the breakthrough features announced in mid-2025 and developed since then is the Test Creation Agent.

The process works as follows: a tester describes in natural language what the test should verify (e.g., “Verify that a user can add a product to the cart and proceed to payment”). On this basis, the agent independently plans the sequence of steps, identifies which parts of the test can be shared with other scenarios (e.g., the login flow), and generates a ready-to-run test in mabl’s cloud environment. The result naturally requires human verification – but the starting point is incomparably better than a blank canvas.

In my work, I have observed that creating new tests consumes the most time when delivering new features. If the agent can build a solid test skeleton that we can then manually verify and refine, we save time without sacrificing quality.

Application Summaries – context that changes the quality of generated tests

In January 2026, mabl introduced a new capability, Application Summaries, that strengthens the Test Creation Agent. This mechanism automatically generates a description of the web application under test, providing additional context when creating new tests.

One of the biggest challenges when using AI agents to generate tests is providing the right context.

How does Application Summaries work?

The mechanism operates automatically and requires no user intervention. When a test is created for a new web application, mabl begins generating an application summary within a few hours. Summaries are then periodically updated based on recent test activity. Most importantly, the Test Creation Agent uses this description as additional context when planning new test steps.

Two current limitations deserve mention: summaries are generated automatically and cannot be edited manually, and their processing and generation can take several hours.

Despite these two drawbacks, in my opinion, this is a significant step forward. Application Summaries address a problem familiar to anyone who has tried to generate tests with AI: the need to write detailed prompts that describe what the application under test actually is. This feature reduces the burden of crafting the perfect prompt and provides greater confidence in the agent’s planning process.

Summary – what does this mean for us, testers?

In my view, the key takeaway is that Agentic Testing does not replace the tester – it transforms the nature of their work. mabl consistently positions its agent as a digital co-worker that complements human expertise rather than eliminates it.

From my perspective, we must approach all the innovations in this space with curiosity, tempered by a healthy dose of skepticism. Let us not forget that technology is meant to help us perform our work better, often freeing us from repetitive, time-consuming tasks, so we have the time to view testing from the broader perspective of quality strategy and architecture.

5/5
Rating
5/5
Avatar

About the author

Michał Czyżowicz

A senior test and analysis engineer with extensive experience in project delivery, primarily for clients in the financial and pharmaceutical sectors. He combines a technical approach with a business mindset and a passion for innovation. He is actively involved in training initiatives and in his team's professional development. In his free time, he enjoys traveling and reading non-fiction

All articles written by the author

Leave a comment

Your email address will not be published. Required fields are marked *

You might also like

SUBSCRIBE AND DON'T FALL BEHIND

Blog Newsletter

Join our team

See all job offers

Show results
Join us Contact us

Ta treść jest dostępna tylko w jednej wersji językowej.
Nastąpi przekierowanie do strony głównej.

Czy chcesz opuścić tę stronę?