{"id":33356,"date":"2026-03-23T05:00:00","date_gmt":"2026-03-23T04:00:00","guid":{"rendered":"https:\/\/sii.pl\/blog\/zarzadzanie-danymi-testowymi-w-duchu-data-as-code-przy-uzyciu-synthesized\/"},"modified":"2026-03-18T16:05:38","modified_gmt":"2026-03-18T15:05:38","slug":"test-data-management-in-the-spirit-of-data-as-code-using-synthesized-platform","status":"publish","type":"post","link":"https:\/\/sii.pl\/blog\/en\/test-data-management-in-the-spirit-of-data-as-code-using-synthesized-platform\/","title":{"rendered":"Test Data Management in the Spirit of &#8220;Data as Code&#8221; using Synthesized Platform"},"content":{"rendered":"\n<p>In the world of modern software development, where CI\/CD and automation are industry standards, one piece of the puzzle remains an archaic bottleneck: test data. While application code can now be written with AI assistance in minutes and applications tested just as quickly on containers, preparing the proper data environment typically takes hours, days, or sometimes even weeks!<\/p>\n\n\n\n<p>Modern applications create and process ever-growing volumes of data, and business logic is fundamentally data-dependent \u2013 hence the growing need for large amounts of high-quality records. In this article, we will focus on common problems encountered in test data management and how the platform from our technology partner, Synthesized, can help address them.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong><strong>The value of Test Data Management (TDM) in the Quality Assurance process<\/strong><\/strong><\/h2>\n\n\n\n<p>Test Data Management (TDM) has been treated for years as a necessary, tedious obligation. However, in the era of GDPR\/HIPAA regulations and the growing complexity of microservices, TDM is becoming a critical element of any Quality Assurance strategy.<\/p>\n\n\n\n<p>The value of modern TDM rests on three pillars:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Speed (Time-to-Market):<\/strong> Developers and testers need data &#8220;here and now.&#8221; Waiting weeks for a slice of a production database is a waste of money and team momentum.<\/li>\n\n\n\n<li><strong>Quality (Shift-Left Testing):<\/strong> To detect bugs early, we need data that faithfully reflects production (including edge cases), not just the &#8220;happy path.&#8221;<\/li>\n\n\n\n<li><strong>Security (Compliance):<\/strong> Using live production data in test environments frequently carries the risk of violating regulatory requirements. GDPR penalties can reach millions of euros.<\/li>\n<\/ol>\n\n\n\n<h2 class=\"wp-block-heading\"><strong><strong>Typical problems encountered with static data management<\/strong><\/strong><\/h2>\n\n\n\n<p>The traditional approach to test data is typically static and plagued by a set of well-known problems that QA engineers know all too well:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Unstable Environments:<\/strong> Shared databases are modified by multiple teams simultaneously. Test A overwrites the data needed by Test B, leading to so-called flaky tests \u2014 unreliable, non-deterministic tests.<\/li>\n\n\n\n<li><strong>PII Leakage Risk:<\/strong> Manual anonymization scripts are prone to human error. A single new column containing email addresses added during a sprint and overlooked in the masking script is enough to violate data protection regulations. Moreover, the anonymization process itself requires direct &#8220;contact&#8221; with sensitive data.<\/li>\n\n\n\n<li><strong>Enormous Storage Costs:<\/strong> Cloning full production databases (often terabytes in size) for every development or test environment is expensive and inefficient.<\/li>\n\n\n\n<li><strong>Lack of Consistency:<\/strong> Maintaining referential integrity (relationships between tables) during manual data generation or subsetting is highly error-prone. An unnoticed inconsistency can cause hard-to-trace defects or test result anomalies (false positives\/false negatives).<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"><strong><strong>The Synthesized Platform and the &#8220;Data as Code&#8221; approach<\/strong><\/strong><\/h2>\n\n\n\n<p>This is where Synthesized enters the scene with its Data-as-Code philosophy. What does this actually mean?<\/p>\n\n\n\n<p>In this approach, test data definitions are treated as code \u2013 just like application source code. Instead of transferring copies of production databases as gigabytes of .sql or .dump files, we store lightweight configuration files in.YAML format that describes what the data should look like.<\/p>\n\n\n\n<figure data-wp-context=\"{&quot;uploadedSrc&quot;:&quot;https:\\\/\\\/sii.pl\\\/blog\\\/wp-content\\\/uploads\\\/2026\\\/02\\\/image1-4.png&quot;,&quot;figureClassNames&quot;:&quot;wp-block-image aligncenter size-full&quot;,&quot;figureStyles&quot;:null,&quot;imgClassNames&quot;:&quot;wp-image-33199&quot;,&quot;imgStyles&quot;:null,&quot;targetWidth&quot;:797,&quot;targetHeight&quot;:440,&quot;scaleAttr&quot;:false,&quot;ariaLabel&quot;:&quot;Enlarge image: Platforma Synthesized&quot;,&quot;alt&quot;:&quot;Platforma Synthesized&quot;}\" data-wp-interactive=\"core\/image\" class=\"wp-block-image aligncenter size-full wp-lightbox-container\"><img decoding=\"async\" width=\"797\" height=\"440\" data-wp-init=\"callbacks.setButtonStyles\" data-wp-on-async--click=\"actions.showLightbox\" data-wp-on-async--load=\"callbacks.setButtonStyles\" data-wp-on-async-window--resize=\"callbacks.setButtonStyles\" src=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/02\/image1-4.png\" alt=\"Platforma Synthesized\" class=\"wp-image-33199\" srcset=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/02\/image1-4.png 797w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/02\/image1-4-300x166.png 300w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/02\/image1-4-768x424.png 768w\" sizes=\"(max-width: 797px) 100vw, 797px\" \/><button\n\t\t\tclass=\"lightbox-trigger\"\n\t\t\ttype=\"button\"\n\t\t\taria-haspopup=\"dialog\"\n\t\t\taria-label=\"Enlarge image: Platforma Synthesized\"\n\t\t\tdata-wp-init=\"callbacks.initTriggerButton\"\n\t\t\tdata-wp-on-async--click=\"actions.showLightbox\"\n\t\t\tdata-wp-style--right=\"context.imageButtonRight\"\n\t\t\tdata-wp-style--top=\"context.imageButtonTop\"\n\t\t>\n\t\t\t<svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"12\" height=\"12\" fill=\"none\" viewBox=\"0 0 12 12\">\n\t\t\t\t<path fill=\"#fff\" d=\"M2 0a2 2 0 0 0-2 2v2h1.5V2a.5.5 0 0 1 .5-.5h2V0H2Zm2 10.5H2a.5.5 0 0 1-.5-.5V8H0v2a2 2 0 0 0 2 2h2v-1.5ZM8 12v-1.5h2a.5.5 0 0 0 .5-.5V8H12v2a2 2 0 0 1-2 2H8Zm2-12a2 2 0 0 1 2 2v2h-1.5V2a.5.5 0 0 0-.5-.5H8V0h2Z\" \/>\n\t\t\t<\/svg>\n\t\t<\/button><figcaption class=\"wp-element-caption\">Fig. 1 The Synthesized platform (<a href=\"https:\/\/docs.synthesized.io\/home\/\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >source<\/a>)<\/figcaption><\/figure>\n\n\n\n<p>The Synthesized platform allows you to:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Declarative Configuration:<\/strong> You describe in a configuration file what you want to achieve, and the platform&#8217;s engine takes care of the execution.<\/li>\n\n\n\n<li><strong>Versioning:<\/strong> The data generation configuration &#8220;lives&#8221; in the Git repository alongside the application code. As a result, data definitions evolve alongside the database schema.<\/li>\n\n\n\n<li><strong>Repeatability:<\/strong> Every team member can generate the same dataset on their local machine using a single command.<\/li>\n\n\n\n<li><strong>Referential Integrity:<\/strong> The tool &#8220;learns&#8221; the structure of your data \u2014 relationships, foreign keys, and statistical distributions \u2014 and then generates data while preserving proportions and, most importantly, consistency with the model.<\/li>\n<\/ul>\n\n\n\n<p>This approach fundamentally shifts the process from copying resources to maintaining lightweight, high-level descriptions of expected data.<\/p>\n\n\n\n<p>The technical capabilities described above are not everything \u2014 data security remains paramount. Synthesized does not fetch or transmit data outside the client&#8217;s network. The GenAI algorithm that inspects data structure, generates test data based on production, or anonymizes sensitive data never copies data from the production database. Data remains secure, with no possibility of leakage or unauthorized access.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong><strong>Key features \u2013 what can this &#8220;swiss army knife&#8221; do?<\/strong><\/strong><\/h2>\n\n\n\n<p>The Synthesized platform delivers four key capabilities that address the problems described earlier. <\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong><strong>PII Scanning (Sensitive data detection)<\/strong><\/strong><\/h3>\n\n\n\n<p>Before you do anything with data, you need to know where the risks are hiding. Synthesized includes a scanning module that automatically analyzes the database schema and data samples to identify potential Personally Identifiable Information (PII).<\/p>\n\n\n\n<p><strong>How it works:<\/strong> The tool flags columns containing national ID numbers, email addresses, or credit card numbers, and suggests appropriate transformations. It works somewhat like a linter for data security.<\/p>\n\n\n\n<p>The data scanning mechanism can be easily customized to meet organizational requirements by adding new rules for identifying sensitive data, as well as modifying existing rules \u2013 for example, adding the name of a column containing sensitive data in a non-English language.<\/p>\n\n\n\n<figure data-wp-context=\"{&quot;uploadedSrc&quot;:&quot;https:\\\/\\\/sii.pl\\\/blog\\\/wp-content\\\/uploads\\\/2026\\\/02\\\/image2-3.png&quot;,&quot;figureClassNames&quot;:&quot;wp-block-image aligncenter size-large&quot;,&quot;figureStyles&quot;:null,&quot;imgClassNames&quot;:&quot;wp-image-33201&quot;,&quot;imgStyles&quot;:null,&quot;targetWidth&quot;:2990,&quot;targetHeight&quot;:1712,&quot;scaleAttr&quot;:false,&quot;ariaLabel&quot;:&quot;Enlarge image: PII Scanning&quot;,&quot;alt&quot;:&quot;PII Scanning&quot;}\" data-wp-interactive=\"core\/image\" class=\"wp-block-image aligncenter size-large wp-lightbox-container\"><img decoding=\"async\" width=\"1024\" height=\"586\" data-wp-init=\"callbacks.setButtonStyles\" data-wp-on-async--click=\"actions.showLightbox\" data-wp-on-async--load=\"callbacks.setButtonStyles\" data-wp-on-async-window--resize=\"callbacks.setButtonStyles\" src=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/02\/image2-3-1024x586.png\" alt=\"PII Scanning\" class=\"wp-image-33201\" srcset=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/02\/image2-3-1024x586.png 1024w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/02\/image2-3-300x172.png 300w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/02\/image2-3-768x440.png 768w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/02\/image2-3-1536x879.png 1536w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/02\/image2-3-2048x1173.png 2048w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><button\n\t\t\tclass=\"lightbox-trigger\"\n\t\t\ttype=\"button\"\n\t\t\taria-haspopup=\"dialog\"\n\t\t\taria-label=\"Enlarge image: PII Scanning\"\n\t\t\tdata-wp-init=\"callbacks.initTriggerButton\"\n\t\t\tdata-wp-on-async--click=\"actions.showLightbox\"\n\t\t\tdata-wp-style--right=\"context.imageButtonRight\"\n\t\t\tdata-wp-style--top=\"context.imageButtonTop\"\n\t\t>\n\t\t\t<svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"12\" height=\"12\" fill=\"none\" viewBox=\"0 0 12 12\">\n\t\t\t\t<path fill=\"#fff\" d=\"M2 0a2 2 0 0 0-2 2v2h1.5V2a.5.5 0 0 1 .5-.5h2V0H2Zm2 10.5H2a.5.5 0 0 1-.5-.5V8H0v2a2 2 0 0 0 2 2h2v-1.5ZM8 12v-1.5h2a.5.5 0 0 0 .5-.5V8H12v2a2 2 0 0 1-2 2H8Zm2-12a2 2 0 0 1 2 2v2h-1.5V2a.5.5 0 0 0-.5-.5H8V0h2Z\" \/>\n\t\t\t<\/svg>\n\t\t<\/button><figcaption class=\"wp-element-caption\">Fig. 2 PII Scanning (<a href=\"https:\/\/docs.synthesized.io\/tdk\/latest\/user_guide\/030_core_concepts\/pii\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >source<\/a>)<\/figcaption><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\"><strong><strong>Masking (Data Masking and Anonymization)<\/strong><\/strong><\/h3>\n\n\n\n<p>Data masking with the Synthesized platform is not the classic &#8220;<em>replace every character with X.<\/em>&#8220;<\/p>\n\n\n\n<p>Synthesized offers advanced masking methods that preserve both data usability and consistency.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Format-preserving encryption: Anonymized data retains the original format, which is critical for front-end validations or API requests. For example, a masked phone number still appears to be a valid phone number, even though the actual data has been replaced with generated values.<\/li>\n\n\n\n<li>Deterministic consistency: Anonymized data preserves relationships between records. Masked data changes its surface representation without breaking logical relationships. When anonymizing customer data, if a record named &#8220;Jan Kowalski&#8221; appears in both the Users and Orders tables, after masking it as &#8220;Adam Nowak&#8221; (or cr56V^$ B%x#9G, depending on the chosen masking mode), that change will be reflected in both tables in the same way, preserving the original relationships between them (Referential Integrity).<\/li>\n\n\n\n<li>Choice of masking mode \u2013 depending on the selected configuration, data can be anonymized in one of two modes:\n<ul class=\"wp-block-list\">\n<li>Replacing the value of a sensitive record with a random set of characters. This allows us to distinguish between anonymized data and manually entered data in the test environment.<\/li>\n<\/ul>\n<\/li>\n<\/ul>\n\n\n\n<figure data-wp-context=\"{&quot;uploadedSrc&quot;:&quot;https:\\\/\\\/sii.pl\\\/blog\\\/wp-content\\\/uploads\\\/2026\\\/02\\\/image3-2.png&quot;,&quot;figureClassNames&quot;:&quot;wp-block-image aligncenter size-full&quot;,&quot;figureStyles&quot;:null,&quot;imgClassNames&quot;:&quot;wp-image-33203&quot;,&quot;imgStyles&quot;:null,&quot;targetWidth&quot;:842,&quot;targetHeight&quot;:500,&quot;scaleAttr&quot;:false,&quot;ariaLabel&quot;:&quot;Enlarge image: Data Masking and Anonymization&quot;,&quot;alt&quot;:&quot;Data Masking and Anonymization&quot;}\" data-wp-interactive=\"core\/image\" class=\"wp-block-image aligncenter size-full wp-lightbox-container\"><img decoding=\"async\" width=\"842\" height=\"500\" data-wp-init=\"callbacks.setButtonStyles\" data-wp-on-async--click=\"actions.showLightbox\" data-wp-on-async--load=\"callbacks.setButtonStyles\" data-wp-on-async-window--resize=\"callbacks.setButtonStyles\" src=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/02\/image3-2.png\" alt=\"Data Masking and Anonymization\" class=\"wp-image-33203\" srcset=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/02\/image3-2.png 842w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/02\/image3-2-300x178.png 300w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/02\/image3-2-768x456.png 768w\" sizes=\"(max-width: 842px) 100vw, 842px\" \/><button\n\t\t\tclass=\"lightbox-trigger\"\n\t\t\ttype=\"button\"\n\t\t\taria-haspopup=\"dialog\"\n\t\t\taria-label=\"Enlarge image: Data Masking and Anonymization\"\n\t\t\tdata-wp-init=\"callbacks.initTriggerButton\"\n\t\t\tdata-wp-on-async--click=\"actions.showLightbox\"\n\t\t\tdata-wp-style--right=\"context.imageButtonRight\"\n\t\t\tdata-wp-style--top=\"context.imageButtonTop\"\n\t\t>\n\t\t\t<svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"12\" height=\"12\" fill=\"none\" viewBox=\"0 0 12 12\">\n\t\t\t\t<path fill=\"#fff\" d=\"M2 0a2 2 0 0 0-2 2v2h1.5V2a.5.5 0 0 1 .5-.5h2V0H2Zm2 10.5H2a.5.5 0 0 1-.5-.5V8H0v2a2 2 0 0 0 2 2h2v-1.5ZM8 12v-1.5h2a.5.5 0 0 0 .5-.5V8H12v2a2 2 0 0 1-2 2H8Zm2-12a2 2 0 0 1 2 2v2h-1.5V2a.5.5 0 0 0-.5-.5H8V0h2Z\" \/>\n\t\t\t<\/svg>\n\t\t<\/button><figcaption class=\"wp-element-caption\">Fig. 3 Data Masking and Anonymization<\/figcaption><\/figure>\n\n\n\n<ul class=\"wp-block-list\">\n<li>replacing the value of a sensitive record with a value generated by the data generator. For example, a customer named &#8220;Clint Eastwood&#8221; would be masked as &#8220;Walt Kowalsky&#8221; \u2013 a randomly generated but realistic-looking name. This allows us to use &#8220;production-like&#8221; data without concern about using sensitive information.<\/li>\n<\/ul>\n\n\n\n<figure data-wp-context=\"{&quot;uploadedSrc&quot;:&quot;https:\\\/\\\/sii.pl\\\/blog\\\/wp-content\\\/uploads\\\/2026\\\/02\\\/image4-1.png&quot;,&quot;figureClassNames&quot;:&quot;wp-block-image aligncenter size-full&quot;,&quot;figureStyles&quot;:null,&quot;imgClassNames&quot;:&quot;wp-image-33205&quot;,&quot;imgStyles&quot;:null,&quot;targetWidth&quot;:832,&quot;targetHeight&quot;:446,&quot;scaleAttr&quot;:false,&quot;ariaLabel&quot;:&quot;Enlarge image: Data Masking and Anonymization&quot;,&quot;alt&quot;:&quot;Data Masking and Anonymization&quot;}\" data-wp-interactive=\"core\/image\" class=\"wp-block-image aligncenter size-full wp-lightbox-container\"><img decoding=\"async\" width=\"832\" height=\"446\" data-wp-init=\"callbacks.setButtonStyles\" data-wp-on-async--click=\"actions.showLightbox\" data-wp-on-async--load=\"callbacks.setButtonStyles\" data-wp-on-async-window--resize=\"callbacks.setButtonStyles\" src=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/02\/image4-1.png\" alt=\"Data Masking and Anonymization\" class=\"wp-image-33205\" srcset=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/02\/image4-1.png 832w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/02\/image4-1-300x161.png 300w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/02\/image4-1-768x412.png 768w\" sizes=\"(max-width: 832px) 100vw, 832px\" \/><button\n\t\t\tclass=\"lightbox-trigger\"\n\t\t\ttype=\"button\"\n\t\t\taria-haspopup=\"dialog\"\n\t\t\taria-label=\"Enlarge image: Data Masking and Anonymization\"\n\t\t\tdata-wp-init=\"callbacks.initTriggerButton\"\n\t\t\tdata-wp-on-async--click=\"actions.showLightbox\"\n\t\t\tdata-wp-style--right=\"context.imageButtonRight\"\n\t\t\tdata-wp-style--top=\"context.imageButtonTop\"\n\t\t>\n\t\t\t<svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"12\" height=\"12\" fill=\"none\" viewBox=\"0 0 12 12\">\n\t\t\t\t<path fill=\"#fff\" d=\"M2 0a2 2 0 0 0-2 2v2h1.5V2a.5.5 0 0 1 .5-.5h2V0H2Zm2 10.5H2a.5.5 0 0 1-.5-.5V8H0v2a2 2 0 0 0 2 2h2v-1.5ZM8 12v-1.5h2a.5.5 0 0 0 .5-.5V8H12v2a2 2 0 0 1-2 2H8Zm2-12a2 2 0 0 1 2 2v2h-1.5V2a.5.5 0 0 0-.5-.5H8V0h2Z\" \/>\n\t\t\t<\/svg>\n\t\t<\/button><figcaption class=\"wp-element-caption\">Fig. 4 Data Masking and Anonymization<\/figcaption><\/figure>\n\n\n\n<h3 class=\"wp-block-heading\"><strong><strong>Subsetting (Data Subset Extraction)<\/strong><\/strong><\/h3>\n\n\n\n<p>A typical use case involves testing a regression or bug fix in a production-like environment \u2013 but without loading the full data volume.<\/p>\n\n\n\n<p>When we need to populate our environment with, say, thousands of records. At the same time, the production database contains millions; we would usually take a sample, e.g., &#8220;SELECT the 1,000 most recent records from the table.&#8221; Data retrieved this way may not accurately reflect the distribution of production data, which can significantly impact test results.<\/p>\n\n\n\n<p>Using the Subsetting feature in the Synthesized platform, we apply a Smart Slicing approach that, within the expected data volume, preserves the structure and distribution as close as possible to the production environment. For instance, a developer does not need 10 TB of production data to fix a single bug. They need a representative slice (e.g., 1% of the database) that preserves all relationships. Synthesized allows you to extract a data subset by automatically scanning foreign keys. When pulling data for a single customer, the tool will also retrieve their orders, addresses, and login history, while ignoring all other customers&#8217; data. With Smart Slicing, we significantly reduce infrastructure costs and environment setup time, while also mitigating the risk of testing a change against non-representative data.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong><strong>Generate (Synthetic Data Generation)<\/strong><\/strong><\/h3>\n\n\n\n<p>This is the feature that sets Synthesized apart from traditional test data management tools. Using machine learning models (AI\/ML), the platform can &#8220;learn&#8221; the structure and statistical distribution from production data, and then generate entirely new records.<\/p>\n\n\n\n<p>Newly created records contain no real production data, yet they retain consistency and a distribution aligned with the production environment.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Use case: Ideal when you have too little production data (e.g., a new system) or when you need to perform load testing and must create new data that resembles production records in distribution and model, but in much greater quantities.<\/li>\n\n\n\n<li>Edge cases: Thanks to the flexibility of the Data as Code approach, the generator can be configured to create specific edge cases that rarely occur in production but are critical for system stability. Users can independently override the statistical distribution of any value. For example, if we need to test a new business rule that triggers when the list of receivables with an &#8220;OVERDUE&#8221; status exceeds 10%, but our production data contains only 5% such statuses, we can override the statistical distribution in the configuration file to hard-code the generation of 11% &#8220;OVERDUE&#8221; records. The records will be synthetically consistent \u2013 this is not a simple status replacement, but the generation of a new record along with all related entities.<\/li>\n\n\n\n<li>Data generation can also be used to clean our test data resources. By combining Subsetting with data generation, Synthesized makes it easy to create a configuration file that cleans a test environment of redundant data (you can configure the removal of 90% of data after a test suite) or wipes it entirely to zero. When using the &#8220;Data Cleanup&#8221; feature, you can choose to preserve or drop the database schema.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"><strong><strong>Supported formats<\/strong><\/strong><\/h2>\n\n\n\n<p>Synthesized.io offers a wide range of capabilities for working with the most commonly used database engines.<\/p>\n\n\n\n<p>As data sources, the platform natively supports:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>PostgreSQL<\/li>\n\n\n\n<li>MySQL<\/li>\n\n\n\n<li>MariaDB<\/li>\n\n\n\n<li>SQLite<\/li>\n\n\n\n<li>Oracle<\/li>\n\n\n\n<li>MSSQL<\/li>\n\n\n\n<li>H2<\/li>\n\n\n\n<li>DB2 (including mainframe)<\/li>\n\n\n\n<li>Snowflake<\/li>\n\n\n\n<li>SAP HANA<\/li>\n<\/ul>\n\n\n\n<p>As well as flat files such as .csv.<\/p>\n\n\n\n<p>An additional convenience for testing enterprise-class systems is the set of built-in connectors that enable seamless use of production data for anonymization, subsetting, or generation.<\/p>\n\n\n\n<p>Dedicated support is available for the following platforms:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>SAP<\/li>\n\n\n\n<li>ServiceNow<\/li>\n\n\n\n<li>Microsoft Dynamics 365<\/li>\n\n\n\n<li>Workday<\/li>\n\n\n\n<li>Salesforce<\/li>\n\n\n\n<li>and many others.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"><strong><strong>CI\/CD integrations<\/strong><\/strong><\/h2>\n\n\n\n<p>Synthesized was designed for seamless CI\/CD integration. It enables test data generation &#8220;on the fly&#8221; during the execution of various test levels on remote resources. To integrate with CI\/CD systems, Synthesized provides the Synthesized CLI. This command-line tool allows configured data generation or anonymization processes to be triggered from the console, without a graphical interface.<\/p>\n\n\n\n<p>The Governor UI (user interface) allows you to manage projects, user access, data sources, and easily create and manage configuration files.<\/p>\n\n\n\n<p>Thanks to the Synthesized CLI, you can import a JSON schema into your IDE and create and manage Synthesized configuration files as part of your automated test project.<\/p>\n\n\n\n<p>A typical pipeline scenario using Synthesized looks as follows:<\/p>\n\n\n\n<ol class=\"wp-block-list\">\n<li><strong>Commit:<\/strong> A developer pushes a code change (e.g., a migration adding a new table).<\/li>\n\n\n\n<li><strong>Build:<\/strong> The CI server builds the application.<\/li>\n\n\n\n<li><strong>Data Provisioning:<\/strong> In the test step, the CI server runs the Synthesized TDK. The tool fetches the data definition (in YAML format), connects to the secure data source, applies masking, and generates a lightweight database file (e.g., a Docker container with PostgreSQL populated with synthetic data).<\/li>\n\n\n\n<li><strong>Test:<\/strong> Automated tests are executed against this fresh, ephemeral database.<\/li>\n\n\n\n<li><strong>Cleanup:<\/strong> Once testing is complete, the environment is cleaned up. The container is removed.<\/li>\n<\/ol>\n\n\n\n<p>This process guarantees that every test run is executed against a clean, predictable, and secure dataset. At no point is sensitive data used or transmitted outside the organization&#8217;s internal network.<\/p>\n\n\n\n<figure class=\"wp-block-image aligncenter size-full\"><a href=\"https:\/\/sii.pl\/en\/job-ads\/\" target=\"_blank\" rel=\"noreferrer noopener\"><img decoding=\"async\" width=\"737\" height=\"170\" src=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/03\/praca-EN-k-1.jpg\" alt=\"job\" class=\"wp-image-33358\" srcset=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/03\/praca-EN-k-1.jpg 737w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/03\/praca-EN-k-1-300x69.jpg 300w\" sizes=\"(max-width: 737px) 100vw, 737px\" \/><\/a><\/figure>\n\n\n\n<h2 class=\"wp-block-heading\"><strong><strong>Summary<\/strong><\/strong><\/h2>\n\n\n\n<p>Transitioning to the Data as Code model with tools like Synthesized is a natural step in DevOps evolution. It allows organizations to regain control over the chaos of test data, reduce cloud costs, and \u2013 most importantly \u2013 sleep soundly knowing that customer data is secure and the software being released has been thoroughly tested.<\/p>\n\n\n\n<p>If your organization is still manually copying production backups, perhaps it is time to treat data with the same seriousness as code. In the next articles in this series, we will look at how to configure the free version of the Synthesized platform to explore its capabilities hands-on.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>It&#8217;s worth checking out<\/strong><\/h2>\n\n\n\n<p>Additionally, I recommend checking out the official tool website and published documentation:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/www.synthesized.io\/\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >Enterprise-grade test data automation at the speed of AI<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/docs.synthesized.io\/tdk\/latest\/\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >Synthesized Platform Documentation<\/a><\/li>\n\n\n\n<li><a href=\"https:\/\/github.com\/synthesized-io\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >Synthesized<\/a><\/li>\n<\/ul>\n\n\n<div class=\"kk-star-ratings kksr-auto kksr-align-left kksr-valign-bottom\"\n    data-payload='{&quot;align&quot;:&quot;left&quot;,&quot;id&quot;:&quot;33356&quot;,&quot;slug&quot;:&quot;default&quot;,&quot;valign&quot;:&quot;bottom&quot;,&quot;ignore&quot;:&quot;&quot;,&quot;reference&quot;:&quot;auto&quot;,&quot;class&quot;:&quot;&quot;,&quot;count&quot;:&quot;5&quot;,&quot;legendonly&quot;:&quot;&quot;,&quot;readonly&quot;:&quot;&quot;,&quot;score&quot;:&quot;5&quot;,&quot;starsonly&quot;:&quot;&quot;,&quot;best&quot;:&quot;5&quot;,&quot;gap&quot;:&quot;11&quot;,&quot;greet&quot;:&quot;&quot;,&quot;legend&quot;:&quot;5\\\/5 ( votes: 5)&quot;,&quot;size&quot;:&quot;18&quot;,&quot;title&quot;:&quot;Test Data Management in the Spirit of \\u0026quot;Data as Code\\u0026quot; using Synthesized Platform&quot;,&quot;width&quot;:&quot;139.5&quot;,&quot;_legend&quot;:&quot;{score}\\\/{best} ( {votes}: {count})&quot;,&quot;font_factor&quot;:&quot;1.25&quot;}'>\n            \n<div class=\"kksr-stars\">\n    \n<div class=\"kksr-stars-inactive\">\n            <div class=\"kksr-star\" data-star=\"1\" style=\"padding-right: 11px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 18px; height: 18px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" data-star=\"2\" style=\"padding-right: 11px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 18px; height: 18px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" data-star=\"3\" style=\"padding-right: 11px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 18px; height: 18px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" data-star=\"4\" style=\"padding-right: 11px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 18px; height: 18px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" data-star=\"5\" style=\"padding-right: 11px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 18px; height: 18px;\"><\/div>\n        <\/div>\n    <\/div>\n    \n<div class=\"kksr-stars-active\" style=\"width: 139.5px;\">\n            <div class=\"kksr-star\" style=\"padding-right: 11px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 18px; height: 18px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" style=\"padding-right: 11px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 18px; height: 18px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" style=\"padding-right: 11px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 18px; height: 18px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" style=\"padding-right: 11px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 18px; height: 18px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" style=\"padding-right: 11px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 18px; height: 18px;\"><\/div>\n        <\/div>\n    <\/div>\n<\/div>\n                \n\n<div class=\"kksr-legend\" style=\"font-size: 14.4px;\">\n            5\/5 ( votes: 5)    <\/div>\n    <\/div>\n","protected":false},"excerpt":{"rendered":"<p>In the world of modern software development, where CI\/CD and automation are industry standards, one piece of the puzzle remains &hellip; <a class=\"continued-btn\" href=\"https:\/\/sii.pl\/blog\/en\/test-data-management-in-the-spirit-of-data-as-code-using-synthesized-platform\/\">Continued<\/a><\/p>\n","protected":false},"author":778,"featured_media":33210,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_editorskit_title_hidden":false,"_editorskit_reading_time":0,"_editorskit_is_block_options_detached":false,"_editorskit_block_options_position":"{}","inline_featured_image":false,"footnotes":""},"categories":[1317],"tags":[9691,9692,1590,1526,1421],"class_list":["post-33356","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-testowanie","tag-data-as-code-en","tag-synthesized-en","tag-tools","tag-guidebook","tag-testing-en"],"acf":[],"aioseo_notices":[],"republish_history":[],"featured_media_url":"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/02\/Update-1.jpg","category_names":["Testowanie"],"_links":{"self":[{"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/posts\/33356"}],"collection":[{"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/users\/778"}],"replies":[{"embeddable":true,"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/comments?post=33356"}],"version-history":[{"count":1,"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/posts\/33356\/revisions"}],"predecessor-version":[{"id":33360,"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/posts\/33356\/revisions\/33360"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/media\/33210"}],"wp:attachment":[{"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/media?parent=33356"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/categories?post=33356"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/tags?post=33356"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}