{"id":33666,"date":"2026-04-24T05:00:00","date_gmt":"2026-04-24T03:00:00","guid":{"rendered":"https:\/\/sii.pl\/blog\/?p=33666"},"modified":"2026-04-23T15:59:45","modified_gmt":"2026-04-23T13:59:45","slug":"practical-application-of-ai-models-in-programming-a-comparison-of-gpt-5-4-and-claude-opus-4-7","status":"publish","type":"post","link":"https:\/\/sii.pl\/blog\/en\/practical-application-of-ai-models-in-programming-a-comparison-of-gpt-5-4-and-claude-opus-4-7\/","title":{"rendered":"Practical application of AI models in programming. A comparison of GPT-5.4 and Claude Opus 4.7"},"content":{"rendered":"\n<p>The days when artificial intelligence was used solely for code auto-completion are behind us. In 2026, AI agents like OpenAI&#8217;s GPT-5.4 and Anthropic&#8217;s Claude Opus 4.7 act as fully-fledged programming assistants. They can independently analyze bugs, refactor entire systems, and ensure application security.<\/p>\n\n\n\n<p>How do you choose the right model for a specific task? We will look at their capabilities, costs, and ecosystems to help you make that decision.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>The most popular AI models for developers in 2026<\/strong><\/h2>\n\n\n\n<p>Today&#8217;s programming tools ecosystem is dominated by two models that best illustrate the shift in software development:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>GPT-5.4<\/strong><\/li>\n\n\n\n<li><strong>Claude Opus 4.7<\/strong><\/li>\n<\/ul>\n\n\n\n<p>At first glance, their capabilities seem similar. Both models can efficiently generate code, analyze complex problems, perform refactoring, and support debugging.<\/p>\n\n\n\n<p>GPT-5.4 is the latest offer from OpenAI. This model combines Codex&#8217;s proven capabilities with advanced reasoning, native computer use, and a massive context window (<a href=\"https:\/\/openai.com\/index\/introducing-gpt-5-4\/\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >source<\/a>). It represents an evolution of the earlier GPT-5.3 Codex, which still works great as a cheaper alternative for purely terminal-based tasks.<\/p>\n\n\n\n<p>Claude Opus 4.7 was introduced by Anthropic as the most capable publicly available model for advanced programming and analytical reasoning (<a href=\"https:\/\/www.anthropic.com\/news\/claude-opus-4-7\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >source<\/a>).<\/p>\n\n\n\n<p>So where does the real difference lie? It lies in <strong>effectiveness for specific types of tasks<\/strong>, the <strong>tool ecosystem<\/strong>, and the <strong>quality-to-cost ratio<\/strong>.<\/p>\n\n\n\n<p>Currently, both models are widely available via API and integrated with the most popular tools, such as Cursor, Windsurf, and GitHub Copilot. As of February 2026, both OpenAI and Anthropic solutions operate as agents in GitHub Copilot (<a href=\"https:\/\/github.blog\/changelog\/2026-02-04-claude-and-codex-are-now-available-in-public-preview-on-github\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >source<\/a>).<\/p>\n\n\n\n<p>The market division, therefore, is not about where a given model is available, but rather in which tool ecosystem it operates most effectively:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>OpenAI<\/strong> offers deep integration with GitHub Copilot (both in the IDE and during PR review), has its own Codex IDE, extensions for popular editors (VS Code, Cursor, Windsurf), and ready-made libraries for CI\/CD. This ecosystem is deeply embedded in the daily developer pipeline.<\/li>\n\n\n\n<li><strong>Anthropic<\/strong> offers Claude Code (a CLI agent for terminal work), and its models are a frequently chosen option by Cursor users. Their approach is decidedly more agent-centric and API-first.<\/li>\n<\/ul>\n\n\n\n<p>Choosing a model often comes down to deciding which ecosystem you want to work in.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong><strong>How have these models changed?<\/strong><\/strong><\/h2>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><\/td><td><strong>Previously<\/strong><\/td><td><strong>Currently<\/strong><\/td><\/tr><tr><td><strong>OpenAI<\/strong> (5.3 Codex \u2192 5.4)<\/td><td>Generated code based on a prompt, worked on single files.<\/td><td>GPT-5.4 combines coding with native computer use, reasoning, and a massive context (1.05M tokens).<\/td><\/tr><tr><td><strong>Opus<\/strong> (4.6 \u2192 4.7)<\/td><td>Generated answers well but had limitations with more complex analyses.<\/td><td>Conducts multi-step reasoning, maintains the context of large systems, and generates and refactors code at a high level. Version 4.7 brought a jump on SWE-bench Pro from 53.4% to 64.3%.<\/td><\/tr><\/tbody><\/table><figcaption class=\"wp-element-caption\">Fig. 1 How have these models changed?<\/figcaption><\/figure>\n\n\n\n<p><strong>In short,<\/strong> both models have advanced from the role of mere &#8220;snippet generators&#8221; to the level of <strong>fully-fledged programming agents<\/strong>. Today, they differ in how they approach solving a problem, not whether they can solve it at all.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong><strong>How does Opus 4.6 differ from 4.7?<\/strong><\/strong><\/h2>\n\n\n\n<p>Released in mid-April, Opus 4.7 introduces several changes that translate into better results in selected scenarios in practice:<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><strong>Area<\/strong><\/td><td><strong>Opus 4.6<\/strong><\/td><td><strong>Opus 4.7<\/strong><\/td><td><strong>Change<\/strong><\/td><\/tr><tr><td><strong>SWE-bench Pro<\/strong> (coding)<\/td><td>53.4%<\/td><td>64.3%<\/td><td>+20%<\/td><\/tr><tr><td><strong>CursorBench<\/strong><\/td><td>58%<\/td><td>70%<\/td><td>+12pp<\/td><\/tr><tr><td><strong>OfficeQA Pro<\/strong> (document reasoning)<\/td><td>baseline<\/td><td>\u221221% errors<\/td><td>significant improvement<\/td><\/tr><tr><td><strong>Vision<\/strong> (max resolution)<\/td><td>1.15 MP<\/td><td>3.75 MP<\/td><td>3\u00d7 more<\/td><\/tr><\/tbody><\/table><figcaption class=\"wp-element-caption\">Tab. 2 How does Opus 4.6 differ from 4.7? <em>(Benchmark data comes from Anthropic materials and results published by partners, e.g., CursorBench, OfficeQA Pro; <\/em><a href=\"https:\/\/www.anthropic.com\/news\/claude-opus-4-7\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" ><em>source<\/em><\/a><em>).<\/em><\/figcaption><\/figure>\n\n\n\n<p>Among the most important new features in version 4.7 are:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>xhigh effort level<\/strong> \u2013 a new parameter created specifically for tasks requiring the deepest reasoning.<\/li>\n\n\n\n<li><strong>Task budgets (beta)<\/strong> \u2013 a feature allowing the model to manage its token budget in agentic loops.<\/li>\n\n\n\n<li><strong>New tokenizer<\/strong> \u2013 a small note here: it increases token usage by about 35%, which realistically raises operational costs while maintaining the existing pricing ($5\/M input, $25\/M output).<\/li>\n<\/ul>\n\n\n\n<p><strong>Important practical caveat:<\/strong> Opus 4.7 adheres much more strictly to instructions (i.e., stricter instruction-following). This means that prompts that worked perfectly in version 4.6 may require minor adjustments. It is worth testing them before a full migration.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong><strong>How was GPT-5.4 created?<\/strong><\/strong><\/h2>\n\n\n\n<p>It&#8217;s worth clarifying one thing: GPT-5.4 is not simply a new version of the Codex model. It is a <strong>comprehensive, universal model that absorbed all the programming skills of its predecessors<\/strong>, and then expanded them with native computer use, advanced research, and powerful context.<\/p>\n\n\n\n<p>To better understand this, let&#8217;s look at the evolution of OpenAI tools:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>GPT-5.2-Codex<\/strong> (late 2025) \u2013 the first standalone agent in the company&#8217;s lineup.<\/li>\n\n\n\n<li><strong>GPT-5.3-Codex<\/strong> (February 2026) \u2013 an update: a model 25% faster, performing excellently in the terminal (77.3% in Terminal-Bench), and the first to receive the highest rating in cybersecurity (<a href=\"https:\/\/openai.com\/index\/introducing-gpt-5-3-codex\/\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >source<\/a>).<\/li>\n\n\n\n<li><strong>GPT-5.4<\/strong> (March 2026) \u2013 OpenAI combined Codex&#8217;s programming precision with general reasoning and a context window accommodating up to 1.05M tokens (<a href=\"https:\/\/openai.com\/index\/introducing-gpt-5-4\/\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >source<\/a>).<\/li>\n<\/ul>\n\n\n\n<p>Importantly, the older GPT-5.3 Codex has not disappeared from the market. It remains available as the most affordable solution for <strong>purely terminal-based tasks and CLI work<\/strong>. If your daily workflow relies heavily on the console, scripts, and automation, Codex 5.3 will still hit the bullseye.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong><strong>Work characteristics: GPT-5.4 and Claude Opus 4.7<\/strong><\/strong><\/h2>\n\n\n\n<p>Although both models can easily handle most daily programming tasks, their approaches to work differ. And this is what determines their effectiveness in specific scenarios.<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><strong>Area<\/strong><\/td><td><strong>GPT-5.4<\/strong><\/td><td><strong>Opus 4.7<\/strong><\/td><\/tr><tr><td><strong>Default style<\/strong><\/td><td>Quickly moves to implementation<\/td><td>Analyzes first, then acts<\/td><\/tr><tr><td><strong>Strength<\/strong><\/td><td>Speed, computer use, and broad capabilities<\/td><td>Depth of reasoning, maintaining context<\/td><\/tr><tr><td><a href=\"https:\/\/www.swebench.com\/\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" ><strong>SWE-bench Pro<\/strong><\/a><\/td><td>57.7% (per <a href=\"https:\/\/openai.com\/index\/introducing-gpt-5-4\/\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >OpenAI<\/a>)<\/td><td><strong>64.3%<\/strong> (per <a href=\"https:\/\/www.anthropic.com\/news\/claude-opus-4-7\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >Anthropic<\/a>)<\/td><\/tr><tr><td><strong>Context<\/strong><\/td><td><strong>1.05M tokens<\/strong><\/td><td>1M tokens<\/td><\/tr><tr><td><strong>Computer use<\/strong><\/td><td><strong>Native<\/strong> (<a href=\"https:\/\/os-world.github.io\/\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >OSWorld<\/a> 75%)<\/td><td>Supported (Computer Use API, vision 3.75 MP)<\/td><\/tr><tr><td><strong>Ecosystem (deepest integration)<\/strong><\/td><td>GitHub Copilot (PR review, Copilot in IDE), Codex IDE, CI\/CD extensions<\/td><td>Cursor, Claude Code (CLI), API-first<\/td><\/tr><tr><td><strong>Security<\/strong><\/td><td>Codex Security &#8211; proactive vulnerability scanning in the pipeline<\/td><td>Consciously narrowed capabilities in cybersecurity (per Anthropic, relative to Mythos)<\/td><\/tr><tr><td><strong>Approach to risk<\/strong><\/td><td>Broad capabilities + control layers<\/td><td>Limiting capabilities at the source<\/td><\/tr><\/tbody><\/table><figcaption class=\"wp-element-caption\">Tab. 3 Work characteristics: GPT-5.4 and Claude Opus 4.7<\/figcaption><\/figure>\n\n\n\n<p>The last row of the table clearly illustrates the fundamental difference in the two providers&#8217; philosophies. <\/p>\n\n\n\n<p>OpenAI provides a model with incredibly broad capabilities but imposes external control layers on it (such as Codex Security or Trusted Access for Cyber). Anthropic chooses a different path: some potentially dangerous capabilities are narrowed down even before the model is publicly released (according to the company&#8217;s declaration, Opus 4.7&#8217;s cybersecurity capabilities were consciously limited compared to the powerful Mythos model). Both approaches make sense and affect how you ultimately design your workflow.<\/p>\n\n\n\n<p>Simply put, the difference in their work style looks like this:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>GPT-5.4:<\/strong> <em>Receives a task \u2192 immediately generates a solution \u2192 iterates if tests fail.<\/em><\/li>\n\n\n\n<li><strong>Opus:<\/strong> <em>Receives a task \u2192 analyzes the context \u2192 plans the approach \u2192 implements the code along with a justification.<\/em><\/li>\n<\/ul>\n\n\n\n<p>Neither of these styles is objectively better. It all depends on the problem you are facing.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong><strong>Where do these differences come from?<\/strong><\/strong><\/h2>\n\n\n\n<p>The different work style is not a coincidence &#8211; it stems from design decisions at the architecture level of the models themselves:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Explicit vs. Implicit Planning.<\/strong> Opus tends to explicitly break down a problem into smaller steps before generating the first line of code. GPT-5.4 plans more &#8220;on the fly&#8221; &#8211; it starts writing and corrects course as it goes. As a result, Opus rarely hits dead ends on complex tasks, but it can be slower on the simplest ones. This is a classic dilemma known from distributed systems: an execution-first (optimistic) approach versus a plan-first (deliberative) approach.<\/li>\n\n\n\n<li><strong>Context Management.<\/strong> Both models have a powerful context (around 1M tokens) and generate up to 128k output tokens. The difference lies in how they use it. Opus excels at maintaining consistency during long, multi-step interactions. GPT-5.4, on the other hand, is unrivaled when a task requires a wide range of capabilities (e.g., simultaneous coding, browser use, and research) within a single session.<\/li>\n\n\n\n<li><strong>Strategy Towards Uncertainty.<\/strong> When data is missing, GPT-5.4 tends to generate the most probable solution. Opus will more often stop and explicitly ask for the missing information. Depending on the situation, each of these traits can be a huge advantage or an irritating flaw.<\/li>\n<\/ul>\n\n\n\n<p>These differences translate into hard data. Opus 4.7&#8217;s score on <a href=\"https:\/\/www.swebench.com\/\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >SWE-bench Pro<\/a> (64.3% per Anthropic vs 57.7% for GPT-5.4) confirms that deeper planning translates into higher code quality in complex projects.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong><strong>Practical applications of models in IT projects<\/strong><\/strong><\/h2>\n\n\n\n<p>Let&#8217;s step down from the level of abstraction and look at specific scenarios where the choice of model makes a real difference.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong><strong>When will GPT-5.4 work best?<\/strong><\/strong><\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Mass refactoring:<\/strong> Changing a pattern across 200 files or migrating a large API. GPT-5.4 offers significantly higher throughput and a lower price per token, which is crucial for large-scale repetitive changes.<\/li>\n\n\n\n<li><strong>Generating boilerplate:<\/strong> Writing repetitive tests, CRUD operations, or configuration files. Speed is what counts here.<\/li>\n\n\n\n<li><strong>Tasks requiring intensive computer use:<\/strong> Interacting with user interfaces, testing web applications, or working with desktop tools. GPT-5.4 features native computer use integrated directly into the model (<a href=\"https:\/\/os-world.github.io\/\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >OSWorld<\/a> 75%, <a href=\"https:\/\/openai.com\/index\/introducing-gpt-5-4\/\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >source<\/a>). Opus supports this via API, but with slightly less maturity.<\/li>\n\n\n\n<li><strong>Tight integration with GitHub:<\/strong> Automated PR review, suggestions in Copilot, CI\/CD support. OpenAI models are available by default in GitHub Copilot for business users (<a href=\"https:\/\/github.blog\/changelog\/2026-02-09-gpt-5-3-codex-is-now-generally-available-for-github-copilot\/\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >source<\/a>).<\/li>\n\n\n\n<li><strong>Security scanning:<\/strong> Introduced in March 2026, Codex Security is not just a feature, but a separate layer in the pipeline. It analyzes code before merging, acting like an intelligent SAST\/DAST. It understands the context of changes and reduces false alarms by about 50% (<a href=\"https:\/\/openai.com\/index\/introducing-gpt-5-3-codex\/\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >source<\/a>).<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\"><strong><strong>When should you reach for Claude Opus 4.7?<\/strong><\/strong><\/h3>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Debugging complex problems:<\/strong> Race conditions, memory leaks, intermittent failures. Opus focuses on deeper analysis and maintains context exceptionally well, which is reflected in the SWE-bench Pro results.<\/li>\n\n\n\n<li><strong>Architectural decisions:<\/strong> Evaluating trade-offs, choosing the right design pattern, or analyzing the impact of a single change on the entire system. Here, deeper reasoning definitely pays off.<\/li>\n\n\n\n<li><strong>Working with a large, interconnected context:<\/strong> Analyzing many interdependent files and understanding the data flow across multiple application layers.<\/li>\n\n\n\n<li><strong>Visual analysis:<\/strong> Opus 4.7 can analyze images with resolutions up to 3.75 MP. This is invaluable help when debugging UI, analyzing complex architectural diagrams, or reviewing error screenshots.<\/li>\n<\/ul>\n\n\n\n<h3 class=\"wp-block-heading\"><strong><strong>When more does not mean better<\/strong><\/strong><\/h3>\n\n\n\n<p>It&#8217;s worth remembering that deeper reasoning does not always mean a better outcome for the user. For very simple tasks, Opus can be slower and more expensive &#8211; not because it can&#8217;t handle it, but because it <em>analyzes more than necessary<\/em>. Instead of simply writing a basic endpoint, it starts considering edge cases that no one asked for. Additionally, the new tokenizer in version 4.7 increases the cost of such operations by ~35%.<\/p>\n\n\n\n<p>On the other hand, for some pesky bugs, GPT-5.4&#8217;s iterative approach can be surprisingly effective. The model instantly generates several fix variants that you can quickly verify using tests.<\/p>\n\n\n\n<p><strong>Both models excel when it comes to:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Daily coding (new features, simple fixes).<\/li>\n\n\n\n<li>Code review (both efficiently catch problems).<\/li>\n\n\n\n<li>Explaining and documenting existing code.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"><strong><strong>The new developer workflow<\/strong><\/strong><\/h2>\n\n\n\n<p>What does working with these models look like in practice? Let&#8217;s imagine a classic problem: sudden timeouts on the \/api\/orders endpoint.<\/p>\n\n\n\n<p><strong>In the past (without AI):<\/strong><\/p>\n\n\n\n<ol start=\"1\" class=\"wp-block-list\">\n<li>The developer analyzes logs.<\/li>\n\n\n\n<li>Look for the cause in the code.<\/li>\n\n\n\n<li>Implements a fix.<\/li>\n\n\n\n<li>Writes tests and deploys the change.<\/li>\n<\/ol>\n\n\n\n<p><strong>Today (with AI) \u2013 leveraging each model&#8217;s strengths:<\/strong><\/p>\n\n\n\n<ol start=\"1\" class=\"wp-block-list\">\n<li><strong>Specify:<\/strong> You write down a short specification of the problem (goal, acceptance criteria, constraints) \u2013 in a SPEC.md, a Jira issue, or a task description.<\/li>\n\n\n\n<li>You provide Opus with logs from the last 24 hours along with the specification \u2192 Opus identifies that the problem only occurs for orders with more than 50 items. It points to a lack of pagination and an N+1 query issue.<\/li>\n\n\n\n<li>You pass this analysis to GPT-5.4 \u2192 the model refactors the repository, adds pagination, eliminates the N+1 issue, and generates an integration test.<\/li>\n\n\n\n<li>Codex Security scans the fix for vulnerabilities (e.g., IDOR \u2013 Insecure Direct Object Reference, SQL injection) \u2192 confirms the security of this change.<\/li>\n\n\n\n<li>You perform the final review \u2192 check if the solution breaks the API contract, run tests, and deploy.<\/li>\n<\/ol>\n\n\n\n<h2 class=\"wp-block-heading\"><strong><strong>What actually changed in the code<\/strong><\/strong><\/h2>\n\n\n\n<p>Based on Opus&#8217;s analysis, GPT-5.4 generated a fix. The diff itself is short but requires understanding JPA, maintaining the API contract, and writing a regression test \u2013 which is exactly what we expect from an AI assistant.<\/p>\n\n\n\n<p><strong>Before:<\/strong><\/p>\n\n\n\n<figure data-wp-context=\"{&quot;uploadedSrc&quot;:&quot;https:\\\/\\\/sii.pl\\\/blog\\\/wp-content\\\/uploads\\\/2026\\\/04\\\/image1-1.png&quot;,&quot;figureClassNames&quot;:&quot;wp-block-image aligncenter size-full&quot;,&quot;figureStyles&quot;:null,&quot;imgClassNames&quot;:&quot;wp-image-33659&quot;,&quot;imgStyles&quot;:null,&quot;targetWidth&quot;:796,&quot;targetHeight&quot;:336,&quot;scaleAttr&quot;:false,&quot;ariaLabel&quot;:&quot;Enlarge image: code&quot;,&quot;alt&quot;:&quot;code&quot;}\" data-wp-interactive=\"core\/image\" class=\"wp-block-image aligncenter size-full wp-lightbox-container\"><img decoding=\"async\" width=\"796\" height=\"336\" data-wp-init=\"callbacks.setButtonStyles\" data-wp-on-async--click=\"actions.showLightbox\" data-wp-on-async--load=\"callbacks.setButtonStyles\" data-wp-on-async-window--resize=\"callbacks.setButtonStyles\" src=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/04\/image1-1.png\" alt=\"code\" class=\"wp-image-33659\" srcset=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/04\/image1-1.png 796w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/04\/image1-1-300x127.png 300w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/04\/image1-1-768x324.png 768w\" sizes=\"(max-width: 796px) 100vw, 796px\" \/><button\n\t\t\tclass=\"lightbox-trigger\"\n\t\t\ttype=\"button\"\n\t\t\taria-haspopup=\"dialog\"\n\t\t\taria-label=\"Enlarge image: code\"\n\t\t\tdata-wp-init=\"callbacks.initTriggerButton\"\n\t\t\tdata-wp-on-async--click=\"actions.showLightbox\"\n\t\t\tdata-wp-style--right=\"context.imageButtonRight\"\n\t\t\tdata-wp-style--top=\"context.imageButtonTop\"\n\t\t>\n\t\t\t<svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"12\" height=\"12\" fill=\"none\" viewBox=\"0 0 12 12\">\n\t\t\t\t<path fill=\"#fff\" d=\"M2 0a2 2 0 0 0-2 2v2h1.5V2a.5.5 0 0 1 .5-.5h2V0H2Zm2 10.5H2a.5.5 0 0 1-.5-.5V8H0v2a2 2 0 0 0 2 2h2v-1.5ZM8 12v-1.5h2a.5.5 0 0 0 .5-.5V8H12v2a2 2 0 0 1-2 2H8Zm2-12a2 2 0 0 1 2 2v2h-1.5V2a.5.5 0 0 0-.5-.5H8V0h2Z\" \/>\n\t\t\t<\/svg>\n\t\t<\/button><\/figure>\n\n\n\n<p>After:<\/p>\n\n\n\n<figure data-wp-context=\"{&quot;uploadedSrc&quot;:&quot;https:\\\/\\\/sii.pl\\\/blog\\\/wp-content\\\/uploads\\\/2026\\\/04\\\/image2-2.png&quot;,&quot;figureClassNames&quot;:&quot;wp-block-image aligncenter size-full&quot;,&quot;figureStyles&quot;:null,&quot;imgClassNames&quot;:&quot;wp-image-33661&quot;,&quot;imgStyles&quot;:null,&quot;targetWidth&quot;:796,&quot;targetHeight&quot;:369,&quot;scaleAttr&quot;:false,&quot;ariaLabel&quot;:&quot;Enlarge image: code&quot;,&quot;alt&quot;:&quot;code&quot;}\" data-wp-interactive=\"core\/image\" class=\"wp-block-image aligncenter size-full wp-lightbox-container\"><img decoding=\"async\" width=\"796\" height=\"369\" data-wp-init=\"callbacks.setButtonStyles\" data-wp-on-async--click=\"actions.showLightbox\" data-wp-on-async--load=\"callbacks.setButtonStyles\" data-wp-on-async-window--resize=\"callbacks.setButtonStyles\" src=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/04\/image2-2.png\" alt=\"code\" class=\"wp-image-33661\" srcset=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/04\/image2-2.png 796w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/04\/image2-2-300x139.png 300w, https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/04\/image2-2-768x356.png 768w\" sizes=\"(max-width: 796px) 100vw, 796px\" \/><button\n\t\t\tclass=\"lightbox-trigger\"\n\t\t\ttype=\"button\"\n\t\t\taria-haspopup=\"dialog\"\n\t\t\taria-label=\"Enlarge image: code\"\n\t\t\tdata-wp-init=\"callbacks.initTriggerButton\"\n\t\t\tdata-wp-on-async--click=\"actions.showLightbox\"\n\t\t\tdata-wp-style--right=\"context.imageButtonRight\"\n\t\t\tdata-wp-style--top=\"context.imageButtonTop\"\n\t\t>\n\t\t\t<svg xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"12\" height=\"12\" fill=\"none\" viewBox=\"0 0 12 12\">\n\t\t\t\t<path fill=\"#fff\" d=\"M2 0a2 2 0 0 0-2 2v2h1.5V2a.5.5 0 0 1 .5-.5h2V0H2Zm2 10.5H2a.5.5 0 0 1-.5-.5V8H0v2a2 2 0 0 0 2 2h2v-1.5ZM8 12v-1.5h2a.5.5 0 0 0 .5-.5V8H12v2a2 2 0 0 1-2 2H8Zm2-12a2 2 0 0 1 2 2v2h-1.5V2a.5.5 0 0 0-.5-.5H8V0h2Z\" \/>\n\t\t\t<\/svg>\n\t\t<\/button><\/figure>\n\n\n\n<p>Two key changes were introduced: the @EntityGraph annotation eliminates the N+1 problem (a single query with JOIN FETCH), and the Pageable object introduces pagination, limiting the payload. It&#8217;s worth remembering that in a real project, there is also a rollback plan, post-deployment monitoring, and team communication \u2013 <strong>AI does not replace these layers<\/strong>, but merely shortens the path between diagnosis and the fix code.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong><strong>Prompts used in this scenario<\/strong><\/strong><\/h2>\n\n\n\n<p>Here is what you actually type into the model at each stage of working with AI.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong><strong>Step 1 \u2013 Specify (requirements specification)<\/strong><\/strong><\/h3>\n\n\n\n<p>Before you even engage the model, you write down the requirements, acceptance criteria, and constraints as a separate artifact \u2013 a SPEC.md file in the repository, a Jira issue, or a short block in the task description. This is the contract between you and the AI.<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Goal:<\/strong> The \/api\/orders endpoint should respond in &lt; 200 ms for orders with \u2264 100 items.<\/li>\n\n\n\n<li><strong>Requirements<\/strong>: Maintain backward compatibility of the API contract, write an integration test for pagination > 50 items.<\/li>\n\n\n\n<li><strong>Constraints:<\/strong> We do not change the database schema in this iteration, and we do not introduce a cache.<\/li>\n<\/ul>\n\n\n\n<p>This is the core of the <em>spec-driven development<\/em> approach. You provide the specification to the model in Step 2 along with the logs \u2013 the plan is then formed not based on <em>&#8220;what the model thinks you want&#8221;<\/em>, but <em>&#8220;what you explicitly wrote down&#8221;<\/em>. As a result, the model hallucinates intent much less frequently.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong><strong>Step 2 \u2013 Opus 4.7 in Ask\/Plan mode (analysis)<\/strong><\/strong><\/h3>\n\n\n\n<p><em>I am attaching the spec (SPEC.md), the last 24h of logs from the \/api\/orders endpoint, and the Sentry stack trace. Identify the root causes of the timeouts and propose a repair strategy aligned with the specification.<\/em><\/p>\n\n\n\n<p>It is crucial to run this step in <strong>read-only mode<\/strong> (e.g., Ask mode in Cursor or Plan mode in Claude Code). This forces the model to focus solely on diagnosis and solution planning, preventing premature file modifications. The text request &#8220;do not write code yet&#8221; is sometimes ignored by models \u2013 tool mode technically guarantees this.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong><strong>Step 3 \u2013 GPT-5.4 in Agent mode (implementation)<\/strong><\/strong><\/h3>\n\n\n\n<p><em>Implement the fix according to this strategy: [paste Opus&#8217;s output]. Add pagination (limit 50), eliminate N+1 via eager loading, and write an integration test for an order with 100 items.<\/em><\/p>\n\n\n\n<p>At this stage, we switch to Agent mode (or its equivalent, e.g., accept edits in Claude Code). Speed and execution count here \u2013 GPT-5.4 takes the ready strategy and instantly turns it into ready code and tests.<\/p>\n\n\n\n<h3 class=\"wp-block-heading\"><strong><strong>Step 4 \u2013 Codex Security (verification)<\/strong><\/strong><\/h3>\n\n\n\n<p><em>Scan the diff for IDOR, SQL injection, and permissions. Pay special attention to the new parameters ?page= and ?size=.<\/em><\/p>\n\n\n\n<p>Specifying concrete attack vectors narrows the scope of scanning and reduces the number of false alarms (false positives). Instead of a generic &#8220;check security,&#8221; we get a targeted analysis.<\/p>\n\n\n\n<p><em>Preview: a full case study \u2013 with logs, model outputs, and cost measurements \u2013 will be described in the next article in the series.<\/em><\/p>\n\n\n\n<p>Could Opus be used exclusively for both steps? Yes. Would GPT-5.4 handle the analysis on its own? Probably as well. However, it is precisely <strong>matching the model to the specifics of the task<\/strong> that allows you to achieve an optimal quality-to-time-and-cost ratio.<\/p>\n\n\n\n<p><strong>The developer ceases to be solely responsible for implementation \u2013 they become the orchestrator of AI work:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>Selects the right model for the task.<\/li>\n\n\n\n<li>Controls the flow of information between different agents.<\/li>\n\n\n\n<li>Manages risk (security, regressions, hallucinations).<\/li>\n\n\n\n<li>Decides when AI can act autonomously and when it requires supervision.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"><strong><strong>Implementing AI in a team<\/strong><\/strong><\/h2>\n\n\n\n<p>Looking from a systems architect&#8217;s perspective, it&#8217;s worth treating AI models like microservices with different characteristics:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Opus 4.7<\/strong> \u2013 lower throughput, higher cost, optimized for complex reasoning.<\/li>\n\n\n\n<li><strong>Claude Sonnet <\/strong>\u2013 Opus&#8217;s cheaper sibling in the Anthropic ecosystem, good for execution and daily coding.<\/li>\n\n\n\n<li><strong>GPT-5.4 <\/strong>\u2013 higher throughput, lower cost, native computer use, broad spectrum of capabilities.<\/li>\n\n\n\n<li><strong>GPT-5.3 Codex<\/strong> \u2013 the most affordable option, specialized in terminal\/CLI work.<\/li>\n\n\n\n<li><strong>Codex Security<\/strong> \u2013 a specialized agent for security scanning.<\/li>\n<\/ul>\n\n\n\n<p>This is a situation analogous to choosing between a synchronous API call and batch processing \u2013 both approaches work, but their optimal use depends on the context.<\/p>\n\n\n<div class=\"nsw-o-blogersii-banner\">\n            <picture>\n            <source srcset=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/04\/Blog-Digital-Desktop_.jpg\" media=\"(min-width: 992px)\" >\n            <source srcset=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/04\/Blog-Digital-Mob_.jpg\" media=\"(min-width: 300px)\" >            <img decoding=\"async\" src=\"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/04\/Blog-Digital-Desktop_.jpg\" alt=\"\"  class=\"\"  >\n        <\/picture>\n        <div class=\"cnt\">\n                    <div class=\"nsw-m-title-block -h3 -invert  -has-title-margin-bottom-0 -has-title-font-weight-bold\">\n                                <h2 class=\"nsw-m-title-block__title\">Digital<\/h2>\n                <\/div>\n                            <p class=\"has-nsw-p-4-font-size has-invert-color\">\n                Grow your business and gain more satisfied customers with our services in software engineering, e-commerce, mobile and digital customer experience.\n            <\/p>\n                            <a  href=\"https:\/\/sii.pl\/en\/what-we-offer\/digital\/\" class=\"nsw-a-button -ghost -banner-button\"   >\n        <span>Check our offer<\/span>\n    <\/a>\n            <\/div>\n<\/div>\n\n\n\n<h2 class=\"wp-block-heading\"><strong><strong>Choosing the model for the step<\/strong><\/strong><\/h2>\n\n\n\n<p>Opus 4.7 is worth its price, where you pay for <em>more extensive reasoning<\/em>. Where you only need execution, it&#8217;s worth considering a cheaper sibling within the same ecosystem or a competing model:<\/p>\n\n\n\n<figure class=\"wp-block-table\"><table class=\"has-fixed-layout\"><tbody><tr><td><strong>Workflow Step<\/strong><\/td><td><strong>Recommendation<\/strong><\/td><td><strong>Why<\/strong><\/td><\/tr><tr><td><strong>Specify<\/strong><\/td><td>any model (or no AI)<\/td><td>You write the specification, not the model<\/td><\/tr><tr><td><strong>Plan \/ Analyze<\/strong><\/td><td><strong>Opus 4.7<\/strong><\/td><td>deeper reasoning realistically translates into plan quality<\/td><\/tr><tr><td><strong>Implement<\/strong><\/td><td><strong>Claude Sonnet<\/strong> or <strong>GPT-5.4<\/strong><\/td><td>cheaper, powerful enough to execute a ready plan<\/td><\/tr><tr><td><strong>Verify \/ Security<\/strong><\/td><td><strong>Codex Security<\/strong> or a specialized agent<\/td><td>narrow task scope, does not need a flagship model<\/td><\/tr><\/tbody><\/table><figcaption class=\"wp-element-caption\">Tab. 4 Choosing the model for the step<\/figcaption><\/figure>\n\n\n\n<p>Why Sonnet instead of Opus for implementation?<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li>First, <strong>pricing<\/strong> \u2013 within the same generation, Sonnet is usually several times cheaper than Opus.<\/li>\n\n\n\n<li>Second, <strong>token usage <\/strong>\u2013Opus 4.7 has a new tokenizer which (as mentioned earlier) increases the number of tokens by ~35%, and its deeper reasoning chains naturally lead to longer responses.<\/li>\n<\/ul>\n\n\n\n<p>The result: a single implementation iteration on Opus can cost several times as much as the same iteration on Sonnet, with a marginal quality difference at the point of executing a ready plan.<\/p>\n\n\n\n<p>In a mature work environment, AI plays three main roles:<\/p>\n\n\n\n<ol start=\"1\" class=\"wp-block-list\">\n<li><strong>Decision layer<\/strong> \u2013 analysis, planning, risk assessment (Opus excels here).<\/li>\n\n\n\n<li><strong>Execution layer<\/strong> \u2013 implementation, refactoring, test generation (GPT-5.4 offers higher performance and lower costs here).<\/li>\n\n\n\n<li><strong>Security layer<\/strong> \u2013 vulnerability scanning, verification of introduced changes (Codex Security).<\/li>\n<\/ol>\n\n\n\n<p>The architect&#8217;s task is to manage the flow between these layers. In practice, this means designing an <strong>orchestration layer<\/strong>. It is worth asking yourself a few key questions:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>Routing:<\/strong> How to automatically route requests to the appropriate model? Should the criterion be the task type, its complexity, or perhaps cost?<\/li>\n\n\n\n<li><strong>Fallback:<\/strong> What happens when the &#8220;first choice&#8221; model returns an unsatisfactory result? Is there a mechanism for a smooth switch to an alternative?<\/li>\n\n\n\n<li><strong>Decision logging:<\/strong> Do you record which model proposed a given solution? This is crucial for later audits and team learning.<\/li>\n\n\n\n<li><strong>Quality metrics:<\/strong> How do you measure the effectiveness of a given model in your specific context? Do you measure time, costs, or the percentage of accepted changes?<\/li>\n<\/ul>\n\n\n\n<p>Even just asking these questions dramatically changes how we think about building processes with AI. The architect is no longer just designing the IT system itself, but also the <strong>collaboration system between AI models<\/strong>.<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong><strong>Claude Mythos \u2013 the future of Cybersecurity in AI<\/strong><\/strong><\/h2>\n\n\n\n<p>When talking about AI models in 2026, it is impossible to ignore the elephant in the room: <strong>Claude Mythos Preview<\/strong>.<\/p>\n\n\n\n<p>On April 7, Anthropic announced a model that is not just &#8220;another, more powerful version.&#8221; It represents an entirely new class of systems capable of conducting long, uninterrupted reasoning chains without any user intervention. According to benchmarks published by the creators, the results are impressive:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><a href=\"https:\/\/www.swebench.com\/\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" ><strong>SWE-bench Verified<\/strong><\/a><strong>:<\/strong> 93.9% (compared to 80.8% for Opus 4.6).<\/li>\n\n\n\n<li><strong>USAMO 2026<\/strong> (advanced mathematics): 97.6% (compared to 42.3% for Opus 4.6).<\/li>\n\n\n\n<li><strong>Cybersecurity CTF:<\/strong> 83.1% (compared to 66.6% for Opus 4.6).<\/li>\n<\/ul>\n\n\n\n<p>According to information shared under Project Glasswing, Mythos demonstrated the ability to autonomously identify previously unknown vulnerabilities (zero-days) in extremely complex systems \u2013 including a 27-year-old bug in OpenBSD. However, these results are from controlled test environments and have not yet been publicly verified by independent entities.<\/p>\n\n\n\n<p>The most important thing, however, is that Mythos <strong>is not publicly available<\/strong>. Anthropic, as the first player in the market since the release of GPT-2 in 2019, decided to restrict access to the model for security reasons drastically. It was provided to only about 50 selected organizations (including Microsoft, AWS, Apple, Google, Nvidia, and Cisco) under the <strong>Project Glasswing<\/strong> program.<\/p>\n\n\n\n<p>What does this mean for Opus 4.7 users? According to Anthropic&#8217;s announcements, <strong>Opus 4.7&#8217;s cybersecurity capabilities were deliberately narrowed<\/strong> compared to the Mythos model. This is a clear design declaration: a publicly available model should be powerful, but above all, safe.<\/p>\n\n\n\n<p>Mythos sets a new direction for the entire industry: <strong>models are becoming so advanced that it becomes necessary to deliberately limit their capabilities<\/strong>. This fundamentally changes the architect&#8217;s perspective &#8211; we no longer just ask &#8220;which model is best for my task,&#8221; but &#8220;which models do I even have access to, and under what conditions.&#8221;<em> (<a href=\"http:\/\/red.anthropic.com\/2026\/mythos-preview\" target=\"_blank\" rel=\"noopener\" title=\"\" rel=\"nofollow\" >source<\/a>)<\/em><\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong><strong>What is worth remembering?<\/strong><\/strong><\/h2>\n\n\n\n<p>The widely known warning that &#8220;AI sometimes makes mistakes&#8221; surprises no one in 2026. Let&#8217;s pay attention to more specific pitfalls:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>The speed of GPT-5.4 requires rigorous quality control.<\/strong> The model can instantly modify dozens of files. Therefore, automated tests that continuously verify the correctness of introduced changes are an essential element of the workflow.<\/li>\n\n\n\n<li><strong>Opus can generate an incredibly convincing but incorrect analysis.<\/strong> Always verify its assumptions, especially when it relies on incomplete logs or outdated documentation.<\/li>\n\n\n\n<li><strong>Migrating between model versions is not trivial.<\/strong> Prompts that worked perfectly with Opus 4.6 may stop working in version 4.7. Additionally, the new tokenizer changes the cost structure. Always test the model&#8217;s behavior before switching it to a production environment.<\/li>\n\n\n\n<li><strong>Choosing a model is a purely engineering decision, not an ideological one.<\/strong> There is no point in arguing about which model is generally &#8220;better.&#8221; The key is which one better fits the task you currently have in front of you.<\/li>\n\n\n\n<li><strong>Not every problem requires engaging AI.<\/strong> Sometimes you will write a simple bugfix yourself faster than it takes to formulate a precise prompt.<\/li>\n<\/ul>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Summary<\/strong><\/h2>\n\n\n\n<p>Although the scope of capabilities of both models is very similar, they differ in effectiveness depending on the specifics of the task:<\/p>\n\n\n\n<ul class=\"wp-block-list\">\n<li><strong>GPT-5.4<\/strong> is a model optimized for speed and costs. It offers native computer use and a powerful 1.05M token context. It is available directly in GitHub Copilot and the entire OpenAI Codex ecosystem. It works perfectly when you know exactly what you want to do and simply need efficient execution.<\/li>\n\n\n\n<li><strong>Claude Opus 4.7<\/strong> is geared towards deep reasoning (achieving 64.3% in SWE-bench Pro per Anthropic) and excels at maintaining complex context. It is immensely popular among Cursor users and API-first tools. It is the optimal choice when you first need to understand a complex problem and plan the solution architecture.<\/li>\n\n\n\n<li><strong>GPT-5.3 Codex<\/strong> remains the most affordable option, specialized in terminal and CLI work. It is a very justified choice for automation and scripting.<\/li>\n<\/ul>\n\n\n\n<p>And somewhere in the background looms Claude Mythos Preview, reminding us that the models we use every day are <strong>consciously limited versions<\/strong> of what is already technically possible.<\/p>\n\n\n\n<p><strong>The key to efficiency in 2026 is not searching for a single, universal model, but rather a conscious, flexible selection of the tool for a specific problem.<\/strong><\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong>Want to implement AI Agents in your team?<\/strong><\/h2>\n\n\n\n<p>Are you wondering how to optimize your organization&#8217;s development processes using the latest AI models? <strong>Contact the experts at Sii Poland.<\/strong> We will help you choose the right tools, design a secure architecture, and train your team to maximize the potential of GPT-5.4, Claude Opus, and other enterprise-class solutions.<\/p>\n\n\n\n<p><a href=\"https:\/\/sii.pl\/kontakt\/\" target=\"_blank\" rel=\"noopener\" title=\"\"><strong>Let&#8217;s talk about AI in your project.<\/strong><\/a><\/p>\n\n\n\n<p>***<\/p>\n\n\n\n<h2 class=\"wp-block-heading\"><strong><strong>About data and benchmarks<\/strong><\/strong><\/h2>\n\n\n\n<p><em><em>The article uses official materials from model producers (OpenAI, Anthropic), internal and partner benchmarks (e.g., SWE-bench, CursorBench), and data from test programs (e.g., Project Glasswing). Some results come from controlled test environments and may differ from results in production systems.<\/em><\/em><\/p>\n\n\n<div class=\"kk-star-ratings kksr-auto kksr-align-left kksr-valign-bottom\"\n    data-payload='{&quot;align&quot;:&quot;left&quot;,&quot;id&quot;:&quot;33666&quot;,&quot;slug&quot;:&quot;default&quot;,&quot;valign&quot;:&quot;bottom&quot;,&quot;ignore&quot;:&quot;&quot;,&quot;reference&quot;:&quot;auto&quot;,&quot;class&quot;:&quot;&quot;,&quot;count&quot;:&quot;0&quot;,&quot;legendonly&quot;:&quot;&quot;,&quot;readonly&quot;:&quot;&quot;,&quot;score&quot;:&quot;0&quot;,&quot;starsonly&quot;:&quot;&quot;,&quot;best&quot;:&quot;5&quot;,&quot;gap&quot;:&quot;2&quot;,&quot;greet&quot;:&quot;&quot;,&quot;legend&quot;:&quot;0\\\/5&quot;,&quot;size&quot;:&quot;30&quot;,&quot;title&quot;:&quot;Practical application of AI models in programming. A comparison of GPT-5.4 and Claude Opus 4.7&quot;,&quot;width&quot;:&quot;0&quot;,&quot;_legend&quot;:&quot;{score}\\\/5&quot;,&quot;font_factor&quot;:&quot;1.25&quot;}'>\n            \n<div class=\"kksr-stars\">\n    \n<div class=\"kksr-stars-inactive\">\n            <div class=\"kksr-star\" data-star=\"1\" style=\"padding-right: 2px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 30px; height: 30px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" data-star=\"2\" style=\"padding-right: 2px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 30px; height: 30px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" data-star=\"3\" style=\"padding-right: 2px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 30px; height: 30px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" data-star=\"4\" style=\"padding-right: 2px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 30px; height: 30px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" data-star=\"5\" style=\"padding-right: 2px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 30px; height: 30px;\"><\/div>\n        <\/div>\n    <\/div>\n    \n<div class=\"kksr-stars-active\" style=\"width: 0px;\">\n            <div class=\"kksr-star\" style=\"padding-right: 2px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 30px; height: 30px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" style=\"padding-right: 2px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 30px; height: 30px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" style=\"padding-right: 2px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 30px; height: 30px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" style=\"padding-right: 2px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 30px; height: 30px;\"><\/div>\n        <\/div>\n            <div class=\"kksr-star\" style=\"padding-right: 2px\">\n            \n\n<div class=\"kksr-icon\" style=\"width: 30px; height: 30px;\"><\/div>\n        <\/div>\n    <\/div>\n<\/div>\n                \n\n<div class=\"kksr-legend\" style=\"font-size: 24px;\">\n            <span class=\"kksr-muted\"><\/span>\n    <\/div>\n    <\/div>\n","protected":false},"excerpt":{"rendered":"<p>The days when artificial intelligence was used solely for code auto-completion are behind us. In 2026, AI agents like OpenAI&#8217;s &hellip; <a class=\"continued-btn\" href=\"https:\/\/sii.pl\/blog\/en\/practical-application-of-ai-models-in-programming-a-comparison-of-gpt-5-4-and-claude-opus-4-7\/\">Continued<\/a><\/p>\n","protected":false},"author":788,"featured_media":33664,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"_editorskit_title_hidden":false,"_editorskit_reading_time":0,"_editorskit_is_block_options_detached":false,"_editorskit_block_options_position":"{}","inline_featured_image":false,"footnotes":""},"categories":[1320],"tags":[14195,13915,12235,11955,2622,1590,1501],"class_list":["post-33666","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-hard-development","tag-anthropic-en","tag-gpt-en","tag-claude-en","tag-codex","tag-digital-en","tag-tools","tag-artifiical-intelligence-en"],"acf":[],"aioseo_notices":[],"republish_history":[],"featured_media_url":"https:\/\/sii.pl\/blog\/wp-content\/uploads\/2026\/04\/AI_3.jpg","category_names":["Hard development"],"_links":{"self":[{"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/posts\/33666"}],"collection":[{"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/users\/788"}],"replies":[{"embeddable":true,"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/comments?post=33666"}],"version-history":[{"count":1,"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/posts\/33666\/revisions"}],"predecessor-version":[{"id":33668,"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/posts\/33666\/revisions\/33668"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/media\/33664"}],"wp:attachment":[{"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/media?parent=33666"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/categories?post=33666"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/sii.pl\/blog\/en\/wp-json\/wp\/v2\/tags?post=33666"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}