How Nubank refactors millions of lines of code to improve engineering efficiency with Devin

8x
engineering time efficiency gain
20x
cost savings
Vimeo

Overview

One of Nubank’s most critical, company-wide projects for 2023-2024 was a migration of their core ETL — an 8 year old, multi-million lines of code monolith — to sub-modules. To handle such a large refactor, their only option was a multi-year effort that distributed repetitive refactoring work across over one thousand of their engineers. With Devin, however, this changed: engineers were able to delegate Devin to handle their migrations and achieve a 12x efficiency improvement in terms of engineering hours saved, and over 20x cost savings. Among others, Data, Collections, and Risk business units verified and completed their migrations in weeks instead of months or years.

The Problem

Nubank was born into the tradition of centralized ETL FinServ architectures. To date, the monolith architecture had worked well for Nubank — it enabled the developer autonomy and flexibility that carried them through their hypergrowth phases. After 8 years, however, Nubank’s sheer volume of customer growth, as well as geographic and product expansion beyond their original credit card business, led to an entangled, behemoth ETL with countless cross-dependencies and no clear path to continuing to scale.

For Nubankers, business critical data transformations started taking increasingly long to run, with chains of dependencies as deep as 70 and insufficient formal agreements on who was responsible for maintaining what. As the company continued to grow, it became clear that the ETL would be a primary bottleneck to scale.

Nubank concluded that there was an urgent need to split up their monolithic ETL repository, amassing over 6 million lines of code, into smaller, more flexible sub-modules.

Nubank’s code migration was filled with the monotonous, repetitive work that engineers dread. Moving each data class implementation from one architecture to another while tracing imports correctly, performing multiple delicate refactoring steps, and accounting for any number of edge cases was highly tedious, even to do just once or twice. At Nubank’s scale, however, the total migration scope involved more than 1,000 engineers moving ~100,000 data class implementations over an expected timeline of 18 months.

In a world where engineering resources are scarce, such large-scale migrations and modernizations become massively expensive, time-consuming projects that distract from any engineering team’s core mission: building better products for customers. Unfortunately, this is the reality for many of the world’s largest organizations.

The Decision: an army of Devins to tackle subtasks in parallel

At project outset in 2023, Nubank had no choice but to rely on their engineers to perform code changes manually. Migrating one data class was a highly discretionary task, with multiple variations, edge cases, and ad hoc decision-making — far too complex to be scriptable, but high-volume enough to be a significant manual effort.

Within weeks of Devin’s launch, Nubank identified a clear opportunity to accelerate their refactor at a fraction of the engineering hours. Migration or large refactoring tasks are often fantastic projects for Devin: after investing a small, fixed cost to teach Devin how to approach sub-tasks, Devin can go and complete the migration autonomously. A human is kept in the loop just to manage the project and approve Devin’s changes.

The Solution: Custom ETL Migration Devin

A task of this magnitude, with the vast number of variations that it had, was a ripe opportunity for fine-tuning. The Nubank team helped to collect examples of previous migrations their engineers had done manually, some of which were fed to Devin for fine-tuning. The rest were used to create a benchmark evaluation set. Against this evaluation set, we observed a doubling of Devin’s task completion scores after fine-tuning, as well as a 4x improvement in task speed. Roughly 40 minutes per sub-task dropped to 10, which made the whole migration start to look much cheaper and less time-consuming, allowing the company to devote more energy to new business and new value creation instead.

Devin contributed to its own speed improvements by building itself classical tools and scripts it would later use on the most common, mechanical components of the migration. For instance, detecting the country extension of a data class (either ‘br’, ‘co’, or ‘mx’) based on its file path was a few-step process for each sub-task. Devin’s script automatically turned this into a single step executable — improvements from which added up immensely across all tens of thousands of sub-tasks.

There is also a compounding advantage on Devin’s learning. In the first weeks, it was common to see outstanding errors to fix, or small things Devin wasn’t sure how to solve. But as Devin saw more examples and gained familiarity with the task, it started to avoid rabbit holes more often and find faster solutions to previously-seen errors and edge cases. Much like a human engineer, we observed obvious speed and reliability improvements with every day Devin worked on the migration.

Results: Delivering an 8-12x faster migration, lifting a burden from every engineer, and slashing migration costs by 20x.

“Devin provided an easy way to reduce the number of engineering hours for the migration, in a way that was more stable and less prone to human error. Rather than engineers having to work across several files and complete an entire migration task 100%, they could just review Devin’s changes, make minor adjustments, then merge their PR”

Jose Carlos Castro, Senior Product Manager

8-12x efficiency gains This is calculated by comparing the typical engineering hours required to complete a data class migration task against the total engineering hours spent prompting and reviewing Devin’s work on the same task.
Over 20x cost savings on scope of the migration delegated to Devin This is calculated by comparing the cost of running Devin versus the hourly cost of an engineer completing that task. The significant savings are heavily driven by speed of task execution and cost effectiveness of Devin relative to human engineering time – it does not even consider the value captured by completing the entire project months ahead of schedule!
Fewer dreaded migration tasks for Nubank engineers

How Evinova accelerates regulated software delivery with Devin

Vimeo
8x
faster GxP documentation
3x
faster code migrations
66%
of clear-cut bugs fixed autonomously
2–3x
faster test-writing

About the company

Evinova is a separate health tech company within the AstraZeneca group that delivers intelligently designed digital and AI-native solutions to biopharma companies and contract research organizations to optimize the entire clinical development lifecycle from end to end.

Industry: Healthcare & Life Sciences Visit site

Overview

Evinova is a separate health tech company within the AstraZeneca group that delivers intelligently designed digital and AI-native solutions to biopharma companies and contract research organizations to optimize the entire clinical development lifecycle from end to end.

“Evinova is helping life sciences customers deliver groundbreaking, industry-first outcomes for patients at record speed by harnessing the latest advances in AI. But we can’t credibly transform how our customers design and run clinical trials if we haven’t transformed how we build the software that runs them. Our promise to customers is simple: the software shaping the future of clinical development is built by a company already living in that AI-native future.”

Sean Connolly, Vice President, Chief Product and Technology Officer of Evinova

The Challenge

As an AI-native clinical trial technology company operating in a GxP-regulated environment, Evinova’s engineering organization holds itself to the highest standards: every line of code must be traceable, every change auditable, and every release defensible to regulators. Within that operating model, the team identified a clear opportunity to deploy AI agents against the structured, well-bounded work that surrounds product development — backlog triage, regulatory documentation such as User Requirement Specifications and Disaster Recovery Plans, and the coordinated migrations that come with running a modern platform at scale.

The opportunity was meaningful. Regulatory documents typically required 35 to 40 hours of cross-functional coordination across engineering, product, and QA to assemble the necessary context. Evinova saw a chance to compress that cycle significantly while raising consistency and traceability — freeing senior engineers to focus on the differentiated platform work that sets Evinova apart in the market.

The team had evaluated the leading AI coding tools and set a deliberately high bar: any solution would need to own a task end-to-end — read a ticket, explore the codebase, write the code, run the tests, and open a pull request — while preserving the full audit trail that GxP and 21 CFR Part 11 demand. That standard is what led Evinova to Cognition.

Why Devin

Pete Nellius, who leads Future Product Discovery and AI initiatives at Evinova, championed bringing Devin into the organization. During the evaluation, the team tested every major AI coding tool against real engineering challenges. One test involved a complex task with dependencies on third-party libraries where none of the underlying logic existed. Devin was the only tool that completed it.

“The way Devin approached the problem and was able to iterate on its solution until it resolved the problem was ultimately why we decided to go with this tool.”

Shaun Phillips, Director of Engineering for the Study Designer team

Other large regulated companies were already using Devin, which gave Evinova’s leadership confidence that the tool could meet their compliance and security requirements.

Use Case: GxP Documentation

User Requirement Specifications are foundational regulatory artifacts. Producing one means pulling requirements from Jira, tracing them to the underlying code, organizing them into testable acceptance criteria, demonstrating test coverage and evidence, and formatting everything to a standard that makes regulatory audits routine. For a major software module, that work has historically taken 35 to 40 hours of senior engineering, product, and QA time — with much of that effort spent assembling context that lives across multiple systems and teams.

Accuracy under regulatory scrutiny was the central evaluation criteria across AI development tools. Evinova pointed Devin at the codebase and Jira and tasked it with generating URS documents directly from source. Devin produced structured first drafts at roughly 90% accuracy, shifting senior engineers from primary authors to reviewers — the role where their regulatory judgment is most valuable.

They used the same approach for Disaster Recovery Plans.

“It was done in five to ten minutes. The cursory overview was 90% accurate.”

Sudha Panuganti, Senior Director of Product Engineering

The team was able to refocus its time on edge cases, more complex DR scenarios, and more effective DR test cases and end to end simulations.

The outcome was a deeper, more meaningful alignment with key life sciences software regulations and a stronger set of plans and evidence for Evinova customers to leverage for clinical use qualification.

Use Case: Automated Bug Triage and Resolution

Bug triage is exactly the kind of recurring, well-bounded engineering workflow where autonomous agents can compound value over time. Rather than treating it as a manual process to optimize, Phillips’s team designed it as a system: AutoFixer, a Devin playbook that runs four times a day and works through open Jira tickets end to end — reading the ticket, exploring the codebase, implementing a fix, running tests, and opening a pull request for human review.

The results validated the approach quickly. Within 22 days, AutoFixer had attempted 79 bugs — merging clean, review-ready fixes for half and getting engineers most of the way to resolution on another quarter. Phillips estimated a 100% return on his time investment: under two days of setup yielded three to four days of saved engineering effort, with compounding returns as the playbook continues to run. The pattern is portable by design — other squads can replicate the workflow against their own repositories with minimal configuration, turning a single team’s investment into a platform capability across Evinova engineering.

Use Case: Tech Stack Migration

Evinova’s engineering organization continuously evolves its technology stack to stay at the frontier of AI-native software development. As part of that ongoing investment, the team migrated existing product components to a modern TypeScript/Next.js architecture — a stack better suited to the agent-driven development workflows, rapid iteration cycles, and composable UI patterns that define how leading AI-native products are built today. The work spans backend logic translation, API updates, UI reconstruction, and full validation, and a single component has historically required about five days of focused engineering time.

The team set out to migrate one component of a legacy product to test the feasibility of an agent-led migration. Devin completed three components in roughly a day and a half — touching 58 files across three repositories and surfacing only nine points where it needed engineer input. The pattern is now repeatable across the remaining components, allowing Evinova to advance its modernization roadmap at a pace that matches the speed of innovation in the broader AI-native ecosystem.

Use Case: Test Automation

Comprehensive test automation is foundational to Evinova’s quality posture — the more coverage the team can build and maintain, the more confidently it can ship at the cadence regulated software demands. The team’s automation framework pairs Playwright with Cucumber-based BDD scenarios — a pattern that doubles as living regulatory documentation by mapping test coverage directly to user requirements — with test specifications managed in X-Ray and traceability anchored in Jira.

Evinova reframed test script development as another well-bounded workflow ready for agent ownership. Before Devin, the team projected two to three quarters to reach near-complete test automation coverage. Now, Panuganti said they’re looking at weeks.

What Comes Next

Evinova is scaling the agent-led playbook across the organization. AutoFixer is expanding from bug resolution into feature development, and the workflow patterns are being rolled out to engineering teams across Evinova — turning early wins into a shared platform capability.

The impact extends beyond throughput. For Shaun Phillips, Devin has changed what’s possible from the engineering leadership seat. Senior leaders at Evinova spend much of their day in strategy, architecture, and people work — by design. Devin gives them a way to stay close to the code without competing for the same hours. “I would previously ping one of my lead engineers with questions,” Phillips said. “That question now goes straight into Devin and I get an immediate answer, without requiring a human engineer to be online, available, and context switch.”

It has also let him ship again and keep his engineering skills sharp. “Devin has given me, as an engineering manager, the ability to balance my leadership role and directly contribute to our products,” Phillips said. “I just ping Devin and provide human oversight as Devin handles the ticket end to end, and spits out a pull request on the other side.”

Most software organizations are bolting AI onto how they already work. In partnership with Cognition, Evinova is rebuilding the operating model itself — leaders shipping code alongside their teams, engineers freed for the work only humans can do, and capabilities that compound across squads rather than getting trapped inside them. In an industry where the pace of AI progress is outrunning most enterprise software organizations, that operating model - coupled with tools like Devin - is what lets Evinova ship clinical trial technology at the speed the moment demands — and what makes it the kind of engineering organization the best builders want to join.

Evinova at a Glance

Company Evinova (an AstraZeneca Group company)
Industry Healthcare & Life Sciences
Scale Clinical study design and management software for global life sciences organizations
Champion Pete Nellius, Future Product Discovery and AI, Evinova
AI Use Cases
  • GxP documentation generation (URS, Disaster Recovery Plans)
  • Automated bug triage and resolution via nightly playbook
  • Tech stack migrations
  • End-to-end Test automation
Key Outcomes
  • 8x faster GxP documentation (35–40 hours reduced to under 5)
  • 66% of bugs fixed autonomously in first 10 days
  • 3x faster code migrations with minimal engineer input
  • Test automation timeline compressed from quarters to weeks
  • 60%+ of engineering tickets routed through Devin workflows