How Nubank refactors millions of lines of code to improve engineering efficiency with Devin

8x
engineering time efficiency gain
20x
cost savings
Vimeo

Overview

One of Nubank’s most critical, company-wide projects for 2023-2024 was a migration of their core ETL — an 8 year old, multi-million lines of code monolith — to sub-modules. To handle such a large refactor, their only option was a multi-year effort that distributed repetitive refactoring work across over one thousand of their engineers. With Devin, however, this changed: engineers were able to delegate Devin to handle their migrations and achieve a 12x efficiency improvement in terms of engineering hours saved, and over 20x cost savings. Among others, Data, Collections, and Risk business units verified and completed their migrations in weeks instead of months or years.

The Problem

Nubank was born into the tradition of centralized ETL FinServ architectures. To date, the monolith architecture had worked well for Nubank — it enabled the developer autonomy and flexibility that carried them through their hypergrowth phases. After 8 years, however, Nubank’s sheer volume of customer growth, as well as geographic and product expansion beyond their original credit card business, led to an entangled, behemoth ETL with countless cross-dependencies and no clear path to continuing to scale.

For Nubankers, business critical data transformations started taking increasingly long to run, with chains of dependencies as deep as 70 and insufficient formal agreements on who was responsible for maintaining what. As the company continued to grow, it became clear that the ETL would be a primary bottleneck to scale.

Nubank concluded that there was an urgent need to split up their monolithic ETL repository, amassing over 6 million lines of code, into smaller, more flexible sub-modules.

Nubank’s code migration was filled with the monotonous, repetitive work that engineers dread. Moving each data class implementation from one architecture to another while tracing imports correctly, performing multiple delicate refactoring steps, and accounting for any number of edge cases was highly tedious, even to do just once or twice. At Nubank’s scale, however, the total migration scope involved more than 1,000 engineers moving ~100,000 data class implementations over an expected timeline of 18 months.

In a world where engineering resources are scarce, such large-scale migrations and modernizations become massively expensive, time-consuming projects that distract from any engineering team’s core mission: building better products for customers. Unfortunately, this is the reality for many of the world’s largest organizations.

The Decision: an army of Devins to tackle subtasks in parallel

At project outset in 2023, Nubank had no choice but to rely on their engineers to perform code changes manually. Migrating one data class was a highly discretionary task, with multiple variations, edge cases, and ad hoc decision-making — far too complex to be scriptable, but high-volume enough to be a significant manual effort.

Within weeks of Devin’s launch, Nubank identified a clear opportunity to accelerate their refactor at a fraction of the engineering hours. Migration or large refactoring tasks are often fantastic projects for Devin: after investing a small, fixed cost to teach Devin how to approach sub-tasks, Devin can go and complete the migration autonomously. A human is kept in the loop just to manage the project and approve Devin’s changes.

The Solution: Custom ETL Migration Devin

A task of this magnitude, with the vast number of variations that it had, was a ripe opportunity for fine-tuning. The Nubank team helped to collect examples of previous migrations their engineers had done manually, some of which were fed to Devin for fine-tuning. The rest were used to create a benchmark evaluation set. Against this evaluation set, we observed a doubling of Devin’s task completion scores after fine-tuning, as well as a 4x improvement in task speed. Roughly 40 minutes per sub-task dropped to 10, which made the whole migration start to look much cheaper and less time-consuming, allowing the company to devote more energy to new business and new value creation instead.

Devin contributed to its own speed improvements by building itself classical tools and scripts it would later use on the most common, mechanical components of the migration. For instance, detecting the country extension of a data class (either ‘br’, ‘co’, or ‘mx’) based on its file path was a few-step process for each sub-task. Devin’s script automatically turned this into a single step executable — improvements from which added up immensely across all tens of thousands of sub-tasks.

There is also a compounding advantage on Devin’s learning. In the first weeks, it was common to see outstanding errors to fix, or small things Devin wasn’t sure how to solve. But as Devin saw more examples and gained familiarity with the task, it started to avoid rabbit holes more often and find faster solutions to previously-seen errors and edge cases. Much like a human engineer, we observed obvious speed and reliability improvements with every day Devin worked on the migration.

Results: Delivering an 8-12x faster migration, lifting a burden from every engineer, and slashing migration costs by 20x.

“Devin provided an easy way to reduce the number of engineering hours for the migration, in a way that was more stable and less prone to human error. Rather than engineers having to work across several files and complete an entire migration task 100%, they could just review Devin’s changes, make minor adjustments, then merge their PR”

Jose Carlos Castro, Senior Product Manager

8-12x efficiency gains This is calculated by comparing the typical engineering hours required to complete a data class migration task against the total engineering hours spent prompting and reviewing Devin’s work on the same task.
Over 20x cost savings on scope of the migration delegated to Devin This is calculated by comparing the cost of running Devin versus the hourly cost of an engineer completing that task. The significant savings are heavily driven by speed of task execution and cost effectiveness of Devin relative to human engineering time – it does not even consider the value captured by completing the entire project months ahead of schedule!
Fewer dreaded migration tasks for Nubank engineers

How Rivian and VW's Joint Venture Uses AI to Ship the Future of Vehicle Software

Vimeo
10-15x
increase in test generation velocity
3-4
engineers redirected from customer support back to product

About the company

RV Tech (Rivian and Volkswagen Group Technologies) is a joint venture founded in November 2024 to develop state-of-the-art zonal electronic architecture and software for software-defined vehicles. The technology will power future Volkswagen Group models on the SSP platform—expected to support up to 30 million vehicles—as well as Rivian's R2, R3, and R3X. The joint venture has grown to more than 1,600 employees across the US, Canada, Sweden, Serbia, and Germany.

Industry: Automotive / Vehicle Software Size: ~1,500 employees Visit site

Overview

Rivian and Volkswagen Group formed RV Tech to build and deploy a shared software-defined vehicle platform—combining Rivian’s proven SDV software and electrical architecture with Volkswagen Group’s global manufacturing scale and brand portfolio. The platform will power vehicles across Volkswagen Group and Rivian’s portfolios, at a scale that could reach 30 million vehicles.

With the R2 launching in 2026 and the first Volkswagen Group models to follow in 2027, RV Tech invested in AI tooling to accelerate delivery—automating high-volume engineering work and adding new layers of quality assurance to safety-critical code.

The results have been significant: substantial engineering capacity redirected to product work, and a 10x increase in test generation velocity on propulsion and dynamics software.

How the team evaluated AI tools

Arjuna Siva, VP of Infotainment & Connectivity, has been leading the initiative to use generative AI as a force multiplier for the engineering organization.

RV Tech had adopted several AI coding tools across its workflows, but as the team scaled, they needed a solution that could meet the full scope of enterprise requirements, including support for their existing IDE workflows, a context engine that could handle large legacy codebases, a mature security posture, and dedicated enterprise support.

“The first thing that really piqued our interest was the culture match with Cognition. We’re a very fast-moving team—we’re lean, very practical, product-focused, and customer-focused.”

Arjuna Siva, VP of Infotainment & Connectivity, RV Tech

After evaluation, they adopted both Windsurf for interactive coding assistance—chosen for its enterprise readiness across all of these dimensions, and Devin for autonomous, background work—tasks that can run without constant human supervision.

“It’s not a vendor-customer relationship. It’s a true partnership, where we feel that our requests are heard and we’re really at the forefront of what the technology can offer today.”

Wassym Bensaid, Co-CEO & CTO, RV Tech

Results from the rollout

Vehicle Access Ticket Triage

Vivek Ravi is an engineering manager on the Vehicle Access team, which builds the software that enables vehicle lock, unlock, and start. The team integrates car keys into phone wallets using Bluetooth Low Energy and Ultra-Wideband technology. It is a full-stack team spanning firmware, wireless protocols, cloud infrastructure, mobile app interfaces, and infotainment integration.

One area where the team spent time was customer-reported vehicle access issues. Before AI tooling, the triage process was manual: an issue comes in from front-line customer support, an on-call engineer picks it up and triages it. Each ticket is time sensitive. At peak periods, this consumes the full capacity of multiple engineers every sprint.

RV Tech had been close to assigning dedicated engineers for exactly this workload. Instead, the team piloted Devin. A Devin Slackbot gets tagged on an incoming report, extracts the VIN and timestamp, pulls the relevant logs, runs the parsing scripts, and performs first-level triage. Engineers then review Devin’s analysis.

Devin triage enabled RV Tech to redirect 3-4 engineers from customer support back to product work, meaningfully expanding the team’s capacity for core development.

Beyond reactive triage, the team sees a second application: proactive fleet monitoring. RV Tech already tracks fleet-wide metrics and can see issues customers never report. Today, the team has no capacity to investigate those failures. With Devin, they plan to pull logs for affected vehicles and run first-level triage automatically, surfacing issues before they become customer complaints.

SIL Testing

Brian Harries leads the dynamics and propulsion controls team—all the application software governing how the car drives and handles.

In automotive, traceability from requirements to code to tests is essential. The traditional process for enforcing that traceability relies heavily on brute-force meetings—systems engineers sitting down with software engineers, going line by line on every requirement, mapping coverage, and reviewing repeatedly.

Brian’s team has been using Devin to generate SIL test cases. Devin ingests the simulation framework structure, the codebase architecture, and the relevant requirements, then generates test cases that validate the implementation, raises a merge request, iterates until the tests are functional and passing, and submits for review.

Devin writes approximately 100% of the test code in most cases. The engineering review focuses on verifying that the tests are actually checking the right logic.

Where an engineer might write one or two tests per day manually, Brian estimates the team can reach 10 to 15 tests per day using Devin.

The team appreciated how shareable Devin was across the team. One person invested the time to build a playbook with the right context, and then the entire team could use it to generate test cases immediately.

The impact went beyond speed. In at least one case during the pilot, Devin identified conflicting logic between two requirements, suggesting the requirements themselves needed revision. A human reviewer confirmed the finding.

Brian described this as the broader opportunity: AI tooling lets the team catch quality issues across the whole V-model in systems engineering—requirements, code, test cases—much faster, without the traditional slow review cycles.

What's next

The auto industry’s shift to software-defined vehicles opens up something genuinely new—cars that improve after they leave the factory, new features delivered to millions of vehicles overnight, and a driving experience that gets better over time. RV Tech is pioneering what it takes to deliver on that promise: autonomous software engineering that triages issues, generates test coverage, and scans for regressions, so engineers can focus on building the features that make vehicles better.

“What really makes me excited is this will help us be way more predictable in our execution, deliver software with way higher quality, and dramatically increase the velocity of our engineering team so that we get more and more features to our customers.”

Wassym Bensaid, Co-CEO & CTO, RV Tech