How Nubank refactors millions of lines of code to improve engineering efficiency with Devin

engineering time efficiency gain

20x

cost savings

Overview

One of Nubank’s most critical, company-wide projects for 2023-2024 was a migration of their core ETL — an 8 year old, multi-million lines of code monolith — to sub-modules. To handle such a large refactor, their only option was a multi-year effort that distributed repetitive refactoring work across over one thousand of their engineers. With Devin, however, this changed: engineers were able to delegate Devin to handle their migrations and achieve a 12x efficiency improvement in terms of engineering hours saved, and over 20x cost savings. Among others, Data, Collections, and Risk business units verified and completed their migrations in weeks instead of months or years.

The Problem

Nubank was born into the tradition of centralized ETL FinServ architectures. To date, the monolith architecture had worked well for Nubank — it enabled the developer autonomy and flexibility that carried them through their hypergrowth phases. After 8 years, however, Nubank’s sheer volume of customer growth, as well as geographic and product expansion beyond their original credit card business, led to an entangled, behemoth ETL with countless cross-dependencies and no clear path to continuing to scale.

For Nubankers, business critical data transformations started taking increasingly long to run, with chains of dependencies as deep as 70 and insufficient formal agreements on who was responsible for maintaining what. As the company continued to grow, it became clear that the ETL would be a primary bottleneck to scale.

Nubank concluded that there was an urgent need to split up their monolithic ETL repository, amassing over 6 million lines of code, into smaller, more flexible sub-modules.

Nubank’s code migration was filled with the monotonous, repetitive work that engineers dread. Moving each data class implementation from one architecture to another while tracing imports correctly, performing multiple delicate refactoring steps, and accounting for any number of edge cases was highly tedious, even to do just once or twice. At Nubank’s scale, however, the total migration scope involved more than 1,000 engineers moving ~100,000 data class implementations over an expected timeline of 18 months.

In a world where engineering resources are scarce, such large-scale migrations and modernizations become massively expensive, time-consuming projects that distract from any engineering team’s core mission: building better products for customers. Unfortunately, this is the reality for many of the world’s largest organizations.

The Decision: an army of Devins to tackle subtasks in parallel

At project outset in 2023, Nubank had no choice but to rely on their engineers to perform code changes manually. Migrating one data class was a highly discretionary task, with multiple variations, edge cases, and ad hoc decision-making — far too complex to be scriptable, but high-volume enough to be a significant manual effort.

Within weeks of Devin’s launch, Nubank identified a clear opportunity to accelerate their refactor at a fraction of the engineering hours. Migration or large refactoring tasks are often fantastic projects for Devin: after investing a small, fixed cost to teach Devin how to approach sub-tasks, Devin can go and complete the migration autonomously. A human is kept in the loop just to manage the project and approve Devin’s changes.

The Solution: Custom ETL Migration Devin

A task of this magnitude, with the vast number of variations that it had, was a ripe opportunity for fine-tuning. The Nubank team helped to collect examples of previous migrations their engineers had done manually, some of which were fed to Devin for fine-tuning. The rest were used to create a benchmark evaluation set. Against this evaluation set, we observed a doubling of Devin’s task completion scores after fine-tuning, as well as a 4x improvement in task speed. Roughly 40 minutes per sub-task dropped to 10, which made the whole migration start to look much cheaper and less time-consuming, allowing the company to devote more energy to new business and new value creation instead.

Devin contributed to its own speed improvements by building itself classical tools and scripts it would later use on the most common, mechanical components of the migration. For instance, detecting the country extension of a data class (either ‘br’, ‘co’, or ‘mx’) based on its file path was a few-step process for each sub-task. Devin’s script automatically turned this into a single step executable — improvements from which added up immensely across all tens of thousands of sub-tasks.

There is also a compounding advantage on Devin’s learning. In the first weeks, it was common to see outstanding errors to fix, or small things Devin wasn’t sure how to solve. But as Devin saw more examples and gained familiarity with the task, it started to avoid rabbit holes more often and find faster solutions to previously-seen errors and edge cases. Much like a human engineer, we observed obvious speed and reliability improvements with every day Devin worked on the migration.

    Results:
    Delivering an 8-12x faster migration, lifting a burden from every engineer, and slashing migration costs by 20x.

“Devin provided an easy way to reduce the number of engineering hours for the migration, in a way that was more stable and less prone to human error. Rather than engineers having to work across several files and complete an entire migration task 100%, they could just review Devin’s changes, make minor adjustments, then merge their PR”

Jose Carlos Castro, Senior Product Manager

    8-12x efficiency gains
    This is calculated by comparing the typical engineering hours required to complete a data class migration task against the total engineering hours spent prompting and reviewing Devin’s work on the same task.

    Over 20x cost savings on scope of the migration delegated to Devin
    This is calculated by comparing the cost of running Devin versus the hourly cost of an engineer completing that task. The significant savings are heavily driven by speed of task execution and cost effectiveness of Devin relative to human engineering time – it does not even consider the value captured by completing the entire project months ahead of schedule!

Fewer dreaded migration tasks for Nubank engineers

Making tasks "Devinable"

For Hamming, speed is everything. As one of the fastest-moving engineering teams in the world, they measure their success by how much they can ship per week. With Devin contributing 25% of their total code volume and ranking as a top contributor alongside their best engineers, Hamming has redefined what engineering velocity looks like in the age of AI agents.

“Speed is our primary metric… We measure how much our organization can ship per week, per month. And Devin is one of our top contributors, so it’s pretty obvious what the impact is to us.”

—Sumanyu Sharma, Founder & CEO, Hamming

To ship fast in a complex codebase without adding bugs, Hamming developed a systematic methodology for identifying and creating “Devinable” tasks—well-bounded, clearly specified work with objective deliverables that Devin can execute with high accuracy.

With this framework, the Hamming team was able to delegate increasingly complex tasks to Devin that would otherwise take up engineering time (such as rebuilding large parts of their codebase).

    The "Devinable" framework
    Bounded tasks + Clear specs + Objective evals

“We have a good feel for what is “Devinable”, what is not “Devinable” internally. That requires skill to know what the right boundaries are, how do you craft specs in a way that are concrete, that someone like Devin can execute.”

—Sumanyu

Identifying “Devinable” tasks was just the beginning. Hamming realized that to truly leverage Devin’s capabilities, they needed to rethink their approach to software engineering. This led to a transformation in how they structure their codebase:

1. Writing for AI agents, not just humans Hamming inverted the traditional approach to software development. Instead of writing code primarily for human consumption, they optimized their codebase for AI agents to understand, manipulate, and test.

2. Functional over object-oriented They shifted toward functional programming patterns, recognizing that states are harder for both LLMs and humans to visualize and reason about.

3. Test-driven and eval-driven development Testing became central to their philosophy, with comprehensive unit tests that help both human and AI developers understand expected behaviors.

“We made our code base more functional, more self-documenting, added more unit tests, ironically using Devin to add more unit tests… These are things that have nothing specifically to do with Devin, but are just good engineering practices.”

—Sumanyu

Hamming’s approach to real-time customer issue resolution

One of the most impactful uses of Devin at Hamming is for immediate customer issue resolution:

“If customers report issues on the call, I often just create Devin sessions immediately right then and there so that they’re shipped to prod and fixed the same day. That turnaround time of having an idea and then proving it out or disproving it out is extremely valuable.”

—Sumanyu

Devin as a scout

Beyond traditional development tasks, Hamming uses Devin for exploration and experimentation. When the team has new product ideas but lacks time to fully scope them out, they send Devin as a “scout” to explore the solution space. Devin will quickly understand the codebase, search online for documentation, and try out different solutions to find the best approach.

“Sometimes we have big ideas of things we want to do but don’t have time to fully scope all of them out or prove out a thesis. We’ve often sent Devins as scouts to just try it out and see what is the solution space.”

—Sumanyu

This exploration capability has led to surprising discoveries, with Devin sometimes finding reusable functions and patterns in the codebase that the team had forgotten about:

“Sometimes I feel like Devin has a better understanding of our code base than we do. Devin can sometimes find even functions that exist that none of us remember that we wrote like a year ago.”

—Sumanyu

Impact on Hamming's engineering velocity

The impact of Devin on Hamming’s engineering velocity has been transformative:

25% of total code volume is contributed by Devin
Top contributor status, tied with only 1-2 human engineers in terms of output
Same-day turnaround on customer-reported issues
Autonomous execution allowing engineers to start Devin sessions before meetings and return to completed PRs

“If you use Devin well, it can be your top contributor. If it’s not, it’s probably a skill issue or a code base issue. If your code base is unfriendly for Devin, it’s also unfriendly for an engineer.”

—Sumanyu

Eval-driven development and clear thinking

Hamming’s experience with Devin has shaped their vision for the future of software engineering. Rather than seeing AI as a threat to engineering jobs, they view it as an opportunity for unprecedented leverage and creativity.

“I think the future of software engineering is actually quite bright. This is the best time to learn software engineering because you get enormous amounts of leverage. The future belongs to teams that are deeply thinking around what needs to be done because that’s the alpha of a human.”

—Sumanyu

Their philosophy extends to their own product development: just as they build testing infrastructure for voice agents, they apply the same rigorous testing principles to their own codebase, creating a virtuous cycle where better testing enables faster, more reliable development with Devin.

“The future of software engineering is eval-driven development. If you can measure, therefore you can improve. Having clear thinking is kind of everything.”

—Sumanyu

How Nubank refactors millions of lines of code to improve engineering efficiency with Devin

Overview

The Problem

The Decision: an army of Devins to tackle subtasks in parallel

The Solution: Custom ETL Migration Devin

Devin contributes 25% of total code volume at Hamming

About the company

Making tasks "Devinable"

Hamming’s approach to real-time customer issue resolution

Devin as a scout

Impact on Hamming's engineering velocity

Eval-driven development and clear thinking

Build more with
Devin

Need Devin for your enterprise?

Get started with Devin Enterprise

How Nubank refactors millions of lines of code to improve engineering efficiency with Devin

Overview

The Problem

The Decision: an army of Devins to tackle subtasks in parallel

The Solution: Custom ETL Migration Devin

Devin contributes 25% of total code volume at Hamming

About the company

Making tasks "Devinable"

Hamming’s approach to real-time customer issue resolution

Devin as a scout

Impact on Hamming's engineering velocity

Eval-driven development and clear thinking

Build more withDevin

Need Devin for your enterprise?

Get started with Devin Enterprise

Build more with
Devin