We’ve hit double digit Waves!

Just like Wave 8, we are releasing Wave 10 over three days.

Today’s launch is Planning Mode, a native way to collaborate with the AI on long-term planning and thinking. Let’s jump right into it…

Planning Mode

Planning Mode introduces the interface for collaborating with AI on long-term thinking.

Let’s see it in action. You can enable Planning Mode by simply clicking the icon under the prompt box on the right:

Then, when you start a conversation with Cascade, not only will Cascade give you responses and tool-based actions, but it will also help plan your work by generating and editing a local markdown file with goals and tasks.

Cascade will constantly refer to the plan as it completes the task, so you can manually go in and edit the plan and Cascade will respond accordingly, or just ask Cascade to change the plan.

As Cascade learns new information (ex. Memories) that might require changing the plan, it will make modifications to the plan, and you will be notified when this happens so that you can review and adjust as necessary.

In the background, a larger model that is more capable for long-term reasoning is being used to iterate on the long-term plan as you use your selected model to take short-term actions conditional on the plan. A lot of people have already found this idea of generating an “implementation plan checklist” to be beneficial in guiding the AI to complete longer, more-complex tasks, and so a native support within Windsurf has been a big ask from the community. Now, Planning Mode supports and improves on this workflow.

Planning Mode is available on all paid plans at currently no extra cost.

The Tactical

Planning Mode is available on all paid plans at currently no extra cost, partially thanks to larger models like o3 getting much cheaper.

In fact, o3 is now just 1x credits (both medium and high reasoning), instead of 7.5x and 10x credits before Wave 10. We have also spent time making o3 work significantly better and faster in Cascade than before. We are heavily investing in making these larger, more powerful models work better in Windsurf as we start automating longer-and-longer ranged tasks in our quest to accelerate all of software development by 99%.

The Theory

Let’s pick up from Wave 9, where we launched our first family of proprietary frontier models for software engineering, dubbed SWE-1. A core thesis behind SWE-1 is that software engineering is not the same as coding, as the former requires operating over incomplete states and long-horizon objectives. We also introduced the representation of a shared timeline of actions taken, and the concept of flow awareness, which is the ability for both the human and AI to be aware of (and to take action) on the shared timeline:

The standard shared timeline diagram of short-term actions, with human’s actions in grey and the AI’s actions in teal.

However, this is an incomplete representation of software engineering. Software engineering is not an arbitrary series of short-term actions. It’s a thoughtful order of short-term actions that all align with a long-term plan, a plan that is refined as the work is being done, approaches are tested, and information is learned.

If we ignore AI for a second, humans actually complete work by having a timeline of actions (short-term thinking) and a timeline of a plan (long-term thinking) that update each other:

Short-term and long-term thinking timelines

The top timeline is for the short-term actions and the bottom timeline is for the long-term reasoning. Since this is happening simultaneously within the human’s brain, every action is subconsciously pulled from the plan (the vertical upwards arrows). Meanwhile, the plan is also constantly updating, sometimes independently from the actions or sometimes because the results of certain actions trigger new information that requires the plan to change (the curved downwards arrows).

This observation explains why pretty much every agentic software engineering tool (Windsurf or otherwise) starts to break down for long-running complex tasks; they lack a native primitive that encapsulates the plan timeline. Given an end goal, it is important to have a plan on how to achieve the end goal, to be able to learn more about the state of the world while executing the plan, and then iterate on the plan with the new information when necessary.

So, if we are to incorporate planning with AI under the concepts of a shared timelines, we actually need to extend the flow awareness on the action timeline to the plan timeline:

Unlike in the previous diagram, instead of everything happening within the human’s brain, the plan is an actual persistent, modifiable object on disk. Planning Mode covers the orange arrows - pulling from the plan when the AI needs to take actions, and then automatically updating the plan as frequently as possible when new information comes in. This does not mean that all updates on the plan have to be made by the AI. Also, the human’s actions ideally reflect the same plan - human and AI are operating on not just the same shared timeline of short-term actions, but on the same joint shared timeline of short-term and long-term reasoning.

Software engineering really boils down to three things: knowing what to build, figuring out how to build it, and doing the building. “Doing the building” is captured by the action timeline (short-term thinking), while “figuring out how to build” is captured by the plan timeline (long-term thinking). AI has historically been only good enough to help with “doing the building,” taking over more and more of the steps on the action timeline. The optimal flow-aware interface for collaborating with AI on the action timeline likely looks very similar to Cascade, a chat-like interface. But now, AI is getting good enough to also help with “figuring out how to build.” And with that, it is unclear whether this chat-like interface is the optimal flow-aware interface for collaborating with AI on the plan timeline.

We believe that the markdown plan file is the v1 of this optimal flow-aware interface for collaborating with AI on the plan timeline. To build intuition for this, we will deconstruct what long-term planning with AI really looks like.

There are two main stages to long-term planning with AI: initialization and iteration.

The initialization stage is a one-time operation of creating a plan given a high-level goal. This is what some of the large frontier models with reasoning capabilities have been getting very good at, and so this is not as interesting a phase. Many developers already know to select a model like o3 within Windsurf to ask it to generate a plan before switching to a smaller, cheaper model for the step-by-step reasoning.

The iteration stage is the much more interesting part, and itself can be decomposed into three pieces:

The ability to learn new information
The ability to update the plan given the new information
The ability for the short-term actions to always reflect the most current version of the long-term plan

This is continuous, so it isn’t as simple as calling a particular model once.

We have actually been building the first and third pieces for a while. The first piece is essentially Customizations; new information is helpful to both short-term actions or long-term planning. These Customizations are either human-specified information in the form of Workflows or Rules, or information learned from AI outputs, such as Memories. The third piece is automatically given on how we have instrumented our whole system to work on a centralized, ever-evolving “shared timeline.” The only difference is looking at the joint timelines of short-term actions and long-term plans.

The second piece was primarily what was missing: being able to update the plan given the new information. For this, the plan needs to be saved in a persistent format, not something ephemeral like a suggestion or a chat response. And once that exists, both the AI and the human have to be able to have ways of modifying this plan.

The combination of modifiable and persistent is why we think the markdown file is the v1 interface - the simplest interface that satisfies both of these conditions. Then, the human making modifications is trivial given that it is a markdown file and the AI making modifications is what we have instrumented with new tools in Planning Mode.

At the end, the solution is quite elegant - a Cascade conversation for short-term actions paired with a plan file for long-term reasoning. In fact, it might seem too simple or obvious, which we generally think is a good thing. The trickiness is in the details on how to instrument these two together, but if the end user finds the experience seamless, then we’ve done our job.

What’s Next

While we think this is a very important first steps, we think a lot more can, and should, be explored:

The actual representation could become more rich than a markdown file. Images, diagrams, code blocks, comments, and more.
Just like Workflows and Rules are “checked-in to a repository” so that people in a team can share them, maintain version control, etc, what if plans are natively multiplayer like a Google Doc?
Updates to the planning timeline could be categorized more richly than just “initialize” and “iterate,” turning the planning timeline into the ideal primitive for flow-aware project management and architecture. These are parts of the software development life cycle that we don’t yet natively support because we have not cracked a system that would be good enough in production.
What else is possible when not just the plans, but also the evolution of said plans, are cleanly documented for an organization? This is a data source that has never existed before.

Long-term reasoning is a very important problem for us to solve in order to achieve our mission of accelerating software development by 99%. Planning Mode is just the start.

See you tomorrow for day 2 of Wave 10!

¹“What to build” is more about goal-setting as opposed to short-term vs long-term thinking. This is the piece that is likely going to be human-owned for the foreseeable future, which is why our goal at Windsurf is to accelerate developers by 99%, not automate them by 100%.