Why is OpenAI buying Windsurf?

theahura.substack.com

241 points by theahura 3 months ago

Why couldn’t OpenAI vibe code their own Windsurf/Cursor competitor? (Serious question).

OpenAI is a technology company constantly in search of productization (ChatGPT, Sora, Dall-e), and they’ve been really good at creating product interest that converts to acquisition. An IDE is much more complex than a chat app, but given their literal billions of dollars and familiarity with developer tooling, this is a down-stack build that they could dogfood off their own tech. And especially given that some of these tools were built by tiny teams (Cursor is what, 10 people?), is this like Google and Facebook’s implicit admission that they can’t “build and grow” anymore, and need to turn to acquisitions to fuel growth?

conartist6 3 months ago

I don't think there's anyone better positioned to answer your question than I am, given that I've spent the last 3 years building the IDE tech that OpenAI really wants and needs right now (though they don't yet know it).
The problem is that what you're discussing is a political undertaking. I don't mean that it's "left versus right" political, I mean that the primary task that makes it hard is getting a lot of human beings to agree on some low-level details about how the gory internals of an IDE work. LLMs can produce text, but they have no will to political organization. They aren't going to accomplish a task by going out and trying align the needs of many individuals in a compromise that requires determined work to find out what those people really want and need, which is the only way to get this particular task done. Somehow the natural-language-as-API idea has made many people think that in the future APIs and formal technical standards will be unnecessary, something for which I do not see evidence.
There's a second problem too, one of alignment: LLMs encourage you to give away the work of coding. I would not for anything give away the pain of using mediocre tools like VSCode to write a lot of code myself, because what I learn as the intelligence doing the work is what I need to know to be able to make tools for doing that work more efficiently. I use my learnings to design protocols, data structures, libraries, frameworks, and programming languages which don't just *mask* the underlying pain of software development but which can reduce it greatly.
- surgical_fire 3 months ago
  
  > Why couldn’t OpenAI vibe code their own Windsurf/Cursor competitor?
  The obvious answer is that vibecoding does not work.
  If it did, OpenAI wouldn't need to buy Windsurf
  - conartist6 3 months ago
    
    I'm as much of a skeptic about vibe coding as you are. I consider engineering a bit of an art, so it's not for me.
    That said, the kinds of things I've been hearing as going wrong with vibe coding are all or mostly fixable. I would even go as far as to say they're manifestations of the same problems that make IDEs such clumsy tools for humans to use at the moment, which is why we're trying to foist all this annoyance of using them off onto AIs.
    The biggest problem, THE problem in this space is that tools we use don't have the language to describe the logic -- the intent -- behind refactorings. If you've never captured the user's intent, you can't learn from it. The ability to capture that intent is absent all the way from the highest level UX right down the lowest level of serialization: the patch format
    That's why I believe it's my tech they really need and not Windsurf's -- I'm working on capturing the intent top to bottom.
    
    skydhash 3 months ago
    
    Isn’t the intent described in the commit message, which is one of many ways the developer can use to share intent? Code is only the what and the how. The why should be in Wiki, Readmes, Commit, Issue tracker…
    
    conartist6 3 months ago
    
    Let's say the change is to rename a variable. The patch format can never capture that the intent was to rename a variable. You can say so in a commit message, sure, but the action is never recorded, only it's effects, causing merges involving the rename to have conflicts
    
    skydhash 3 months ago
    
    That’s because the code is data for an execution machine. All the other stuff are for the hunan mind. There’s multiple way to transition from one state of code to another, so mostly people record the existence of a transition (version control) and why it has happened (commits and other texts).
    Recording how is not fruitful, unless the technique is good. In this case, the essence will be extracted, and it will become a checklist or a script.
    If you have two itents producing conflicting patches, the merge intent that emerges is good because it brings empathy (understanding the message) and harmony (working code). And that’s why almost everyone says that the code itself doesn’t matter. Only the feature that it brings into play and the the mind that birth it do. It is a medium.
    And a merge conflict is nice because it detects overlapping work and the fact that the concerned people should propably meet.
    
    mycall 3 months ago
    
    Typically intent needs to be well defined, such as PR discussions, use case / need assessments, or user stories on a backlog.
    
    hotstickyballs 2 months ago
    
    Intent being unclear is exactly why different languages exist
    
    re-thc 3 months ago
    
    > the kinds of things I've been hearing as going wrong
    That’s the problem with this stuff. It’s all hearsay.
    
    cdelsolar 3 months ago
    
    Iunno, Cline is vibe coding some stuff for me right now while I’m surfing through hacker News.
    
    conartist6 3 months ago
    
    Yeah that's weird to me but I've made my peace with it. I do this to make the software I want to use, and I think that's a chance everyone should have.
  - pomatic 2 months ago
    
    This is patently wrong - as an experiment, I've vibe coded* four apps in the last 10 days. I did not write one line of code myself. Two are crud style, one is DNS related and one is for an embedded device. They were non-trivial use cases, built using claude desktop with various MCP tools. I was blown away by just how good it was.
    *the original definition of vibe coding involved using voice to dictate the prompts to AI, I prefer to type.
  - gdilla 3 months ago
    
    I mean, why did Facebook buy Instagram? It's a feed of photos and social network, hmmmm. In one sense, sure you can copy any software out there when you have the resources, but OpenAI got to peak under the hood of Windsurf and 1) Found a large, growing community (emphasis on growing) they can buy, which is worth something and 2) A team that probably will save them 1 year of unforced errors (also worth something) and 3) they have a lot of capital to deploy anyway, and this is not a risky bet.
    
    sazoo 3 months ago
    
    Exactly. Some of the ultra-myopic takes regarding vibe coding reek of "the lady doth protest too much" lol. Cope on!
  - sazoo 3 months ago
    
    Your comment hurts my brain. Help me understand your reasoning here:
    "Technology no worky. Therefore let's spend $3billion to buy one of the leading tools that already has a million users."
    Did you even stop for 5 seconds to think about how completely nonsensical your comment is? The dev-cope is strong here... XD
    
    surgical_fire 2 months ago
    
    I am not the one coping. Many things may threaten my earnings as a Dev, mostly the very clear possibility of a difficult economic scenario in the years to come.
    Vibe coding is just people deeply invested in a deadend technology trying to hype it up.
    And I say this as someone that does use AI to a limited capacity as a code assistant.
- conartist6 3 months ago
  
  And the next link I click captures it perfectly: https://blog.ollien.com/posts/llm-friction/
  To quote Ollien:
  > Even as an “LLM-skeptic”, it would be silly for me to say that the tools are useless for software development. There are clearly times where they can be useful, whether that’s to perform a refactoring too complicated for IDE tooling, or to get a proof-of-concept put together. With that said, in my time both using and watching others use LLMs, I have noticed a troubling trend: they help reduce friction when developing software – friction that can help us to better understand and improve the systems we work on.
  It's quite funny to me that in the case I'm talking about, "a refactor that's too complicated for IDE tooling" is exactly the kind of friction that needs to be felt
  - dickersnoodle 3 months ago
    
    If the refactoring is beyond the IDE tooling it's even more incumbent on the engineers/developers to know what they're doing instead of outsourcing it to the "brains" behind an LLM.
  - never_inline 3 months ago
    
    I wouldn't use LLM for serious refactors. It's actually the common "recipes" using popular libraries where LLM is most useful for me now.
  - __loam 3 months ago
    
    That friction is the smell you need to chase down for a refactor.
- jv22222 3 months ago
  
  I’ve been building text editor as notion from the ground up. Recently i’ve been uploading the full codebase to Gemini to discuss different tasks. I can fully agree that the LLM can simply not decide on how to do things! If you slightly tweak your prompt it will suggest an entirely different architecture. You have to be very careful about pulling out the right stuff from the cargo cult best practice stuff baked in to the model. Another thing, a really obvious optimization might be just staring it in the face and it won’t suggest it unless you say “have you considered XYZ?” and then all of a sudden it won’t stop talking about XYZ! But most of all you just have to be strong in your will power to guide it in the direction you want.
- runfaster2000 3 months ago
  
  This excellent presentation (Veritasium) is making much the same point (learning comes from “reps”).
  https://youtu.be/0xS68sl2D70?si=rXg5_wjg4bEdMalw
lozenge 3 months ago

Why would OpenAI vibe code anything? They have hundreds of experienced software engineers. Vibe coding is meant to replace an intern, not your actual team.
They don't want to get into specific industries (AI for software development, AI for business process management, AI for knowledge workers). They just do the AI component. Maybe that's changing now, but they still let other people take the risks - you know, people who are passionate about coding. Then when the product is proven, they might acquire it.
- echoangle 3 months ago
  
  Why would you replace an intern with AI? The goal is to give someone experience and make them come back for a job later. If you replace interns with AI, don’t complain about not getting developers later.
  - MonkeyClub 3 months ago
    
    Exactly: without interns and juniors now you won't have mediors and seniors later.
- lumost 3 months ago
  
  My honest take? Vibe coding makes a senior engineer 2-4x more productive depending on the project. Very large projects see diminishing returns down to 0% productivity gain or negative. I can probably supervise 10 concurrent vibe coding sessions with a little thought on how to structure the tasks and code. This is like giving each of your top engineers their own staff.
  The AI coders are different from human coders in what they can and can’t do, they are both profoundly dumb - and extremely technically proficient *at the same time*.
  - __loam 3 months ago
    
    Insane take. 2-4x productive until you have to refactor or fix anything and realize you just created a 10,000 line steaming pile.
    
    mitemte 3 months ago
    
    I recently spent 2 weeks fixing a project that a senior engineer seemingly vibe coded while I was on holiday. Prior to that, their work output was excellent in terms of quality and pace.
    Those 2 weeks were absolute hell for me. I estimate I had to rewrite about 90% of the code. Everything was cobbled together and ultimately disposable. Unfortunately, this work was meant to be the first of several milestones and was completely unsalvageable as a foundation for the project.
    I'm not opposed to using AI tools, I use them myself. But being on the receiving end and having to deal with someone else's vibe coded rubbish is truly dreadful.
    
    __loam 3 months ago
    
    I am opposed to them. I'm tired of being pushed by people who don't understand the profession to use crappy tools to be unrealistically productive. I hate what their presence has done to the industry and to the expectations placed on us.
    
    monkeycantype 3 months ago
    
    I disagree with this negative take. I can use Claude to quickly explore libraries, I’m not familiar with, and have developed a development process where I describe the purpose of each class and method in a markdown file , and have Claude, Gemini, deep seek and Chat all pitch descriptions of how to implement it in shared markdown files. I correct their misconceptions and inefficiencies before any code is written, I can write this code myself, but I’m finding I can work faster like this.
    
    __loam 3 months ago
    
    So when Claude lies to you what do you do? Your workflow is crazy. Just write the fucking code.
    
    monkeycantype 2 months ago
    
    Importantly this is how I work on my own personal fun projects, so it really doesn't matter if is productive, I find it enjoyable so I'm going to keep exploring it, there will be pros and cons, don't have the final ficture on that yet
    claude can't manage the big picture of what I'm trying to achieve, and claude and the others hallucinate all the time.
    I have them all write simple tests for the code, so if they introduce me to a new library, I have tests to prove their assumptions.
    And I review everything and tweak everything.
    In my day job I work as a lead dev / arch, this isn't must different, working with Claude is like working with a large team of very inconsistent devs, with deep knowledge, but a tenous grasp on reality that struggle with attention. So not that much different from real people?
    My dad wrote code generators, back in the 70's and 80's that I did some work on early in my career, those code generators which took a high level description of a program and output mainframe code, made most of the money that paid for raising my siblings and I. From that perspective I've been roboting myself my entire professional life.
    
    conartist6 3 months ago
    
    So you've managed to ensure that the one things you never have to do is come up with any ideas. You're turning yourself into the robot
    
    misfits 2 months ago
    
    That's cool, I will try it!
  - whatevaa 3 months ago
    
    Found the manager who doesn't code.
ramesh31 3 months ago

They couldn’t care less about the tech. This was entirely about corporate accounts and user data. Windsurf caught the first wave of enterprise adoption to agentic coding IDEs, and they have tons of big customers now.
- goopthink 3 months ago
  
  No disagreement, I think developer augmentation is an amazing productization of LLMs, and will likely be better at converting enterprises so paying customers.
  But OpenAI has the best brand recognition and the largest user base, and they have the core tech powering all of this. Whats the number on “tons” of customers, given that these VSCode-spinoff/plugin GPT wrappers are sprouting around like mushrooms after a rain?
  If this is a build-vs-buy decision, $3 billion? Is that worth 1/3rd of the money in the bank when they’re burning cash at insane rates just running servers, and the rest of the $30b fundraise is tenuous and there may not be a followup? I’m skeptical of the financial decision here.
  - ramesh31 3 months ago
    
    OpenAI has massive B2C brand value, but their enterprise offerings are scant. The era of consumer chatbot AI is coming to a close, and the winner will be the one who captures the B2B mindshare for real applications.
    It’s about buying time as well. Could they put out a competitive product in less than six months? Maybe, but even that would leave them light years behind the market by the time it was ready.
- Terretta 3 months ago
  
  Interesting since OpenAI and Anthropic both still mostly refuse to talk to anyone with "corporate" (information security, AAA, audit log) needs for under 150 seats even if they're willing to pay.
  - mycall 3 months ago
    
    Microsoft, Oracle, Goole and the like are happy to handle corporate seats.
siva7 3 months ago

Vibe coding with what? o4-mini? 4.1? Their coding models are a joke and their agent coder product as well. They would have to use claude
- jononor 3 months ago
  
  There seems to be no stable consensus on which LLM one should have used, to get good results. Which is somewhat natural, things are moving quickly - and evaluation methods are immature (and the little we have, actively gamed).
  But a lot of the arguments seem on the surface to be of "No true Scotchman AI" form. Or "you are just holding it wrong" (ref Apple).
- cdelsolar 3 months ago
  
  I find 4.1 superior to Claude in my few adventures into vibe coding.
  - siva7 3 months ago
    
    Which tool do you use?
    
    cdelsolar 2 months ago
    
    Cline
LZ_Khan 3 months ago

Well I assume that's what Anthropic and Google are doing with Claude Code and Firebase Studio.
The main thing is marketing and userbase, which shouldn't be underestimated.
ZeroTalent 3 months ago

Cash is cheap. $3B isn't that much in the grand scheme of things for OpenAI and, allegedly, 1M users.
rvz 3 months ago

> Why couldn’t OpenAI vibe code their own Windsurf/Cursor competitor? (Serious question).
Because an IDE has to be tested to work and function correctly. Not "Vibe coded".
Vibe-coding is not software engineering.
It is better to build than to buy.
ivanbalepin 3 months ago

it still takes time to spec out, build and sell to a comparable size user base, even with 10 people. And you're not guaranteed the same results if you just try to clone all of that. If the price is right, why not take the shortcut.
- goopthink 3 months ago
  
  Is $3 billion the right price? That might be what Windsurf is being valued at (cue the “selling to willing buyers at currently fair market prices” meme), but that’s like saying “it would cost OpenAI more than $3b to staff from zero, build a competitor, and acquire a comparable volume of paying users” … and that feels like an insane statement given the implications therein?
  Especially given that Windsurf (and I think Cursor too) is a VSCode fork and OpenAI is cozy enough with Microsoft? It’s not even a zero to one build.
  - ivanbalepin 3 months ago
    
    if they are optimizing for cost, which they are obviously not, then of course it would take them less to build. If they are optimizing for time, and the actual aforementioned "vibe code" step may not even be the most time-consuming part, then yes, it may be the right price.
aaron695 3 months ago

[dead]
echan00 3 months ago

They could. They prob need talent though, considering how much fish they're frying
1zael 3 months ago

They are just too focused on more important problems to solve, primarily around model improvement and getting to AGI.
fullstackwife 3 months ago

Anyone can vibe code their own Windsurf/Cursor alternative, and adjust it accordingly to their needs. You don't need majority of the features, most of them are developed because of the usual "large enterprise customer wants a feature" mantra. The real value comes from the underlying model anyway.
This looks more like a short term tactical move, and the goal is to improve their "API usage" KPIs, because compared to Gemini/Claude it looks not good. (majority of traffic are chatgpt.com free users)

_jab 3 months ago

A few thoughts:

1) I agree that the moat for these companies is thin. AFAICT, auto-complete, as opposed to agentic flows, is Cursor's primary feature that attracts users. This is probably harder than the author gives it credit for; figuring out what context to provide the model is a non-obvious problem - how do you tradeoff latency and model quality? Nonetheless, it's been implemented enough times that it's mostly just down to how good is the underlying model.

2) Speaking of models, I'm not sure it's been independently benchmarked yet, but GPT 4.1 on the surface looks like a reasonable contestant to back auto-complete functionality. Varun from Windsurf was even on the GPT 4.1 announcement livestream a few days ago, so it's clear Windsurf does intend to use them.

3) This is probably a stock deal, not a cash deal. Not sure why the author is so convinced this has to be $3B in cash paid for Windsurf. AFAIK that hasn't been reported anywhere.

4) If agentic flows do take off, data becomes a more meaningful moat. Having a platform like Cursor or Windsurf enables these companies to collect telemetry about _how_ users are coding that isn't possible just from looking at the repo, the finished product. It opens up interesting opportunities for RLHF and other methods to refine agentic flows. That could be part of the appeal here.

Falimonda 3 months ago

Agentic flows will soon overtake auto-complete. Models like claude sonnet 3.5 were already good enough, albeit requiring the user to actively limit context length.
Most recently, gemini 2.5 pro makes the agentic workflow usable, and how!
- canadiantim 3 months ago
  
  What agentic workflow are you using with Gemini 2.5? Is there a Claude Code that uses Gemini 2.5?
  - boleary-gl 3 months ago
    
    There are a LOT of IDE extensions that allow for agentic workflows with Gemini 2.5 pro and flash
  - ChadNauseam 3 months ago
    
    It can be used with Aider. Although I absolutely hate the way that Gemini 2.5 writes Rust. It writes it like it's a C++ developer who skimmed the Rust reference yesterday. Perhaps it's great for other languages though
  - ramesh31 3 months ago
    
    Cline. It’s literally black magic at this point.
    
    Falimonda 3 months ago
    
    Yeah, it is! SWE in a professional setting is changed for ever whether people like it or not.
    No sane organization will accept anything less than forcing SWEs to use these tools moving forward. The details might differ and some will delay because "IP", but they'll all converge very quickly.
    Not to say there's no joy in doing it the old-fashioned way on your own time :)
  - Falimonda 3 months ago
    
    I've only used it with Cursor.
    I've seen the light!
  - greymalik 3 months ago
    
    Aider or GitHub Copilot should work.
sebastiennight 3 months ago

> Having a platform like Cursor or Windsurf enables these companies to collect telemetry about _how_ users are coding that isn't possible just from looking at the repo, the finished product.
If you are powering the API for the underlying model, then don't you know exactly how users are coding?
Since all of it is included in the context fed to your model.
- NitpickLawyer 3 months ago
  
  > If you are powering the API for the underlying model, then don't you know exactly how users are coding?
  You have the code, but not all the other signals. An easy example is "acceptance signal" where someone gets an autocompletion and accepts it / rejects it. It can get more complicated with the /architect mode and so on, but there's probably lots of signals that you can get from an IDE that you can't just by serving API responses.
theahura 3 months ago

I mention in the footnotes that this is likely a stock deal!
I didn't think about telemetry for RL, that's very interesting
calmoo 3 months ago

What source do you have for 1)? From what I see and from my own and usage of people I know, the main feature people use in Cursor is the Agentic stuff, writing code for you and just clicking 'accept'. Cursor tab is a nice to have but IMO not the primary reason to use it.
peed 3 months ago

1) is out-of-date. Cursor started with auto-complete, but is all about agentic flows now.
zombiwoof 3 months ago

OpenAI realizes foundation models are a very expensive commodity.
They need to buy revenue streams fast
XAI has Twitter (ads) Meta has well everything
It’s why OpenAI will build a social network or search engine and get into Ads ASAP
dzonga 3 months ago

data has always been the greatest moat.
some of the oldest companies are data companies think your credit rating provider, your business verifier. your american express.
imtringued 3 months ago

If it's a stock deal that's even worse, since OpenAI is saying that their stock is definitely worth less than $3 billion.

kace91 3 months ago

>Some are better at the auto complete (Copilot), others at the agent flow (Claude Code). Some aim to be the best for non-technical people (Bolt or Replit), others for large enterprises (again, Copilot). Still, all of this "differentiation" ends up making a 1-2% difference in product. In fact, I can't stress enough how much the UX and core functionality of these tools is essentially identical.

Is this exclusively referring to the ux or full functionality?

Because I can tell you straight away that cursor (Claude) vs copilot is not a 1% difference. Most people in my company pay their own cursor license even though we have copilot for available for free.

Jcampuzano2 3 months ago

Agreed, although we're strictly prohibited from using cursor at work in enterprise, though they have been in discussion for an enterprise license.
I use cursor for personal work though and it's night and day, even with the recent copilot agent mode additions. I told my CTO who asked about it if we should look into cursor and I told him straight up that in comparison copilot is basically useless.
- nativeit 3 months ago
  
  What are the most dramatic differences?
  - aerhardt 3 months ago
    
    I don't use Cursor (I'm on Jetbrains) but from what I read it must be the autocomplete. Github Copilot is literally unusable. I've tried it many times in the last month and everything it suggests is stupid and utterly wrong. The suggestions are not subtly wrong - they often have nothing to do with what I'm working on. It wasn't this bad at the beginning. Mind you, I code mainly in Python, pretty common stuff most of the time.
    I keep reading here that Cursor has great autocomplete so we could be talking about a 1000% improvement compared to Copilot rather than 1% as on of the other commenters is positing.
    
    wavemode 3 months ago
    
    Personally I've seen Cursor make boneheaded autocomplete suggestions at roughly the same rate that Copilot does. And Cursor was also buggy in other annoying ways, and its "jump" suggestions (where it tries to predict where to move your cursor next) were usually stupid. So I ultimately switched back to Copilot.
    
    aerhardt 3 months ago
    
    I wouldn't qualify the suggestions I'm getting as boneheaded, but rather disastrous. I have completions turned off by default because they're so stupid it's unbearable. But then maybe a few times a month I'm low on energy and faced with the typical repetitive task: fill out this large dictionary with predictable values, type these classes which are variations of another, write some very grindy tests, what-have-you. I turn it on in the hopes that it will save me some work, see what it's suggesting, and turn it off back again in seconds. It literally cannot pick up on blatantly predictable local patterns anymore. I don't know what's happened to it because it used to be able to at least do that.
    
    selcuka 3 months ago
    
    Interesting. I'm on Jetbrains+Copilot and I find that the autocomplete is useful 90% of the time.
    I'm using the Claude backend instead of the default (OpenAI). I wonder if that makes the difference.
    
    aerhardt 2 months ago
    
    I only have the 4o option for completions. Like, I see the option in the settings but the dropdown only produces 4o for me. I seem to remember that in github.com I have all the models exposed, but in PyCharm I only get that for autocomplete. I do get a wider model choice in the chat pane. I'll look into it.
    
    selcuka 2 months ago
    
    I stand corrected. Claude is enabled on my GitHub.com settings but it's not yet available to PyCharm, so I'm using GPT-4o as well. It's a known issue:
    https://github.com/orgs/community/discussions/143139
theahura 3 months ago

Can you say more?
I was referring to UX, as that is the main product. Cursor isn't providing their own models, or at least most people that I'm aware of are bringing their own keys.
I haven't used copilot extensively but my understanding is that they now have feature parity at the IDE level, but the underlying models aren't as good.
- kace91 3 months ago
  
  >Can you say more?
  My experience is that copilot is basically a better autocomplete, but anything beyond a three liner will deviate from current context making the answer useless - not following the codebase's convention, using packages that aren't present, not seeing the big picture, and so on.
  In contrast, cursor is eerily aware of its surroundings, being able to point out that your choice of naming conflicts with somewhere else, that your test is failing because of a weird config in a completely different place leaking to your suite, and so on.
  I use cursor without bringing my own keys, so it defaults to claude-3.5-sonnet. I always use it in composer mode. Though I can't tell you with full certainty the reasons for its better performance, I strongly suspect it's related to how it searches the codebase for context to provide the model with.
  It gets to the point that I'm frequently starting tasks by dropping a Jira description with some extra info to it directly and watching it work. It won't do the job by itself in one shot, but it will surface entry points, issues and small details in such a way that it's more useful to start there than from a blank slate, which is already a big plus.
  It can also be used as a rubber duck colleague asking it whether a design is good, potential for refactorings, bottlenecks, boy scouting and so on.
- rfoo 3 months ago
  
  > Cursor isn't providing their own models
  For use cases demanding the most intelligent model, yes they aren't.
  However, there are cases that you just can't use best models due to latency. For example next edit prediction, and applying diffs [0] generated by the super intelligent model you decided to use. AFAIK, Cursor does use their own model for these, which is why you can't use Cursor without paying them $20/mo even if you bring your own Anthropic API key. Applying what Claude generated in Copilot is just so painfully slow to the point that I just don't want to use it.
  If you tried Cursor early on, I recommend you update your prior now. Cursor had been redesigned about a year ago, and it is a completely different product compared to what they first released 2 years ago.
  [0] We may not need a model to apply diff soon, as Aider leaderboard shows, recent models started to be able to generate perfect diff that actually applies.
  - theahura 3 months ago
    
    (I most recently used cursor in October before switching to Avante, so I suspect I've experienced the version of the tool you're talking about. I mostly didn't use the autocomplete, I mostly used the chat-q&a sidebar.)
    
    rfoo 3 months ago
    
    And I pay Cursor only for autocomplete - this explains the difference I guess.
    I do sometimes use Composer (or Agent in recent versions), but it's being increasingly less useful in my case. Not sure why :(
    
    nicoritschel 3 months ago
    
    The redesign was ~5 months ago. If you switched in October, you 100% have not used the current Cursor experience.
tyleo 3 months ago

Agreed, I’ve used both at work and personal projects. Copilot auto complete is great but it isn’t ground breaking. Cursor has built near entire features for me.
I think copilot could get there TBH. I love most Microsoft dev tools and IDEs. But it really isn’t there yet in my opinion.
mellosouls 3 months ago

cursor (Claude) vs copilot is not a 1% difference
This is true, but as a user of both and champion of Cursor - VS Code Copilot is quickly catching up.

OxfordOutlander 3 months ago

It is a talent and a distribution play. Talent: obvious.

Distribution: OpenAI believes the marginal token they sell will be accretive to their bottom line, so the goal then is to deliver as many tokens as possible. Windsurf already has 1k+ enterprise logos and allegedly millions of downloads. 2 m tokens × $0.00001 gross / token = $20/seat/mo; if windsrf runs 500k seats, oai books $120 m/yr gross @ 90% margin.

I saw a similar dynamic play out in the UK with Pub (bar) companies. By mid-2000s, the major players were failing. Margins were nearly zero, thanks to rising costs, and securlar decline in demand, plus they had too much expensive term debt.

But they represented profitable sources of distribution for the beer makers. So Heineken went on a buying spree. They didn't care about making money from the pubs themselves and were happy to run them break-even. This is because they then had a controlled channel of distribution for their beer (and they made a profit on every pint they shipped).

The switching costs are very different here, and the market is still so nascent. It is a thin product and vscode‑copilot can catchup. But 1% of enterprise value ($3bn of $300bn) is not a lot to gamble on owning the #2 horse in the most promising AI end market today.

theahura 3 months ago

I mention distribution in the post, and reasons to be skeptical of that as the primary driver (though I agree that may be the case)
Re talent, I'm not sure how big windsurf is, but aren't these teams generally quite small? $3b for a small team still seems quite high, especially since (afaik) their core area of expertise is more in UX and product than in ml research. That's not to say that UX and product aren't worth acquiring, just that the price tag is surprising if that's the primary justification.
- OxfordOutlander 3 months ago
  
  Given Sam recently said he thinks consumer is going to be the valuable path, perhaps it is not too much to pay for a great ux/product team
  "Ben Thompson: What’s going to be more valuable in five years? A 1-billion daily active user destination site that doesn’t have to do customer acquisition, or the state-of-the-art model?
  Sam Altman: The 1-billion user site I think."
benjaminwootton 3 months ago

What’s the net margin? Is anyone making money on inference yet?

WA 3 months ago

Ironically, 3 billion is proof that these tools do not work as expected and won’t replace coders in the near future.

Otherwise, why spend 3 billion if you could have it cooked up by an AI coding agent for (almost) free?

JambalayaJimbo 3 months ago

Existing customer base?
- arczyx 3 months ago
  
  OpenAI has way more users and brand recognition than Windsurf. If they decide to make their own code editor and marketed it, I'm pretty sure its customer base will surpass Windsurf's relatively quickly.

mrcwinn 3 months ago

This is poor analysis. It claims OpenAI is spending “3 of its 40” billion in raised capital on Windsurf. Who said this was an all cash deal?

And so if you’re purchasing with equity in whole or in part, the critical question is, do you believe this product could be worth more than $3b in the future? That’s not at all a stretch.

Cursor is awfully cozy with Anthropic, as well, and so if I’m OpenAI, I don’t mind having a competitive product inserted into this space. This space, by the way, that is at the forefront of demonstrating real value creation atop your platform.

theahura 3 months ago

(I mention in the footnotes that this is likely a stock deal!)

g8oz 3 months ago

Reminds me of Snowflake purchasing Streamlit. A sign of a big wallet and slowing internal execution on the part of the purchaser rather than an indication of the compelling nature of the acquisition.

ramraj07 3 months ago

800 million for streamlit is still the most mind-blowing acquisition story I've heard. Codeium being a few bill sounds reasonable for that.
mosdl 3 months ago

The snowflake marketplace was/is such a mess. I always wondered what caused them to choose streamlit.

firesteelrain 3 months ago

Windsurf/Codeium has an enterprise version that can be used by corporations to provide AI assisted coding environments using their own HW stack (non cloud). This is beneficial for privacy and proprietary reasons especially if your data cannot be exfiltrated off premises. The hardware recommended to run Codeium is a lot cheaper than if you were to have 700 developers generate tokens. This model has the chance to generate many paying customers. Whether that has a $40b market cap is unclear

mbreese 3 months ago

I don’t think the utility of Windsurf was the question. There is clearly a benefit for a tool/service like this.
The questions raised by the article (as I saw it) were price and timing. $3B is a lot. Is that overpaying for something with a known value but limited reach? Not to mention competitors with deep pockets. And the other question is - why now? What was to be gained by OpenAI by buying Windsuf now.
- firesteelrain 3 months ago
  
  It’s a Copilot competitor and it’s used by Zillow, Dell, and Anduril (newish Defense company). Cursor can’t work in airgapped environments right now. I don’t know what Codeium charges to run an on prem licensed version but they boast over 1000 enterprise customers. Codeium is on a rapid growth trajectory from $1.25b to $2.85b in such a short period.
  Codeium can be fine tuned. Though it’s trained on similar open source it does provide assurances that they do not inadvertently train on wrongly licensed software code.
  https://windsurf.com/blog/copilot-trains-on-gpl-codeium-does...
theahura 3 months ago

Thanks, I had a feeling it may be something like this since it seemed like they were investing more in enterprise. That said, do they do better than copilot on this? Surely msft has more experience and ability to execute in that market?
- rfoo 3 months ago
  
  Codeium's completion model is better than whatever GitHub Copilot has. For me it's Cursor > Codeium >>> Copilot. Yes, Copilot is that bad.
  And yes Codeium/Windsurf focuses on enterprise customers more. As GP said they have an on-prem [0], a hybrid SaaS offering and enterprise features that just make sense (e.g. pooled credits). Their support team is more responsive (compared to Anysphere). Windsurf also "feels" more finished than Cursor.
  [0] but ultimately if you want to "vibe-coding" you have to call Claude API
  - theahura 3 months ago
    
    Ok thanks, that was my follow-up -- I assumed that airgap implementations are significantly worse because they can't back into Claude or Gemini
- firesteelrain 3 months ago
  
  It’s a Copilot competitor and it’s used by Zillow, Dell, and Anduril (newish Defense company). Cursor can’t work in airgapped environments right now. I don’t know what Codeium charges to run an on prem licensed version but they boast over 1000 enterprise customers. Codeium is on a rapid growth trajectory from $1.25b to $2.85b in such a short period.
  Codeium can be fine tuned. Though it’s trained on similar open source it does provide assurances that they do not inadvertently train on wrongly licensed software code.
  https://windsurf.com/blog/copilot-trains-on-gpl-codeium-does...

dstroot 3 months ago

Currently using Claude code and Cursor, but VSCode is copying Cursor rapidly. Not sure if the VSCode forks will survive. Ideally we’d have VSCode with a robust agent capability and a fully open “bring your own LLM” feature.

danny_codes 3 months ago

Never used cursor but this seems like easy pickings. Code already has extensions that do exactly this.
- nsonha 3 months ago
  
  Cursor was way, way better than Copilot before agent mode, which has just been released this month. Not to mention it has have MCP for quite awhile before Copilot.

tagalog 3 months ago

OpenAI probably has 3 main reasons

1. Opportunity cost of shifting their team to work on a cursor/ firebase studio/ windsurf clone while they are locked into a model arms race might be worth much more than $3 billion if risk losing supremacy

2. They get a million users day one of the deal getting done and instantly own the team + the number 2 best IDE which gives them a massive boost over all their other model competitors like Google and Anthropic. Personally, I still think we are early enough that building makes more sense, so reason 1 still feels the strongest.

3. You get the dedicated windsurf team that can focus 100% on this product while your main model teams can keep cranking

machiaweliczny 3 months ago

IMO people underestimate effort of hiring people with domain expertise for this. Maybe they can try to buy one person from these teams and then have it help build the core IDE team but this will take much longer than just buying a very strong team.

parsabg 3 months ago

> Windsurf is providing OpenAI access to data? There's certainly some possibility that this is the case — though it makes me wonder just how bad OpenAI's relationship with Microsoft has gotten if they no longer have access to GitHub

This is missing an important nuance. The valuable data generated through coding copilots isn't the code, but the step by step human-AI interaction to produce the code.

Windsurf and Cursor are effectively crowdsourced data annotation farms which give their owners an edge in providing better coding models. I think this is way more important than talent or software in this deal.

dang 3 months ago

Recent and related:

OpenAI looked at buying Cursor creator before turning to Windsurf - https://news.ycombinator.com/item?id=43716856 - April 2025 (115 comments)

OpenAI in Talks to Buy Windsurf for About $3B - https://news.ycombinator.com/item?id=43708725 - April 2025 (44 comments)

ramoz 3 months ago

It’s a vehicle that can hit the enterprise, broad user base, training data, and gain coverage of a competitor market (I’m sure the primary LLM in windsurf is Claude just like it is in cursor).

Beyond that, these IDEs have a potential path to “vibe coding for everyone” and could possibly represent the next generation of general office tooling. Might as well start with a dedicated product team vs spinning up a new one.

phillipcarter 3 months ago

To me it's fairly straightforward.

OpenAI is predominantly a consumer AI company. Anthropic has also won over developer hearts and minds since Claude 3.5. Developers are also, proportionally, the largest uses of AI in an enterprise setting. OpenAI does not want to be pigeonholed into being the "ChatGPT company". And money spent now is a lot cheaper than money spent later.

But this is all just speculation anyways.

rvz 3 months ago

Because Cursor got too greedy.

Before approaching Windsurf, OpenAI wanted to buy Cursor (which is what I predicted thought too [0]) first, then the talks failed twice! [1]

The fact they approached Cursor more than once tells you they REALLY wanted to buyout Cursor. But Cursor wanted more and were raising over $10B.

Instead OpenAI went to Windsurf. The team at Windsurf should think carefully and they should sell because of the extreme competition, overvaluation and the current AI hype cycle.

Both Windsurf and Cursor’s revenue can evaporate very quickly. Don’t get greedy like Cursor.

[0] https://news.ycombinator.com/item?id=43708867

[1] https://techcrunch.com/2025/04/17/openai-pursued-cursor-make...

consumer451 3 months ago

For anyone interested, here is an interview with Windsurf CEO and co-founder, Varun Mohan. It was released today. Not sure if it covers the potential acquisition, though I imagine not.

https://www.youtube.com/watch?v=5Z0RCxDZdrE

theahura 3 months ago

Thanks for sharing, super interesting!

yumraj 3 months ago

Prediction:

OpenAI will buy Windsurf and then make it free with one of the cheaper OpenAI plans, effectively trying to kill the other IDEs and getting access to data which helps it compete/better against Claude and Gemini.

Google needs to launch its equivalent, and Anthropic needs to figure out their plan.

gman83 3 months ago

Google already did: https://firebase.studio/
- yumraj 3 months ago
  
  Not apples to apples, isn’t Firebase studio tied to Firebace, and not general purpose development?
  TBH, I didn’t look deeper since I made the above assumption based on the name.
  - gman83 3 months ago
    
    No, it's just a rebranded version of Project IDX, their cloud-based VS Code fork. No need to use Firebase, that's just the branding.
    
    yumraj 3 months ago
    
    will take a look, thx.

croes 3 months ago

Can’t be for the software because they claim AI can do it too, so $3B should be more than enough to write it from scratch.

jchonphoenix 3 months ago

My guess is the telemetry data.

OAI spends gobs of money on Mercor and Windsurf telemetry gets them similar data. My guess is they saw their Mercor spend hitting close to 1B a year in the next 5 years if they did nothing to curb it

apples_oranges 3 months ago

Isn’t the usual reason the people that work there?

captn3m0 3 months ago

> The worst case scenario for Apple is they decide to use user data late.

Given how heavily Apple has leaned into E2E over the years, I don't see this happening at all, beyond local on-device stuff.

cheriot 3 months ago

Coding assistants are likely to be a major profit source in the near term. It's useful today and the customers (ie employers) are willing to pay for productivity.

Only the beginning of the enterprise spending more on AI than consumers. When it's time to stop giving away dollars for pennies, everyone will want to be in the enterprise.

The article rightly points out that OAI is not cost effective for code right now. I imagine they will optimize for the use case now.

seaourfreed 3 months ago

I think the defensible business models in AI are up-the-stack. Windsurf category is one example. There are more.

AI will lead to far bigger work accomplished than one prompt or chat at a time. Bigger work flows on humans upgrading and interacting with AI will be a big critical category for that.

xnx 3 months ago

The $3B number is largely and a marketing move to show what a big/real/important company OpenAI is. I hope Windsurf got some real money out of the deal too. If ChatGPT disappeared tomorrow, people would just move to the next model.

stevenjgarner 3 months ago

The article opines "though it makes me wonder just how bad OpenAI's relationship with Microsoft has gotten if they no longer have access to GitHub". Is their relationship known to be souring?

stevenjgarner 3 months ago

Just learned that 'Sam Altman says OpenAI is no longer "compute-constrained" — after Microsoft lost its exclusive cloud provider status':
https://news.ycombinator.com/item?id=43711688

memset 3 months ago

Not stated in the article - and doesn’t necessarily e explain the price tag - but windsurf if simply better than cursor or Claude code. Otherwise people wouldn’t be switching to it.

AndrewKemendo 3 months ago

I was really liking windsurf but need to look for another option now unfortunately.

It’s a shame we can’t have anything nice not get consumed but - such is the world.

uxcolumbo 3 months ago

Same here. I doubt they'll keep the option of letting me choose what model to use. Can't trust OpenAI.
sexy_seedbox 3 months ago

RooCode
- AndrewKemendo 3 months ago
  
  Thanks that looks promising

dangus 3 months ago

> I've always been a staunch defender of capitalism and free markets, even though that's historically been an unpopular opinion in my particular social circle. Watching the LLM market, I can't help but feel extremely vindicated. Over the last 5 years, the cost per token has been driven down relentlessly even as model quality has skyrocketed. The brutal and bruising competition between the tech giants has left nothing but riches for the average consumer.

There's a rich irony to be saying this right after explaining how Google is dominating the market and how they're involved in an antitrust lawsuit for alleged illegal monopolistic practices.

And of course this willfully ignores the phase of capitalism we are in with the AI market right now. We all know how the story will end. Over time, AI companies will inevitably merge and the products will eventually enshittify. As companies like OpenAI look to exit they will go public or be acquired and need to greatly trim the fat in order to become profitable long-term.

We'll start seeing AI products incorporate things like advertising, raise their prices, and every other negative end state we've seen with every other new technology landscape. E.g., When I get a ride from Uber they literally display ads to me while I'm waiting for my vehicle. They didn't do that when they were okay with losing moeny.

And of course, "free market" capitalism isn't really free market at all in an enviornment where there are random tariffs being applied and removed on a whim to random countries.

I really don't understand why people feel like they need to defend capitalism like this. Capitalism doesn’t need a defender, if anything it constantly needs people restraining it.

probably_wrong 3 months ago

I had a similar thought when I reached the part about Apple. A system that punishes the player respects their user's privacy while rewarding those that take everything that isn't nailed down is not a good system.
The author frames Apple's choice as an own goal, but I'd rather see it as putting the failings of capitalism on display.

JumpCrisscross 3 months ago

“it's not yet clear OpenAI actually has $40bn to spend”

Do we have confirmation it was a cash transaction?

farhantawfeeq56 2 months ago

Because even vibe coding is hard

whippymopp 3 months ago

if you look closely at the communications coming out of windsurf, I think it’s pretty obvious that the deal is not happening.

dagorenouf 3 months ago

where did you see this?
- whippymopp 3 months ago
  
  check the windsurf subreddit. the official reps have repeatedly said it’s pure speculation
  - TiredOfLife 3 months ago
    
    Google Stadia team were saying they are not shuting down minutes before they were shut down.
  - disgruntledphd2 3 months ago
    
    They have to say that, even if the deal is real. They might not even have been told.
  - rvz 3 months ago
    
    >pure speculation
    Or called plausible deniability. They will always deny these reports.
    At the end of the day, Windsurf has a private price tag which they know they will sell at.
    If they were smart, they should consider selling the hype.

bionhoward 3 months ago

To take your codebase and make you dependent

elicksaur 3 months ago

>OpenAI has also announced a social media project

I haven’t heard about this before this post, but if they’re starting a “Social Media but with AI” site in 2025, can’t help but feel like they’re cooked.

soared 3 months ago

I’ve switched off of chatGPT for general use from a kind of moral/ethical standpoint. All the competitors are effectively the same for easy research questions, so I might as well use a vendor who’s not potentially a scumbag.

AndrewKemendo 3 months ago

Which one did you move to?
I haven’t found as good of an turnkey chat/search/gen interface as CGPT yet unfortunately.
Even self hosted deepseek on an Ada machine doesn’t get there cause the open source interfaces are still bad
- soared 3 months ago
  
  Gemini primarily - but I’m using it for help with house projects, landscaping, shopping, etc and not for coding. Not not a scumbag owner but feels better than OpenAI.
  - itwillnotbeasy 3 months ago
    
    Weird choice, given Google’s track record on privacy. Gemini uses your chats for training and human review them by default. You can opt out by turning off App Activity, but that disables chat history and integrations, which kinda kinda makes it useless. Claude respects your privacy by default and ChatGPT and Grok have an opt-out that doesn’t totally cripple the app.
    So what makes Google less scummy? A cheaper price? I’m sure that won’t last long – they’ve played this game before with YouTube, Google Reader, and Search: hook users, dominate, then enshittify. Same old Google playbook: good → monopoly → crap.
dangus 3 months ago

Which vendor isn't a run scumbag or owned by a scumbag?
- trollbridge 3 months ago
  
  I’ve been using Grok (for free), so in theory I’m getting a vendor to spend money on me.
  - dangus 3 months ago
    
    But they can count you as a user and that positively impacts their valuation.
    
    nativeit 3 months ago
    
    One of the many inverted incentives in this space, considering every user Grok counts is actively burning through their cash.
  - nsonha 3 months ago
    
    Grok one of the two big brands of model associated with a social network? They're not qualified to even be considered.
    
    trollbridge 2 months ago
    
    Grok works fine for generating images, summarising documentation etc., and surprisingly most stuff is now easily usable on the free tier.
  - asadotzler 3 months ago
    
    >Which vendor isn't a run scumbag or owned by a scumbag? >>I’ve been using Grok
    The biggest scumbag of them all, but hey "I use it for free."

revskill 3 months ago

They should build ide using their ai. Lol they are jst bullshit generators.

candiddevmike 3 months ago

I predict we will hit peak vibe coding by this summer. The tooling can't be sold at a loss forever/costs will go up for all sorts of reasons, and I think the tech debt generated by the tooling will eventually be recognized by management as velocity/error quotas start to inverse. I don't think self-driving developers will happen in time, and another AI winter will settle in with the upcoming recession.

MostlyStable 3 months ago

Ok, I'm going to make a slightly controversial statement: Vibe coding is both A)potentially hugely important and transofrmative and B)massively good.
Most of the criticisms of vibe coding are coming from SWEs who work on large, complicated codebases whose products are used by lots of people and for whom security, edge cases, maintainability, etc. are extremely important considerations. In this context, vibe coding is obviously a horrible idea, and we are pretty far away from AI being able to do more than slightly assist around the edges.
But small, bespoke scripts that will be used by exactly 1 person and are whose outputs are easily verified are actually _hugely_ important. Millions of things are probably done every single day where, if the person doing it had the skill to write up a small script, it would be massively sped up. But most people don't have that skill, and it's too expensive/there is too much friction to hire an actual programmer to solve it. AI can do these things.
Each specific instance isn't a big deal, and won't make much productivity difference, but in aggregate, the potential gains are massive, and AI is already far more than good enough to be completely creating these kinds of scripts. It is just going to take people a while to shift their perspective and start asking about what small tasks they do every day that could be scripted.
This is the true potential of "vibe coding". Someone who can't program, but knows what they need (and how to verify that it works), making something for their personal use.
- aerhardt 3 months ago
  
  I work in Enterprise IT, I'm the CTO of a small business. People executing small scripts all over the place sounds to me like the stuff of nightmares. Especially locally, but even if it's run online in a safer environment, as long as it causes side effects in other systems, the risk is massive. And if it doesn't cause side effects, well, chances are the program is worthless.
  Also, I've already seen this story play out with low-code. The pitch was the same, nearly line by line: "citizen developers will be able to solve so many problems by themselves at the edges of enterprise". It didn't take. Most users could not learn the technologies, but much more importantly: they did not even know how to verbalize their requirements. Take my word for it, most people do not want to design solutions, and of those that are willing, only a small subset is able.
  - rfoo 3 months ago
    
    > I've already seen this story play out with low-code. The pitch was the same
    That's the point. Unlike low-code or no-code bullshit, this somehow worked. People "magically" want to design solutions, and are able to, now.
    > People executing small scripts all over the place sounds to me like the stuff of nightmares.
    Me too. However, watching how eager people want to (and they indeed can!) make progress with these LLM generated horror, I believe it's time to give up and start designing secure systems despite the existence of massive slightly broken scripts.
    
    aerhardt 3 months ago
    
    > People "magically" want to design solutions, and are able to, now.
    I agree that LLMs lower the barrier significantly compared to low-code/no-code, but most people are not able to design solutions because they lack the business analysis skills and are not detailed-oriented enough to follow through with the specification of requirements. Let's not even talk about the discipline to carry out maintenance over a working project in the face of changing requirements.
    Even if we agree that LLMs move a lot of the work up the stack towards business analysis / product ownership / solution design, my experience in Enterprise IT in companies ranging from small to gigantic is that users do not magically become BAs / POs / PMs. There's a reason those are professionalized and specialized roles.
    I wouldn't mind being proven wrong, it's not like I feel personally threatened or anything. I feel it's the integrity of the systems I oversee that would be threatened.
    > I believe it's time to give up and start designing secure systems
    OK, well I'm not going to bear that responsibility for that, I have enough on my plate as it is. I'm not allowing an arbitrary sales rep to interact with our production Salesforce instance by automated means, period. Even if they have the proper permission levels configured to a tee in Salesforce, I can think of a thousand ways they could badly mess up their own slice of data. Interacting with the local machine: also potentially a supermassive black hole of vulnerabilities. Some of them possibly more serious than data loss, such as the syphoning of data to malicious actors.
    If someone can think of secure ways for citizen devs to interact with critical enterprise systems via scripting, then fine. I'll sit here waiting!
  - danny_codes 3 months ago
    
    Hmm I feel like this could work but we (SWE professionals) would need to provide a secondary private API surface that assumes the user is hostile. IE if you want to automate something that interacts with any private software it’d have to be treated like an adversary.
    I guess in the long run it could become worthwhile, but it’ll be a long road to get there
- kjellsbells 3 months ago
  
  > This is the true potential of "vibe coding". Someone who can't program, but knows what they need
  I would argue that the real money, and the gap right now, is in vibe tasking, not vibe coding.
  There are millions of knowledge workers for whom the ability to synthesize and manipulate office artifacts (excel sheets, salesforce objects, emails, tableau reports, etc) is critical. There are also lots of employees who recognise that a lot of these tasks are "bullshit jobs", and a lot of employers that would like nothing more than to automate them away. Companies like Appian try to convince CEOs that digital process automation can solve this problem, but the difficult reality is that these tasks also require a bit of flexible thinking ("what do I put in my report if the TPS data from Gary doesnt show up in time?"). This is a far bigger and more lucrative market than the one made of people who need quick and dirty apps or scripts.
  It's also one that has had several attempts over the years to solve it. Somewhere between "keyboard automation" (macro recording, AutoHotKey type stuff) and "citizen programming" (VB type tools, power automate) and "application oriented LLM" (copilot for excel, etc) there is a killer product and a vast market waiting to escape.
  Amusingly, in my own experience, the major corps in the IT domain (msft, salesforce, etc etc) all seem to be determined to silo the experience, so that the conversational LLM interface only works inside their universe. Which perhaps is the reason why vibe tasking hasnt succeeded yet. Perhaps MCP or an MCP marketplace will force a degree of openness, but it's too early to say.
- luckylion 3 months ago
  
  > But small, bespoke scripts that will be used by exactly 1 person and are whose outputs are easily verified are actually _hugely_ important.
  Are they easily verified though?
  I have a bunch of people who are "vibe coding" in non-dev departments. It's amazing that it allows them to do things they otherwise couldn't, but I don't think it's accurate to say it's easily verified, unless we're talking about the most trivial tasks ("count the words in this text").
  As soon as it gets a bit more complex (but far from "complex"), it's no longer verifiable for them except "the output looks kinda like what I expected". Might still be useful for things, but how much weight do you want to put on your sales-analysis if you've verified its accuracy by "looks intuitively correct"?
- theahura 3 months ago
  
  Strong plus one. This is more or less what my company is working on -- more and more ostensibly nontechnical people are able to contribute to codebases with seasoned engineers.
  - pandemic_region 3 months ago
    
    > ostensibly nontechnical people are able to contribute to codebases with seasoned engineers.
    Who is the contributor then? The AI or the prompt writer?
    I mean I'd be more at ease if they would just contribute their prompt instead. And then, what value does that actually have? So many mixed feelings here.
    At work I had a React dev merging Java code into a rather complex project. It was clearly heavily prompt assisted, and looked like the code the junior Java developer would have written. The difference is that the junior Java developer probably would have sweated a couple of days over that code, so she would know it inside out and could maintain it. The React dev would just write more prompts or ask the AI to do it.
    If we're confident that prompting creates good code and solid projects, well then we don't need expensive developers anymore do we?
- abxyz 3 months ago
  
  I am very much in favor of anything that makes software engineering more accessible. I have no principled objection to vibe coding. The problem I have with vibe coding is practical: it's producing more low quality code rather than allowing people to achieve more than they previously could.
  Almost everything I've seen achieved with vibe coding so far has been long since achievable with low / no code platforms. There is a great deal of value in the freedom that vibe coding gives (and for that reason, I am in favor of it) but the missing piece of this criticism of the criticism is that vibe coding is not the only way to write these simple scripts and it is the least reliable way.
  Vibe coding as the future is an uninspired vision of the future. The future is less code, not more.
  - gerad 3 months ago
    
    After watching a sales person and a PM vibe code I’ll say that existing developers are not the initial for vibe coding. Vibe coding absolutely allows non-devs to achieve more than they previously could.
    
    abxyz 3 months ago
    
    The issue is one of education, not possibility. There is so much hype around vibe coding that it has penetrated non-technical circles and given non-technical people the confidence to try and make things. The same people could use Zapier or Airtable or Tally or Retool or Bubble or n8n to achieve their goals but they didn't have the confidence to do so, or knowledge of the tooling.
  - sebastiennight 3 months ago
    
    > Almost everything I've seen achieved with vibe coding so far has been long since achievable with low / no code platforms.
    I tried showing someone Bubble as a solution to a fairly simple workflow (one form input, a few LLM calls, an output) that they should have been able to build in minutes.
    It was a horrific experience, as Bubble can't find the right balance between being "no-code" (simple) and "powerful enough" with customizations. The end result is that in my experience a non-technical person just can't get workflows to work in Bubble without investing a massive number of hours.
    In comparison, getting this done with either vanilla Claude.ai or Lovable is a *single* prompt. Just one!
    I think you are deeply underestimating the difference for true non-technical users. Or even bored/overwhelmed technical users who need something done that's non-mission-critical and would just never get done if it took hours of messing with a no-code GUI, vs. dictating it through whisper and letting the AI agent go at it.
- croes 3 months ago
  
  Here is my controversial statement:
  Vibe coding is coding like a customer hiring a programmer is coding.
  If all the code is written by AI it isn’t coding at all, it’s ordering.
- candiddevmike 3 months ago
  
  I don't think that's a $3B market.
  - consumer451 3 months ago
    
    I dislike the term, but eventually "vibe coding" should replace many existing no-code/low-code platforms, right? I see that as nearly guaranteed, for many use cases.
    > Low-code and no-code development platforms allow professional as well as citizen developers to quickly and efficiently create applications in a visual software development environment. The fact that little to no coding experience is required to build applications that may be used to resolve business issues underlines the value of the technology for organizations worldwide. Unsurprisingly, the global low-code platform market is forecast to amount to approximately 65 billion U.S. dollars by 2027. [0]
    We could argue about the exact no-code TAM, but if you have a decent chance to create the market leader for the no-code replacement, $3B seems fair, doesn't it?
    [0] https://www.statista.com/topics/8461/low-code-and-no-code-pl...
    
    abxyz 3 months ago
    
    I disagree, it's the opposite. Low code / no code is valuable because you're deferring responsibility to a system that is developed and maintained by experts. A task running once a day on Zapier is orders of magnitude better for a business than the same task being built by someone on the marketing team with vibe coding. Low code / no code platforms have a very bright future, because they can leverage LLMs to help people create tasks with ease that are also reliable.
    LLM-enabled Zapier or Make or n8n is the future, not everyone churning out Claude-written NextJS app after NextJS app.
    
    consumer451 3 months ago
    
    Yeah, I don't disagree with you at all. I almost wrote a longer and more nuanced comment to begin with. For one, low-code and no-code are actually two different very things.
    There are many use cases for low-code. The two major ones I've dealt with are MVPs where tools like Bubble are used, and the other is creating corporate internal tools, where MS Power Platform is common.
    Corporate IT departments are allergic to custom web apps, and have a much easier time getting a Power Platform project approved due to its easily understood security implications. That low-code use case is certainly going to be the last thing a tool like Windsurf conquers.
    However, even without that use case, in an AI-heavy investment environment, $3B doesn't seem all that bad to me. However, I have zero experience with M&A.
  - MostlyStable 3 months ago
    
    I actually literally think that these small scripts, if widely applied, are far, far more bigger than that. Even solely in the US, let alone globally. I'm also relatively certain that, if they weren't sinking money into research etc, subscriptions on inference are probably already profitable. These companies are burning money because they are investing in research, not because $10/month doesn't cover the average inference costs. Although I'd love to find a better source than the speculative ones I've seen about it.
- orbital-decay 3 months ago
  
  Malleable software. This all reminds me of personal computing in 80's with BASIC in every machine, and environments like Emacs that are built for that.
  I think LLMs have a much better chance at this kind of software than Emacs or BASIC, but I also doubt it has any future: once AI is capable enough, you can just hide the programmatic layer entirely and tell the computer what to do.
ToValueFunfetti 3 months ago

I've seen a lot of AI hype, but "AI will make management recognize that tech debt is important" takes the cake. Maybe in 2040
- Magma7404 3 months ago
  
  Management realizing and saying publicly that they made a mistake? Maybe in 3025.
- tyre 3 months ago
  
  I hope you’re able to find good managers. I prioritize paying down tech debt over feature development regularly, because it makes business sense.
  Like even in a cold capitalist analysis, the benefits to developer velocity, ease of new feature development, incident response, stability, customer trust, etc.
  It doesn’t always; there are certainly areas of tech debt that bother me personally but I know aren’t worth the ROI to clean up. These become weekend projects if I want a fun win in my life, but nothing terrible happens if there’s a little friction.
  - huntertwo 3 months ago
    
    How? I find it hard for my team to reduce tech debt as an OKR since other feature work is 1) sexier for engineers to work on 2) easier to put concrete value on. Everybody agrees in principle that tech debt is bad
    
    tyre 3 months ago
    
    Great question. It depends on why you want to kill it.
    Sometimes it’s because there are regular bugs and on-call becomes a drag on velocity.
    Sometimes making code changes is difficult and there’s only one person who knows what going on, so you either have a bus factor risk or it limits flexibility on assigning projects / code review.
    Sometimes the system’s performance is, or will be in the short–medium term, going to start causing incidents.
    Sometimes incident recovery takes a long time. We had a pipeline that would take six–ten hours to run and couldn’t be restarted midway if it failed. Recovering from downtime was crazy!
    Sometimes there’s a host of features whose development timelines would be sped up by more than it would take to burn down the tech debt to unlock them.
    Sometimes a refactor would improve system performance enough to meaningfully affect the customer or reduce infra costs.
    And then…
    Sometimes you have career-driven managers and engineers who don’t want to or can’t make difficult long-term trade-offs, which is sometimes the way it is and you should consider switching teams or companies.
    So I guess my question to you is: why should you burn this down?
- croes 3 months ago
  
  They still have to figure that out for cloud software
mountainriver 3 months ago

There hasn’t been an AI winter since 2008 and there sure isn’t going to be one now. In spite of everyone saying it every couple of months since then.
Also what tech debt? If you have good engineers doing the vibe coding they are just way faster. And also faster at squashing bugs.
I was one-shotting whole features into our Rust code base with 2.5 last week. Absolutely perfect code, better than I could have written it in places.
Then later that week o3 solved a hard bug 2 different MLEs failed to solve as well as myself.
I have no idea why people think this stuff is bad, it’s utterly baffling to me
- gopher_space 3 months ago
  
  I won’t recognize bugs the machine fixes if they appear outside the original context later on.
  If I’m not learning something every day this profession holds very little for me.
apples_oranges 3 months ago

IMHO: we will vibe:code with free local/cheaply hosted open source models and IDEs.. the hardware to facilitate is coming to consumers fast. But if Microsoft can sell Office to companies for decades then open ai can surely do the same for coding tools
- sebzim4500 3 months ago
  
  Unless there is a massive change in archiecture, it will always be much more cost effective to have a single cluster of GPUs running inference for many users than have each user have hardware capable of running SOTA models but only using it for the 1% of the time where they have asked the model to do something.
- exitb 3 months ago
  
  There are multiple orders of magnitude between the sizes of models people use for „vibe coding” and models most people can comfortably run. It will take many years to bridge that gap.
- nativeit 3 months ago
  
  > But if Microsoft can sell Office to companies for decades then open ai can surely do the same for coding tools
  This seems like a bold statement.
tyre 3 months ago

I don’t know, I’ve been using Gemini 2.5 for a bit. The daily quota caps at effectively $55/day. It’s not a ton of development but it’s definitely worth it compared to a human for projects that Claude 3.7 can’t yet wrap its mind around.
We’ll see if Gemini 2.5 Flash is good enough, but it definitely doesn’t feel like Google is selling for a huge loss post-training.
Yes the training is a huge investment but are they really not going to do it? Doesn’t seem optional
yojo 3 months ago

The chatter around vibe coding to me feels a lot like the late 90s early 2000s FUD around outsourcing. Who would pay a high-cost American engineer when you could get 10 in South Asia for the same price? Media was forecasting an irreversible IT offshoring mega trend. Obviously, some software development did move to cheaper regions. But the US tech sector also exploded.
For some projects (e.g. your internal-facing CRUD app), cheap code is acceptable. For a high scale consumer product, the cost of premium engineering resources is a rounding error on your profits, and even small marginal improvements can generate high value in absolute dollar terms.
I’m sure vibe coding will eat the lowest end of software development. It will also allow the creation of software that wouldn’t have been economically viable before. But I don’t see it notably denting the high end without something close to AGI.
- esafak 3 months ago
  
  Asian programmers are not improving at the rate LLMs are.
senko 3 months ago

I agree on the underlying premise - current crop of LLMs isn't good enough at coding to completely autonomously achieve a minimum quality level for actually reliable products.
I don't see how peak vibe coding in a few months follows that. Check revenue and growth figures for products like Lovable ($10m+ ARR) or Bolt.new ($30m+ ARR). This doesn't show costs (they might in fact be deep in red) but with story like that I don't see it crashing in 3-4 months.
On the user experience/expectation side, I can see how the overhyped claims of "build complete apps" hit a peak, but that will still leave the tools positioned strong for "quick prototyping and experimentation". IMHO, that alone is enough to prevent a cliff drop.
Even allowing for the peak in tool usage for coding specifically, I don't see how that causes "AI winter", since LLMs are now used in a wide variety of cases and that use is strongly growing (and uncorrelated to the whole "AI coding" market).
Finally, "costs will go up for all sorts of reasons" claim is dubious, since the costs per token are dropping even while the models are getting better (for a quick example, cost of GPT-4.1 is roughly 50% of GPT-4o while being an improvement).
For these reasons, if I could bet against your prediction, I'd immediately take that bet.
NitpickLawyer 3 months ago

> and another AI winter will settle in with the upcoming
Oh, please. Even if every cent of VC funding dries up tomorrow we'd still have years of discovering how to use LLMs and "generative models" in general to do cool, useful stuff. And by "we" I mean everyone, at every level. The proverbial bearded dude in his mom's basement, the young college grad, phd researcher, big tech researcher, and everyone in the middle. The cat is out of the bag, and this tech is here to stay.
The various AI winters came because of many reasons, none that are present today. Todays tech is cool! It's also immediately useful (oAI, anthropic, goog are already selling billions of $ worth of tokens!). And it's highly transformative. The amount of innovation in the past 2 years is bonkers. And, for the first time, it's also accessible to "home users". Alpaca was to llama what the home computer was to computers. It showed that anyone can take any of the open models and train them on their downstream tasks for cheap. And guess what, everyone is doing it. From horny teens to business analysts, they're all using this, today.
Also, as opposed to the last time (which also coincided with the .com bubble), this time the tech is supported and mainly financed by the top tech firms. VCs are not alone in this one. Between MS, goog, AMZ, Meta and even AAPL, they're all pouring billions into this. They'll want to earn their money back, so like it or not, this thing is here to stay. (hell, even IBM! is doing gen ai =)) )
So no, AI winter is not coming.
behnamoh 3 months ago

> ... velocity/error quotas start to inverse ...
Could you please elaborate? Is this how management (at least in your company) looks at code—as a ratio of how fast it's done over how many tests it passes?
bobxmax 3 months ago

AirBnb just did a 1.5 year engineering migration in 6 weeks thanks to AI.
"Vibe" coding is here to stay and it's only devs who don't know how to adapt that are wishfully hoping for otherwise.
- firefoxd 3 months ago
  
  Hold on, we aren't even good at estimating but now we know how much time we saved by vibe coding? I can't wait to read the source of this info when you share it.
  - bobxmax 3 months ago
    
    https://analyticsindiamag.com/global-tech/airbnb-uses-llms-t...
    You might not be good at estimating, but professional software teams at the most successful tech companies in the world generally are.
    
    sarchertech 3 months ago
    
    No they aren’t. I’ve been doing this for a few decades working everywhere from small startups to the most successful tech companies, and none of them are good at estimating.
    If you read the blog post, they were able to manually migrate 3% of the tests in 1 week. Extrapolating that gives you an estimate of less than half of the 1.5 year estimate.
    I’d also say there’s a pretty good chance that the 3% of tests the automated process couldn’t handle were more complicated than average and the devs would have gotten faster at doing these migrations after the first week.
    It’s very unclear what was actually being compared. A team did a POC in 2023 and then some work in 2024 and then spent 6 weeks tweaking the pipeline and a final week manually migrating the tests that the automated process couldn’t handle.
    But they don’t specify how many people worked on this or how many people were going to work on the original project. It could have been 1 guy on the team.
    As for looking at what the actual work was, it was migrating from one test framework to another one. I’d be surprised if a team couldn’t have written a compiler to do something similar in a similar amount of time.
- gammarator 3 months ago
  
  Whatever the term “vibe coding” is taken to mean, it assuredly doesn’t apply to a large scale migration undertaken by a professional software organization.
  - bobxmax 3 months ago
    
    Vibe coding means AI-assisted programming. That is it.
    
    fragmede 3 months ago
    
    it means having it generate code and then not looking at the code and going purely off of vibes. If you have it generate some code and then dig deep into the code and edit it to the point that you're able to explain every variable name and the reasoning for each if clause like your boss is gonna call you out for cheating and using an LLM and is gonna fire you if you can't explain "your" code, that's gonna slow you down and you're no longer purely going off of ~/vibes/~
- mech422 3 months ago
  
  "Airbnb recently completed our first large-scale, LLM-driven code migration, updating nearly 3.5K React component test files from Enzyme to use React Testing Library (RTL) instead. We’d originally estimated this would take 1.5 years of engineering time to do by hand, but — using a combination of frontier models and robust automation — we finished the entire migration in just 6 weeks."
  Color me unimpressed - it converted some test files. It didn't design any architecture, create any databases, handle any security concerns or any of the other things programmers have to do/worry about on a daily basis. It basically did source to source translation, which has been around for 30+ years.
  - shermantanktop 3 months ago
    
    If you told me five years ago that such a conversion had been done in six weeks, I would not have believed it. Even though some level of source-to-source existed. And I would definitely expect that such a conversion would have resulted in hideous, non-idiomatic code in the target language.
    
    riku_iki 3 months ago
    
    > And I would definitely expect that such a conversion would have resulted in hideous, non-idiomatic code in the target language.
    and we don't know what is the quality of end code, it is possible that tech debt created by migration is well higher than 1.5 eng/y.
    
    shermantanktop 3 months ago
    
    Have you tried it?
    I’ve done something similar to this type of work with an llm. It produces code that is often too idiomatic, in that it introduces conventional approaches that are overkill for the task at hand. But this is almost an ideal scenario, because if the previous tests run clean, the newly converted tests can only fail by being wrong.
    They can also silently reduce the tested scenarios, but that’s what code reviews are for.
    
    riku_iki 3 months ago
    
    I use LLM for coding every day. LLMs can totally make something super dumb: oh this test is broken, gatcha, lets replace it with something which is not broken (but very different).
    > They can also silently reduce the tested scenarios, but that’s what code reviews are for.
    code review of complicated scenarios is about as resource demanding as actually writing code, so shrinking time spent from 1.5y to 6 weeks totally can produce lower quality.
    
    bobxmax 3 months ago
    
    Do you really think Airbnb software engineers would accept the final code if it was low quality?
    Seriously, the wishful thinking around this stuff is embarassing.
    
    riku_iki 3 months ago
    
    There are plenty of tech bros in corps who push some hyped stuff, add to resume and move to next gig. Not sure why do you think airbnb is different.
  - bobxmax 3 months ago
    
    It's easy to tell yourself you're not losing by constantly pushing the goal posts.
    
    mech422 3 months ago
    
    great comeback - "Dude - Trust me" :-P got anything more compelling then your opinion ? Have you ever done source to source translation work (I have ...) Or is this just so exciting to you cuz you've never seen it before ?
    
    bobxmax 3 months ago
    
    I didn't say anything remotely close to "Dude Trust Me" - perhaps you need to try a bit of "vibe learning how to read english"
    
    mech422 3 months ago
    
    exactly, you didn't provide any evidence or even decent arguements to support your position. I provided 3 distinct areas (of many) that it didn't touch that programmers have to deal with everyday. I also pointed out that the capabilities to do the sorts of transformations it did have existed for decades.
    But given your comment history, backing up your arguments doesn't seem to be your strong suit...
- croes 3 months ago
  
  Lets wait for the first wave security bugs because of vibe coded software.
  I doubt that error free code is outnumbering code with errors in the training data.
  - bobxmax 3 months ago
    
    Nobody cares. Ancient devs are insisting that vibe coding is going to lead to a bunch of tech debt.
    1. Nobody cares. It's still worth the insane productivity improvements.
    2. There is no proof that it's going to lead to long-term issues, because why would it?
- exitb 3 months ago
  
  It wasn’t vibe coding, they translated their tests from one framework to another.