For a long time, I have held a view: as long as there is an IDE or a CLI-based application, all human-AI collaboration can be completed: the essence of AI productivity is code.
I also always believed that, logically, there was nothing wrong with this.
Until, for the same task, I didn't solve it very well in Google's Antigravity, whereas in Claude's Cowork (Claude is the model, but it's much more famous than the company Anthropic, so let's leave it at that), the implementation was much closer to my expectations. This made me realize some different issues.
This isn't a gap in the capabilities of the two models (Gemini-3 vs. Claude-4.6); there are some directional differences, which I can briefly summarize later.
Rather, my task was a batch analysis task that required searching and filtering. Antigravity executed 66 batches, while Cowork did 38. However, after Antigravity collected all the data, it unexpectedly wrote a Python program to analyze it, whereas Cowork handed it directly to the Claude-4.6 Opus model for processing.
Therefore, the result was naturally that Cowork better met my requirements.
Yes, that’s the problem. Antigravity is preset as a coding IDE, so the weight of solving problems with code is obviously higher. Cowork, on the other hand, is about utilizing the model to assist with work tasks. Although implementing via code is a very basic and common setting, it's clear that in Cowork’s preset, "solving the problem with the model" is the higher priority.
Because Claude-4.6 Opus is so expensive, I was unwilling to repeat the long wait of "unlocked for five hours, working for ten minutes." After optimizing some prompts, I also quickly completed the task in Antigravity.
However, this process gave me some different inspirations: perhaps, depending on the scenario, different IDEs or at least different app formats are indeed needed. My previous ideas were a bit too idealized.
Perhaps the developers at Anthropic went through a similar process and then quickly pushed for the release of Cowork. As for the effect, one only needs to see how terrified "software stock" investors are to get a glimpse.
Since the models caught up, Anthropic has been one step ahead on its own path, proposing new concepts, small products, or small updates. Although each seems very "light" and not particularly high in technical content, the resulting changes in standards are staggering.
- The first to launch Artifacts, making the model a great tool for rapid frontend demos and visualization;
- The first to propose MCP, which, despite being an evolution of APIs, has quickly driven the integration of a series of third-party tools with models;
- The first to propose Skills, which is both a huge supplement to MCP and a significant step toward standardizing Agent output (of course, I’ll boast a bit—the OpenResearch I made last year was almost identical to the vision and pain points Skills aims to solve);
- The first to launch Cowork;
In fact, during these evolutionary stages over the past two years, while model capabilities have improved, more visible results have come from the optimization and integration of "reasoning models" and the Agent layer. At the same time, code itself is a relatively small set compared to "world knowledge."
Anthropic's true core competitiveness is that state of concentrating energy on one direction and iterating rapidly. Using their models and products, you can feel a kind of "spiritual resonance" with the developers: facing the same problem, you realize it can be solved this way, or that it still can't be solved, or you find a small easter egg that reflects their "aesthetic values." Haha.
OpenAI used to have this core. It could be felt more during GPT-4o and before, and even more so before the release of ChatGPT. Nowadays, it seems hard to find, with only a tiny trace remaining in the latest product, Prism—but based on LaTeX optimization, while it has an audience, the demand is too narrow.
Google has also maintained this core, but perhaps it can never escape the issues of "large enterprise disease"; its outward manifestations are always unstable. Compared to the rapid-fire updates of labs applications and AI Studio products after the release of Gemini-2 and 2.5, new application releases seem to have gone silent since Gemini-3, except for those in the nano-banana-pro line. Perhaps the scope they want is too broad, and the computing power demand can't keep up, or perhaps it's for other reasons unknown to us.
In terms of "genes," I still feel more aligned with Gemini: it clearly possesses much more "world knowledge," it is more versatile, its multi-modal capabilities are stronger, and its ecosystem is richer...
However, my biggest takeaway over the past few years is realizing that in more important scenarios—over 90% of "landing" scenarios—one needs focus, the ability to "penetrate through," and more standardization. In these aspects, Anthropic (Claude) is indeed doing better and better.
I have only one "dissatisfaction": it's too expensive.