For over a year, I haven't been able to provide a solid answer to one question: What super app will AI produce?
Every time, I brushed it off by saying, "It's already here, it's ChatGPT."
Then, in most cases, I would be pressed: "If that's all, then AI is far inferior to the mobile internet."
My response is usually that AI is a productivity tool, and this question should be considered from a B2B perspective.
To be honest, I've always been dissatisfied with these answers myself, but there was no other way; it was the "most likely" answer I could think of.
Indeed, in the mobile internet era, we had world-class "super apps" in e-commerce, social networking, short videos, and livestreaming. However, entering the "AI era" by 2023, while we might be the country with the largest number of models and applications, the total unique user base combined is likely less than a tenth of OpenAI's.
A "Super App" is perhaps the most practical and urgent goal for our internet companies right now. Against this backdrop and expectation, Tencent released "Tencent Yuanbao." Following a recommendation from a researcher, I downloaded and experienced it for a while. Without taking screenshots, I'll share my impressions directly:
I quite like the default voice feature; the latency is very low, and there are many voice choices. It has a game-dubbing feel, which is very Tencent.
You can customize voices by just reading a passage. Although the similarity isn't very high and the speaking speed is slow, I can imagine that if I were still young, I would definitely find a way to let my "goddess" read a passage, so I could talk to her 24 hours a day.
Document processing and information search capabilities are passing; there might be a slight gap compared to Kimi, but it's certainly not generational.
It is primarily for entertainment. So, when I ask Kimi to generate a mind map, Kimi seriously generates a mermaid.js script and renders it in real-time. When I ask Yuanbao to generate a mind map, "she" actually does text-to-image (imagine you're reading a paper and want a mind map, and "Yuanbao" gives you a cartoon image).
From here on, it's all criticisms: "Yuanbao" provides many voice options, but none of them make me feel "warmth." In contrast, the two female voices of ChatGPT (Sky was taken offline due to the controversy over the similarity to Scarlett Johansson's voice; now there's only "Jupiter"), you feel like they are real people, especially during English communication, giving a full cinematic feel.
I have always believed that if AI is to break through on the C-end (consumer side), it needs "warmth" and "emotion." At least ChatGPT is very close, while it seems "Yuanbao" still has a long way to go.
I thought for a long time but still couldn't find a use case for "Yuanbao." You can't just build an app based on a model and open an agent feature to make it the "right" application. General model companies can't do it, and it seems Tencent can't either yet.
I might prefer having a contact in WeChat named "Yuanbao" (or any other name, anyway) as an assistant, a sounding board, a second brain, or even J.A.R.V.I.S.
If the eighth point above holds true, then perhaps it shouldn't be called a "Super App" but gradually become a "Super Terminal."
To some extent, WeChat is already our "Super Terminal"; the only difference is the current carrier is a mobile phone. However, the forms of Rabbit R1 and AI Pin at least tell us that if interaction issues are better resolved, a "Super Terminal" does not need "Super Apps."
Only, this might place higher requirements on product design. Perhaps this is why everyone is more expectant of Apple's AI iPhone, because if the company recognized as having the best C-end product capability cannot launch a satisfying "Super Terminal," then who can?
Perhaps Tencent should challenge itself. While thinking about the question of "Super App" versus "Super Terminal," I am also considering another question: "In the AI era, is the hardware threshold still that important?"
Clearly, what I want to express is not technical parameters like computing power, but the form of software and hardware integration reconstructed by AI models—the "Super Terminal." Is hardware capability a barrier?
I might have a very preliminary conclusion: in the production or professional field, I like using open-source solutions because if I use functions designed by others, it means the innovation ceiling of my work is likely not high.
In the field of personal daily life, the simpler the better; obviously, the higher the completion of the product provided by the manufacturer, the better.
This is probably the different choice between closed-source and open-source. Right, some might emphasize that most people use phones with the open-source Android.
In fact, whether Android is open-source or not has nothing to do with us users; it only matters to phone manufacturers.
For C-end AI, I firmly stand by the "Super Terminal." A highly completed "Super Terminal" has not yet appeared. Everyone has a chance, and it seems that whoever has a larger user base is a bit closer to the finish line.