What People Are Actually Using AI for Right Now

By The AI Daily Brief: Artificial Intelligence News

Summary

## Key takeaways - **Reasoning Tokens Hit 50%**: The balance between reasoning versus non-reasoning tokens completely shifted over the course of the year. Since o1 became broadly available in December 2024, reasoning model token usage went from negligible to over 50% of tokens consumed. [03:00], [03:19] - **Programming Dominates at 50%**: The dominant use case by far has become programming. Early in 2025, programming was around 11% of usage and now it is over 50%. [03:36], [03:58] - **Roleplay Rules Open Source**: The other use case that dominates is roleplay, basically everything in and around chatting with AI in a fantasy context. For open-source models, roleplay and creative dialogue accounted for more than 50% of usage. [04:29], [04:55] - **Chinese Models Surge to 30%**: The rise of Chinese open source models grew from around 1% to as many as 30% in some weeks. Release velocity and quality make the market lively. [05:45], [05:54] - **Closed for Value, Open for Volume**: Closed models are for high value workloads and open models are for high volume workloads. Teams are using both. [06:11], [06:32] - **Tool Calls Rise to 15%**: The share of requests that invoke tools rose steadily throughout the year from around 0% at the beginning of the year to 15% now. [03:27], [03:45]

Topics Covered

Reasoning Tokens Shifted to 50%
AI Coding Dominates at 50%
Roleplay Powers Open Source
Closed High-Value, Open High-Volume
First-Mover Models Lock In

Full Transcript

Welcome back to the AI daily brief.

Today we are looking at what people are actually using AI for right now. In

other words, beyond our suppositions and our guesses, is there a way to see these specific types of applications that are driving AI adoption? And last week, we got a study that was trying to do

exactly that. The study comes from a

exactly that. The study comes from a team up of Open Router and A16Z. A16Z of

course being a prominent venture fund and Open Router being a startup that provides a unified API that gives developers and users access to hundreds of different LLMs through a standard API

gateway. So to provide a little bit more

gateway. So to provide a little bit more background on who open Router is the service offers a near complete range of proprietary and open- source models being served on a range of different

infrastructure. They serve 25 trillion

infrastructure. They serve 25 trillion tokens monthly across 300 models to 5 million end users. One of the big use cases for open router is consumerf facing AI apps. So basically developers

can use open router to automatically route requests to the most efficient or appropriate model. It also provides

appropriate model. It also provides failover services in case service of a favored model goes down. So not hard to imagine how you would use this if you are a startup. Most startups that are providing some sort of consumer or

business interface for using AI are trying to abstract away all the details of which model you're using and things like that. And so open router gives them

like that. And so open router gives them an alternative to plugging into just a single model. Instead, they can get

single model. Instead, they can get access to the full suite. It's more

redundant. It has potential cost efficiencies. That's the sort of idea

efficiencies. That's the sort of idea here. Now, individual users can also

here. Now, individual users can also make use of open router, but that definitely tends to be for extreme power users. By way of example, users can plug

users. By way of example, users can plug their open router API keys into cursor and get full access to models without needing to handle multiple sets of keys.

The study they released last week is called the state of AI, an empirical 100 trillion token study with open router.

In the abstract, they write, "We analyzed over 100 trillion tokens of real world LLM interactions across tasks, geographies, and time. The

findings underscore that the way developers and end users engage with LLMs in the wild is complex and multifaceted. Now, one more note on the

multifaceted. Now, one more note on the methodology before we dive in. While 100

trillion tokens is absolutely nothing to sneeze at and is a very meaningful and reasonable sample size to start to infer some patterns, the caveats are that one that's somewhere between a tenth and a

15eenth of the number of tokens Google Gemini was serving per month before the release of Gemini 3. So while 100 trillion is a lot, it is still a fairly limited sample size overall. The second

thing to note is that this pattern of usage is concentrated around people who are building things. So if you did a study like this across all the end users who are using chat GBT and Claude and Gemini and things like that, it would

probably look a little bit different. So

with that out of the way, let's look at what they actually found. There were a few different things that stood out to me. The first, which just absolutely

me. The first, which just absolutely defined the year, is the balance between reasoning versus non-reasoning tokens completely shifted over the course of the year. Remember, it was only at the

the year. Remember, it was only at the beginning of December of 2024 when OpenAI's01 became broadly available.

Since then and over the course of 2025, reasoning model token usage went from basically negligible to now over 50% of tokens consumed. Open router calls this

tokens consumed. Open router calls this a full paradigm shift. And I think that this is absolutely a key part of the story of AI in 2025. Now, of course,

part of what reasoning models open up is more autonomy and agent capabilities.

And while not as dramatic as the growth in reasoning, some indications of that are also starting to show up in the data. They write that the share of

data. They write that the share of requests that invoke tools rose steadily throughout the year from around 0% at the beginning of the year to 15% now.

Overall, and this will be surprising to no one who is listening to this show, the dominant use case by far has become programming. Early in 2025, programming

programming. Early in 2025, programming was around 11% of usage and now it is over 50%. We are coming up towards end

over 50%. We are coming up towards end of the year episodes and I think any accounting of 2025 has to start with the fact that the dominant and most important phenomenon of this year in AI

was the rise of AI coding. That's

unsurprisingly then is showing up in token consumption in the study. Now

there are some other ways that we see coding as the major use case showing up in the study. The average number of prompt tokens per request. In other

words, the average prompt length grew about 4x over the course of the year from around 1.5,000 tokens to 6,000 tokens. Open router translated it for us

tokens. Open router translated it for us saying the median request is less write me an essay and more here's a pile of code docs and logs now extract the signal. Now the next thing that is

signal. Now the next thing that is notable and in some ways a lot of this study is a tale of two use cases is that the other use case that dominates is roleplay. basically everything in and

roleplay. basically everything in and around chatting with AI in a fantasy context from innocent to not so safe for work. That is particularly true for

work. That is particularly true for open- source models where roleplay and or creative dialogue as they put it accounted for more than 50% of OSS usage. Now actually before we look more

usage. Now actually before we look more at that let's look at the patterns of open source versus closed source overall. Another big story for this

overall. Another big story for this year, at least among developers building AI applications, has been the rise of open source models and specifically Chinese open source models. Open router

notes that by Q4 of this year, openweight models had reached about a third of overall usage, but they also noted that they've plateaued this quarter. Now, this makes sense

quarter. Now, this makes sense intuitively given that this quarter we've seen some major advances in the closed weight models like Gemini 3,

GPT51, and both sonnet and opus 4.5.

Still, the landscape looks really different than it did last year at this time in terms of the composition of these two types of models, which makes sense when you remember back that the first big story in AI of this year was

the DeepSeek moment. Indeed, the rise of Chinese open source models is one of the big phenomenons that Open Router noted.

They grew from around 1% to as many as 30% in some weeks. In understated

fashion, Open Router notes, release velocity and quality make the market lively. And really what they're saying

lively. And really what they're saying and what these numbers are showing is that for developers in 2025, open source models in general, but particularly Chinese open source models became a major contender when it came to choosing

what models you were going to use for your applications. Indeed, it turns out

your applications. Indeed, it turns out that it's not really an either or. It's

a both. And open router writes, if you want a single picture of the modern stack, closed models are for high value workloads and open models are for high volume workloads. And as they point out,

volume workloads. And as they point out, teams are using both. Now going back to the breakdown of what people are using open source models for. Over 50% of it is roleplay and creative dialogue. Now I

think a lot of people are interpreting this as developers using the open models for use cases that clearly have a lot of demand but which fall outside the bounds of what closed source providers want their models being used for. It is

notable though that over the course of the summer programming also became a big part of open source consumption and now sits at between 15 and 20% of usage.

Indeed, when it comes to the Chinese open source models, programming and technology in aggregate are now ahead of roleplay, which is down to 33%.

Basically, the current crop of Chinese open source models is being seen as viable for pretty much every type of use case. One last note from their highlight

case. One last note from their highlight summary that I think is interesting.

They observed what they call a Cinderella glass slipper effect for new models. Basically, when a new model gets

models. Basically, when a new model gets released, tons of people come in and try it. And the people who persist create

it. And the people who persist create what open router calls a foundational cohort who resist substitution even as newer models emerge. Basically they

create a foundation and a base group for that model moving forward. So what are other people's observations of the study? Tengan who runs the chain of

study? Tengan who runs the chain of thought AI newsletter noted a couple things. One of them which he called out

things. One of them which he called out specifically was the division of different models by different usage. He

writes anthropics cla is used for over 80% of programming and almost zero roleplay. It is the serious work model.

roleplay. It is the serious work model.

While DeepSeek is the entertainment king with twothirds role-play traffic, he also noted that although people are willing to try new models, as he puts it, quote, "A model that's the first to nail a painful workload creates near

permanent lockin. Early 2025 cohorts of

permanent lockin. Early 2025 cohorts of Clawude for Sonnet and Gemini 2.5 Pro still retain 40 to 50% of users 6 months later, while every later cohort churns."

Relatedly, he points out demand is wildly priced in elastic. Users happily

pay 10 to 50x more per token for Clott GBT5 if it saves them 10 minutes of debugging. Being cheap is nowhere near

debugging. Being cheap is nowhere near enough. Going back to this idea of

enough. Going back to this idea of different models for different uses, he noted that there is a new medium-siz model sweet spot in the 20 to 70 billion parameter range. Token Bender points out

parameter range. Token Bender points out that while this study is super useful for understanding the breakdown of different open source model usage, we probably shouldn't extrapolate their patterns overall because open router is

a less preferred option for the closed model providers. Most people were

model providers. Most people were focused on the use cases. Anan Chadri

writes, "Open router reported what everyone building tools already knows.

AI usage is mostly longunning coding job with tool calls." Jay Little writes, "Heard Deepseek was good at roleplay, but didn't think 80% of the use would be that lol." Shan Chahan writes, "Roll

that lol." Shan Chahan writes, "Roll playing and creative writing is 52% of open source usage. While VCs fund productivity, humans are using AI to write fanfiction and debug code. The

market gap versus reality gap is hilarious." I don't know if that's

hilarious." I don't know if that's totally fair. If, for example, you look

totally fair. If, for example, you look at the internet, it's not like the fact that there is massive amounts of adult content doesn't mean it's also super useful for productivity. Although, it

certainly does suggest that there's probably capital opportunities that aren't being taken advantage of because of particular norms and morals. One subp

part of the conversation was about how Grock dominated total consumption charts, but this is potentially a little bit dismissible and where the limits of this study show up most to me. Grock

made tokens available for free for some time on open router as part of a promotion strategy, which was obviously successful as a way to get people to try it, but which warps the model results at least a little bit. One really

interesting reflection came from Brian Katano, who actually got meta on the success of Open Router in general. Brian

writes, "I really thought cursor and open router would not become big. Cursor

is just a fork of VS Code. Open router

is just a wrapper on top of model APIs.

I was very wrong. I'm realizing that my baseline visceral skepticism of scaffolds and rappers needs to be unlearned." The AI market, he continues,

unlearned." The AI market, he continues, is special in its sensitive differentiation. It's easy to switch

differentiation. It's easy to switch between providers, but evaluating any model or provider is sensitive. Small

changes in input cause large changes in output. This is true at the prompt level

output. This is true at the prompt level and at the model level. GPT5 versus

Claude 4.5 as inputs to write my code will yield vastly different results. So

buyers in a sensitively differentiated market have the following problem. It's

easy to switch between providers and the models are always getting better. In

addition, because this market is so new, none of the models are sticky yet. This

might change with memory, etc. So you end up needing rappers and scaffolds to do your work over time. Otherwise, you

lose out on optionality in a rapidly changing provider market. I keep

expecting one model to win, but this hasn't ever really happened. Tenyan

again made this point as well. There is

no single best model. The top 10 models by volume are from eight different labs.

So overall, this is a super interesting study that while focused on a particular audience of app developers and power users and a relatively limited number of 100 trillion tokens, still shows some of the big changes that we've been feeling

throughout the year. If you want to check out the study for yourself, you can find it at openouter.ai. It's on a banner right on top of the website.

Thanks to the team there and at A16Z for putting this all together. For now,

that's going to do it for today's AI daily brief. Appreciate you guys

daily brief. Appreciate you guys listening or watching as always.

Loading...

Loading video analysis...