Hello All,
Now that I have a few weeks of historical pricing data I decided to do some analysis and find out how stable (or unstable) LLM API prices are. I scrape prices from multiple sources but first started with OpenRouter and will base this analysis from that data.
In case you don’t know, OpenRouter is a large router for LLM inference APIs. They allow users to access about 300 models hosted on 54 of providers with a single API key. According to OpenRouter themselves, they never markup the price from the inference providers so the following can be interpreted as an analysis of the inference market covered by them.
Here are two charts to give you an idea of who the major model authors are (companies that created the models) and who the major providers are (host the models).
I started collecting data 22 days ago, on July 28th, and during that period 17% of all models covered by OpenRouter have had either their cheapest input or output token cost change, meaning a new price floor set.
Here is a breakdown of the top authors and the percent of their models that experienced a price change. You can see that some authors like OpenAI and Anthropic, who mainly host their own models, are relatively stable while open source authors like Qwen and Deepseek are susceptible to changes (Llama being an exception):
Changes were more drastic than I would have thought. In the below I show the distribution of percent change in prices in input or output tokens for models experiencing a new price floor. I calculate this by seeing percent difference between price during the day before change and day of change. The distribution is wide and shows that price increases are more frequent than decreases (60% of all changes increased the price).
The lesson I would take away here is to keep an eye on your spend per token if you are heavily using models from Qwen, Deepseek and other open source authors. You could also consider set fallback options in your apps to cheaper models if you are price sensitive.
I recently built out the price trends page in Price Per Toke where you can explore models that show pricing changes. Below I show how that looks for the Qwen models offered by Chutes (an inference provider) for input token price:
You can see that many of the Qwen models had a price reduction on August 5 and that the new prices persisted. Clicking through I see that usually the new prices do stick, making it even more important to track changes because if a model gets more expensive it will likely stay that way.
As always please feel free to respond to this email with any feedback on the Newsletter and the Price Per Token site!
Best,
Alex