Pay only for what you use. This arrangement sounds fair, but with top-tier AI it creates a structure where those who can pay receive smarter support. Using the Fable 5 generation as a case study, we look at the gap in opportunity, the gap in amplification, and the forces working toward a level field — from a pharmaceutical industry perspective.
01Two Pricing Philosophies: Is "Pay for What You Use" Really Fair?
Anthropic's top model, Fable 5, is billed mainly on a metered basis (= you pay only for what you use) when accessed through the API (= the gateway that lets programs call the AI). The reason is simple: top-class inference (= the computation an AI performs to produce an answer) consumes several times to dozens of times more computing resources than lower-tier models. Looking at the lineage from the Opus series to Fable 5, performance has grown — and so has the amount of computation spent on a single act of thinking. A design that gets smarter the longer it thinks translates directly into electricity bills and time occupying GPUs (= chips built specifically for AI computation).
Consumer chat services, meanwhile, have mostly stayed on flat monthly subscriptions. What a flat rate sells is peace of mind: it removes the worry of not knowing what this month will cost. Behind the scenes, though, providers absorb the cost through usage caps and model routing (= adjustments such as sending heavy requests to lighter models). You could say there is an invisible rationing system beneath the flat-rate sign.
Metered pricing is the opposite: it passes the actual cost of computation almost directly to the user. You pay for what you use, so it looks fair. This is the starting point of this article. The "fairness" of metered pricing is a fairness that ties ability to pay directly to how much you can use. With electricity or water, how much you need does not vary much between rich and poor. Intelligent support from AI is different. The more you can pay, the deeper and longer the AI thinks for you. The thickness of your wallet begins to determine the quality of thinking you receive.
| Aspect | Flat rate | Metered |
|---|---|---|
| What it sells | Worry-free unlimited use (caps and routing hidden behind the scenes) | Actual cost of computation (transparent but open-ended) |
| Best suited for | Individuals and small users who cannot predict usage | Companies and developers who can calculate cost versus benefit |
| Relation to inequality | Quality differences are hard to see (they arise inside the rationing) | Differences in payment become differences in thinking, one for one |
Each pricing philosophy is rational on its own. The problem arises in a world where the computing cost of top models stays high: metered pricing becomes the entry point where price divides the quantity and quality of intelligent support. Companies with deep pockets can have a Fable 5-class model think for hours. Those who cannot pay make do with short answers from lighter models. This line appears nowhere but on the invoice, so it is rarely recognized as inequality. That is why we call it an invisible line.
02Precedents in Electricity, Computing, and Cloud: Metered Pricing Has Produced Both Inequality and Leveling
AI is not the first case of metered pricing producing inequality. Electricity was. In the early 20th century, when metered billing for electricity began, the first to benefit from electrification were factories with large contracts. Households came later, and there was a long lag between cities and rural areas. But economies of scale (= unit costs fall as you produce more) in generation and transmission kept pushing prices down, and electricity eventually became infrastructure everyone was assumed to have. Differences in how much people used remained, but differences in quality all but disappeared. A factory's electricity and a household's electricity are the same electricity.
Computing followed the same path. Mainframes of the 1960s (= large computers that ran companies' core operations) were rented by the hour, and only big corporations and universities could buy computing time. Then personal computers brought it to individuals, and cloud services of the 2000s (= renting computing resources over the internet) reintroduced metered pricing. In the early cloud era, the gap between companies that could use it and those that could not was wide — yet falling prices and better tooling mean an individual developer today can run a service for the whole world for a few dollars. Metered pricing was an entry point for inequality, but it was also a machine for leveling (= movement toward smaller gaps) through falling unit prices. History shows both faces.
So will AI level out on its own if we simply wait? Probably about half of it will. In fact, the pattern has held: the capability of last year's top model becomes available the following year in a much cheaper mid-tier model. The performance gains of the Sonnet series and smaller models are the evidence. "Last year's best intelligence" reliably gets cheaper and spreads to everyone.
But there is one decisive difference from electricity and cloud. There is no such thing as "premium electricity," but with AI there is always a model that is the smartest available at that moment. Electricity is uniform in quality, so once the price fell, the gap disappeared. AI is different. After Fable 5 comes a next generation that consumes even more computation. The single frontier model remains at the top, at every point in time, as the most expensive form of intelligent support. What levels out is past intelligence — not the best intelligence of the present.
What this structure implies is a scenario where inequality becomes fixed not as absolute deprivation but as a relative lag. Everyone can use last year's best intelligence cheaply. But competition always happens at today's frontier. In research, in investment decisions, and — as we discuss below — in day-to-day pharmaceutical work, if the other side is thinking with today's Fable 5 and you respond with last year's model, the disadvantage does not go away. The history of electricity supports the optimistic view that "it will get cheap eventually," but in a market where the summit keeps moving, that optimism only goes halfway. That said, if the cost of inference falls faster than demand grows, this gap narrows. The conditions under which this reading fails lie exactly there.
03The Opportunity Gap and the Amplification Gap: Those Already Strong Get Stronger with AI
When a top model like Fable 5 is offered on metered pricing, the resulting inequality is not one thing. It helps to think in two layers. The first layer is the opportunity gap: whether you can touch the top model at all. There are standard models available within a flat-rate plan, and top models you cannot call without committing to extra payment. This dividing line is printed on the price list, so it is visible. It is also a line you can cross by adding budget.
The second layer is the amplification gap, and it is harder to see. Even using the same model at the same price, the size of the outcome depends on the capital, data, and people the user brings. A company with an organized data platform (= internal information arranged so an AI can read it) has more material to feed the model. If it has specialists who can evaluate AI output, it catches mistakes early and adopts only the good outputs into its work. If it has a sales network that turns results into revenue, one unit of value the AI creates becomes ten. The opportunity gap is set by pricing and billing models, and budget can close it. The amplification gap is set by data, people, and distribution, and changing the billing model does not erase it. The more equal access to models becomes, the more the main driver of inequality shifts to this amplification machinery.
The asymmetry (= imbalance of conditions) between companies and individuals widens in both layers. A company can book usage fees as expenses and absorb the cost of failed attempts. An individual pays from their own pocket. A few hundred dollars a month in metered charges is rounding error for a company but a household decision for a person. And the individual has no amplification machinery: even a brilliant answer has no organizational mechanism to convert it into results. Freelance professionals raising their productivity with top models will multiply, but their gains will tend to be smaller than what a company extracts from the same model.
Between developed and emerging economies, exchange rates and income levels draw the line directly. Metered charges denominated in US dollars weigh several times more heavily, in relative terms, on users in lower-income countries. There is a force pushing the other way here too: prices for previous-generation models keep falling, and developers in emerging economies have growing room to build practical systems on inexpensive models. But if the question is limited to "who gets the newest intelligent support first," the answer skews toward those with the ability to pay. This is not speculation; it follows directly from how metered pricing works. The problem is that when those first movers also hold the amplification machinery, the inequality compounds rather than happening once.
04Large Firms versus Small Ones: The Day the API Budget Becomes a Competitive Condition
Now narrow the focus to organization size. In large companies, AI usage fees are easily absorbed into R&D or IT budgets. The annual budget frame comes first, and experiments happen inside it. In small and mid-sized companies, spending usually needs internal approval (= the company's sign-off procedure for expenditures) project by project. You must explain, before using the tool, "up to how much may this trial cost?" That is the awkward part of metered pricing: you cannot know the results until you use it, yet you must justify the amount before you do.
Before it is a difference in money, this is a difference in the number of attempts. Skill with AI grows in proportion to how many times you try, miss, and adjust. How to write prompts (= instructions to the AI), how to fit it into workflows, how to recognize failure patterns — all of these are learned from trials, not textbooks. An employee at a large firm who can try a hundred times a day and a counterpart at a small firm carefully trying ten times within an approved budget will not have the same proficiency three months later. Different proficiency means different results from the same budget; different results make the next budget easier or harder to obtain. The gap in attempts snowballs.
| Aspect | Large company | Small or mid-sized company |
|---|---|---|
| How costs are handled | Absorbed into R&D or IT budgets | Approved project by project |
| Freedom to experiment | Can run many trials, failures included | Careful trials within the approved range |
| Speed of learning | Fast, in proportion to attempts | Fewer attempts, slower to catch up |
| Room to close the gap | Little in particular | Can exploit falling prices of previous-generation models |
Yet this does not widen in one direction only. Two forces push back. One, mentioned earlier, is the falling price of older models. The trend continues: a model that was the frontier six months to a year ago becomes much cheaper to use. Most day-to-day work at small firms does not require the newest model. Anchor operations on a capable, now-cheap model, and buy the top model on metered terms only for the situations where it truly makes a difference. A small firm that manages this split can keep the practical gap with large firms small while controlling spend.
The other force is the speed that comes with being small. In large companies, AI adoption passes through layers of legal, information-security, and compliance review, and approvals taking months are not rare. Small firms have a shorter chain of decision-makers. Their budgets are small, but the time from decision to first use is short. So the asymmetry can also take the form of "budget for the large, speed for the small." Even so, if per-use prices for top models keep rising, the snowballing gap in attempts starts to bite. The day when the size of the API budget (= the spending frame for calling AI from programs) sits alongside payroll and equipment as an assumed competitive condition in the profit-and-loss statement is, in my view, not far off.
05What Could Happen on the Pharma Floor: Materials Creation, Review, and Medical Affairs
Let us leave abstraction and bring this to our own workplace. Picture a company that can pay freely for a metered top model (= the highest-performing AI, billed by usage) and a company getting by on a flat plan or a previous-generation model. Where would that difference show up in pharmaceutical work? I want to sketch three scenes as scenarios. To be clear: what follows is inference from current trends, not a settled future.
The first scene is materials creation. Drafting promotional materials (= the explanatory documents MRs present to healthcare professionals) means reading long primary sources — package inserts, review reports, guidelines — and producing text consistent with them. It is exactly where long-context capability (= the ability to read a long document all at once) pays off. At a company that can run the top model without limits, even the first draft comes back quickly with citations matched to the original text. In an environment with usage caps, the workflow becomes splitting the documents and going back and forth many times, and context drops at the seams of each split. The difference in first-draft quality can become, directly, a difference in how many rounds of internal review rework are needed.
The second scene is review. Materials review centers on cross-checking the content against fair advertising standards and the sales information provision guidelines. Consider having an AI do the first screening of that cross-check. Some models can read the full text of the regulatory documents and the materials at once and surface contradictions. A model that can only handle them via summaries may miss more, because summarizing is a process that discards information. What I want to stress here is that in either environment, responsibility for quality stays with the human reviewer. The AI's flags are one layer of the net, and no regulation gives grounds for delegating the final judgment. Where the difference appears is in "how fine the mesh is before a human picks up what slipped through."
The third scene is medical affairs (= the function responsible for medical and scientific information activities, kept separate from sales promotion). Evidence synthesis — reading across dozens of papers to sort out "what can be said, and how far" — is an area where the depth of a model's reasoning matters directly. A top model can produce a synthesis with qualifications (= drawing the line between "this much can be said" and "beyond this we do not know") that accounts for differences in trial design: comparators, follow-up periods, endpoints. In a low-cost environment, the output drifts toward a list of summaries, and the qualifications become shallow. The storyline that differences emerge in the quality of answers given to healthcare professionals holds here as well.
- Materials creation: differences in citation consistency and speed of first drafts → ripple into review rework counts
- Review: differences in the mesh of first-pass screening → but final responsibility always stays with people
- Medical affairs: differences in the depth of evidence synthesis (quality of qualifications) → ripple into the quality of responses to healthcare professionals
What the three scenes share is that the difference shows up not as "can versus cannot" but in "the quality of the first move and the number of round trips." Because it is hard to see, it has piled up by the time anyone notices. That, as I see it, is what makes this kind of inequality troublesome.
06Forces Working Toward a Level Field: Three Reasons Not to End on Pessimism
So far this article has described the direction in which gaps widen, but forces working the other way exist in reality too. Ending on pessimism would not be fair, so here are three grounds for leveling.
The historical trend of miniaturization and distillation
Through distillation (= a technique that transfers a large model's abilities into a smaller one) and more efficient inference, "last generation's top performance" has become dramatically cheaper almost every year. Public reporting has repeatedly noted that GPT-4-class performance came down to small models within a few years. Today's frontier becomes tomorrow's mass tier — and so far, that slope has not broken.
The existence of open models
Models with published weights (= the data that makes up a model's internals), such as the Llama and DeepSeek families, let you stand outside metered billing by running them on your own servers. They may not reach the frontier, but they catch up fast. In practice they act as a ceiling on the prices of closed top models.
Most work does not need the frontier
Cleaning up meeting minutes, drafting routine documents, translation, internal Q&A. Much of daily work is handled well enough by a previous-generation model. The places where a top model genuinely makes a difference are limited to certain steps, such as long-context cross-checking and deep evidence synthesis. Measured against work as a whole, the area where the gap bites is narrower than one might think.
Putting the three together, my reading is this: the gap will remain, but it will not be fixed in place. Work that only today's frontier can do will be doable at mass-market prices in one to two years. The difference stays relative — whether you can always use the step just ahead of the frontier — and is unlikely to become an absolute break. For small pharmaceutical companies and medical affairs teams, the practical conclusion is not to chase the frontier constantly, but to identify which steps truly require the top model and concentrate the metered budget there alone.
Honesty requires stating the conditions under which this reading fails. That is the case where the top model alone breaks into a qualitatively different dimension. Suppose the lineage from the Opus series to Fable 5 arrives not at incremental gains but at a discontinuous capability — say, being able to hand over an entire investigation or design effort that takes weeks. And suppose that capability does not transfer to lower models through distillation. Then the gap becomes a break in kind, not a difference in quantity, and the three leveling forces stop working. There is no confirmed sign of this at present, but the possibility is not zero. That is exactly why, each time a new generation of model appears, one should ask: is the difference from lower models one of quantity, or of kind? I think that is the most practical way to keep watch on this invisible line.
- The "fairness" of metered pricing ties ability to pay directly to the amount of thinking you receive. With top-tier AI, the thickness of your wallet begins to determine the quality of intelligent support — and the line appears nowhere but on the invoice.
- The inequality has two layers. The "opportunity gap" — whether you can touch the top model — can be closed with budget. The "amplification gap," set by data, people, and distribution, does not disappear by changing the billing model. The gap in attempts snowballs.
- Leveling forces are also at work: cheaper models through distillation, open models, and the fact that most work does not need the frontier. The practical focus is not chasing the top model, but identifying the steps that truly require it — long-context cross-checking, evidence synthesis — and concentrating the budget there.
- Anthropic, "Pricing" — models and API pricing (structure of metered billing). https://www.anthropic.com/pricing
- Anthropic, "News" — announcements of model releases. https://www.anthropic.com/news
- AWS, "Pricing" — explanation of the pay-as-you-go model in cloud computing. https://aws.amazon.com/pricing/
- Stanford HAI, "AI Index Report" — annual survey on inference costs and disparities in usage. https://aiindex.stanford.edu/
- OECD — reports on AI adoption and the digital divide between countries. https://www.oecd.org/digital/artificial-intelligence/
- Hugging Face, "Models" — availability of open models (basis for leveling). https://huggingface.co/models
- Epoch AI — public data on trends in machine learning computation costs. https://epoch.ai/