How Token Usage is Calculated in LLMs
Tokens are the small chunks of text that large language models read and write.
A token is usually a short word or piece of a word, and in English roughly four characters or three-quarters of a word counts as one token.
When you send a prompt to a model like GPT-4 or Claude, the system breaks your text into tokens, processes them, and then generates a response made of more tokens.
Most providers bill you for both the input tokens and the output tokens, often at different rates.
Knowing the approximate token count of a prompt and an expected reply lets you estimate the cost of a single call before you ever run it.