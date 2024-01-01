Guessing the next word

GPT-4 has been trained on a huge dataset, using unimaginable amounts of publicly available data such as scientific articles, Wikipedia, as well as general web pages and discussion forums. All of that data went into the model, with it all broken down into words/parts as “tokens”. The system has the ability to look at the previous few thousand tokens and then attempts to guess the next one, and it does this one word at a time. For example, if it has “The cat sat on the…” and it is trying to guess the next token, it will have a probability matrix of what that next word is likely to be. The more context it has, for example a previous statement saying “In my house there is a cat near the front door”, the more likely the answer is to be correct. However, even without that context it will infer it based on probabilities and would likely guess the word “mat”, even without more context.