Retrieval Augmented Generation (RAG)
RAG (Retrieval Augmented Generation) integrates LLMs (Large Language Models) like GPT-4 with external databases or APIs, thus enabling real-time information retrieval for up-to-date and more accurate responses.
An easy way to look at this would be to think of RAG as "the combination of a librarian and a skilled writer". The API or database has extensive knowledge it can pull from, while ChatGPT is capable of explaining numerous topics it was trained to write with. If combined, the skilled writer (ChatGPT) can now explain all the additional information given to it by the librarian (a database/API).
Analogies aside, RAG increases the knowledge base of an LLM exponentially and helps to ground it with verifiable facts. The retrieval aspect of RAG can scrape websites and documents for “context” that relates to the user prompt. Depending on the sophistication of the retrieval method / system, the limited context is then passed to the LLM (ChatGPT) which adds this information to its “context window” (sort of like the short term memory of the model) and responds to the user prompt with its own built in knowledge plus the retrieved context. The retrieval function can receive access to websites/browsing, databases, individual documents, or any other external source of information.