How does Perplexity select which sources to use in its answers?
Perplexity uses Retrieval-Augmented Generation (RAG) on every query — it always searches the live web before generating a response, unlike ChatGPT which may rely on training data for some queries.
The retrieval process works roughly as follows:
- Query expansion: Perplexity rewrites the user query into multiple search-optimized forms
- Web retrieval: it fetches results from its search index, with preference for recently updated, crawlable content
- Chunking and ranking: retrieved pages are split into passages; passages are ranked by embedding similarity to the query
- Generation: the top-ranked passages are passed to the language model, which synthesizes an answer and cites sources
For GEO, this means: content must be crawlable, specific passages must be retrievable as standalone units, and recency matters — Perplexity favors recently updated content for factual queries. A highly specific, authoritative article on a narrow topic from a lower-domain-authority site can outrank a generic overview from an established brand.
The single highest-leverage action for Perplexity citation is ensuring each major section of your content answers a complete question without requiring the surrounding context.