Perplexity, a San Francisco-based startup, has unveiled two new online large language models (LLMs) called pplx-7b-online and pplx-70b-online.
These online LLMs aim to overcome key limitations of many existing LLMs: the inability to provide up-to-date information and a tendency to hallucinate inaccurate facts.
What Is An Online LLM?
Perplexity’s online LLMs can tap into the latest information from the internet to generate responses, making them uniquely capable of answering queries that depend on recent events or data.
For example, the models can report the latest sports scores, stock prices, or the latest Google news developments.
This real-time grounding contrasts with offline LLMs like GPT-3.5 which rely solely on their training data, which gradually becomes outdated.
In addition, Perplexity employs various techniques to maximize factual accuracy and minimize the generation of false information.
The PPLX models build on top of the open-sourced mistral-7B and llama2-70B models and have been specifically fine-tuned by Perplexity on diverse, high-quality datasets to optimize for helpfulness and factuality.
In-house search technology: our in-house search, indexing, and crawling infrastructure allows us to augment LLMs with the most relevant, up to date, and valuable information. Our search index is large, updated on a regular cadence, and uses sophisticated ranking algorithms to ensure high quality, non-SEOed sites are prioritized. Website excerpts, which we call “snippets”, are provided to our pplx-online models to enable responses with the most up-to-date information.
How Do PPLX Models Compare To GPT 3.5?
Early testing provided by the company indicates Perplexity’s online LLMs match or exceed the capabilities of leading proprietary LLMs like GPT-3.5 across benchmarks measuring robustness, helpfulness, and knowledge across academic subjects.
PPLX models’ ability to tap into the latest online information allows them to provide timely facts and data in response to queries.
How Online LLMs Overcome Accuracy Challenges
A paper titled “FRESHLLMS: Refreshing Large Language Models with Search Engine Augmentation” (under review) highlighted critical limitations in traditional LLMs.
It focused particularly on the struggle for LLMs to stay current with the ever-evolving global knowledge and their tendency to produce factually inaccurate responses, known as hallucinations.
The paper proposed the FRESHQA benchmark, a dynamic question-answering system designed to evaluate the factuality of LLMs, and introduced FRESHPROMPT, a method that improves LLM performance by incorporating up-to-date information from search engines.
Despite these advances, the paper underscored a persistent challenge for LLMs: maintaining freshness and accuracy in their responses, especially in rapidly changing knowledge domains.
Perplexity’s new online LLMs are uniquely equipped to access and utilize real-time information from the internet, addressing the critical issue of freshness and accuracy that most LLMs with a knowledge cutoff date face.
PPLX models demonstrate a marked improvement in dealing with the limitations of freshness and factuality, addressing the concerns raised in the FRESHLLMS paper.
How To Access Perplexity’s New Online LLMs
The release of these highly capable yet affordable online LLMs marks a pivotal moment for the democratization of AI, according to Perplexity CEO Aravind Srinivas.
Excited to announce that pplx-api is coming out of beta and moving to usage based pricing, along with the first-ever live LLM APIs that are grounded with web search data and have no knowledge cutoff! https://t.co/VYXIjqdLy9
— Aravind Srinivas (@AravSrinivas) November 29, 2023
By providing access to the latest information and insights from the web, Perplexity’s models help level the playing field between large tech firms and smaller organizations looking to benefit from AI.
With further performance gains on the horizon, Perplexity envisions a new paradigm for search and information discovery centered around conversational interfaces. Its online LLMs hint at a future when we can query an AI assistant much like a human expert – and receive timely, factual, and nuanced responses.
Featured image: Jamie Jin/Shutterstock