Breaking BarriersArticle

Oct 3rd, 2023 – : How RAG Elevates Language Model Proficiency

Request a demo

For those venturing into large language models, the acronym RAG may have piqued your interest – it stands for retrieval augmented generation. But why this sudden surge in RAG’s popularity? The answer lies in its ability to address some of the inherent shortcomings of large language models. This machine learning approach seamlessly marries the intrinsic capabilities of large language models (LLMs) with external knowledge and information retrieval systems. Instead of solely depending on the pre-existing knowledge of an LLM, the RAG pattern ventures beyond, fetching pertinent external information to enrich responses with a newfound level of context and accuracy.

Unmasking the Limitations of LLMs

Many of us have delved into the world of Large Language Models (LLMs) through platforms like Chat-GPT. With extended usage, you might have discerned certain limitations beyond occasional peculiar responses. The foremost rule is the confinement to training data, as vividly depicted in the short image below.

As the scenario illustrates, when I asked about the 2022/2023 Serie A Championship winner, Chat-GPT couldn’t immediately respond. It pointed out that its knowledge was last updated in September 2021, while the championship occurred in 2023. But what if we could keep it updated in real time?

In our conversation, I tried to introduce current information by including the latest sports news about the 2022/2023 sports events, and I just repeated the question. Here, we encountered a standard limitation with language models: contextual constraints. Chat-GPT informed me that the message exceeded the maximum context length, a known challenge when working with these models.

In conclusion, we are currently encountering two issues. The model is unable to retrieve information beyond September 2021, and the token limit is restricting our ability to provide up-to-date information to the model.

How can we overcome those specific issues?

Let’s use the previous example and ask a new question: who won the 200m at the Zurich Weltklasse Diamond League in 2023? To overcome contextual constraints, we will build an external knowledge base. We can create a sports results database from September 2021 onwards and save the information in a vectorized database. (e.g. Pinecone). We can then develop an intelligent system to retrieve and send contextual information to the LLM to ensure it always receives accurate information.

The system is called Retrieval Augmented Generation (RAG) in plain text. It works similarly to the method I explained in this chart:

Deciphering Retrieval Augmented Generation (RAG)

RAG represents a strategy to enhance the capabilities of LLMs by enabling them to search and retrieve information from a knowledge base. This knowledge base could include corporate documents, news archives, or non-LLM data.

A RAG pipeline can be broken down into three components (see Figure 2):

The knowledge base is transformed into a vectorized format, simplifying search and retrieval.
A tool comprising a retriever and a question-answering model. The retriever identifies relevant information from the knowledge base through similarity searches, addressing the context window issue by providing concise, pertinent context. The question-answering model then processes this truncated context to respond to queries.
An LLM powers an agent with access to the tool. While not mandatory, an agent proves valuable for tackling more complex questions, as it can employ reasoning frameworks like ReAct. The agent serves as an intermediary, assessing whether the tool has successfully retrieved relevant information and answered the original question.

Why is RAG important for developing business applications that are connected to LLM?

In essence, RAG bridges the gap between general-purpose language models and the specific needs of businesses, enabling the creation of tailored applications that leverage the vast knowledge available on the internet and within organizations. This makes RAG a valuable tool for companies looking to harness the power of language models to solve real-world problems and gain a competitive edge.

Retrieval Augmented Generation (RAG) is highly relevant for building business applications connected to Large Language Models (LLMs) due to several key advantages:

Access to Proprietary Knowledge: Many businesses have proprietary data, documents, and information crucial to their operations. RAG allows LLMs to tap into this proprietary knowledge, creating applications that can provide insights and answers based on an organization’s internal documents and data.
Real-time Updates: Business environments are dynamic, and updating the latest information is essential. RAG enables LLMs to retrieve real-time data and information from external sources, such as news feeds or databases, ensuring that the applications remain relevant and accurate.
Customization for Specific Domains: Businesses often operate in specialized domains with their terminology and nuances. RAG can be fine-tuned and customized to understand and respond to domain-specific queries and requirements, making it a valuable tool for creating domain-specific business applications.
Improved Decision Support: Business applications powered by RAG can assist in decision-making processes by providing contextually rich information and insights. This can be particularly valuable in market analysis, risk assessment, and strategy development.
Efficient Customer Support: RAG can enhance customer support by enabling chatbots and virtual assistants to access a wider knowledge base, answer complex queries, and provide more personalized assistance to customers, improving the overall customer experience.
Compliance and Risk Management: In regulated industries, businesses must adhere to specific guidelines and compliance requirements. RAG can assist in accessing and interpreting relevant regulatory documents, helping companies to remain compliant and manage risks effectively.
Transparency and Accountability: RAG-based applications can provide citations or references to the sources of information they use, promoting transparency and accountability. This is essential in situations where accurate information and citations are critical, such as legal or financial contexts.
Cost-Effective Solutions: Instead of investing in developing entirely new language models for specific business applications, RAG allows organizations to leverage existing LLMs and enhance them with external knowledge sources. This approach can be more cost-effective and efficient.

Exploring RAG’s Boundaries

Following this discussion thus far, you might have concluded that RAG is a panacea. However, responsible practitioners must acknowledge its limitations. Assessing the performance of RAG-enabled LLMs can be challenging due to their composite nature. A pragmatic approach involves evaluating each component (retrieval and question answering) separately, though their interplay complicates matters. A weak retrieval component invariably diminishes the question-answering component’s performance.

Constructing evaluation datasets can also prove intricate. RAG applications involving proprietary business documents may require bespoke datasets, as more than pre-existing ones might be needed.

Recognizing that no system is flawless, even when evaluation is accurate, is crucial. Would you happen to know how we can address this for RAG-based applications? The consensus among engineers I’ve consulted is to offer a transparent user interface (UI). RAG-based applications are evolving to provide responses, citations, or references to the knowledge base sections they draw upon. This additional context empowers users to discern the reliability of their RAG application’s responses.

In Conclusion

RAG emerges as an affordable and straightforward method to enhance LLMs and mitigate their inherent limitations. Nevertheless, the practical implementation of RAG-enabled models demands careful evaluation. Thoughtful attention to user interface and user experience is essential to ensure that users receive sufficient context to evaluate the reliability of their RAG-enabled models’ responses.

Breaking Barriers

Author:
Federico Cesconi

Read the article on LinkedIn

Start growing with sandsiv+ today

Request a demo

Cookie	Duration	Description
__cf_bm	1 hour	This cookie, set by Cloudflare, is used to support Cloudflare Bot Management.
__hssc	1 hour	HubSpot sets this cookie to keep track of sessions and to determine if HubSpot should increment the session number and timestamps in the __hstc cookie.
__hssrc	session	This cookie is set by Hubspot whenever it changes the session cookie. The __hssrc cookie set to 1 indicates that the user has restarted the browser, and if the cookie does not exist, it is assumed to be a new session.
_GRECAPTCHA	6 months	Google Recaptcha service sets this cookie to identify bots to protect the website against malicious spam attacks.
cookielawinfo-checkbox-advertisement	1 year	This cookie is set by the GDPR Cookie Consent plugin, this cookie records the user consent for the cookies in the "Advertisement" category.
cookielawinfo-checkbox-analytics	1 year	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	1 year	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	1 year	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	1 year	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	1 year	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
rc::a	Never Expires	This cookie is set by the Google recaptcha service to identify bots to protect the website against malicious spam attacks.
rc::f	Never Expires	This cookie is set by the Google recaptcha service to identify bots to protect the website against malicious spam attacks.
viewed_cookie_policy	1 year	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Cookie	Duration	Description
__hstc	6 months	Hubspot set this main cookie for tracking visitors. It contains the domain, initial timestamp (first visit), last timestamp (last visit), current timestamp (this visit), and session number (increments for each subsequent session).
_ga	1 year 1 month 4 days	Google Analytics sets this cookie to calculate visitor, session and campaign data and track site usage for the site's analytics report. The cookie stores information anonymously and assigns a randomly generated number to recognise unique visitors.
_ga_*	1 year 1 month 4 days	Google Analytics sets this cookie to store and count page views.
_gat_gtag_UA_*	1 minute	Google Analytics sets this cookie to store a unique user ID.
_gd_session	4 hours	This cookie is used for collecting information on users visit to the website. It collects data such as total number of visits, average time spent on the website and the pages loaded.
_gd_visitor	1 year 1 month 4 days	This cookie is used for collecting information on the users visit such as number of visits, average time spent on the website and the pages loaded for displaying targeted ads.
_gid	1 day	Google Analytics sets this cookie to store information on how visitors use a website while also creating an analytics report of the website's performance. Some of the collected data includes the number of visitors, their source, and the pages they visit anonymously.
_hjSession_*	1 hour	Hotjar sets this cookie to ensure data from subsequent visits to the same site is attributed to the same user ID, which persists in the Hotjar User ID, which is unique to that site.
_hjSessionUser_*	1 year	Hotjar sets this cookie to ensure data from subsequent visits to the same site is attributed to the same user ID, which persists in the Hotjar User ID, which is unique to that site.
hubspotutk	6 months	HubSpot sets this cookie to keep track of the visitors to the website. This cookie is passed to HubSpot on form submission and used when deduplicating contacts.

Cookie	Duration	Description
mautic_device_id	1 year	This cookie is set by the provider Mautic.This cookie is used for identifying visitor across visits and devices. Mautic cookies are used for supporting marketing activities.
mautic_referer_id	1 hour	This cookie is set by the provider Mautic. This cookie is used for marketing purposes. It heps in tracking people submitting forms.
mtc_id	session	This cookie is set by the provider Mautic.This cookie is used for setting unique ID for visitor, to track visitor across multiple websites inorder to serve them with relevant advertisements. Mautic cookies are used for supporting marketing activities.
mtc_sid	session	This cookie is set by the provider Mautic.This cookie is used for setting unique ID for visitor, to track visitor across multiple websites inorder to serve them with relevant advertisements. Mautic cookies are used for supporting marketing activities.

Cookie	Duration	Description
_cfuvid	session	This cookie is set by Hubspot.
SNID	1 year 1 month 4 days	This cookie is set by the Google. This cookie is used by the map which helps visitors to identify and reach the facility.

Human experiences

By Focus

By Industry

By Use Case

Product overview

Listen & Integrate

Analyse & Predict

Implement & Improve

Measure & Succeed

Deploy everywhere

Latest recognitions

About us

What’s New

Customer stories

Get Started

Get Inspired

Partners Breaking News

Breaking BarriersArticle

Unmasking the Limitations of LLMs

How can we overcome those specific issues?

Deciphering Retrieval Augmented Generation (RAG)

Why is RAG important for developing business applications that are connected to LLM?

Exploring RAG’s Boundaries

In Conclusion

Author:
Federico Cesconi

Human experiences

By Focus

By Industry

By Use Case

Product overview

Listen & Integrate

Analyse & Predict

Implement & Improve

Measure & Succeed

Deploy everywhere

Latest recognitions

About us

What’s New

Customer stories

Get Started

Get Inspired

Partners Breaking News

Breaking BarriersArticle

Unmasking the Limitations of LLMs

How can we overcome those specific issues?

Deciphering Retrieval Augmented Generation (RAG)

Why is RAG important for developing business applications that are connected to LLM?

Exploring RAG’s Boundaries

In Conclusion

Author: Federico Cesconi

Author:
Federico Cesconi