Skip to content
Sandeep Vellanki
Sandeep Vellanki

Sandeep Vellanki

Manager, Data Science Team
Deloitte Services LLP
Bio

Sandeep Vellanki is a Data Science manager leading a team focused on transforming complex data into actionable insights. With over 13 years of experience, Sandeep specializes in machine learning, predictive modeling, and data visualization, delivering impactful solutions across industries. He excels at bridging the gap between technical and non-technical stakeholders, translating intricate data concepts into clear, actionable insights.

Sandeep has been at the forefront of emerging technologies, particularly in Natural Language Processing (NLP) and generative AI. Notable projects include developing several generative AI-powered tools at Deloitte which customizes content and enhances client-facing teams’ efficiency. His work on the QRAG architecture has also improved the performance of retrieval-augmented generation (RAG) models by enhancing context pulling and specificity.

Sandeep holds an MBA in Marketing from IMT-Ghaziabad and an undergraduate degree in Engineering from VNR VJIET, JNTU, Hyderabad.


Question based Retrieval Augmented Generation (QRAG) : Evaluating a New Approach to Retrieval-Augmented Generation (RAG)

Featuring: Paula Payton

Large Language Models (LLMs), a subset of natural language processing (NLP), enable sophisticated text generation, comprehension, and reasoning. However, LLMs often suffer from issues like hallucinations, outdated knowledge, and high computational costs.  Retrieval-Augmented Generation (RAG) has emerged as a powerful solution to mitigate these limitations since they were first introduced in 2021.  RAG systems combine the ability to dynamically retrieve information from an external knowledge base and then generate contextual and coherent responses based on the retrieved information..  However, RAG models are not without their challenges, which include latency, retrieval noise, and optimization issues.  In this presentation, we explore a new approach to RAG – a question-based RAG or QRAG  – that can enhance efficiency and accuracy while maintaining or improving response quality. We then compare traditional LLMs and standard RAGs with our new QRAG approach, analyzing key benchmarks such as latency, precision, and relevance scoring. We close with a discussion of possible use cases and applications for QRAGs, as well as future research directions.