BUILD: The Dev Conference for AI & Apps (Nov. 12-14)

Hear the latest product announcements and push the limits of what can be done in the AI Data Cloud.

Blog

Cortex Analyst: Paving the Way to Self-Service Analytics with AI

Snowflake Cortex logo

Today, we are excited to announce the public preview of Snowflake Cortex Analyst. Cortex Analyst, built using Meta’s Llama and Mistral models, is a fully managed service that provides a conversational interface to interact with structured data in Snowflake. It streamlines the development of intuitive, self-serve analytics applications for business users, while providing industry-leading accuracy.

Historically, business users have primarily relied on BI dashboards and reports to answer their data questions. However, these resources often lack the flexibility needed, leaving users dependent on overburdened data analysts for updates or answers, which can take days. Cortex Analyst disrupts this cycle by providing a natural language interface with high text-to-SQL accuracy. With Cortex Analyst organizations can streamline the development of intuitive, conversational applications that can enable business users to ask questions using natural language and receive more accurate answers in near real time.

To deliver high text-to-SQL accuracy, Cortex Analyst uses an agentic AI setup powered by state-of-the-art LLMs. Available as a convenient REST API, Cortex Analyst can seamlessly integrate into any application. This empowers developers to customize how and where business users interact with results, while still benefiting from Snowflake's integrated security and governance features, including role-based access controls (RBAC), to protect valuable data.

Self-service analytics app using Cortex Analyst
Figure 1. Self-service analytics app using Cortex Analyst

Cortex Analyst boosts the value of data investments at Bayer

Bayer, a leader in pharmaceutical and biomedical innovation, is using Cortex Analyst to power self-serve analytics on its enterprise data. Previously, teams had access to the enterprise data platform using dashboards, which often lacked the flexibility to address the increasing number of user questions.

Now, using Streamlit in Snowflake as the chat interface and Cortex Analyst as the query-generation service, Bayer is able to augment existing business intelligence with self-serve analytics in natural language. In the first phase, its application focused on answering high-value, executive questions from sales VPs, like: "What was the market share of product X in the last month?" It has since expanded to support business unit analysts, offering detailed, row-level data for deeper analysis. You can learn more about Bayer’s journey here.

What if internal functional users could ask specific questions directly on their enterprise data and get responses back with basic visualizations? The core of this capability is high-quality responses to a natural language query on structured data, used in an operationally sustainable way. This is exactly what Snowflake Cortex Analyst enables for us. What I’m most excited about is we’re just getting started, and we’re looking forward to unlocking more value with Snowflake Cortex AI.”

Mukesh Dubey
Product Owner Data Platform, CH NA, Bayer

Current “AI for BI” solutions aim to democratize access to analytics but struggle with accuracy

Large language models (LLMs), thanks to their ability to both understand natural language and generate code, have set off a wave of interest around democratizing analytics by making it accessible beyond those with SQL knowledge. However, despite recent advancements, LLMs still struggle to understand real-world databases and schemas, which is necessary to generate accurate SQL and reliable responses. Industry solutions that rely solely on leveraging LLMs for text-to-SQL on raw schema suffer from low accuracy and often fail to progress from demo to production.

As outlined in a recent Forrester report, anecdotal evidence shows a best-case 70% rate in successfully generating accurate, executable code for simple single-table queries and around a 20% success rate, at worst, with multiple tables and/or complex joins. If a service is meant to answer business users' data questions, high accuracy is paramount. To ensure adoption of natural language-based solutions, business teams need to trust that the results delivered consistently reflect actionable facts. Without accurate results that bring trust, no other feature matters.

Cortex Analyst relies on four key principles to achieve reliable, high accuracy

Achieving high accuracy in SQL-code generation across diverse use cases requires more than just the use of advanced LLMs. At Snowflake, we've been leading with the following key principles to build a trustworthy product that can be embedded into any application: 

  1. Capture semantics: Raw schemas often lack semantic information, making it difficult for an LLM to answer data questions accurately based on the business user’s intent. Similar to a human analyst, the system needs to understand the intent of the question, including the user’s vocabulary and specific jargon. Semantic data models should be used to capture this information to provide high precision.

  2. Contain the problem space: Creating use case-specific semantic data models, such as those for marketing analytics or sales analytics, is important. SQL-generation accuracy can be driven significantly higher within a contained scope, as opposed to targeting the entire database schema. Too many similar-sounding tables and columns can confuse the LLMs and thus reduce accuracy.

  3. Reject unanswerable questions and suggest alternatives: Like a smart analyst, our system proactively identifies and rejects ambiguous or unanswerable questions, given the available data. Instead of producing incorrect results, it suggests alternative queries that can be confidently answered, maintaining user trust.

  4. Evolve with the technology: Even state-of-the-art LLMs currently struggle to generate correct SQL on complex schemas. While enabling joins isn't hard, delivering accurate results without over/undercounting post-joins and handling tricky schema shapes, like chasm traps and fan traps, are challenging for current AI models. By keeping schemas simple, we can significantly improve the reliability and accuracy of generated SQL.

Given Snowflake’s primary focus on accuracy, we have intentionally focused the current product scope with the above strategies, observing ~90% or higher accuracy reliably across customer evaluations as well as in our benchmark tests. Over time, we plan to gradually introduce more features to tackle advanced use cases while maintaining the same high bar for accuracy that translates to trust.

Text-to-SQL accuracy
Figure 2. SQL generation accuracy for Cortex Analyst vs. alternatives

The benchmark above is based on one of our internal evaluation sets, which is representative of real-world use cases. The results show that Cortex Analyst is consistently close to 2X more accurate than single-shot SQL generation from state-of-the-art (SoTA) LLMs and delivers approximately 14% higher accuracy than another text-to-SQL solution in the market. Stay tuned for our upcoming engineering blog post, where we will delve into the benchmark details and results.

Cortex Analyst accelerates deployment of conversational self-service analytics, while lowering TCO

Building a production-grade solution requires a service that generates accurate text-to-SQL responses. For most teams, developing such a service is a daunting task, often resulting in months of work that never progresses beyond the demo stage. Balancing accuracy, latency and costs is challenging, often yielding suboptimal solutions. Cortex Analyst simplifies this process by providing a fully managed, sophisticated agentic AI system that handles all of these complexities, generating highly accurate text-to-SQL responses.

Cortex Analyst offers a convenient REST API that can be integrated into any application, giving developers the flexibility to tailor how and where business users view and interact with the results. By using the API, developers are alleviated from the time-consuming burdens of:

  • Model evaluation and fine-tuning: Different models have different strengths, and the LLM landscape continues to rapidly evolve. Selecting the right model(s) is a long, time-consuming and ongoing effort.

  • Building and maintaining complex solution architectures: To deliver consistent and reliable accuracy, the service may require multi-agent interactions, integrations with multiple components and even maintenance of a complex retrieval-augmented generation (RAG) architecture. 

  • GPU capacity planning: Bringing insights to life in a production environment requires having applications ready for unplanned user demand. Managing and scaling GPU resources becomes one of the growing set of pieces developers need to support. 

Using Cortex Analyst can significantly lower the total cost of ownership (TCO) and time to deliver of reliable self-serve analytics. Since the generated SQL queries run on Snowflake's highly scalable engine, teams experience leading performance during query execution, in addition to integrated cost-governance controls across the stack.

Cortex Analyst prioritizes data security and governance

Data privacy and governance are of utmost importance to enterprises. As organizations explore using LLMs for data analytics, concerns arise around data privacy, access controls and LLMs accessing internal data. Snowflake’s privacy-first foundation and enterprise-grade security features enable you to explore and execute important use cases with the latest AI advancements while leveraging high standards of data privacy and governance.

  • Cortex Analyst does not train on Customer Data.  We do not use Customer Data to train or fine-tune any model to be made available for use across our customer base. Additionally, for inference Cortex Analyst utilizes the metadata provided in the semantic model YAML file (e.g., table names, column names, value type, descriptions, etc.) for SQL-query generation. This SQL query is then executed in your Snowflake virtual warehouse to generate the final output. 
  • Customer Data never leaves Snowflake’s governance boundary. By default, Cortex Analyst functionality is powered by state-of-the-art Snowflake-hosted LLMs from Mistral and Meta. This ensures that no data, including metadata or prompts, leaves Snowflake’s governance boundary. If you opt-in to allow the use of Azure OpenAI models, only metadata and the user’s questions travels outside of Snowflake’s governance boundary. 
  • Fully integrated with native Snowflake privacy and governance features. Cortex Analyst respects all RBAC policies set by administrators, ensuring that generated SQL queries, when executed, comply with the appropriate access controls. This integration enables robust security and governance for your data.

How Cortex Analyst works

Answering users' data questions involves a comprehensive workflow that includes interaction between multiple LLM agents, with guardrails at every step to prevent hallucinations and deliver accurate and trustworthy answers.

At a high level, Cortex Analyst follows these steps to go from a natural language question to a response:

  1. Request. The end user-facing client application sends a request that includes the user question and related Semantic Model YAML to Cortex Analyst REST API.

  2. Question understanding and enrichment. The intent of the user’s question is analyzed to determine if it can be answered. 

    1. If the question is ambiguous and can’t be answered confidently, a list of similar, confidently answerable questions are returned. This prevents the user from being stuck and enables them to continue receiving reliable answers to their data questions. 

    2. If the question is categorized as answerable, it is enhanced with semantics captured in the provided YAML file. The file includes information such as names for measures and dimension columns, default aggregations, synonyms, descriptions, sample values and more.

  3. SQL generation and error correction. To provide the best response, the enriched context is passed to multiple SQL-generation agents, each using a different LLM. Different LLMs excel at various tasks — some are better with time-related concepts, while others handle multilevel aggregations more effectively. Using multiple LLMs enhances query-generation accuracy and robustness. Next, an Error Correction Agent checks the generated SQL for syntactic and semantic errors, using core Snowflake services like the SQL Compiler. If errors are found, the agent runs a correction loop to fix them. This module also addresses hallucinations, correcting instances where the model might invent entities outside the Semantic Data Model or use nonexistent SQL functions.

  4. Response. As a final step, all of the generated SQL queries are forwarded to a Synthesizer Agent. Leveraging the work done by the previous agents, the Synthesizer Agent generates the final SQL query that most accurately answers the question at hand. The SQL query, along with the interpretation of the user question, is included in the API response. The returned SQL query can be executed in the background of the client application, and the final results are presented to the end user.
Figure 3.
Figure 3. Cortex Analyst: How it Works

To learn more about the various agents, LLMs and tools used in this agentic AI system, check out our detailed Behind the Scenes blog.

What’s next for Cortex Analyst

Future iterations will expand product scope to handle more advanced use cases without compromising on accuracy or trust. By starting with a clear focus and expanding methodically, we aim to deliver a product that truly empowers business users to "talk to their data" with confidence. In the upcoming quarter, stay tuned for some exciting feature updates, including: 

  • Integration with Cortex Search for automatic literal/sample value retrieval
  • Support for joins, with high accuracy
  • Support for multi-turn conversations to provide a more interactive experience
  • Snowsight UI for easier semantic model creation, iteration, management and feedback loop

Get started with Cortex Analyst

Build your first Cortex Analyst-powered chat app using this quickstart guide.

For further details on the feature, as well as best practices for obtaining more accurate results, be sure to check out the Cortex Analyst documentation.

AI Data Cloud Academy

Generative AI & ML School

Welcome to the AI Data Cloud Academy, providing an introduction to cutting-edge AI research as well as generative AI and ML functionality in Snowflake.

Learn about Snowflake's Shared Responsibility Model

Subscribe to our blog newsletter

Get the best, coolest and latest delivered to your inbox each week

Start your 30-DayFree Trial

Try Snowflake free for 30 days and experience the AI Data Cloud that helps eliminate the complexity, cost and constraints inherent with other solutions.