Snowflake AI Research

We are a team with extensive experience building systems and technology that have significantly reduced the cost of LLM training and inference. A lot of our work has been open-sourced to provide the AI community with more accessible and cost-effective LLMs. The team includes many specialists in natural language processing and search. With the help of thousands of engineers worldwide at Snowflake, our cutting-edge technology powers enterprise AI products in Cortex AI and more. Check out what we're working on: https://www.snowflake.com/en/product/ai/ai-research/

Gen AI

Fastest Speculative Decoding in vLLM with Arctic Inference and Arctic Training

How we enhanced speculative decoding to get 4x faster end-to-end task completion for LLM agents and up to 2.8x faster decoding for conversational, interactive and coding workloads.

Snowflake AI Research

MAY 01, 2025|18 min read

MORE POSTSFROM Snowflake AI Research

Gen AI

Evaluating Multimodal vs. Text-Based Retrieval for RAG with Snowflake Cortex

Discover how multimodal retrieval on Snowflake Cortex transforms enterprise PDF search, enhancing accuracy and speed across complex document formats.

Snowflake AI Research

APR 21, 2025|8 min read

Gen AI

Low-Latency and High-Throughput Inference for Long Context with Sequence Parallelism (aka Arctic Ulysses)

Ulysses, a novel sequence parallelism technique, boosts long-context LLM inference performance with 3.4x lower latency and better GPU efficiency.

Samyam Rajbhandari (Tech Lead)

|

Snowflake AI Research

APR 03, 2025|14 min read

Gen AI

Think. Execute. Excel: Arctic Text2SQL with Execution-Guided CoT

Learn how Snowflake’s ExCoT optimizes Text2SQL with execution-guided CoT and DPO, setting a new benchmark in natural language to SQL accuracy.

Snowflake AI Research

APR 02, 2025|10 min read

Gen AI

Snowflake Arctic Embed Joins ArcticTraining: Simple And Scalable Embedding Model Training

Arctic Embed now merges with ArcticTraining, giving developers open access to core training code for building efficient frontier embedding models.

Luke Merrick

|

Snowflake AI Research

MAR 25, 2025|10 min read

Gen AI

Arctic Agentic RAG Episode 1: Agentic Query Clarification for Grounded and Speedy Responses

Discover how Arctic Agentic RAG improves AI accuracy with agentic query clarification, delivering grounded, speedy responses for enterprise AI applications.

Snowflake AI Research

FEB 18, 2025|9 min read

Gen AI

Eval-Guided Optimization of LLM Judges for the RAG Triad

Learn how eval-guided optimization enhances LLM Judges for RAG systems, improving context relevance, and answer relevance with advanced benchmarking.

Snowflake AI Research

FEB 04, 2025|8 min read

Gen AI

Benchmarking LLM-as-a-Judge for the RAG Triad Metrics

Evaluating RAG systems with LLM-as-a-Judge: explore our benchmarking results and learn how the RAG Triad metrics build trust in enterprise RAG systems.

Snowflake AI Research

JAN 31, 2025|22 min read

Machine Learning

Machine Learning Models Require the Right Explanation Framework — and It’s Easy to Get Wrong

Learn the importance of using a robust explanation framework for machine learning models, ensuring accuracy, trust and understanding for better decision-making.

Snowflake AI Research

JAN 23, 2025|7 min read

Gen AI

ArcticTraining: Simplifying and Accelerating Post-Training for LLMs

ArcticTraining, a streamlined framework for LLM post-training, offering flexible trainers, simplified structures, and native data generation pipeline.

Snowflake AI Research

JAN 16, 2025|10 min read

2

3

Try Snowflake free for 30 days and experience the AI Data Cloud that helps eliminate the complexity, cost and constraints inherent with other solutions.

Product

Solutions

Why Snowflake

Resources

Developers

Pricing

Snowflake AI Research

Fastest Speculative Decoding in vLLM with Arctic Inference and Arctic Training

MORE POSTSFROM Snowflake AI Research

Evaluating Multimodal vs. Text-Based Retrieval for RAG with Snowflake Cortex

Low-Latency and High-Throughput Inference for Long Context with Sequence Parallelism (aka Arctic Ulysses)

Think. Execute. Excel: Arctic Text2SQL with Execution-Guided CoT

Snowflake Arctic Embed Joins ArcticTraining: Simple And Scalable Embedding Model Training

Arctic Agentic RAG Episode 1: Agentic Query Clarification for Grounded and Speedy Responses

Eval-Guided Optimization of LLM Judges for the RAG Triad

Benchmarking LLM-as-a-Judge for the RAG Triad Metrics

Machine Learning Models Require the Right Explanation Framework — and It’s Easy to Get Wrong

ArcticTraining: Simplifying and Accelerating Post-Training for LLMs

Start your 30-DayFree Trial

Product

PRODUCT CATEGORIES

FEATURED CAPABILITIES

FEATURED OPEN SOURCE TECHNOLOGIES

Solutions

INDUSTRIES

DEPARTMENTS

ENABLEMENT SOLUTIONS

PARTNER SOLUTIONS

Why Snowflake

Resources

CONNECT

LEARN

Developers

Company

Pricing