AI & ML

Simplifying Multimodal Data Analysis with Snowflake Cortex AI

Snowflake Cortex AI now features native multimodal AI capabilities, eliminating data silos and the need for separate, expensive tools. Introducing Cortex AI COMPLETE Multimodal, now in public preview. This major enhancement brings the power to analyze images and other unstructured data directly into Snowflake’s query engine, using familiar SQL at scale. Unify your structured and unstructured data more efficiently and with less complexity. Importantly, this works seamlessly across Snowflake data, Iceberg tables, and object storage like Amazon S3 — all without moving your data. Leverage Snowflake’s built-in security and governance to generate deeper, trusted insights across all types of enterprise data. 

Bridging the data gap

In today’s data-driven landscape, organizations can gain a significant competitive advantage by effortlessly combining insights from unstructured sources like text, image, audio, and video with structured data are gaining a significant competitive advantage. With Cortex AI COMPLETE Multimodal, these complex tasks become simple, using just a few lines of SQL, reducing the cost of analyzing data. For example, teams can enhance predictive models by including text and images, correlate medical imaging to treatment outcomes, or identify manufacturing defects from production line photos.

Process all your data where it already lives

Fragmented data environments and complex cloud architectures impede efficiency and innovation. Addressing this, Cortex AI COMPLETE Multimodal, now in public preview, offers enterprises a solution to directly process image files within a single, secure unified platform, making it easier to manage and scale. Cortex AI’s managed platform automatically batches and provides high throughput for unstructured data stored in an external cloud object provider such as an Amazon S3 bucket or in Snowflake, eliminating the need to invest cycles in building solutions that orchestrate jobs between different cloud services. This accelerates visual data insights and their integration with structured data, improving technical data worker speed and agility. This resulting streamlined architecture reduces complexity, accelerates time-to-insight and lowers total cost of ownership.

Fig. 1: Multimodal analysis architecture comparison
Fig. 1: Multimodal analysis architecture comparison

Multimodal Analysis: Seeing the full picture

Data scientists often use predictive models built on structured data. However, relying only on structured data for these models can overlook valuable signals present in unstructured sources like images, which influence user engagement. Instead of maintaining separate systems for structured data and image processing, data analysts and scientists can now work within the familiar Snowflake environment, using simple SQL to explore correlations between traditional metrics and visual intelligence. 

Here’s how analysts can use SQL to analyze visual elements in ad creatives and reveal hidden patterns in campaign performance. By extracting visual features, your technical teams can uncover relationships with social media engagement and user conversion rates. 

SELECT 

c.ad_id, 
c.conversion_rate,
snowflake.cortex.complete('claude-3-5-sonnet','Classify the prominent color visible in this image. Respond with the name of the color and nothing else', adimages) as prominent_color ,

snowflake.cortex.complete('claude-3-5-sonnet','Are there human faces identified in the Image? Respond only with TRUE or FALSE and nothing else', adimages) as human_face_flag

FROM campaign_table c join image_table i on c.ad_id i.ad_id

This is where multimodal analysis unlocks its true potential by combining traditional structured data with these rich visual insights, creating a more comprehensive business understanding. Other examples include retailers who integrate product photo metadata with transaction histories to gain deeper insights of how visuals influence purchase decisions. Whether you're in retail, manufacturing, healthcare, or finance, you can leverage these capabilities to gain deeper insights and create more meaningful customer experiences that drive business growth.

Transforming industries with AI-powered multimodal analysis

The vast amount of unstructured data assets within your organization holds untapped business value. Cortex AI Functions unlocks this value through simple SQL that combines structured and unstructured data analysis.

  • To optimize marketing campaigns: Marketing teams use Cortex AI to transform their campaign performance by connecting visual elements in promotional assets directly to conversion metrics. For example, a retail company using Snowflake can analyze thousands of ad images to discover that product images with certain color schemes tend to generate higher engagement among specific demographic segments. 

  • For streamlining manual processes: Online retailers and food delivery platforms use Cortex AI to automate image descriptions for meals and groceries, reducing manual effort. In manufacturing, facilities are able to prevent costly defects by linking visual inspection data with production specifications. Healthcare organizations can improve patient outcomes by correlating imaging metadata with treatment protocols and demographics.

  • Enhancing customer service:  Customer service departments use speech-to-text models to transcribe calls and derive a deeper level of insight. For example, Cortex capabilities can be used to extract not just customer details and agent interactions, but also summaries next steps, intent, and sentiment, thus delivering a fuller picture of the customer experience. 

  • To analyze complex documents: Cortex AI enables financial companies to analyze quarterly reports, prospectuses and financial statements by extracting structured data from text, tables, and chart descriptions. For example, a global bank can accelerate its loan application process by extracting and validating key information from tax returns, bank statements, and employment verification documents, all within a secure and trusted environment.  

AI ad evaluation

Delivering high-quality results

Quality is paramount when business decisions depend on analytics. Cortex AI delivers exceptional quality across a wide range of unstructured data processing tasks through models and specialized functions tailored for different tasks. 

Industry-leading Vision Models: Cortex AI provides instant, secure access to industry-leading vision models, allowing you to select the model that best matches your specific business requirements. Using the Cortex AI COMPLETE function, you can choose from options like Anthropic’s Claude 3.5 Sonnet, Mistral AI’s Pixtral Large, and the upcoming Anthropic’s Claude 3.7 Sonnet, as well as Meta’s Llama 4 Scout, and Open AI’s GPT-4.1 for comprehensive visual analysis. 

Fig. 2: Multimodal queries using SQL
Fig. 2: Multimodal queries using SQL

Claude 3.5 Sonnet excels at document understanding with an impressive 90.3% on DocVQA benchmark, hence making it an optimal choice for extracting information from financial statements, legal contracts, and compliance documentation.  Pixtral Large stands out with exceptional chart analysis (88.1% ChartQA) and mathematical reasoning (69.4% Mathvista), perfect for financial reports and manufacturing specification measurements. GPT-4.1 (coming soon) offers industry-leading image understanding benchmarks like MMMU (74.8%) for diverse business imagery needs like visual question answering from diagrams and maps. All these models operate directly within the Snowflake environment—no complex external integrations are required.

Flexible audio transcription: Separate from Cortex AI's native multimodal capabilities, customers have the ability to bring any modality, including audio processing, to Snowflake using Snowpark Container Services. Snowpark Container Services offers managed infrastructure for containerized applications that empowers developers to deploy audio transcription models at scale. 

Using Snowpark Container Services, you have the flexibility to deploy and optimize models like OpenAI Whisper, Nvidia Canary, or Nvidia Parakeet on Snowflake based on your specific needs. Customers often select their preferred model based on Word Error Rate (WER) but also based on individual model features like multilingual support, performance in challenging environments like call centers, or resource efficiency. Snowflake’s secure, efficient, environment allows you to run the model of your choice, offering a great combination of flexibility, power, and trust.

State-of-the-art entity sentiment analysis: Beyond audio processing, Cortex AI delivers sophisticated text analytics capabilities for deriving insights from diverse textual sources. Whether analyzing transcribed customer conversations, social media posts, product reviews, or other text data, our state-of-the-art entity sentiment analysis offers a nuanced understanding of expressed opinions. 

Snowflake's aspect-based sentiment analysis sets a new quality standard in the industry, providing superior sentiment classification compared to leading large language models based on the benchmark listed below. Specifically, Cortex AI Entity_Sentiment enables the extraction of nuanced insights from text by analyzing sentiment toward specific entities, rather than relying only on overall Positive or Negative classification. Cortex AI Entity Sentiment is up to 45% more cost efficient than prompting a large model like GPT-4o for substantially higher sentiment accuracy. Entity_Sentiment effectively handles complex sentiment expressions, including mixed and unknown sentiments, facilitating the analysis of relative emotion in product reviews or call transcripts. 

Fig. 3: Comparison of model accuracy on the task of aspect-based sentiment analysis (ABSA) across a combined evaluation set. The benchmark includes datasets from SemEval-2014 Task 4 (laptops and restaurants), MAMS, SENTFIN, and FABSA. The task involves identifying sentiment polarity toward specific aspects or entities mentioned in a sentence. We compare the performance of Mistral-large, Claude 3.5 Sonnet, GPT-4, and Snowflake’s model Cortex AI Entity Sentiment.
Fig. 3: Comparison of model accuracy on the task of aspect-based sentiment analysis (ABSA) across a combined evaluation set. The benchmark includes datasets from SemEval-2014 Task 4 (laptops and restaurants), MAMS, SENTFIN, and FABSA. The task involves identifying sentiment polarity toward specific aspects or entities mentioned in a sentence. We compare the performance of Mistral-large, Claude 3.5 Sonnet, GPT-4, and Snowflake’s model Cortex AI Entity Sentiment.

Advanced document processing with OCR: Beyond analyzing text from digital sources, organizations also need to extract valuable information locked in documents of various formats. Cortex AI's document processing capabilities convert unstructured documents into searchable, analyzable data. Foundational to the accuracy of your search systems and multimodal analytics, reliable text extraction is the foundation of accurate document processing. 

Cortex AI’s PARSE_DOCUMENT OCR capabilities surpass popular commercial and open source solutions for enterprise documents, without introducing unnecessary complexity. Customers can easily integrate into applications OCR results, such as  LLM-based question answering on financial documents, Snowflake's solution delivers significantly better results, achieving an ANLS metric of 0.974 compared to 0.969 for competing solutions.

Fig. 4: The results of our real-world documents OCR benchmark tests, which were performed on diverse public documents for different file formats (e.g., PDF, DOCX, PPTX, TIFF) with manually annotated ground truth. The tests measure how accurately an OCR system extracts text.
Fig. 4: The results of our real-world documents OCR benchmark tests, which were performed on diverse public documents for different file formats (e.g., PDF, DOCX, PPTX, TIFF) with manually annotated ground truth. The tests measure how accurately an OCR system extracts text.

Best-in-class machine translation: For all digital text and extracted text from documents, organizations often need to make information accessible across languages. Cortex AI Translate delivers consistent, high-quality translations across 14 languages. Unlike general-purpose LLMs, which may introduce commentary or decline translation requests, Cortex AI Translate is specifically optimized for translation tasks through a rigorous data preparation process and customized model training. The performance aligns with popular commercial systems and state-of-the-art LLMs like GPT-4o on industry benchmarks. At the same time, Cortex AI Translate is up to 51% more cost-effective than prompting a large model like GPT-4o, or up to 70% more cost-effective than using a popular commercial system. In addition, Cortex AI Translate effectively handles noisy text, code-mixing, and extended context with coherence.

Fig. 5: chrF (character-lefel-F-score) is a metric used to evaluate machine translation quality that operates at the character level rather than the word level. BLEU compares how closely machine translations match human references by counting matching word sequences.
Fig. 5: chrF (character-lefel-F-score) is a metric used to evaluate machine translation quality that operates at the character level rather than the word level. BLEU compares how closely machine translations match human references by counting matching word sequences.

The future of unified data analytics

Snowflake Cortex AI enhances how enterprises extract value from all data. It natively integrates structured and unstructured analysis via Cortex AI COMPLETE Multimodal, all within Snowflake using SQL. This reduces complexity and accelerates the time to insights. With native AI and trusted governance, enterprises gain a richer, broader understanding of all their data. This unified approach results in quicker, more impactful decisions. Snowflake AI Research team simplifies AI value for enterprises with innovations like Cortex AI COMPLETE Multimodal, and powerful Task-Specific functions optimized for industry-leading quality.  

Explore Snowflake Cortex AI COMPLETE Multimodal today.

Get started:

Forward Looking Statements
This article contains forward-looking statements, including about our future product offerings, and are not commitments to deliver any product offerings. Actual results and offerings may differ and are subject to known and unknown risk and uncertainties. See our latest 10-Q for more information.

Subscribe to our blog newsletter

Get the best, coolest and latest delivered to your inbox each week

Start your 30-DayFree Trial

Try Snowflake free for 30 days and experience the AI Data Cloud that helps eliminate the complexity, cost and constraints inherent with other solutions.