Cloudera named a leader in The Forrester Wave™: Data Fabric Platforms, Q4 2025

Read the report

September 11, 2025 | Technical

Unlocking Enterprise AI Potential: Knowledge Distillation for Customer Support Analytics

AI Enterprise Ai

Enterprises today face a steep challenge: they want to leverage advanced AI models to stay competitive, but need to keep the high costs of cloud-based large language models (LLMs) under control and stay compliant with data privacy regulations.

So how can businesses explore cutting-edge AI without overextending budgets or exposing sensitive private data? At Cloudera, we’ve developed a solution that turns this challenge into an opportunity—using synthetic data generated from private data and knowledge distillation to build cost-efficient, accurate, and compliant AI systems.

In this article, we discuss how Cloudera’s Synthetic Data Generation Studio–part of Cloudera AI Studios—allows organizations to capitalize on AI innovation even when real-world data is scarce or sensitive.

Use Case and Key Takeaways

Use case: Drawing from an internal use case, we’ll show how we significantly improved the performance and overall throughput for Cloudera’s customer support ticket pipeline through knowledge distillation using synthetic data generated from private data, while maintaining data privacy and regulatory compliance.

Key takeaways:

Data privacy as a competitive advantage: Synthetic data enables innovation without regulatory risk.

Cost-effective performance: Smaller, fine-tuned models outperform larger, resource-heavy alternatives.

Applicable to multiple use cases: The same approach can power use cases from fraud detection to personalized customer service.

Business Challenge: Balancing AI Model Speed and Accuracy Without Compromising Data Privacy

Cloudera’s customer support team leverages AI models to analyze and summarize customer support tickets in real time. The system takes as input customer or Cloudera support agent comments. Then, it analyzes each comment and extracts a set of analytics, such as sentiment and summarization. These analytics are paramount to improve the customer experience at Cloudera.

Due to the sensitive nature of the customer data being processed in this pipeline, only models running in local environments can be used and no customer data can be shared with any external sources.

Initially, to analyze the comments, the team relied on local LLMs (Goliath 120B), which met basic performance requirements but lagged in speed and generation performance: on average, processing requests took 12-15 seconds each, and requests came in every 30 seconds. Adherence to the expected output was 77.5%, and generation accuracy was lower than proprietary models—a bottleneck for scalability and LLM performance.

The challenges of using local large LLMs (Goliath-120B) were clear: slower response times, increased costs, lower generation accuracy than state-of-the-art, cloud-based models, and compliance risks.

Large organizations face similar trade-offs—balancing AI accuracy and speed against the risks of data exposure.

Cloudera’s Solution: Knowledge Distillation with Private Data

Cloudera’s breakthrough lies in a privacy-first approach to knowledge distillation.

Instead of training models on raw customer data, which had regulatory and exposure risks, we generated synthetic datasets using Cloudera Synthetic Data Studio. This new low-code tool in Cloudera AI mimicked real-world interactions—technical questions, troubleshooting scenarios, and more—without ever exposing private information.

Generating synthetic customer support interactions had regulatory and exposure benefits and also enabled the team to send the synthetic data to state-of-the-art, cloud-based LLMs to extract insights such as customer sentiment from the best performing LLMs. These cloud-based LLMs provided much more accurate information extraction than large local LLMs, making them an ideal source to distill accurate insights from these state-of-the-art LLMs.

Cloudera’s synthetic data solution eliminated any compliance and privacy risks and generated the highest quality synthetic data (even higher than existing large, local LLMs). This approach unlocked the option to distill knowledge from state-of-the-art models to small LLMs and solve the same problem as the Goliath-120B but at a lower cost and higher accuracy.

Our Process

Data generation: Using the Synthetic Data Studio data generation workflow, we crafted a prompt instructing Claude Sonnet to generate customer questions and answers. The prompt instructs the LLM to create customer support questions and answers, impose the tone, and detail the structure. In addition, we provide a list of topics that appear in real-world data (such as customer support for Cloudera AI or Cloudera Data Warehouse) and use seed topics to ensure both diverse and real-world customer support ticket generation.

Fine-tuning: Using only the filtered data, the team split the data into train and development tested and distilled knowledge from the Claude Sonnet model to a Meta Llama3.1-8B-instruct model. The team ran multiple experiments selecting the fine-tuning parameters that maximize the performance of the distilled LLM.

Evaluation: Using the Synthetic Data Studio evaluation workflow, the team crafted a prompt to instruct an LLM-as-a-judge on how to evaluate the quality of the generated data and filtered out low-quality samples.

Using both human and automated LLM-as-a-judge evaluations, the team scored real-world customer support ticketing questions and answers. Cloudera’s team focused on answers that the deployed and distilled LLMs differed and reported the win rate of each LLM. In addition, they measured speed improvements in terms of average running time, adherence to the expected output, and cost to deploy the model.

The Results

Improved speed: Processing time dropped 95%.

Better output structure: Output adherence rose from 77.5% to 99.5%.

Higher LLM accuracy: When comparing the smaller distilled LLM (Llama 3.1 8B) against the deployed Goliath LLM (Goliath 120B), win rate was 70% vs. 30% when using Phi-4 as a judge and 63% vs. 37% when using human evaluators to compare the two models.

Improved cost and efficiency: The smaller distilled LLM reduced compute and memory needs while increasing real-time scalability and maintaining data privacy, and throughput improved 11x.

The results are clear: enterprises can achieve AI excellence without compromising data privacy. By synthesizing training data and distilling knowledge, businesses avoid trade-offs between innovation and compliance.

Figure 1. The impact of the synthetic data distillation approach to speed, adherence, and cost for the customer support use case. The AWS cost is a hypothetical cost if the LLM runs on the AWS Cloud (based on Feb 2025 prices).

Synthetic Data Enables Innovation Without Regulatory Risk

By developing a knowledge distillation approach, Cloudera achieved a 95% reduction in processing time, increased output structure adherence to 99.5%, and deployed a distilled Llama 3.1 8B model that outperformed the prior Goliath 120B model by 70% in accuracy (as judged by Phi-4) and 63% in human evaluations.

This method eliminated compliance risks by avoiding direct use of sensitive data and also unlocked 11x greater throughput, showing that smaller, fine-tuned models can surpass larger, resource-intensive alternatives in both speed and precision.

Try our AMP to explore how to use private synthetic data to distill knowledge from a large model to a smaller model for a customer support use case.

Andreas Tsiartas

Senior Staff Data Scientist, Cloudera

More by this author ›

Yi-Hsun Tsai

Director, Engineering, Cloudera

More by this author ›

Jugoslav Djajic

Director, Backline Engineering, Cloudera

More by this author ›

Robert Hryniewicz

Director of Product Marketing

More by this author ›

December 02, 2025 | Business

How to Avoid Building Brick Walls with Your Data and AI Platforms

10 min read • Jeff Healey

Ready to Get Started?

Your form submission has failed.

This may have been caused by one of the following:

Your request timed out
A plugin/browser extension blocked the submission. If you have an ad blocking plugin please disable it and close this message to reload the page.

Misa Amane

Unlocking Enterprise AI Potential: Knowledge Distillation for Customer Support Analytics

Use Case and Key Takeaways

Business Challenge: Balancing AI Model Speed and Accuracy Without Compromising Data Privacy

Cloudera’s Solution: Knowledge Distillation with Private Data

Our Process

The Results

Synthetic Data Enables Innovation Without Regulatory Risk

Andreas Tsiartas

Yi-Hsun Tsai

Jugoslav Djajic

Robert Hryniewicz

Your form submission has failed.