Your browser is out of date!

Update your browser to view this website correctly. Update my browser now

× decreased its site’s bounce rate by 20 percent and increased its conversion rate when it implemented an enterprise data hub that enables clickstream analytics, semantic search, and machine learning.


Because art is subjective, it can be hard to determine what product recommendations to offer customers as they browse the company’s sites. Maintaining customer engagement, and, ultimately, increasing sales, depends on the company’s ability to understand customer interests and buying habits in order to serve up the right content at the right time.

To support this goal, is moving from siloed data management environments to an enterprise data hub (EDH) powered by Cloudera Enterprise and Apache Hadoop. The EDH will provide a flexible, scalable, and centralized environment that enables the organization to ingest, process, and analyze data from any source, and build operational analytic and machine-learning applications that can be used across any of the company’s brands.

The company’s first use cases have demonstrated significant benefits. Improved clickstream analysis is helping staff increase conversion rates and reduce cart abandonment rates. A sophisticated retargeting email engine that leverages real-time customer activity is driving increased customer engagement in initial tests. And a more visual and relevant discovery experience has reduced the bounce rate on its websites by 20 percent.


We now know the exact drop off point of a user session, including which page and product they last viewed and which search term they last used. This is helping us improve conversion rates and reduce cart abandonment.

Parag Panchal, Software Architect,

Business Drivers

“We had a traditional data environment that was limited by the amount and types of data that we could process,” explained Parag Panchal, software architect, “We wanted a centralized system that could process both structured and unstructured data, including website logs, clickstream events, and images, and deliver analytic capabilities that could be consumed by all of our brands.”

Gaining insight regarding customer behavior had become challenging in the siloed infrastructure. While extensive data regarding customer preferences was being generated at every customer touchpoint, had no way to tap into this data. Each brand maintained its own data warehouse, making it difficult to share insights about customers. Additionally, the organization had scaled its existing data warehouses to capacity, and staff found that it would be not only cost-prohibitive to continue on the same path, but also impossible to implement the operational analytics it wanted.

“We wanted to create a closed-loop system where we could collect, process, and analyze data from different sources and then serve the insights back into our marketing systems so they could become smarter,” explained Panchal. “To do so, we needed a lambda architecture based environment that could work with both batch and real-time data, and could process data with very low latency.”


“Instead of having silos of multi-structured data, we looked at Hadoop as a way to create a centralized data platform to implement analytic capabilities,” said Panchal. “We selected Cloudera over other vendors, like Hortonworks, because we see Cloudera as the leader in this space. They have a proven track record, excellent support, and expertise with leading, open standards tools, including Cloudera Search, which integrates Apache Solr with CDH, and Apache Impala. Cloudera Manager also makes it easy to deploy and manage our enterprise data hub.”

Another critical requirement was to maintain security compliance for consumer data. Cloudera offers a comprehensive, compliance-ready security solution without compromising on performance and that is easy to integrate with existing systems.

With Cloudera, gained an intelligent data platform that supports different types of workloads, including batch, analytic SQL, search, and stream processing. The environment also enables more advanced functionality, such as machine learning, in which the algorithms that drive customer experience will improve with each search customers perform and every item added to a user’s shopping cart. Over time, will use its EDH not only to gain a complete and interactive 360-degree view of its data to better serve its customers, but also to proactively create a personalized experience for every user.

Cloudera also enables interoperability between different teams, replacing silos of multi-structured data with a unified, multi-tenant platform that stores data in its original fidelity for as long as it needed.

For example, the EDH supports a new clickstream analytics platform that collects, ingests, and aggregates information regarding user visits from Google Analytics. Business users can obtain a consolidated view of the clickstream data using SAP Business Objects in order to increase website conversion rates.

Following its initial deployment of CDH, Cloudera’s 100 percent open-source distribution including Hadoop, upgraded to Cloudera Enterprise to take advantage of Cloudera’s dedicated support team, management tools, such as Cloudera Manager, and automatic data governance capabilities, built in with Cloudera Navigator.

“Having access to the knowledgebase in Cloudera, both in terms of the portal and the support engineering team, enables us to solve cluster issues much faster,” said Panchal. “Issues that would have taken us days to solve on our own can be fixed in less than 30 minutes with Cloudera Support.”

Additionally, the ability to perform rolling restarts using Cloudera Manager has reduced maintenance windows. “With rolling restarts, maintenance and upgrades to the cluster are seamless and cause minimum disruption to our business processes,” said Panchal.

The new EDH supports nearly 50 terabytes (TB) of data, compared to only 2 TB in its legacy data management environment, and processes tens of millions of events every hour for aggregating user activity.

Impact: Conversion Rates Increase with Clickstream Analytics

Analytics that were once impossible to perform due to platform limitations have been enabled by the Cloudera enterprise data hub. Take clickstream analytics, for example.

Previously, business staff was limited to sample views of user interactions from Google Analytics, and often reports were not available on time each morning due to lengthy extract, transform, and load (ETL) processes. Now, with the ability to process, discover, model, and serve information on clickstream activities from Google Analytics at scale, staff have a comprehensive understanding of user activities to more accurately understand customer behavior. And because the time to process data was cut by 30 percent, business users can access data every morning without delay.

“We now know the exact drop off point of a user session, including which page and product they last viewed and which search term they last used,” said Panchal. “We can correlate that information to identify the reasons they may have dropped off. This is helping us improve conversion rates and reduce cart abandonment.”

The company also plans on operationalizing its novel findings to optimize its recommendation engine, and, ultimately, re-engage customers with smarter, more personalized offers.

“We want to use clickstream data to further strengthen our internal marketing platforms and provide a foundation for near-real-time, closed-loop predictive analytics,” added Panchal. “These are things that we couldn’t do with our legacy platforms.”

Impact: Sophisticated Retargeting Email Engine Leveraging Clickstream Data

Which artwork are customers browsing right now? What galleries are customers exploring? Marketers understand that knowing real-time trends like these and retargeting customers with the same/similar products can help nurture greater engagement, leading to new sales. 

For Panchal, the Cloudera EDH makes this work possible, processing tens of millions of events hourly and aggregating product view and buying data for each customer. These aggregations are leveraged by a smart retargeting email engine to deliver emails to users who have abandoned their carts and shopping.

Currently, it is enabled only for ART US brand and the results are impressive - improved email conversion by 20.4 percent for the ART US brand. The email open rate and click-through rate are three times higher for smart retargeting emails than for promotional emails with discount coupon, increasing the average order value (AOV) and the margin.

Revenue chart from triggered emails for ART US vs APC brands – ART US has smart retargeting emails enabled.

“The goal of this effort is to increase customer engagement by bringing them back, and, so far, with this almost real-time email notification we see a big re-engagement for users on the ART brand,” said Panchal. “We will leverage this learning for other brands and for real-time use cases that immediately share current trends to users before they abandon shopping. For now, we are bringing them back, but in future we will win them with best personalized buying experience.” 

Impact: Bounce Rate Decreases 20 Percent using Semantic Search

Likewise, the EDH has enabled to build a more visual and relevant discovery experience for its customers. Conventional discovery experiences on ecommerce sites are typically taxonomy based, which is very hierarchical in nature. To create a better shopping experience for its customers, sought to offer visually engaging and semantically relevant search results.

For example, if customers search for “flowers,” they’re now presented with semantically related clusters by theme (e.g., poppies, wildflowers, roses), by artist (e.g., Georgia O’Keeffe, Claude Monet, Vincent Van Gogh), and art style (e.g., Fine Art, Decorative Art, 20th Century Art). These clusters, driven by Cloudera Search, help customers more quickly narrow their searches to find exactly what they’re looking for.

“We had been experiencing a high bounce rate on our sites,” said Panchal. “By delivering more relevant search results through the semantic search clusters, we’ve seen an increase in customer engagement and reduction in the bounce rate by 20 percent for test users.”

Additionally, staff can also introduce new products to customers and update information about existing products much more quickly.

“Before, our search indexes could only be updated once a day, so new products and product changes typically wouldn’t appear until the next day,” said Panchal. “Now we can perform updates every 15 minutes, so products can be updated throughout the day.”