Headquarters: Round Rock, TX, USA
- Modern Data Platform: Cloudera Enterprise
- Workloads: Analytic Database, Operational Database
- Components: Apache Flume, Apache HBase, Apache Hive, Apache Impala, Apache Kafka, Apache Pig, Apache Sentry, Apache Spark, Cloudera Manager, Cloudera Navigator, Cloudera Search
- BI & Analytics Tool: Tableau, Datameer
- ETL Tool: Informatica BDM, StreamSets
- Servers: Dell PowerEdge R730 and R730 XD
- Marketing, customer, supply chain, HR, and product quality insights
- Real-time IoT data processing
- Long-term queryable archive and storage
- Temperature sensors and machine data from operational systems
- Log files from factories, Microsoft Internet Information Services (IIS), Internet Authentication Service (IAS), systems integrations, and compatibility tests
- Dell.com weblogs
- Social media data, e.g., Facebook, Twitter
- Structured data from relational marts
- Application data
- PDFs and text files
- Data from parts suppliers, e.g., serial numbers, ship dates, yield data, parametric data
- Data from assembly operations, e.g., serial number scans, test logs, failures
- Customer dispatches, call logs, and “call home” logs
- Data from repair centers, e.g., parts failure and repair data
- Near-real-time response to product development, manufacturing, and supply chain issues
- Better product quality and customer-360 view improves customer satisfaction
Big data scale
- ~2.5 PB raw
- Thousands of concurrent users
Dell builds a 360-degree customer view and improves supply chain operations, quality controls, customer satisfaction, and human resource (HR) insights using Cloudera's modern data platform.
By operationalizing advanced analytics, Dell is shifting its focus from asking “what happened?” to “what do we predict could happen?” Cloudera-powered interactive analytics drive improvements to the manufacturing and supply chain process, benefiting product quality and customer satisfaction:
Maintenance scheduling maximizes ROI by incorporating risk models and predictive failure analytics that leverage sensor and machine data
Parsing through multi-structured data sets allows Dell to better understand challenges and patterns that occur throughout product development, testing, and quality control cycles
Their modern analytic database, built with Cloudera’s platform, facilitates interactive analytics on these data sets and allows Dell to respond to situations while they’re happening
The analytics environment is also making an impact on HR, expediting performance over the legacy software so much that the astonished staff initially thought the system was malfunctioning.
And the customer-360 view Dell has built serves a variety of business functions, including:
Sales—facilitating customized proposals and recommendations
Customer Service—enabling more productive conversations during inbound calls, for example
Dell’s big data platform accommodates and better utilizes voluminous, high-velocity Internet of Things (IoT) data sets such as machine data and temperature sensors, while expediting data processing. The company now has an archiving strategy that makes historical data accessible to business users while avoiding expensive upgrades.
“With [Apache] Impala at the core, it helps us to quickly get information from all logs and pass it along to the different teams immediately, as it happens, at any given time.” said Deepak Gattala, enterprise business intelligence at Dell. Cloudera’s unified platform enables utilization of all types of data, including data which couldn’t be accessed in past systems, and the ability to conduct interactive SQL queries much faster than was previously possible to yield a multitude of new insights.
Upon anticipating eight- to 10-times growth in data volumes over the next five years, and seeing the opportunity to make use of that data to benefit manufacturing and supply chain processes, Dell embarked on a big data journey.
Business users were demanding real-time access to all their data, but IT couldn’t meet SLAs for data availability. Real-time analytics could, for example, help Dell reduce operational costs, but they needed a solution to store and analyze “at rest” machine data. Faced with expensive upgrades, Dell didn’t have the capacity to store or process large volumes of multi-structured data. Data transformations were taking hours to complete, and much of the data Dell did store was kept, unused, in silos. There was no strategy for EDW archival, yet regulatory compliance mandates access to older data. The company had relied on relational databases, which worked well for structured data but struggled to accommodate today’s multi-structured data types.
The Dell team decided its big data platform must:
Store high-velocity, high-volume, multi-structured data sets over long periods of time
Process streaming data in near real time (versus in batch), enabling on-the-spot issue detection and resolution
- Converge data silos into one unified environment
Cloudera Enterprise provides Dell with a unified platform to support a variety of workloads, including an analytic database, operational database, and queryable archive. The team streamlined its Hadoop implementation with the Dell | Cloudera Apache Hadoop Solution, and relies on Cloudera Support to ensure challenges are resolved quickly.
Cloudera’s analytic database drives insights and support interactive analytics across a number of areas including:
Marketing—delivering tailored communications through better customer understanding from Dell.com weblogs and social media data
Customer Relationship Management—improving the user experience by combining data from legacy data marts, call logs, and applications into a 360-degree view
Supply chain operations—improving efficiencies by providing data access and insight to Dell teams and suppliers from parts suppliers, assembly operations, customers, and repair centers
Hardware prototype validations and product quality—minimizing failures and errors with a holistic view into Microsoft Internet Information Services (IIS), Internet Authentication Service (IAS), systems integration logs, and compatibility test data
Human resources--expediting performance by orders of magnitude over the legacy system
Dell also leverages Cloudera’s operational database to enable real-time processing for these analytic applications.. The operational database ingests real-time machine and log data, using Apache Flume and Apache Kafka, and stitches that together with customer information using Apache HBase.
Data exploration is facilitated by Cloudera Search, making customer comments from call logs searchable as text, for example. The platform accommodates approximately 1,000 concurrent searches from Dell employees at any given time.
In addition to valuing the mixed workloads supported by Cloudera’s platform, Dell selected Cloudera for its unified role-based access controls across the organization, including marketing, manufacturing and supply chain, and human resources.
Cloudera Navigator provides the data lineage capabilities to help Dell track and understand how data is used in Apache Hadoop, while Apache Sentry enables role-based access for Dell’s thousands of users. Dell also plans to leverage Cloudera’s encryption offerings in the near term to bring and maintain PCI-compliant data sets into the modern data platform.
Moreover, Cloudera Manager provides simple operations and management with a single view for monitoring and troubleshooting all workloads, services, and partner technologies. In addition, Dell can trust that even their most mission critical workloads will stay up and running, with zero-downtime maintenance and upgrades.