Cloudera DataFlow (CDF) - NiFi, MiNiFi
Over 50% data volume reduction
Onboarding engagement timelines reduced from 6-9 months down to one month through automation
Data that used to take 6-7 weeks to find, has now been shortened down to one week
Ingesting 75,000 events per second into the data lake
This major airline is one of the largest in the world and is based in North America. The company uses mission critical data to drive decision making and optimize its business. This includes sensor data from aircraft, clickstream data, and operational information, such as scanned boarding passes, to influence actions throughout the organization. Having the right data available in a timely manner offers unique opportunities for the airline.
By using sensor data pulled from aircraft upon arrival, maintenance workers in the hangar benefit from predictive maintenance. This prevents unplanned part failures, which helps prevent flight delays, and also enables an optimized supply chain with parts and labor.
For clickstream data, the airline can analyze customer interactions with the website when they’re online looking for flights. Based on the customer decision making process, promotions can be offered when the time is right - ensuring flights are at full capacity thus maximizing profitability.
The major airline needed to solve its data movement challenge, connecting all data sources into its platforms to do analytics at scale. Once this foundation was built, the organization would be able to understand customer behavior, while providing better reporting and aggregation for agent technology teams. It would also remove silos for cybersecurity data, which supported the push for PCI compliance, and customer data privacy.
“The bottom line was how do we aggregate all the data necessary? You need an easy to consume mechanism for linking disparate systems together and getting data into the platform. We had to focus on our ‘plumbing,’ how we were going to get all this data flowing in an easy way for application teams as well as how do we get a platform that is scalable to handle that flow and volume. A lot of people want to jump straight into advanced analytics without taking care of the foundation first,” said a manager of cybersecurity engineering.
Cloudera DataFlow with Apache NiFi and MiNiFi became the foundation for ingesting all data across the organization, including networking and server logs from all devices, which were then routed to cybersecurity platforms and other applications. The company has multiple data lakes that serve different purposes and they all feed off the same data ‘plumbing.’ By solving the challenge of data movement, the resources and focus could be turned to analytics.
“We had deployed NiFi clusters in all of our data centers, to be used as our log transport utility service at an enterprise level for both on-premises and in the cloud. We toted it as an enterprise pattern for moving data in near real-time at scale. We reached a critical mass where we would’ve needed to hire more talent to maintain the system internally. Partnering with Cloudera made much more sense for us - for expertise, guidance, and support. CDF was the launching pad that allowed us to focus on the next step,” added the cybersecurity engineering manager.
The operations team wanted to automate the data pipeline onboarding process, enabling new teams to self-service and get the most value out of the wealth of data that was available much more quickly. The idea was to enable other teams to set up an end-to-end pipeline of data movement as quickly as possible. Using NiFi, the operations team has been able to maintain control, making the onboarding process much simpler, automated, and consistent. Previously, these onboarding engagements would take 6-9 months to set up an end-to-end flow, but now can be executed in less than one month.
“Our previous solution was difficult to use and it wasn't reliable, so if a job stopped running we wouldn’t know about it until 24-hours later. With NiFi we actually saw the data moving and we could validate within a few clicks - it eliminated all of the mess,” said the cybersecurity engineering manager.
By embarking on this digital transformation, the cybersecurity team was able to gain more understanding of the logs and how customers and internal users are using applications.
The operations team is focused on preventing equipment downtime and revenue loss, ensuring on time departures and arrivals, avoiding delays for customers and helping to keep airport employees more productive. Implementing CDF and NiFi also enabled this major airline to centralize data movement, now ingesting 75,000 events per second into the data lake.
“The simplicity to the application teams has been a huge benefit. Now they can have a clear understanding of how to configure logs and where to send them. Having a unified answer makes it a much better message and means we have better visibility and coverage. Leveraging NiFi to parse data and send what we need in a highly compressed manner results in reduced network utilization, and reduced public cloud costs,” said an executive from this major airline.
By using NiFi to parse data and send only what is needed in a highly compressed manner, the major airline has seen an over 50% data volume reduction because they aren’t having to pay for data streams to transmit twice. The organization has also seen a 30-40% reduction of network data charges for public cloud.
The savings is enabling the major airline to expand coverage without having as significant of a dollar impact, which allows it to expand beyond its original objectives into areas that help improve the overall situation awareness of application and infrastructure availability and performance.