Recently, the Taneja Group administered an open survey to 7,000 technology professionals in order to understand their knowledge and use of Apache Spark. Spark has quickly grown into one of the major big data ecosystem projects and shows no signs of slowing down. It has become the de facto processing engine for Hadoop and the general engine for modern analytic use cases.
Cloudera’s focus on driving enterprise use of Spark, ranging from data processing to data science and machine learning, involves expert training, professional services, and proactive support. We partner with our customers to understand the areas of Spark development that matter most and the pain points that Spark users encounter. This survey validates the use cases, architecture choices, and future work that Spark users care about most.