Hadoop Professional Services
Cloudera can help you install, configure, optimize, tune and run Hadoop for large-scale data processing and analysis.
We support Hadoop whether you run our distribution on servers in your own data center, or on hosted infrastructure services such as Amazon EC2, Rackspace, SoftLayer, or VMware’s vCloud.
Professional Services
Cloudera’s services include a team of Solutions Architects who have gained significant experience working with customers putting Hadoop into production in a variety of industries solving a wide range of business and technical problems. We provide guidance and hands-on help on a number of topics, including:
Best practices for setting up and configuring a cluster suitable to run Cloudera’s Distribution for Hadoop:
- Choice of hardware, operating system, and related systems software
- Configuration of storage in the cluster, including ways to integrate with existing storage repositories
- Balancing compute power with storage capacity on nodes in the cluster
A comprehensive design review of your current system and your plans for Hadoop:
- Discovery and analysis sessions aimed at identifying the various data types and sources streaming into your cluster
- Design recommendations for a data-processing pipeline that addresses your business needs
Operational guidance for a cluster running Hadoop, including:
- Best practices for loading data into the cluster and for ensuring locality of data to compute nodes
- Identifying, diagnosing, and fixing errors in Hadoop and the site-specific analyses our customers run
- Tools and techniques for monitoring an active Hadoop cluster
- Advice on the integration of MapReduce job submission into an existing data-processing pipeline, so Hadoop can read data from, and write data to, the analytic tools and databases our customers already use
- Guidance on the use of additional analytic or developmental tools, such as Hive and Pig, that offer high-level interfaces for data evaluation and visualization
Hands-on help in developing Hadoop applications that deliver the data-processing and analysis you need.
How to connect Hadoop to your existing IT infrastructure. We can help with moving data between Hadoop and data warehouses, collecting data from file systems, creating document repositories, logging infrastructure and other sources, and setting up existing visualization and analytic tools to work with Hadoop.
Performance audits of your Hadoop cluster, with tuning recommendations for speed, throughput, and response times.
