Blog Posts

Title, Author(s)Abstract / DescriptionFile Format

September 8

Excerpt: Flume community update: September 2010 by jon September 08, 2010... more

html View Page

September 7

Excerpt: Purdue University’s Saptarshi Guha Interviewed Regarding Hadoop, R and Hadoop World... more

html View Page

September 6

Excerpt: A Look Back at August Posts by Jon Zuanich September 06, 2010... more

html View Page

September 6

Excerpt: A Summer Internship with Cloudera by Jon Zuanich September 06, 2010... more

html View Page

Tracing with Avro

Jon Zuanich

September 3

Excerpt: Tracing with Avro by Jon Zuanich September 03, 2010 no comment... more

html View Page

September 2

Excerpt: Infochimp’s President, Philip Kromer, Interviewed Regarding Hadoop and Hadoop World... more

html View Page

September 1

Excerpt: Register for Hadoop Training in New York and Get into Hadoop World for Free! by Jo... more

html View Page

August 30

Excerpt: Hadoop World 2010: Speaker Highlights by Jon Zuanich August 30, 2010... more

html View Page

August 26

Excerpt: What’s New in Apache Hadoop 0.21 by Tom White August 26, 2010... more

html View Page

August 24

Excerpt: Learn about fraud and how to prevent it with Hadoop... more

html View Page

August 24

Excerpt: Hadoop Administrator Training Comes to London by Jon Zuanich August 24,... more

html View Page

August 23

Excerpt: Improving Hotel Search: Hadoop @ Orbitz Worldwide by John Kreisa August... more

html View Page

August 17

Excerpt: Hadoop/HBase Capacity Planning by Alex Kozlov August 17, 2010... more

html View Page

August 19

Excerpt: Hadoop Training surrounding Hadoop World: NYC.... more

html View Page

August 12

Excerpt: Avoiding Common Hadoop Administration Issues by Jeff Bean August 12, 201... more

html View Page

CDH3b2 Release Recap

Jeff Hammerbacher

August 11

Excerpt: CDH3b2 Release Recap by Jeff Hammerbacher August 11, 2010 no comments... more

html View Page

August 10

Excerpt: Cloudera’s Henry Robinson to speak at Hadoop Day in Seattle by Huw Edwards... more

html View Page

August 9

Excerpt: Hadoop World: early-bird rate ends on August 11 by Huw Edwards August 09, 2010... more

html View Page

August 3

Excerpt: Flume community update – the first 30 days! by phunt August 03, 2010 no c... more

html View Page

Migrating to CDH

Eric Sammer

August 2

Excerpt: Migrating to CDH by Eric Sammer August 02, 2010 no comments... more

html View Page

July 28

Excerpt: How to Get a Job at Cloudera by Mike Olson July 28, 2010 no comments... more

html View Page

July 28

Excerpt: Notes From the Hackathon at Cloudera by Jeff Bean July 28, 2010 no comments... more

html View Page

July 28

Excerpt: Upcoming webinar: 10 Common Hadoop-able Problems by Huw Edwards July 28, 2010 n... more

html View Page

July 28

Excerpt: Announcing Two New Training Classes from Cloudera: Introduction to HBase and Analyzing Data with Hive and Pig... more

html View Page

June 29

Excerpt: CDH3 and Cloudera Enterprise by Mike Olson June 29, 2010 1 comment... more

html View Page

July 20

Excerpt: Developing Applications for HUE by Aaron Newton July 20, 2010 1 comment... more

html View Page

July 22

Excerpt: What’s New in CDH3b2: Hive by Carl Steinbach July 22, 2010 no comments... more

html View Page

July 19

Excerpt: What’s New in CDH3b2: HUE by bc July 19, 2010 no comments... more

html View Page

July 19

Excerpt: Rackspace’s OpenStack shows the way for public cloud vendors by Ed Albanese July 19,... more

html View Page

July 16

Excerpt: What’s New in CDH3b2: Sqoop by Aaron Kimball July 16, 2010 no comments... more

html View Page

Hacking with Cloudera on CDH

Alex Loddengaard

July 15

Excerpt: Hacking with Cloudera on CDH by Alex Loddengaard July 15, 2010 no comments... more

html View Page

What's New in CDH3b2: Oozie

Arvind Prabhakar

July 15

Excerpt: What’s New in CDH3b2: Oozie by Arvind Prabhakar July 15, 2010 no comments... more

html View Page

What's New in CDH3b2: Pig

Carl Steinbach

July 14

Excerpt: What’s New in CDH3b2: Pig by Carl Steinbach July 14, 2010 no comments... more

html View Page

July 12

Excerpt: CDH3 beta 2 is the first to incorporate Apache ZooKeeper. ZooKeeper is a highly reliable and available coordin... more

html View Page

July 8

Excerpt: What’s New in CDH3b2: Core Hadoop by Eli Collins July 08, 2010 no comment... more

html View Page

July 9

Excerpt: What’s New in CDH3b2: HBase by Todd Lipcon July 09, 2010 no comments... more

html View Page

July 13

Excerpt: What’s New in CDH3b2: Flume by Henry Robinson July 13, 2010 no comments... more

html View Page

October 19

Excerpt: Cloudera Desktop and MooTools by Aaron Newton October 19, 2009 7 comments... more

html View Page

May 21

Excerpt: CDH2 Update 1 Now Available by Eli Collins May 21, 2010 no comments... more

html View Page

July 7

Excerpt: More on Cloudera’s Distribution for Hadoop 3 by Charles Zedlewski July 07, 2010... more

html View Page

April 5

Excerpt: Scaling Social Science with Hadoop by Ed Albanese April 05, 2010 12 comments... more

html View Page

June 23

Excerpt: Are your systems struggling to absorb ever-increasing amounts of data being generated daily? Are you mired in... more

html View Page

June 22

Excerpt: Cloudera is once again hosting  Hadoop World which will take place in  New York City on  October 12th . L... more

html View Page

June 18

Excerpt: Will Cloudera be at OSCON this year? Of course, it’s only the premier event for OS technologies on the marke... more

html View Page

June 11

Excerpt: Integrating Hive and HBase by carl June 11, 2010 no comments... more

html View Page

One word more...

Mike Olson

June 10

Excerpt: One word more… by Mike Olson June 10, 2010 no comments... more

html View Page

A transition

Christophe Bisciglia

June 10

Excerpt: A transition by Christophe Bisciglia June 10, 2010 no comments... more

html View Page

June 4

Excerpt: A report from the recent UK HUG from Klass Bosteels.... more

html View Page

June 3

Excerpt: Considerations for Hadoop and BI (part 2 of 2) by Jeff Bean June 03, 2010 no co... more

html View Page

June 1

Excerpt: The second Apache Hadoop HDFS and MapReduce contributors meeting was held last Friday, May 28 at ClouderaR... more

html View Page

May 25

Excerpt: Here at Cloudera we have deep knowledge and experience working with Hadoop and related technologies to solve... more

html View Page

May 21

Excerpt: Considerations for Hadoop and BI (part 1 of 2) by Jeff Bean May 21, 2010 no com... more

html View Page

CDH2 is released

Chad Metcalf

March 24

Excerpt: CDH2 is released by Chad Metcalf March 24, 2010 3 comments... more

html View Page

Job Scheduling in Hadoop

Amr Awadallah

November 23

Excerpt: Job Scheduling in Hadoop by Amr Awadallah November 23, 2008 3 comments... more

html View Page

April 3

Excerpt: Upcoming Functionality in “Fair Scheduler 2.0″ by Amr Awadallah April 03, 2... more

html View Page

June 17

Excerpt: Analyzing Apache logs with Pig by Amr Awadallah June 17, 2009 5 comments... more

html View Page

September 28

Excerpt: Grouping Related Trends with Hadoop and Hive by Amr Awadallah September 28, 2009... more

html View Page

July 31

Excerpt: Tracking Trends with Hadoop and Hive on EC2 by Amr Awadallah July 31, 2009 8 co... more

html View Page

May 10

Excerpt: What to Do with Extra Space? by bc May 10, 2010 no comments... more

html View Page

May 7

Excerpt: Highlights from the First Hadoop Contributors Meeting by Eli Collins May 07, 2010... more

html View Page

September 15

Excerpt: Apache Hadoop Log Files: Where to find them in CDH, and what info they contain by Alex Loddenga... more

html View Page

April 30

Excerpt: Around the globe, more and more companies are turning to Hadoop to tackle data processing problems that don... more

html View Page

April 26

Excerpt: CAP Confusion: Problems with ‘partition tolerance’ by Henry Robinson April... more

html View Page

April 21

Excerpt: Get Hadoop Training from Cloudera at the Hadoop Summit by John Kreisa April 21, 2010... more

html View Page

April 13

Excerpt: Cloudera Hadoop Training Spreads Worldwide by John Kreisa April 13, 2010 no com... more

html View Page

Cloudera Has Moved!

John Kreisa

April 12

Excerpt: Cloudera Has Moved! by John Kreisa April 12, 2010 1 comment... more

html View Page

April 1

Excerpt: Pushing the Limits of Distributed Processing by omer April 01, 2010 no comments... more

html View Page

March 30

Excerpt: Cloudera’s Support Team Shares Some Basic Hardware Recommendations by Alex Loddengaard... more

html View Page

March 22

Excerpt: How Raytheon BBN Technologies Researchers are Using Hadoop to Build a Scalable, Distributed Triple Store... more

html View Page

March 24

Excerpt: CDH3 Beta 1 Now Available by Eli Collins March 24, 2010 no comments... more

html View Page

HDFS Reliability

Tom White

January 14

Excerpt: HDFS Reliability by Tom White January 14, 2009 4 comments... more

html View Page

March 18

Excerpt: HBase User Group #9: HBase and HDFS by Todd Lipcon March 18, 2010 no comments... more

html View Page

March 16

Excerpt: Natural Language Processing with Hadoop and Python by Ed Albanese March 16, 2010... more

html View Page

March 10

Excerpt: Richard Hutton , CTO of nugg.ad , authored the following post about how and why his company uses Hadoop. n... more

html View Page

March 3

Excerpt: Trip Report: Utah Java User’s Group by Philip Zeyliger March 03, 2010 no... more

html View Page

November 2

Excerpt: Avro is a recent addition to Apache's Hadoop family of projects. Avro defines a data format designed to supp... more

html View Page

Avro 1.3.0

Matt Massie

March 1

Excerpt: Avro 1.3.0 by Matt Massie March 01, 2010 no comments Avro... more

html View Page

February 22

Excerpt: Cloudera’s Hadoop Training Programs Expand Internationally by Christophe Bisciglia... more

html View Page

February 18

Excerpt: CDH2: “Testing” Heading Towards “Stable” by Chad Metcalf Februa... more

html View Page

October 15

Excerpt: Analyzing Human Genomes with Hadoop by Christophe Bisciglia October 15, 2009 4... more

html View Page

October 29

Excerpt: Hadoop World: NYC – Let the Videos Roll by Christophe Bisciglia October 29, 2009... more

html View Page

October 21

Excerpt: Around the world, individuals contribute to Hadoop and build community around the technology. This kind of col... more

html View Page

November 9

Excerpt: Today’s Hadoop World video comes from Ed Capriolo, and goes into details about how to effectively monito... more

html View Page

November 19

Excerpt: Hadoop World: Protein Alignment from Paul Brown by Alex Loddengaard November 19, 2009... more

html View Page

November 11

Excerpt: Hadoop World: Rethinking the Data Warehouse with Hadoop and Hive from Ashish Thusoo by Christop... more

html View Page

November 17

Excerpt: Hadoop at Twitter (part 1): Splittable LZO Compression by Matt Massie November 17, 2009... more

html View Page

November 20

Excerpt: Hadoop World: Hadoop + Clojure from Stuart Sierra and Tim Dysinger by Alex Loddengaard... more

html View Page

November 25

Excerpt: Hadoop World: Practical HBase from Jonathan Gray and Ryan Rawson by Alex Loddengaard No... more

html View Page

November 23

Excerpt: Hadoop World: Hadoop + Vertica from Omer Trajman by Alex Loddengaard November 23, 2009... more

html View Page

December 2

Excerpt: Hadoop World: Hadoop for Bioinformatics by Christophe Bisciglia December 02, 2009... more

html View Page

December 8

Excerpt: Hadoop World: Security and API Compatibility by Christophe Bisciglia December 08, 2009... more

html View Page

December 10

Excerpt: Hadoop World: Sqoop – Database Import for Hadoop by Christophe Bisciglia December... more

html View Page

December 15

Excerpt: Observers: Making ZooKeeper Scale Even Further by Henry Robinson December 15, 2009... more

html View Page

December 17

Excerpt: 7 Tips for Improving MapReduce Performance by Todd Lipcon December 17, 2009 no... more

html View Page

December 22

Excerpt: Hadoop World: Hadoop Applications at Yahoo! by Christophe Bisciglia December 22, 2009... more

html View Page

December 23

Excerpt: Hadoop World: Making Hadoop Easy on Amazon Web Services by Christophe Bisciglia Decembe... more

html View Page

January 11

Excerpt: Hadoop World: Building Data Intensive Apps with Hadoop and EC2 by ed January 11, 2010... more

html View Page

January 19

Excerpt: Cloudera speaks VMware vCloud API, too. by Mike Olson January 19, 2010 no comme... more

html View Page

5 Common Questions About Hadoop

Christophe Bisciglia

May 14

Excerpt: 5 Common Questions About Hadoop by Christophe Bisciglia May 14, 2009 11 comment... more

html View Page

November 18

Excerpt: Introducing Hadoop Development Status by Alex Loddengaard November 18, 2008 no... more

html View Page

August 10

Excerpt: Back in October, I promised to keep marketing and sales out of this blog. We wanted to concentrate on techni... more

html View Page

September 30

Excerpt: At the beginning of September, we announced the first release of CDH2 , our current testing repository. Pac... more

html View Page

Introducing Cloudera Desktop

Jeff Hammerbacher

October 1

Excerpt: Today at Hadoop World NYC , we’re announcing the availability of Cloudera Desktop ,  a unified and ex... more

html View Page

September 29

Excerpt: One of the more common requests we receive from the community is to package HBase with Cloudera’s Distri... more

html View Page

September 10

Excerpt: In March of this year, we released our distribution for Hadoop.  Our initial focus was on stability and makin... more

html View Page

September 9

Excerpt: It’s been a crazy few weeks here at Cloudera, and while there is no sign of things letting up before Ha... more

html View Page

August 14

Excerpt: Is it 50030 or 50300 for that JobTracker UI? I can never remember! Hadoop’s daemons expose a handful o... more

html View Page

Hadoop World: NYC 2009

Christophe Bisciglia

August 19

Excerpt: To say we were surprised by the quality and quantity of submissions we received for Hadoop World: NYC 2009... more

html View Page

July 29

Excerpt: As Hadoop adoption increases among organizations, companies, and individuals, and as it makes its way into pro... more

html View Page

July 27

Excerpt: Cloudera’s Training VM is one of the most popular resources on our website. It was created with VMware W... more

html View Page

Hadoop HA Configuration

Christophe Bisciglia

July 22

Excerpt: One of the things we get a lot of questions about is how to make Hadoop highly available. There is still a lot... more

html View Page

July 17

Excerpt: There is some confusion about the state of the file append operation in HDFS. It was in, now it’s out. W... more

html View Page

The Project Split

Aaron Kimball

July 17

Excerpt: Last Wednesday, we hosted a Hadoop meetup, and I gave a short talk about the new project split. How does the s... more

html View Page

Hadoop Graphing with Cacti

Christophe Bisciglia

July 7

Excerpt: An important part of making sure Hadoop works well for all users is developing and maintaining strong relation... more

html View Page

July 3

Excerpt: The distributed nature of MapReduce programs makes debugging a challenge. Attaching a debugger to a remote pro... more

html View Page

June 30

Excerpt: Hadoop moves fast. Users often find that they need to upgrade after just a few months. Upgrading can be a daun... more

html View Page

June 24

Excerpt: Yesterday, Chris Goffinet from Digg made a great blog post about LZO and Hadoop. Many users have been frustr... more

html View Page

June 22

Excerpt: On June 10th, more than 750 people from around the world descended on the Santa Clara Marriott to share their... more

html View Page

June 2

Excerpt: For the last few months, we’ve been working with the TVA to help them manage hundreds of TB of data from... more

html View Page

Introducing Sqoop

Aaron Kimball

June 1

Excerpt: In addition to providing you with a dependable release of Hadoop that is easy to configure , at Cloudera we... more

html View Page

May 29

Excerpt: A few months ago we announced the Cloudera Distribution for Hadoop .  We’re happy to report that lots... more

html View Page

May 28

Excerpt: In my first few weeks here at Cloudera , I’ve been tasked with helping out with the Apache ZooKeeper... more

html View Page

May 28

Excerpt: As Hadoop continues to turn heads at startups and big enterprises alike, Cloudera has received several request... more

html View Page

May 27

Excerpt: Lately, we’ve been spending a lot of time on the East Coast, and one thing is clear: Hadoop is everywher... more

html View Page

May 22

Excerpt: Administrators of HDFS clusters understand that the HDFS metadata is some of the most precious bits they have.... more

html View Page

10 MapReduce Tips

Tom White

May 18

Excerpt: This piece is based on the talk “Practical MapReduce” that I gave at Hadoop User Group UK on April 14 .... more

html View Page

May 7

Excerpt: Hadoop Core version 0.20.0 was released on April 22. In this post I will run through some of the larger or mor... more

html View Page

May 11

Excerpt: A while back, we noticed a blog post From Arun Jacob over at Evri (if you haven’t seen Evri before,... more

html View Page

High Energy Hadoop

Matt Massie

May 1

Excerpt: We asked Brian Bockelman, a Post Doc Research Associate in the Computer Science & Engineering Depar... more

html View Page

April 27

Excerpt: When we announced Cloudera’s Distribution for Hadoop last month, we asked the community to give us fe... more

html View Page

Pig Training Now Available Online

Christophe Bisciglia

April 23

Excerpt: Today I did a web search for “pig training” using my favorite search engine. I was wildly entertai... more

html View Page

April 22

Excerpt: Welcome to the first guest post on the Cloudera blog. The other day, we saw Toby from  Swingly tweeting... more

html View Page

April 21

Excerpt: Last Tuesday – on my second day of work at Cloudera – I went to London to check out the second UK... more

html View Page

April 15

Excerpt: In the process of working on a few things here I wanted to add some links to launch Hive and the Hadoop Jobt... more

html View Page

April 20

Excerpt: One of the perks of using Java is the availability of functional, cross-platform IDEs.  I use vim for my da... more

html View Page

April 9

Excerpt: A few weeks ago we announced Cloudera’s Distribution for Hadoop , and I want to spend some time showing... more

html View Page

March 30

Excerpt: Configuring a Hadoop cluster is something akin to voodoo. There are a large number of variables in hadoop-def... more

html View Page

March 15

Excerpt: One of the repeating themes we have heard while working with our customers and the community is that Hadoop co... more

html View Page

Hadoop Metrics

Philip Zeyliger

March 12

Excerpt: Hadoop’s NameNode, SecondaryNameNode, DataNode, JobTracker, and TaskTracker daemons all expose runtime m... more

html View Page

March 13

Excerpt: Exciting news: We’re providing our basic hadoop training for free online . We’ll still host bas... more

html View Page

March 6

Excerpt: Hadoop’s strength is that it enables ad-hoc analysis of unstructured or semi-structured data. Relational... more

html View Page

February 10

Excerpt: You might think that the SecondaryNameNode is a hot backup daemon for the NameNode. You’d be wrong. The... more

html View Page

February 2

Excerpt: Small files are a big problem in Hadoop — or, at least, they are if the number of questions on the user... more

html View Page

January 5

Excerpt: It’s a new year, the time when we take a moment to look back at the previous one, and forward to what mi... more

html View Page

December 31

Excerpt: The first release (0.19.0) from the 0.19 branch of Hadoop Core was made on November 24. Many changes go into... more

html View Page

Testing Hadoop

Tom White

December 16

Excerpt: As a developer coming to Hadoop it is important to understand how testing is organized in the project. For the... more

html View Page

December 3

Excerpt: A few weeks ago we ran a Hadoop hackathon. ApacheCon participants were invited to use our 10-node Hadoop clust... more

html View Page

November 14

Excerpt: It is common for a MapReduce program to require one or more files to be read by each map or reduce task before... more

html View Page

November 2

Excerpt: As promised in my post about installing Scribe for log collection , I’m going to cover how to configure... more

html View Page

October 28

Excerpt: Scribe is a newly released log collection tool that dumps log files from various nodes in a cluster to Scri... more

html View Page

October 23

Excerpt: We’ve created this blog as a place to post tips, tricks and insights on using Hadoop and related project... more

html View Page

October 24

Excerpt: Apache Hadoop exists within a rich ecosystem of tools for processing and analyzing large data sets. At Facebo... more

html View Page