HDFS Reliability
- by Tom White
- January 14, 2009
- 4 comments
We’ve been talking to enterprise users of Hadoop about existing and new projects, and lots of them are asking questions about reliability and data integrity. So we wrote up a short paper entitled HDFS Reliability to summarize the state of the art and provide advice. We’d like to get your feedback, too, so please leave a comment.
-
gabriel /
January 14, 2009 / 9:28 PM
Thanks for the paper — don’t know if I’m missing the obvious, is there a link to raw reliability numbers?
-
hammer /
January 14, 2009 / 11:59 PM
Good commentary from Steve Loughran: http://1060.org/blogxter/entry?publicid=56985B46DE7063B5124E09772CE40CEA
-
yikai /
January 17, 2009 / 8:50 AM
Great article,
Two suggestions on “Protect the name node”,
1.Running 64 bits JDK on name node (with more memory) even data/computing node run 32 bits. (Yahoo)
2. Apply dual power on name node box. -
Redwood Job Scheduling /
January 30, 2009 / 9:05 PM
I’m more interested in the system’s reliability in hardware failures. Thanks for the information.
- Overview
- Downloads
- Learn Hadoop
- Get Support
-
Blog
- Avro (11)
- Careers (10)
- CDH (29)
- Cloudera Manager (10)
- Cloudera's Service And Configuration Manager (6)
- Community (86)
- Connector (6)
- Data Collection (13)
- Distribution (34)
- Flume (6)
- General (237)
- Guest (35)
- Hadoop (146)
- HBase (40)
- HDFS (26)
- Hive (22)
- MapReduce (36)
- Oozie (4)
- Pig (15)
- Sqoop (9)
- Testing (5)
- Training (18)
- Use Case (11)
- Whirr (1)
- ZooKeeper (10)
- Archives by Month

Filed under
Share this post