Managing Non-CDH Resources
From an operations perspective, CDH hosts may also run other processes, such as antivirus software or operating system backups. This topic presents information to help you plan for those processes.
- Scratch directories used by services such as Impala
- Log directories used by various Hadoop services
- Data directories which can grow to petabytes in size
Operating System Backups
Many of the considerations outlined in Antivirus Software apply to operating system backups as well. Backing up scratch directories, log directories, and large amounts of data using standard operating system utilities may not make sense. In addition, many Hadoop resources cannot be backed up by conventional means due to their size and mutability. Consider excluding these types of resources from operating system backups, and using the techniques outlined in Backup and Disaster Recovery instead.
If you use Cloudera Manager, it stores its configuration in a database. You should regularly perform backups of this database, using the mechanisms provided by the database vendor. See Backing Up Databases.