This is the documentation for Cloudera Manager 4.8.4.
Documentation for other versions is available at Cloudera Documentation.

Using the LZO Parcel

Installing and Activating the Parcel

  1. Add the appropriate repository to Cloudera Manager’s list of parcel repositories as described in Parcel Configuration Settings. The HADOOP_LZO parcel repositories can be found at: http://archive.cloudera.com/gplextras/parcels/. Install the correct version of the HADOOP_LZO parcel for each Impala version. Starting with Cloudera Manager 4.8, Impala versions 1.2.1 or later are supported; you must use the LZO parcel versions according to the following table:
    Impala Version LZO Parcel Version
    1.4.0 HADOOP_LZO-0.4.15-1.gplextras.p0.85
    1.3.1 HADOOP_LZO-0.4.15-1.gplextras.p0.64
    1.2.4 HADOOP_LZO-0.4.15-1.gplextras.p0.57
    1.2.3 HADOOP_LZO-0.4.15-1.gplextras.p0.39
    1.2.2 HADOOP_LZO-0.4.15-1.gplextras.p0.37
    1.2.1 HADOOP_LZO-0.4.15-1.gplextras.p0.33
    If required, the repository can be mirrored in the same way as a CDH repo.
  2. Download, distribute, and activate the parcel as described in Managing Parcels.
  3. Ensure that the lzop.x86_64 and lzo.x86_64 packages are installed.
  4. Reconfigure and restart services as described in the following sections. Services that do not require the use of LZO need not be reconfigured.

MapReduce

  1. Go to the MapReduce service.
  2. Select Configuration > View and Edit.
  3. Search for MapReduce Client Safety.
  4. In the MapReduce Client Environment Safety Valve, enter the following two lines:
    • HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/*
    • JAVA_LIBRARY_PATH=$JAVA_LIBRARY_PATH:/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/native
  5. Search for MapReduce Service Safety.
  6. In the MapReduce Service Environment Safety Valve, enter the following two lines:
    • HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/*
    • JAVA_LIBRARY_PATH=$JAVA_LIBRARY_PATH:/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/native
  7. Search for io.compression.
  8. In the Compression Codecs property, click in the field, then click the + sign to open a new value field.
  9. Add the following two codecs:
    • com.hadoop.compression.lzo.LzoCodec
    • com.hadoop.compression.lzo.LzopCodec
  10. Click Save Changes.
  11. Restart MapReduce.
  12. Redeploy the MapReduce client configuration.

YARN

  1. Go to the YARN service.
  2. Select Configuration > View and Edit.
  3. Search for io.compression.
  4. In the Compression Codecs property, click in the field, then click the + sign to open a new value field.
  5. Add the following two codecs:
    • com.hadoop.compression.lzo.LzoCodec
    • com.hadoop.compression.lzo.LzopCodec
  6. Expand the Service-Wide > Advanced category.
  7. In the YARN Service MapReduce Configuration Safety Valve property, specify:
    <property>
      <name>mapreduce.application.classpath</name>
      <value>$HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*,/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/*</value>
    </property>
    <property>
      <name>mapreduce.admin.user.env</name>
      <value>LD_LIBRARY_PATH=/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/native</value>
    </property>
  8. Expand the Gateway (Default) > Advanced category.
  9. In the Gateway Client Environment Safety Valve for hadoop-env.sh, enter the following two lines:
    • HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/*
    • JAVA_LIBRARY_PATH=$JAVA_LIBRARY_PATH:/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/native
  10. Click Save Changes.
  11. Restart YARN.
  12. Redeploy the YARN client configuration.

Oozie

  1. Go to /var/lib/oozie on each host running the Oozie server and even if the LZO JAR is present, symlink the Hadoop LZO JAR /opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/hadoop-lzo.jar.
  2. Restart Oozie.

HBase

Restart HBase.

Impala

  1. On all hosts, install the following packages:
    • RHEL 5 / 6 - lzo
    • SLES 11, Ubuntu, and Debian - liblzo2-2
  2. Restart Impala.

Hive

Restart Hive.

Sqoop

  1. Go to the Sqoop service.
  2. Select Configuration > View and Edit.
  3. Expand the Service-Wide > Advanced category.
  4. Add the following entries to the Sqoop Service Environment Safety Valve property:
    • HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/*
    • JAVA_LIBRARY_PATH=$JAVA_LIBRARY_PATH:/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/native
  5. Click Save Changes.
  6. Restart the Sqoop service.