Apache Oozie Known Issues

Apache Oozie Server Vulnerability

A vulnerability in the Oozie Server allows a cluster user to read private files owned by the user running the Oozie Server process.

Products affected: Oozie

Releases affected: All releases prior to CDH 5.12.0, CDH 5.12.0, CDH 5.12.1, CDH 5.12.2, CDH 5.13.0, CDH 5.13.1, CDH 5.14.0

Users affected: Users running the Oozie Server

Date/time of detection: November 13, 2017

Detected by: Daryn Sharp and Jason Lowe of Oath (formerly Yahoo! Inc)

Severity: (Low/Medium/High) High

Impact: The vulnerability allows a cluster user to read private files owned by the user running the Oozie Server process. The malicious user can construct a workflow XML file containing XML directives and configuration that reference sensitive files on the Oozie server host.

CVE: CVE-2017-15712

Immediate action required: Upgrade to release where the issue is fixed.

Addressed in release/refresh/patch: CDH 5.13.2 and higher, CDH 5.14.2 and higher, 5.15.0 and higher

Oozie does not consider timezone parameter when scheduling using cron-like syntax

Oozie workflows can be scheduled using coordinators. Coordinators can have start time, end time, and frequency parameters. All those have an effect on when and how often coordinator actions (occurrences) of the same coordinator job (definition) will schedule to run.

There are three ways to choose the frequency parameter:
  1. Use a constant value (meaning number of minutes) like: frequency=”60”. Note that because the number of minutes per day will vary in a daylight saving time zone, only values less than 1440 (minutes in a non-daylight-adjusting day) might be useful.
  2. Put an EL expression like: frequency=”${coord:days(1)}”. This will repeat the workflow once every day, based on the start time.
  3. Use a Cron expression like: frequency="10 9 * * *". This will schedule the coordinator application every day 9:10am.

If you create a scheduled job using either a constant value or an Expression Language expression while using the timezone parameter to supply a timezone with daylight saving time, Oozie schedules the coordinator actions considering daylight saving changes.

This is not the case when you use a Cron expression while using the timezone parameter to supply DST aware timezone. In this case, Oozie schedules the coordinator actions in a way that does not consider the timezone parameter. Only oozie.processing.timezone configuration value is considered configured as part of oozie-site.xml, and only for calculating the offset to GMT. oozie-site.xml affects the overall behavior for each coordinator job.

Affected Versions: CDH5.14.0 and lower.

Fixed in Versions: None.

Bug: OOZIE-2494

Cloudera Bug: CDH-40279

Workaround: In Cloudera Manager, configure Oozie Server Advanced Configuration Snippet (Safety Valve) for oozie-site.xml as follows:
<!-- The GMT offset value for Europe/Budapest summer time -->
<property>
  <name>oozie.processing.timezone</name> 
  <value>GMT+02:00</value> 
</property> 

After enabling HA, Oozie may fail to start due to "NoSuchFieldError: EXTERNAL_PROPERTY

This issue happens in rare cases. Due to an incompatibility with the version of Jackson used by Oozie and Hive, and depending on the order that jars are loaded into Oozie's classpath, Oozie may fail to start.

Affected Versions: CDH5.9.3 and below.

Fixed in Versions: CDH 5.10.0 and higher.

Bug: HIVE-1640

Cloudera Bug: CDH-42408

Workaround: If using parcels:
  1. Delete or move /opt/cloudera/parcels/CDH/lib/oozie/libserver/hive-exec.jar and /opt/cloudera/parcels/CDH/lib/oozie/libtools/hive-exec.jar.
  2. Download hive-exec-<cdh version>-core.jar from the Cloudera repo and put it in /opt/cloudera/parcels/CDH/lib/oozie/libserver/ and /opt/cloudera/parcels/CDH/lib/oozie/libtools/.
  3. Download kryo-2.22.jar from the maven repo and put it in /opt/cloudera/parcels/CDH/lib/oozie/libserver/ and /opt/cloudera/parcels/CDH/lib/oozie/libtools/.

Oozie Web Console returns 500 error when Oozie server runs on JDK 8u75 or higher

The Oozie Web Console returns a 500 error when the Oozie server is running on JDK 8u75 and higher. The Oozie server still functions, and you can use the Oozie command line, REST API, Java API, or the Hue Oozie Dashboard to review status of those jobs.

Affected Versions: CDH5.x and higher, except for the releases listed below.

Fixed in Versions: CDH 5.5.5, 5.7.2, 5.8.2, 5.9.0 and above.

Bug: OOZIE-2533

Cloudera Bug: CDH-40362

Workaround: Use an earlier version of Java 8 or use the Hue Oozie Dashboard.

Oozie jobs fail (gracefully) on secure YARN clusters when JobHistory server is down

If the JobHistory server is down on a YARN (MRv2) cluster, Oozie attempts to submit a job, by default, three times. If the job fails, Oozie automatically puts the workflow in a SUSPEND state.

Affected Versions: CDH 5 Beta 1 and higher.

Cloudera Bug: CDH-14623

Workaround: When the JobHistory server is running again, use the resume command to tell Oozie to continue the workflow from the point at which it left off.

Oozie does not start when oozie.email.smtp.auth is disabled

If you enable SLA integration, and oozie.email.smtp.auth is disabled, Oozie throws a NullPointerException and fails to start.

Affected Versions: C5.5.1 and lower.

Bug: OOZIE-2365

Cloudera Bug: CDH-35331

Workaround: In Cloudera Manager, configure Oozie Server Advanced Configuration Snippet (Safety Valve) for oozie-site.xml as follows:
<property>
  <name>oozie.email.smtp.password</name> 
  <value>none</value> 
</property> 
<property> 
  <name>oozie.email.smtp.username</name> 
  <value>none</value> 
</property>

Oozie works with MapReduce or YARN, but not both

The Oozie server works with a MapReduce (MRv1) cluster or a YARN (MRv2) cluster, but not both at the same time.

Workaround: Use two different Oozie servers.