Fast Forward Labs Research now available without a subscription
All of our applied machine learning research is now publicly available and free to download from the Cloudera Fast Forward Labs landing page.
Navigating the machine learning landscape
Despite its promise, machine learning can be downright daunting. Best efforts can be quickly undermined by uncertainty about a rapidly changing technical landscape, bewilderment on how best to build and organize teams, and difficulty separating hype from reality.
Cloudera Fast Forward Labs Research focuses on emerging trends that are still changing due to algorithmic breakthrough, hardware breakthrough, technological commoditization, and data availability. These are encapsulated in our reports, blog posts, and applied machine learning prototypes (AMPs), which exhibit the capabilities of ML algorithms while adhering to best practices.
Who are we?
Cloudera Fast Forward Labs is an applied machine learning research group. Our mission is to empower enterprise data science practitioners to apply emergent academic research to production machine learning use cases in practical and socially responsible ways, while also driving innovation through the Cloudera ecosystem. Our team brings thoughtful, creative, and diverse perspectives to deeply researched work. In this way, we strive to help organizations make the most of their ML investment as well as educate and inspire the broader machine learning and data science community.
Inferring Concept Drift Without Labeled Data
Concept drift occurs when the statistical properties of a target domain change over time causing model performance to degrade. Drift detection is generally achieved by monitoring a performance metric of interest and triggering a retraining pipeline when that metric falls below some designated threshold. However, this approach assumes ample labeled data is available at prediction time - an unrealistic constraint for many production systems. In this report, we explore various approaches for dealing with concept drift when labeled data is not readily accessible.
Exploring Multi-Objective Hyperparameter Optimization
We develop machine learning models against the “usual suspect” metrics like predictive accuracy, recall, and precision. However, these metrics are rarely truly all we care about. Production models must also satisfy physical requirements such as latency or memory footprint, or fairness constraints. Hyperparameter optimization becomes even more challenging when we have multiple metrics to optimize. Our latest research examines this “multi-objective” hyperparameter optimization scenario in detail.