Cloudera acquires Octopai's platform to enhance metadata management capabilities

Read the press release

What’s taking driverless cars so long? Part 2: The big challenge of big data

Grace Dobush Freelance journalist, Editor of the ADP ReThink Quarterly

JULY 26, 2023
a smart car straight on driving on a street with visible sensors

Part 2: The big challenge of big data

Handling the massive amounts of data created by autonomous vehicles in real-world scenarios takes a village.

In the past five years, great strides have been made in advancing the advanced driver assistance technology, deep learning, and communication protocols required for fully autonomous vehicles. 

For all that progress, a few speed bumps still lie ahead. The first installment addressed the wildcard human factor. This report focuses on the issue of big data and the technological challenges of autonomous driving. 

“Autonomous driving is in the trough of disillusionment now,” says Brian Carlson, director of global product and solutions marketing at NXP Semiconductors. “The problem is that it’s much more difficult than people realized.”

“It takes tons of data,” says Michael Ger, a long-time managing director of manufacturing and automotive at Cloudera, who recently retired. “It’s incredibly complicated in terms of being able to ingest, store this data, and train the perception layers.”

The sensors remain very expensive and the flood of data requires substantial computing power to process and interpret. Cloudera joined NXP, Teraki, Airbiquity, and WNDRVR in creating the Fusion Project to solve this challenge. 

“We formed an industry consortium to demonstrate a full data loop for training autonomous vehicles,” says Geert-Jan van Nunen, co-founder of Teraki. “It’s a huge ecosystem, from chip sets to sensors to data training and storage and connecting the cloud with the car in a secure, safe way. The Fusion Project will show the world this whole ecosystem and lifecycle works in reality, in real cars with real people.” 


Cybersecurity is another big data problem. While cybersecurity has been executed the same way for decades, the good news is that big data platforms are here to help.  
 

Check out the blog

“Autonomous driving is in the trough of disillusionment now,” says Brian Carlson, director of global product and solutions marketing at NXP Semiconductors. “The problem is that it’s much more difficult than people realized”.

Real-time driving data

Data collection from sensor-equipped vehicles on the roads is necessary to inform the next generations of self-driving vehicles. 

Humans have five senses. For cars, it’s radar, lidar, and video. 

“More sensors are being used per car and they’re becoming more high resolution,” van Nunen says. The increased resolution helps improve the reliability of machine learning — high-definition video is more likely to identify a person in the street as a person. 

“All the million miles of data can be played back like a tape recorder,” Ger says. “It’s one of the reasons machine learning is critical and takes so long: The data management side of it is huge.”

“But that causes a tsunami of data. All the data needs to be processed in real time in the car, and that’s a huge challenge. You want safe and fast detection of what’s around you in the car as you’re moving,” van Nunen explains. “We solve this problem in the car without loading a data center into the car. 

“Our software detects the most important frames and objects and uploads those in HD, but keeps less important objects in low res. It de-noises the signal by processing it with less computing power.”

The decreased data volume saves power and saves space. “A lot of people do this today with huge racks of hard drives filling the trunk,” Carlson says. “We’re working with production hardware in vehicles. Not just with test vehicles but using real data from vehicles driving down the road.”

Real world driving is complex and unpredictable. Improving vehicles beyond the capacity of humans — who can get tired or distracted — is necessary for the vehicles’ safe co-existence with other cars, pedestrians and cyclists. 

“Even if you’re 99.9% safe, that .1% over millions of miles is too dangerous,” Carlson says. “It’s not ready for prime time, even if Elon Musk says it is. It’s a continuously moving target.” 

 

An imminent application

One important backup safety system for autonomous test vehicles is driver monitoring — AI-powered video monitoring ensuring that the human on board is alert and aware of any potential trouble.

“People driving their personal cars might be uncomfortable with a driver monitor constantly observing them,” says Kelly Funkhouser, manager of vehicle technology at Consumer Reports. But long-haul truckers already operate this way, so it seems like an easier transition.

Since the pandemic began, autonomous vehicles have been emerging as a solution for the long-haul trucking industry’s labor issues. As online shopping boomed during the pandemic, logistics became an increasingly important sector of the world economy. 

And closed highways are a much safer environment in which to test autonomous trucks. “I see more operators using autonomous vehicles for the long haul in the next few years, where a driver will come in for city driving,” Carlson predicts.

Embark Trucks plans to commercially launch its self-driving fleet in California and Texas in 2024. Waymo Via (a spinoff of Alphabet) is working with operator Ryder System to test autonomous trucks across the U.S., to be built by Daimler. TuSimple has been testing its trucks in the Southwest, partnering with UPS and Navistar.

Partnerships are crucial

Our potentially autonomous future comes down to economics. 

“For this to really take off, the sensing and computing capability has to be really driven to a scale for mass production efficiencies,” Ger says. Much innovation in the autonomous automotive space is happening among suppliers and startups. That’s evident in auto manufacturers’ recent acquisitions of tech startups. 

“A lot of the ubiquitous smart features that have come out in the past five years have come from moonshot research, but OEMs will adopt it when it becomes more affordable,” Carlson says. “The cloud is critical for all future vehicles. We’re experiencing major shifts: the shift in architecture and the shift to electric. Can the industry afford to do all of it at once?”

We’re finding out now. For current EV batteries, high-powered processing is too much of a drain. “That impacts the range. Treating a car like a data center on four wheels doesn’t work if you want the battery to last all day,” Carlson says. 

“OEMs are putting more money into shifting to 100% electric over the next five to 10 years. Most people are talking about electric cars accounting for half of all vehicles in 2030,” Carlson says. “We don’t see that much of a slice for autonomous vehicles.” He predicts that we’re likely to see Level 3 vehicles by the end of the decade.

Accenture reported that industry experts predict Level 2 vehicles will make up 60% of the market by 2030 — up from 15% in 2021. Level 3 and 4 vehicles are expected to make up only 5% of the market by 2030.

We do know that handling the massive amounts of data created by autonomous vehicles takes a village. And sharing and harmonizing that data has major potential benefits for the whole industry — a rising tide lifts all boats.

Article by

Photo of author Grace Dobush

Grace Dobush

Grace Dobush is a freelance journalist based in Berlin. She has contributed to Fortune, Wired, and Quartz and is the editor of the ADP ReThink Quarterly.

More articles


Do data companies need Chief Ethics Officers?

Learn more


What's taking driverless cars so long?

Learn more

Your form submission has failed.

This may have been caused by one of the following:

  • Your request timed out
  • A plugin/browser extension blocked the submission. If you have an ad blocking plugin please disable it and close this message to reload the page.