Offering various digital content and services online, entertainment company Dwango, Co. Ltd. is a telecommunications and media company based in Japan.
With its company vision of “Born in the net, Connected by the net,” Dwango has been driving Japan’s Internet industry and develops systems for online games.
The company provides communication platforms such as Niconico Video and Niconico Live Streaming. It creates value by using the power of the Internet to blend opposing concepts - virtual and reality, digital and analog, and innovation and tradition.
Dwango was utilizing Apache Hadoop, but with an increasing demand for data analytics, and a desire to reduce the burden on infrastructure management, the organization embarked on a big data journey.
Infrastructure management had become difficult for Dwango’s open-source analytics platform, requiring an increase in human power equipped with technical skills. The platform needed to be accessible cross-organization due to increased demand across their business. And last, more resources dedicated toward improving data analytics, including the use and management of a wide range of tools, would be imperative to drive business growth.
Dwango analyzes data created on a UGC (user-generated content) platform that includes user videos as well as text. Keiichiro Tsukamoto, General Manager of Data Management Service Department, ICT Service Division of Dwango, says, “The importance of data utilization is growing across the company, with the company forming new departments over the past several years during organizational restructuring as it seeks to create a cross-organizational data analytics platform.”
To address growth and the amount of new unstructured data, the company needed to move away from an environment operated and managed on its own. “Instead of spending time and resources on infrastructure operations, we needed to improve data analytics, and allocate more resources to defining and managing data,” recalls Tsukamoto.
After a process of validation to evaluate Dwango’s support and specification needs, Cloudera was selected as the Big Data analytics platform to drive their journey forward. Since moving huge amounts of data and consolidating storage nodes were required, the company created a scalable analytics platform with data aggregation and analytics capabilities. The updated platform can be leveraged within and outside the company, allowing as one example, the ability to offer marketing data to owners of the Niconico video sharing site.
During its research into a new environment, the company became interested in the Cloudera enterprise platform. “To reduce the burden of infrastructure management, we needed a solution with a support system. Cloudera’s technical support team operates under a clear system, providing quick responses to our questions,” says Tsukamoto. Dwango had previously operated UGC related services from its own data center, so Cloudera’s extensive experience in delivering on-premises and cloud environments made it a good match.
The primary application measures the effectiveness of various services such as Niconico Video and Niconico Live Streaming. Separate from web analytics platforms used by the web director to analyze access, the data analytics platform provides detailed analysis of data, including application logs collected from multiple systems, as well as deeper analysis to measure impact. The distributed data storage system Apache HBase integrated with the Cloudera platform is used as part of the system to display comments on Niconico Video.
Dwango has increased the cluster size to over 50 nodes through server consolidation. The data sizes have grown by almost a factor of ten - to over one petabyte since Cloudera was initially deployed.
These solutions have made cross-functional collaboration within the organization much easier. By using various query engines such as Hive for batch processing and Impala for high-speed analysis, it has become possible to obtain much-needed analytics results dynamically during meetings. Use of Impala and the business integration tool Tableau enables on-site visualization and exemplifies the contribution made toward speeding up data analytics operations.
As for support, Tsukamoto says that generous support from Cloudera engineers has made stable operation possible. “We had a lot of newbies on our team when the system was first deployed, so the leaders took on a lot of responsibility. Because we were able to rely on their support as soon as an issue arose, we were able to focus our human resources on building an environment for the future. I really appreciate their kind support in various situations ranging from migration planning to support for selecting needed equipment for platform expansion. Thanks to them, we were able to complete numerous migrations on schedule.”
Moving forward, Tsukamoto says the company would like to use the Cloudera Data Platform (CDP) utilizing the cloud, while also leveraging the environment developed from an on-premises system. “Over many years of operation, we have accumulated a large amount of cold data, so I would like to implement an environment that can manage this data appropriately according to how frequently the data is used. An environment in which we could procure and migrate necessary resources from the cloud while leaving the on-premises system intact would be ideal. So, for that reason, I have great hopes for Cloudera Data Platform (CDP) as a new solution.”
The company is looking at ways to further leverage the data analytics platform across other departments. “The fact that we can abstract the required environment for data analytics while centrally managing it with Cloudera Manager means we can count on this contributing to governance, and we’ll be able to use all types of environments including on-premises, private clouds, and public clouds. Our data analytics needs will likely further increase as data continues to grow, so I think CDP will be seriously considered,” said Tsukamoto in closing.