Rapleaf processes the many feeds of data that it collects and synthesizes all of that data into a single, accurate view using Hadoop. Log messages are sent through Scribe and loaded into the Hadoop Distributed File System (HDFS). Log data is loaded into Hadoop every ten minutes, amounting to 1-2 TB each day. Other data sources load hourly or daily. Rapleaf has other jobs that run periodically on the logs to compute stats and make sure everything is running correctly.
