This post is guest authored by our friends at Arimo describing why and how they bet on Apache Spark. In early 2012, a group of engineers with background in distributed systems and machine learning came together to form Arimo. We saw a major unsolved problem in the nascent Hadoop ecosystem: it was largely a storage play. Data was sitting passively on HDFS , with very little value being extracted. To be sure, there was MapReduce , Hive, Pig, etc., but value is