Google Capital, the new investment arm of the search giant, has spent $80 million of its $300 million store to help finance what Brandon Butler of NetworkWorld has called "perhaps the best Hadoop distribution company that many people haven't heard of."
MapR Technologies -- the aformentioned Hadoop company -- has landed $110 million in financing from investors, with Google Capital providing the lion's share of the money. But why MapR, and why Google Capital?
MapR's been a dark horse in the Hadoop world compared to other distributions. Cloudera's made a name for itself by mixing a pragmatic business approach with a savvy sense of how to leverage open source. Hortonworks has partnered closely with Microsoft to become more accessible to end-users without dropping its pure open source play. Pivotal's concentrated on adding commercial-only components to make Hadoop into a ready-to-use business analytics tool.
MapR doesn't have a similar narrative -- at least, not one widely known to the enterprises that have become Hadoop's biggest adopters. Consequently, its user base is comparatively small. The press release for the financing notes that MapR is in use by "more than 500 customers" across multiple business sectors.
However, MapR's feature set is catching enterprise interest. Forrester Research rated MapR as a Hadoop offering nearly on par with Cloudera's or Hortonworks's solutions, citing its support for Network File System (NFS) storage and high-availability and disaster-recovery features as major pluses. But Forrester noted it "has lagged behind the other pure-play vendors ... in terms of market awareness."
Google Capital's investment could, in part, fix MapR's visibility problems. According to Google Capital's CrunchBase profile, the company doesn't invest in startups, but rather in companies "across a range of industries .. .with new technologies and proven track records in their fields," and are "ready to expand their business in big ways." MapR's news release about the funding states, "The new funding will increase worldwide go-to-market programs to accelerate the deployment of MapR in mission-critical, real-time, and operational use cases."
Those kinds of use cases are seen as must-have features for Hadoop, not merely as add-ons or extras from the commercial provider. Open source components like Apache Spark and itscommercially supported derivatives are designed to accelerate Hadoop processing for real-time or near-real-time operations. But there's more than one way to implement them -- such as via commercial projects like DataTorrent RTS -- so it makes sense to have competition, like MapR's, enriching the pot.
Another area where MapR's work may have greater future significance -- and might explain Google's investment -- is machine learning. The chief application architect at MapR, Ted Dunning, is also project management committee member for the Apache Mahout project, a machine-learning library designed to be relatively easy to work with, including for Hadoop data sets.
Granted, any open source innovations produced by MapR are as easily picked up by Google's competition, but if Google ends up expanding on MapR's research in ways it can keep to itself, it'll be a net gain for Google.
[After this article ran, MapR contacted us to add that "MapR has over 500 paying customers [as opposed to 'users'] which we believe is more than the any other distro." They also stated that "the Forrester report ranked MapR as top product offering," with the highest score among all the reviewed Hadoop vendors.]