What can Hadoop Do For You?

What can Hadoop Do For You?

5 years ago this past April, I met Dave Mariani.  Dave was at Yahoo! and ran the company’s  audience and advertising data analytics platforms.  I was doing Marketing for the Microsoft Business Intelligence suite of products - from SQL Server and its fantastic OLAP technology, Analysis Services - to applications like Business Scorecard Manager, Performance Point Server and PowerBI.

Dave and his team had put together the Yahoo! “TAO cube” - the world’s largest Microsoft OLAP cube.  “TAO” stood for Targeting, Analytics and Optimization and it was a powerful analytical platform that created millions of advertising dollar value for Yahoo!  

TAO ran on top of Apache Hadoop and it processed 30+ Billion user and advertising events per day, with hourly refresh.  At the time, Yahoo! accounted for almost half of the world’s online traffic.  Talk about Big Data!  

Prior to meeting Dave, I had had a fair amount of exposure to the business value analytics could return.  I had worked at BusinessObjects where, after buying Acta and Crystal Reports, we became the #1 Business Intelligence tool in the market (BusinessObjects was bought by SAP in 2007 for almost $7B).

But this time, something was different.  Hadoop had been introduced in 2000.  And it took less than 5 years for visionaries like Dave to monetize Hadoop through analytics.  Dave went on to Klout after Yahoo!, where I got the honor to work with him again.  At Klout, he managed a 200-node Hadoop cluster with over 1.4 petabytes of data storage.  His team scored almost half of Facebook’s data, ingested over 12B signal/day, all powered by a Hive Data Warehouse with over 1 Trillon rows of data.  More on that story here.

So, this past year, when Dave and I talked about the opportunity to work together again, I methodically thought through the team, the market and the software (just as one of my heroes and mentors, Dave Kellogg, had taught me).

Big Data’s X factor:  Accelerated market and path to value

Oracle, the world #1 database company, was invented in 1977.  It went public almost 12 years later.  BusinessObjects, the world #1 BI company, started 13 years after Oracle but went public only 4 years after inception.  In less than a generation, our industry had shrunk the path to IPO by a third!  IPO is not everything of course, but when it comes to racing to a company’s first $100M (typically what’s required for an IPO), we learned that the commercialization of business analytics and applications offered a faster path to value.

Hadoop has now been around for about 10 years and we can safely say it will be the #1 database of the Big Data Era. The first company to monetize this data platform and go public on it, took less than 4 years (versus 12 years for Oracle).

How long will it take for the #1 BI on Hadoop company to return $100M in value And what’s required?

BI on Hadoop.  The time is now.

Of course, you’d expect me to say AtScale, won’t you? :)  Well, instead I’d like to point to three new industry trends that make the case for a new generation of “BI on Hadoop” players

  1. Hadoop is here to stay.  This past April, Barclays released a 52-page report on the maturity of Hadoop in the enterprise.  If your company is not actively looking to monetize Hadoop, you might be behind the curve.  Want to assess your progress against your peers and see what’s needed to catch up?  Take the Hadoop Maturity assessment now!
  2. Business users want more data: the key to business value of any data platform is user adoption.  Business users want more data and Tableau (NASDAQ: DATA) is cashing out on this trend: they went from 5,000 customers in 2010 to 26,000 in 2015.  What if there was a way to bridge the gap between Hadoop and BI tools like Tableau without hassle?  Well, there is now.  And see how it works here.
  3. Community momentum for BI on Hadoop is ON: a few years ago, connecting Hadoop and BI was particularly challenging.  Little had been done in the areas of in-memory or query engines for Hadoop.  First-generation vendors had to create proprietary technology to accommodate this gap.  But, starting in 2013, the ecosystem received mass upgrades with Cloudera Impala, SparkSQL and HiveTez.  The community has been carrying these innovations and any company built on top the SQL on Hadoop engine community effort is bound to get a substantial lift.  Are you confused about Impala, SparkSQL, HiveTez and how you can use them for Business Intelligence (BI)?  Sign up for this webinar.

I’m sure that are a few more “post-2013” innovations I’m missing.  Can you think of any one in particular that will revolutionize this space?

=====

Bruno Aziza is a big data entrepreneur and author. He has lead marketing at multiple startups and has worked at Microsoft, Apple, and BusinessObjects/SAP. One of his startups sold to Symantec in 2008, and two of them have raised tens of millions and experienced triple digit growth. Bruno is currently Chief Marketing Officer at AtScale. You can contact him @ bruno@atscale.com

To view or add a comment, sign in

Insights from the community

Explore topics