Hadoop has been one of the most hyped technologies. Enterprise adoption is slow and a drop in the ocean (compared to DBMS), due to various reasons. But there are signs that the ecosystem is actually doing some work to address its weaknesses.
Arun Chandrasekaran, Research Vice President, Storage, Cloud & Big Data, Gartner explains the various reasons for this slow adoption and talks about the challenges that organizations face with big data. Yet, he sees some vendors who are doing some significant work that might ramp up adoption. Are we likely to see a Made in India Hadoop product? Read on to find out.
Q. Where are we right now in the enterprise adoption of Hadoop? And what are the reasons for the slow adoption?
Arun: Gartner recently did a study on the adoption of Hadoop and there were some surprising findings that came out. As you know Hadoop is one of the most hyped technologies. It’s got a lot of potential and promise, an innovative ecosystem — but it has weak adoption.
We ran that survey with close to 300 C-suite members, who are part of our research circle. Fifty-four percent of the organizations that we spoke to either had not deployed Hadoop, or had no immediate plans to deploy it. And 26 percent of organizations were running Hadoop either in a production or pilot environment. So the adoption is still relatively low at one-fourth of the total audience.
But the more important finding was, among the 26 percent of organizations that were using Hadoop, 70 percent of that set had 20 people or less who are actually using Hadoop.
So Hadoop is not one of those strategic, transformative, company-wide initiatives. It is focussed on a very specific set of people and this limits its broader appeal within the organization.
Q. What are the key reasons for this relatively low adoption? What needs to happen before Hadoop gains traction in the enterprise?
Arun: Obviously, the big one is Skills. The other issue is organizational processes around data quality, data governance, and around data integration. If you are unable to integrate the data marts or data islands in the organization, the value you can derive from Hadoop or big data investments is going to be fairly low.
Hadoop is a truly distributed system and is capable of running on top of commodity infrastructure. So the cost of acquisition could be low, because it is Open Source. But Hadoop is still complex. It’s not like a database software that you buy and readily deploy. It is a collection of many different Open Source projects – so it’s complex.
The other challenge is Security and Data protection. Security in Hadoop is lacking at this point in time. It lacks robust authentication and permissions management.
Then there’s the Data Protection and Disaster Recovery aspects. Backup & Recovery in Hadoop is in a very primitive There is no DR. In an enterprise system you think about security, availability and resiliency. If you try to apply these aspects to Hadoop today, it is lacking.
I am sure the Hadoop community is working on all this today. But Hadoop is just not mature (enough) for enterprise deployments today.
Q. Do you think Hortonworks, MapR and Cloudera might close these gaps?
There are three types of vendors in the Hadoop ecosystem: pure play vendors, the three that you mentioned; the broader integrated application system vendors (IBM and Pivotal); cloud model (Amazon Web Services).
A lot of the contribution is coming from the pure play vendors. Cloudera acquired a company called Gazzang, which is a security company. Cloudera understands the Achilles heel of Hadoop.
The community is also starting to focus on Hadoop’s weaknesses, but it is going to take some time. You have to understand the way the community works. They have an operating committee and everyone needs to understand the key processes. There are many entities involved and there are other issues to fix.
It will take 3 – 5 years before Hadoop becomes a more pervasive workload in the enterprise.
Q. How are organizations coping with Big Data? What are some of the challenges?
Arun: We had a roundtable yesterday where I had 25 customers. The topic was “Big Data – Analytics failures”. The first challenge was skills. The second was finding the right use cases. There is a lot of innovation happening in the Big Data world today. But customers are wondering how to apply all that to use cases in their environment. It’s also about tying the right technology to the right use case.
The third challenge was the process challenges around data quality and data integration – the lack of it in the organization.
The fourth challenge that came up, and this one was specific to India, was the absence of these vendors in this market. Customers here don’t have a direct engagement with these vendors here in India.
Q. So is this a big opportunity for vendors here in India?
Arun: I think there is a huge opportunity for professional services companies in the Big Data services space. I also spent a lot of time with SIs (System Integrators) and VARs (Value Added Resellers) here and I see some early maturity coming. They are building platforms and blueprints and use cases for very specific vertical industries. The ecosystem is beginning to ramp up in terms of their skills and capabilities. They are also trying to understand and define problems.
You aren’t going to see Hadoop or NoSQL vendors from India. But you will see a lot of services companies. The Indian services companies are mostly focussed on global customers.
Q. What are the top three sectors that are showing interest in deploying big data solutions?
Arun: We get the most enquiries for Hadoop from Telecom and service providers. The number two sector is Financial services. And number three is Government and Public sector.
In Government we see use cases such as better urbanization, public safety, delivery of public services in a better targeted manner to citizens. Security is another use case in Government.
I see three roles for government when it comes to Big Data. First, government as an adopter of big data technologies. Two, to provide better services to citizens and other entities in the country. Three, open data – the government should foster initiatives around open data. It should come up with standards and make these assets open to the general public.
There could be an ecosystem of players who can take that data and create analytical models, and deliver that back to consumers or other businesses, who can then create very specialized services.
For instance, weather simulation data. Different weather scenarios can be created using these data points, and this can be shared with farmers before the monsoon – so that they could prepare accordingly.
Q. The startup ecosystem in India is rapidly building up. Do you think we are likely to see a Hadoop product coming from a startup in India?
Arun: Most of the startups I’ve met here are services startups rather than product startups. I’ve met some phenomenally talented people in the sessions I’ve conducted (at the Gartner Summit in Mumbai). Every time I come here, I see more product startups. So it is possible that we could see a Hadoop product startup from India in future.