Why big data projects fail and how to make 2017 different
In addition, billions of users and trillions of connected “things” generate exponentially more data outside the enterprise. At the center, enterprises deploy cloud, mobile, and analytics technologies to hopefully turn that data into insight. Unfortunately, Gartner predicts that 2017 will see 60 percent of big data projects fail. They won’t go beyond piloting and experimentation phases, and will eventually be abandoned.
Where is the disconnect happening for companies between linking data assets to strategic value?
In my experience, the two main obstacles are lack of skill or expertise, and a mismatch between the technology strategy and overall company needs.
The expertise gap
When big data was in its infancy, the technology available at the time was immature and a trial-by-fire experience. Companies with very deep pockets such as Google, Yahoo and Facebook had to build infrastructure from the ground up to handle these problems. Excited by their success, many enterprises have tried to emulate them with their own Hadoop-based, big data projects.
From there, IT and data professionals put undue expectations on what Hadoop as a technology toolkit could do and how much nurturing it requires to produce results. A survey from Gartner found that 49 percent of respondents cited “figuring out how to get value from Hadoop” as a key inhibitor to adoption. Most companies lack the skills to deploy such technology. Ironically, they don’t have the need for such scale.
Big data has become too attached to the technology. Many big data projects fail because they outlay massive, upfront resources and deploy rigid architectures that don’t promote flexibility once a project is in flight.
A successful big data project starts from a deep understanding of the business problems you want to solve and the value you want to gain. Without that, no matter what you achieve, the project will fail to meet expectations or deliver adequate return on investment. It will then wither away or get cancelled.
The next critical element is to have a team which brings both IT, data science and line of business perspectives together. A business expert can identify a major business challenge that needs solving through a data initiative. An IT expert can offer the skills to access the data and pinpoint the appropriate infrastructure needed to execute the project. And lastly, a data expert can bring the mathematics and quantitative skills needed to analyze and extract insights. It’s critical to the success of the project to build a team around these skills.
The third element is a short time to value (TtV). The sooner a team can assemble and produce concrete and measurable value, the easier it is for the organization and senior management to get behind continued investments in this space and avoid the “trough of disillusionment”.
Most Hadoop-based projects fail on all three fronts. The projects are too focused on making the technology work. Also, adequate skill sets are hard to find, and too much time and effort is required to build the infrastructure. Lastly, the initial investment is too high, and the turn-around time too long, making it very difficult to quickly experiment and iterate for success.
A better way
As enterprises work through big data projects, one trend that I’m seeing is the adoption of cloud-based data warehousing and data lake solutions as alternatives to Hadoop projects. It’s far easier and quicker to start such endeavors and get value from the cloud versus investing heavily in infrastructure buildouts. The right cloud solution avoids significant, upfront capital expenditures, provides effortless and cost-effective scaling, and transfers the burden of expertise to the technology vendor in the form of a highly-managed solution.
I highly recommend building on the cloud and steering clear of extensive and cost prohibitive infrastructure if you don’t have the in-house experience and skill-set.
I think 2017 will be the year people start walking away from Hadoop. We will see a shift from the glamour and idealized notion of big data to more practical and effective use cases. I expect that semi-structured data and machine learning will continue to drive the need for big data and having expertise in these areas will be critical. For companies to ultimately be successful, they need a clear business challenge to solve, they must start small and fail early, and they should explore the cloud before over-investing in unnecessary architecture.