The Apache Drill project launched back in August 2012, and with six months between the launch and the present day, some are already taking a look back at the project to see if it's proving its worth in the field. With advances being made on several fronts, the idea of use-cases for the Apache Drill are beginning to come into light, and the total value of the system is starting to make itself known.
Apache Drill is an open-source system, similar in nature to Google's (News - Alert) Dremel, that provides the basis for analysis of large-scale data, including nested data items like protocol buffers and JSON. Generally used alongside MapReduce, Dremel is more of a supplement rather than a competitor, but Apache Drill steps things up a notch by improving the speed and responsiveness of the overall analysis.
Apache Drill has taken on a new prominence of late, with rapid growth in terms of code being written for it and a community getting much more involved. As such, new uses for the project are coming into view and bringing with them the potential for big changes. Apache Drill, for starters, is out to make a splash when it comes to large-scale data, bring the ability to make faster queries on an ad hoc basis despite an environment where there were both multiple sources and multiple formats. Apache Drill allows for a simpler solution, removing the need to use Java programs in the query basis. Since Java programs couldn't generate the results desired--both ad hoc and fast--Apache Drill has a great potential to boost that capability.
Apache Drill also serves as a way to bridge batch processing and stream processing, allowing for both structured and unstructured data to come into play. Since Apache Drill is both flexible and speedy, it can allow for both conditions to operate and provide added value to the user base. Apache Drill is an open-source project, pulling in a variety of concepts from a wider part of the community, and is actually looking for others interested in joining in with Apache Drill mailing lists to help the community build.
A recent look at big data from the Gartner (News - Alert) Hype Cycle shows that big data in general is coming into its own as a mature technology, but with that maturation comes an appreciation of the limitations of the technology. One of the complaints showing up during Svetlana Sicular's recent analysis of big data's maturation was the limitations of MapReduce in particular. Something like Apache Drill may help improve those limitations and make an already maturing technology more powerful.
All technologies go through a pattern of growth and change, and Hadoop is ultimately no different. As developments both like and within Apache Drill come available, their applications will ultimately alter the landscape around it and create new opportunities for further growth. The ultimate course of Hadoop remains to be seen, as is that of Apache Drill, but it's a safe bet that further changes are afoot and improvements not far behind.
Edited by Rachel Ramsey