If you're a business team leader, CIO, business analyst, or developer interested in how Apache Hadoop and Apache HBase-related technologies can address problems involving large-scale data in cost-effective ways, this book is for you. Using real-world stories and situations, authors Ted Dunning and Ellen Friedman show Hadoop newcomers and seasoned users alike how NoSQL databases and Hadoop can solve a variety of business and research issues.
You'll learn about early decisions and pre-planning that can make the process easier and more productive. If you're already using these technologies, you'll discover ways to gain the full range of benefits possible with Hadoop. While you don't need a deep technical background to get started, this book does provide expert guidance to help managers, architects, and practitioners succeed with their Hadoop projects.
- Examine a day in the life of big data: India's ambitious Aadhaar project
- Review tools in the Hadoop ecosystem such as Apache's Spark, Storm, and Drill to learn how they can help you
- Pick up a collection of technical and strategic tips that have helped others succeed with Hadoop
- Learn from several prototypical Hadoop use cases, based on how organizations have actually applied the technology
- Explore real-world stories that reveal how MapR customers combine use cases when putting Hadoop and NoSQL to work, including in production
About the Author: Ted Dunning is Chief Applications Architect at MapR Technologies and active in the open source community, being a committer and PMC member of the Apache Mahout, Apache ZooKeeper, and Apache Drill projects, and serving as a mentor for the Storm, Flink, Optiq, and Datafu Apache incubator projects. He has contributed to Mahout clustering, classification, matrix decomposition algorithms, and the new Mahout Math library, and recently designed the t-digest algorithm used in several open source projects.
Ted was the chief architect behind the MusicMatch (now Yahoo Music) and Veoh recommendation systems, built fraud-detection systems for ID Analytics (LifeLock), and has 24 issued patents to date. Ted has a PhD in computing science from University of Sheffield. When he's not doing data science, he plays guitar and mandolin. Ted is on Twitter at @ted_dunning.
Ellen Friedman is a solutions consultant and well-known speaker andauthor, currently writing mainly about big data topics. She is a committerfor the Apache Mahout project and a contributor to the ApacheDrill project. With a PhD in Biochemistry from Rice University, shehas years of experience as a research scientist and has written about avariety of technical topics including molecular biology, nontraditionalinheritance, oceanography, and large-scale computing. Ellen is alsoco-author of a book of magic-themed cartoons, A Rabbit Under theHat. Ellen is on Twitter at @Ellen_Friedman.