June 13, 2015
Hadoop is a free, Java-based programming framework that enables the processing of large data in a distributed computing environment. It is part of the Apache open source project sponsored by the Apache Software Foundation.
We are sure that’s not the only information you are looking for and you will like to know much more in detail about Hadoop and its functions. So we bring to you the Top 10 eBooks available on Hadoop that will help you to get your concepts clear.
This book is a concise guide to getting started with Hadoop and getting the most out of your Hadoop clusters. This book provides information on how to use the framework effectively to scale applications of Hadoop tools. Through this book, you can rapidly get up to speed with Hadoop. This book provides step-by-step instructions and examples that will take you from just beginning to use Hadoop to running complex applications on large clusters of machines.
Download: Pro Hadoop
This book is an ideal learning reference for Apache Pig, the open source engine for executing parallel data flows on Hadoop. This book introduces the new users to pig and gives the advanced users, comprehensive coverage on key features such as, Pig Latin scripting Language, the Grunt shelland User Defined Functions for extending Pig. By referring this book, you can easily analyze the terabytes of the data.
Download: Programming Pig
Professional Hadoop Solutions covers storing data with HDFS and Hbase, processing data with MapReduce, and automating data processing with Oozie. Along with these it also covers, Hadoop security, running Hadoop with Amazon Web Services, best practices, and automating Hadoop processes in real time. With in-depth code examples in Java and XML and the latest on recent additions to the Hadoop ecosystem, this complete resource also covers the use of APIs, exposing their inner workings and allowing architects and developers to better leverage and customize them.
Download: Professional Hadoop Solutions
This book is a user guide for using Apache Sqoop. This book focuses on applying the parameters provided by Command Line Interface, on common use cases to help one use Sqoop. The authors provide MySQL, Oracle, and PostgreSQL database examples on GitHub that you can easily adapt for SQL Server, Netezza, Teradata, or other relational systems.
Download: Apache sqoop cookbook
According to the Preface of this book “Hadoop MapReduce Cookbook helps readers learn to process large and complex datasets. The book starts in a simple manner, but still provides in-depth knowledge of Hadoop. It is a simple one-stop guide on how to get things done. It has 90 recipes, presented in a simple and straightforward manner, with step-by-step instructions and real world examples.”
Download: Hadoop MapReduce Cookbook
This comprehensive guide shows you how to build and maintain reliable, scalable, distributed systems with Hadoop framework. Programmers will find details for analyzing the datasets of any size and administrators will learn how to set up and run Hadoop Clusters. This editions covers the new features such as Hive, Sqoop and Avro. It also provides you with case studies that can help you solve specific problems.
Download : Hadoop: The Definitive Guide, 2nd Edition
According to the preface of this book, “This book will be unique in some ways and
familiar in others. First and foremost, this book is obviously about design patterns, which are templates or general guides to solving problems. This book is a bit more open-ended than a book in the “cookbook” series of texts as we don’t call out specific problems. However, similarly to the cookbooks, the lessons in this book are short and categorized. You’ll have to go a bit further than just copying and pasting our code to solve your problems, but we hope that you will find a pattern to get you at least 90% of the way for just about all of your challenges.”
Download: MapReduce Design Pattern
If you have been asked to maintain large and complex Hadoop clusters, this book is a must. This books covers the topics like HDFS, Map Reduce, Planning of Hadoop Cluster, Installation and Configuration of Hadoop, Identity, Authentication and Authorization, Resource Management and Cluster Maintenance.
Download: Hadoop Operations
Programming Hive introduces Hive , an essential tool in the Hadoop ecosystem thatprovides an SQL (Structured Query Language) dialect for querying data stored in the Hadoop Distributed Filesystem (HDFS), other filesystems that integrate with Hadoop, such as MapR-FS and Amazon’s S3 and databases like HBase (the Hadoop database)and Cassandra.
Download: Programming Hive
According to preface of this book “Hadoop Real-World Solutions Cookbook helps developers become more comfortable with, and proficient at solving problems in, the Hadoop space. Readers will become more familiar with a wide variety of Hadoop-related tools and best practices for implementation. This book will teach readers how to build solutions using tools such as Apache Hive, Pig, MapReduce, Mahout, Giraph, HDFS, Accumulo, Redis, and Ganglia.
This book will give readers the examples they need to apply the Hadoop technology to their own problems.”
Download: Hadoop Real World solutions CookBook
Do let us know, which one was most helpful to you.