A Hadoop cluster can be defined as a special type of computational cluster designed to serve the purpose of storing and analysing huge amounts of data that is not structured, in a distributed computing environment.
Clusters like this can run on Hadoop’s open source distributed processing software on low cost computers, commodity computers to be specific
Hadoop Cluster Architecture:
Hadoop cluster has 3 components:
It is neither a master nor a slave, the work of a client is to submit the MapReduce jobs describing how the way data should be processed and then retrieve the data to know the response after completion of the Job.
Master consists of 3 components, namely, NameNode, Secondary Node Name, and Job Tracker.
a. NameNode: NameNode does not store the actual files, it stores the meta information of the files. NameNode oversees the health of the DataNode and coordinates the access to the data.
b. JobTracker: JobTracker coordinates the parallel processing of data using MapReduce. To know more about JobTracker, please read the article All You Want to Know about MapReduce (The Heart of Hadoop).
c. Secondary NameNode: the job of Secondary NameNode is to contact the NameNode periodically to recall the metadata of the filesystem from the NameNode and saves it to a clean file folder and send it back to the NameNode. Essentially secondary Name Node does the job of house keeping. In case fo NameNode failure the saved meta data which is stored in the RAM of NameNode, can be rebuilt using the secondary Node.
Slave nodes are the majority of the machines in Hadoop Cluster and are responsible for storing the data and processing the computation.
Why use Hadoop Clusters:
Hadoop clusters are particularly known for boosting the speed of data analysis applications and their scalability. If at any point a cluster’s processing power is under stress by the growing volumes of data, it can be dealt by adding additional cluster nodes to increase throughput. Hadoop clusters have high resistance to failure because each block of data is copied onto other nodes ensuring that the data is not lost if a single node fails.
Global Association of Risk Professionals, Inc. (GARP®) does not endorse, promote, review or warrant the accuracy of the products or services offered by EduPristine for FRM® related information, nor does it endorse any pass rates claimed by the provider. Further, GARP® is not responsible for any fees or costs paid by the user to EduPristine nor is GARP® responsible for any fees or costs of any person or entity providing any services to EduPristine Study Program. FRM®, GARP® and Global Association of Risk Professionals®, are trademarks owned by the Global Association of Risk Professionals, Inc
CFA Institute does not endorse, promote, or warrant the accuracy or quality of the products or services offered by EduPristine. CFA Institute, CFA®, Claritas® and Chartered Financial Analyst® are trademarks owned by CFA Institute.
Utmost care has been taken to ensure that there is no copyright violation or infringement in any of our content. Still, in case you feel that there is any copyright violation of any kind please send a mail to email@example.com and we will rectify it.
2015 © Edupristine. ALL Rights Reserved.