Using Zookeeper

Learn How To Coordinate Hadoop Clusters Using Zookeeper

Hadoop was designed to be a distributed system that scales up to thousands of nodes. Even with a few hundred node cluster managing all...
Create-Topologies-In-Storm-To-Process-Data

Learn How To Create Topologies In Storm To Process Data

In part 1 of this tutorial key concepts that are used in Storm were discussed. In that tutorial it was explained Storm topologies are...
Learn-how-to-secure-a-Hadoop-cluster-using-kerberos-Part2-740X296

Learn How To Secure A Hadoop Cluster Using Kerberos Part 2

In part 1 of this tutorial key terminologies used in kerberos authentication were discussed. We demonstrated how to set up and configure a KDC...
Learn-how-to-secure-a-Hadoop-cluster-using-kerberos-Part1-740X296

Learn How To Secure A Hadoop Cluster Using Kerberos Part – 1

Kerberos is a way of authenticating users that was developed at MIT and has grown to become the most widely used authentication approach. Hadoop...
Learn-How-To-Process-Stream-Data-In-Real-Time-Using-Apache-Storm-Part-1-740X296

Learn How To Process Stream Data In Real Time Using Apache Storm Part-1

Apache Storm is a top level hadoop project that has been developed to enable processing of very large stream data that arrives very fast...
Learn-how-to-develop-Spark-applications-using-the-Scala-programming-language-740X296

Learn How To Analyze Data Interactively In Spark Using Scala

Scala is a programming language that incorporates object oriented and functional programming styles. It is one of the programming languages along Java and Python...
Learn-how-to-develop-effective-data-models-in-Hbase-740X296

Learn How to Develop Effective Data Models in Hbase

To develop a data model in Hbase that is scalable you need a good understanding of the strengths and weaknesses of the database. The...
Learn-how-to-create-effective-data-models-in-Hive-740X296

Learn How to Develop Effective Data Models in Hive

Within the Hadoop ecosystem Hive is considered as a data warehouse. This could be true or false depending on how you look at it....
Learn-how-to-process-data-using-Spark-740X296

Learn How To Process Data Using Spark On Amazon Elastic Mapreduce

Apache Spark is a data processing framework that has been developed to process very large amounts of data very fast. The speed gains are...
Learn-how-to-manage-data-in-the-hadoop-file-system-740X296

Learn How to Manage Files Within Hadoop File System

Data in hdfs is store in blocks that have a default size of 64mb. Files that you store in hdfs are broken up and...