Launch an Apache HBase cluster on Amazon EMR using the AWS SDK for Java and connection with Amazon Kinesis.



Apache HBase; MapReduce; Amazon Kinesis  and Hadoop framework, all these technologies and others are indispensable in our data-driven world.

Therefore, Connectikpeople.co, has captured for you a great post from Wangechi Doble, an AWS Solutions Architect, likely to help and show you how to launch an Apache HBase cluster on Amazon EMR using the AWS SDK for Java and how to extend the Amazon Kinesis Connector Library to stream data in real-time to HBase running on an Amazon EMR cluster. 

If unfamiliar, Connectikpeople.co reminds that, Apache HBase is an open-source, column-oriented, distributed NoSQL database that runs on the Apache Hadoop framework.

You can observer that, in the AWS Cloud, you can choose to deploy Apache HBase on Amazon Elastic Cloud Compute (Amazon EC2) and manage it yourself or leverage Apache HBase as a managed service on Amazon Elastic MapReduce (Amazon EMR).  Amazon EMR is a managed, hosted Hadoop framework on top of Amazon EC2.
When it comes to Amazon Kinesis , it is a fully managed service for real-time processing of streaming big data.

To learn more about launching Apache HBase on Amazon EMR, you can also see the documentation for installing HBase on an Amazon EMR Cluster section of the Amazon EMR documentation. 

Popular Posts