Data Modeling Overview
HBase is different from RDBMS in the sense that it has cells and column families..
Unlike in RDBMS, HBase has row, column family, column and timestamp
in there.
One dimension you don't see in below picture is the time stamp associated
to value in the cell.
Customer regional server has multiple column families and data is stored in HFile.
Best Practices..
If we don't have hotspotting, there will be nice distribution of data across the cluster.
Note that row key is repeated with every column and cell. It occupies significant amount of
space.
Securing HBase
Server side configuration:
Client side configuration:
MapReduce Integration with HBase
bin/hbase mapredcp command returns the class-path for mapreduce dependencies
HBase is atomic and consistent (not eventual consistence..)
Ideal for local testing.
Installing HBase in Local Mode
Set hbase.rootdir and hbase.zookeeper.property.datadir in conf/hbase-site.xml to write
data other than /tmp.
bin/start-hbase.sh command can be used to start HBase..
bin/stop-hbase.sh command can be used to stop HBase..
HBase cluster can have up to 9 back up masters.
HBase Web-Based Management Console
Using the HBase shell
Make sure HBase is running before starting the shell.
bin/hbase shell command can be used to start the shell.
Using the HBase as a Data Sink for MapReduce Jobs
TableMapReduceUtil is HBase specific util class that will setup configuration needed for
HBase.
Using the HBase as a Data Source for MapReduce Jobs
TableMapReduceUtil.initTbaleMapperJob takes name of HBase table used for mapper, scan (may contain
filters), mapper class, key (ImmutableBytesWritable.class) and values (IntWritable.class).
Bulk Loading Data
Splitting Map Tasks when Sourcing an HBase Table
Accessing Other HBase Tables within a MapReduce Job
Taking a Snapshot
Superb post...! I really liked this post and I got many clear details for my skill development from your excellent post. keep it up...
ReplyDeletePega Training in Chennai
Pega Training
Oracle Training in Chennai
Spark Training in Chennai
Oracle DBA Training in Chennai
Excel Training in Chennai
Embedded System Course Chennai
Tableau Training in Chennai
Unix Training in Chennai
Power BI Training in Chennai
It is very useful to know about the information based on blogs..
ReplyDeleteData Science Course in Chennai
Data Science Courses in Bangalore
Data Science Course in Coimbatore
Data Science Course in Hyderabad
Data Science Training in BTM
Data Science Training in Marathahalli
Data Science Course in Marathahalli
Best Data Science Training in Marathahalli
DOT NET Training in Bangalore
PHP Training in Bangalore