HBase集群安装部署

  1. 下载安装包

下载地址:http://archive.apache.org/dist/hbase/1.3.1/

此次使用hbase-1.3.1-bin.tar.gz

  1. 上传并解压安装包到指定的规划目录

1
[root@Linux121 ~]# tar -zxvf /opt/lagou/software/hbase-1.3.1-bin.tar.gz -C /opt/lagou/servers
  1. 需要把hadoop中的配置core-site.xml 、hdfs-site.xml拷贝到hbase安装目录下的conf文件夹中
1
2
3
[root@Linux121 ~]# ln -s /opt/lagou/servers/hadoop-2.9.2/etc/hadoop/core-site.xml /opt/lagou/servers/hbase-1.3.1/conf/core-site.xml

[root@Linux121 ~]# ln -s /opt/lagou/servers/hadoop-2.9.2/etc/hadoop/hdfs-site.xml /opt/lagou/servers/hbase-1.3.1/conf/hdfs-site.xml
  1. 修改conf目录下配置文件
1
2
3
4
5
6
7
8
# 修改 hbase-env.sh
[root@Linux121 ~]# cd /opt/lagou/servers/hbase-1.3.1/conf/
[root@Linux121 conf]# vim hbase-env.sh

export JAVA_HOME=/opt/lagou/servers/jdk1.8.0_161
#指定使用外部的zk集群
export HBASE_MANAGES_ZK=FALSE
export HBASE_PID_DIR=/var/hadoop/pids
  1. 修改 hbase-site.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
[root@Linux121 ~]# cd /opt/lagou/servers/hbase-1.3.1/conf/
[root@Linux121 conf]# vim hbase-site.xml

<configuration>
<!-- 指定hbase在HDFS上存储的路径 -->
<property>
<name>hbase.rootdir</name>
<value>hdfs://Linux121:9000/hbase</value>
</property>
<!-- 指定hbase是分布式的 -->
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<!-- 指定zk的地址,多个用“,”分割 -->
<property>
<name>hbase.zookeeper.quorum</name>
<value>Linux121:2181,Linux122:2181,Linux123:2181</value>
</property>
</configuration>
  1. 修改regionservers文件
1
2
3
4
5
6
[root@Linux121 ~]# cd /opt/lagou/servers/hbase-1.3.1/conf/
[root@Linux121 conf]# vim regionservers

Linux121
Linux122
Linux123
  1. hbase的conf目录下创建文件backup-masters (Standby Master)
1
2
3
4
[root@Linux121 ~]# cd /opt/lagou/servers/hbase-1.3.1/conf/
[root@Linux121 conf]# vim backup-masters

Linux122
  1. 配置hbase的环境变量
1
2
3
4
[root@Linux121 conf]# vim /etc/profile

export HBASE_HOME=/opt/lagou/servers/hbase-1.3.1
export PATH=$PATH:$HBASE_HOME/bin
  1. 分发hbase目录和环境变量到其他节点
1
2
3
[root@Linux121 conf]# rsync-script /opt/lagou/servers/hbase-1.3.1

[root@Linux121 conf]# rsync-script /etc/profile
  1. 让所有节点的hbase环境变量生效
1
2
3
[root@Linux121 ~]# source /etc/profile
[root@Linux122 ~]# source /etc/profile
[root@Linux123 ~]# source /etc/profile
  1. HBase集群的启动和停止

前提条件:先启动hadoop和zk集群

1
2
3
4
5
# 启动HBase:
[root@Linux121 ~]# start-hbase.sh

# 停止HBase:
[root@Linux121 ~]# stop-hbase.sh
  1. HBase集群的web管理界面

启动好HBase集群之后,可以访问地址:HMaster的主机名:16010

HBase shell基本操作

  1. 进入Hbase客户端命令操作界面
1
[root@Linux121 ~]# hbase shell
  1. 查看帮助命令
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
hbase(main):001:0> help

COMMAND GROUPS:
Group name: general
Commands: status, table_help, version, whoami

Group name: ddl
Commands: alter, alter_async, alter_status, create, describe, disable, disable_all, drop, drop_all, enable, enable_all, exists, get_table, is_disabled, is_enabled, list, locate_region, show_filters

Group name: namespace
Commands: alter_namespace, create_namespace, describe_namespace, drop_namespace, list_namespace, list_namespace_tables

Group name: dml
Commands: append, count, delete, deleteall, get, get_counter, get_splits, incr, put, scan, truncate, truncate_preserve

Group name: tools
Commands: assign, balance_switch, balancer, balancer_enabled, catalogjanitor_enabled, catalogjanitor_run, catalogjanitor_switch, close_region, compact, compact_rs, flush, major_compact, merge_region, move, normalize, normalizer_enabled, normalizer_switch, split, splitormerge_enabled, splitormerge_switch, trace, unassign, wal_roll, zk_dump

Group name: replication
Commands: add_peer, append_peer_tableCFs, disable_peer, disable_table_replication, enable_peer, enable_table_replication, get_peer_config, list_peer_configs, list_peers, list_replicated_tables, remove_peer, remove_peer_tableCFs, set_peer_tableCFs, show_peer_tableCFs

Group name: snapshots
Commands: clone_snapshot, delete_all_snapshot, delete_snapshot, delete_table_snapshots, list_snapshots, list_table_snapshots, restore_snapshot, snapshot

Group name: configuration
Commands: update_all_config, update_config

Group name: quotas
Commands: list_quotas, set_quota

Group name: security
Commands: grant, list_security_capabilities, revoke, user_permission

Group name: procedures
Commands: abort_procedure, list_procedures

Group name: visibility labels
Commands: add_labels, clear_auths, get_auths, list_labels, set_auths, set_visibility
  1. 查看当前数据库中有哪些表
1
2
3
4
5
6
hbase(main):002:0> list
TABLE
lagou
1 row(s) in 0.0040 seconds

=> ["lagou"]
  1. 创建一张lagou表, 包含base_info、extra_info两个列族
1
2
3
4
hbase(main):001:0> create 'lagou', 'base_info', 'extra_info'

# 或者(Hbase建表必须指定列族信息)
hbase(main):001:0> create 'lagou', {NAME => 'base_info', VERSIONS => '3'},{NAME => 'extra_info',VERSIONS => '3'}

VERSIONS 是指此单元格内的数据可以保留最近的 3 个版本

  1. 添加数据操作
  • 向lagou表中插入信息,row key为 rk1,列族base_info中添加name列标示符,值为wang

    1
    hbase(main):001:0> put 'lagou', 'rk1', 'base_info:name', 'wang'
  • 向lagou表中插入信息,row key为rk1,列族base_info中添加age列标示符,值为30

    1
    hbase(main):001:0> put 'lagou', 'rk1', 'base_info:age', 30
  • 向lagou表中插入信息,row key为rk1,列族extra_info中添加address列标示符,值为shanghai

    1
    hbase(main):001:0> put 'lagou', 'rk1', 'extra_info:address', 'shanghai'
  1. 查询数据
  • 通过rowkey进行查询,获取表中row key为rk1的所有信息

    1
    2
    3
    4
    5
    6
    7
    hbase(main):001:0> get 'lagou', 'rk1'

    COLUMN CELL
    base_info:age timestamp=1656894957812,value=30
    base_info:name timestamp=1656894923472,value=wang
    extra_info:address timestamp=1656894993649,value=shanghai
    1 row(s) in 0.0340 seconds
  • 查看rowkey下面的某个列族的信息,获取lagou表中row key为rk1,base_info列族的所有信息

    1
    2
    3
    4
    5
    6
    hbase(main):001:0> get 'lagou', 'rk1', 'base_info'

    COLUMN CELL
    base_info:age timestamp=1656894957812,value=30
    base_info:name timestamp=1656894923472,value=wang
    1 row(s) in 0.0340 seconds
  • 查看rowkey指定列族指定字段的值,获取表中row key为rk1,base_info列族的name、age列标示符的信息

    1
    2
    3
    4
    5
    6
    hbase(main):008:0> get 'lagou', 'rk1', 'base_info:name', 'base_info:age'

    COLUMN CELL
    base_info:age timestamp=1656894957812,value=30
    base_info:name timestamp=1656894923472,value=wang
    1 row(s) in 0.0340 seconds
  • 查看rowkey指定多个列族的信息,获取lagou表中row key为rk1,base_info、extra_info列族的信息

    1
    2
    3
    4
    5
    hbase(main):010:0> get 'lagou', 'rk1', 'base_info', 'extra_info'
    或者
    hbase(main):011:0> get 'lagou', 'rk1', {COLUMN => ['base_info', 'extra_info']}
    或者
    hbase(main):012:0> get 'lagou', 'rk1', {COLUMN => ['base_info:name', 'extra_info:address']}
  • 指定rowkey与列值查询,获取表中row key为rk1,cell的值为wang的信息

    1
    2
    3
    4
    5
    hbase(main):001:0> get 'lagou', 'rk1', {FILTER => "ValueFilter(=, 'binary:wang')"}

    COLUMN CELL
    base_info:name timestamp=1656894923472,value=wang
    1 row(s) in 0.0340 seconds
  • 指定rowkey与列值模糊查询,获取表中row key为rk1,列标示符中含有a的信息

    1
    2
    3
    4
    5
    6
    7
    hbase(main):001:0> get 'lagou', 'rk1', {FILTER => " (QualifierFilter(=,'substring:a'))"}

    COLUMN CELL
    base_info:age timestamp=1656894957812,value=30
    base_info:name timestamp=1656894923472,value=wang
    extra_info:address timestamp=1656894993649,value=shanghai
    1 row(s) in 0.0340 seconds
  • 查询所有数据,查询lagou表中的所有信息

    1
    2
    3
    4
    5
    6
    7
    hbase(main):000:0> scan 'lagou'  

    ROW COLUMN+CELL
    rk1 column=base_info:age,timestamp=1656894957812,value=30
    rk1 column=base_info:name,timestamp=1656894923472,value=wang
    rk1 column=extra_info:address,timestamp=1656894993649,value=shanghai
    1 row(s) in 0.0370 seconds
  • 列族查询,查询表中列族为 base_info 的信息

    1
    2
    3
    4
    hbase(main):001:0> scan 'lagou', {COLUMNS => 'base_info'}
    hbase(main):002:0> scan 'lagou', {COLUMNS => 'base_info', RAW => true, VERSIONS => 3}
    ## Scan时可以设置是否开启Raw模式,开启Raw模式会返回包括已添加删除标记但是未实际删除的数据
    ## VERSIONS指定查询的最大版本数
  • 指定多个列族与按照数据值模糊查询,查询lagou表中列族为 base_info 和 extra_info且列标示符中含有a字符的信息

    1
    2
    3
    4
    5
    6
    7
    hbase(main):001:0> scan 'lagou', {COLUMNS => ['base_info', 'extra_info'], FILTER => "(QualifierFilter(=,'substring:a'))"}

    ROW COLUMN+CELL
    rk1 column=base_info:age,timestamp=1656894957812,value=30
    rk1 column=base_info:name,timestamp=1656894923472,value=wang
    rk1 column=extra_info:address,timestamp=1656894993649,value=shanghai
    1 row(s) in 0.0370 seconds
  • rowkey的范围值查询(非常重要),查询lagou表中列族为base_info,rk范围是[rk1, rk3)的数据(rowkey底层存储是字典序),按rowkey顺序存储。

    1
    2
    3
    4
    5
    6
    hbase(main):001:0> scan 'lagou', {COLUMNS => 'base_info', STARTROW => 'rk1', ENDROW => 'rk3'}

    ROW COLUMN+CELL
    rk1 column=base_info:age,timestamp=1656894957812,value=30
    rk1 column=base_info:name,timestamp=1656894923472,value=wang
    1 row(s) in 0.0370 seconds
  • 指定rowkey模糊查询,查询lagou表中row key以rk字符开头的

    1
    2
    3
    4
    5
    6
    7
    hbase(main):001:0> scan 'lagou',{FILTER=>"PrefixFilter('rk')"}

    ROW COLUMN+CELL
    rk1 column=base_info:age,timestamp=1656894957812,value=30
    rk1 column=base_info:name,timestamp=1656894923472,value=wang
    rk1 column=extra_info:address,timestamp=1656894993649,value=shanghai
    1 row(s) in 0.0370 seconds
  • 更新数据值,把lagou表中rowkey为rk1的base_info列族下的列name修改为liang

    1
    hbase(main):030:0> put 'lagou', 'rk1', 'base_info:name', 'liang'
  • 指定rowkey以及列名进行删除,删除lagou表row key为rk1,列标示符为 base_info:name 的数据

    1
    hbase(main):002:0> delete 'lagou', 'rk1', 'base_info:name'
  • 指定rowkey,列名以及时间戳信息进行删除,删除lagou表row key为rk1,列标示符为base_info:name的数据

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    14
    15
    16
    hbase(main):030:0> scan 'lagou', {COLUMNS => 'base_info', RAW => true, VERSIONS => 3}
    ROW COLUMN+CELL
    rk1 column=base_info:age,timestamp=1656894957812,value=30
    rk1 column=base_info:name,timestamp=1656896918999,type=DeleteColumn
    rk1 column=base_info:name,timestamp=1656896763505,value=liang
    rk1 column=base_info:name, timestamp=1656894923472, value=wang

    hbase(main):033:0> delete 'lagou', 'rk1', 'base_info:name',1656894923472

    hbase(main):023:0> scan 'lagou', {COLUMNS => 'base_info', RAW => true, VERSIONS => 3}
    ROW COLUMN+CELL
    rk1 column=base_info:age,timestamp=1656894957812,value=30
    rk1 column=base_info:name,timestamp=1656896918999,type=DeleteColumn
    rk1 column=base_info:name,timestamp=1656896763505,value=liang
    rk1 column=base_info:name, timestamp=1656894923472, value=wang
    rk1 column=base_info:name, timestamp=1656894923472, type=DeleteColumn
  • 删除列族,删除 base_info 列族

    1
    hbase(main):035:0> alter 'lagou', 'delete' => 'base_info'
  • 清空表数据,删除lagou表数据

    1
    hbase(main):001:0> truncate 'lagou'
  • 删除表,删除lagou表

    1
    2
    3
    4
    5
    6
    #先disable 再drop
    hbase(main):036:0> disable 'lagou'
    hbase(main):037:0> drop 'lagou'

    #如果不进行disable,直接drop会报错
    ERROR: Table user is enabled. Disable it first.