HBase shell常用命令总结
HBase shell常用命令总结
作者:尹正杰
版权声明:原创作品,谢绝转载!否则将追究法律责任。
一.查看hbase脚本的帮助信息
[root@hadoop101.yinzhengjie.org.cn ~]# hbase #在命令行中直接敲击"hbase"就会弹出该脚本的帮助信息。 Usage: hbase [<options>] <command> [<args>] Options: --config DIR Configuration direction to use. Default: ./conf --hosts HOSTS Override the list in \'regionservers\' file --auth-as-server Authenticate to ZooKeeper using servers configuration --internal-classpath Skip attempting to use client facing jars (WARNING: unstable results between versions) Commands: Some commands take arguments. Pass no args or -h for usage. shell Run the HBase shell hbck Run the HBase \'fsck\' tool. Defaults read-only hbck1. Pass \'-j /path/to/HBCK2.jar\' to run hbase-2.x HBCK2. snapshot Tool for managing snapshots wal Write-ahead-log analyzer hfile Store file analyzer zkcli Run the ZooKeeper shell master Run an HBase HMaster node regionserver Run an HBase HRegionServer node zookeeper Run a ZooKeeper server rest Run an HBase REST server thrift Run the HBase Thrift server thrift2 Run the HBase Thrift2 server clean Run the HBase clean up script classpath Dump hbase CLASSPATH mapredcp Dump CLASSPATH entries required by mapreduce pe Run PerformanceEvaluation ltt Run LoadTestTool canary Run the Canary tool version Print the version completebulkload Run BulkLoadHFiles tool regionsplitter Run RegionSplitter tool rowcounter Run RowCounter tool cellcounter Run CellCounter tool pre-upgrade Run Pre-Upgrade validator tool hbtop Run HBTop tool CLASSNAME Run the class named CLASSNAME [root@hadoop101.yinzhengjie.org.cn ~]#
二.hbase shell常用命令总结
1>.进入HBase的交互式命令行
[root@hadoop101.yinzhengjie.org.cn ~]# hbase shell SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/yinzhengjie/softwares/ha/share/hadoop/common/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/yinzhengjie/softwares/hbase-2.2.4/lib/client-facing-thirdparty/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] HBase Shell Use "help" to get list of supported commands. Use "exit" to quit this interactive shell. For Reference, please visit: http://hbase.apache.org/2.0/book.html#shell Version 2.2.4, r67779d1a325a4f78a468af3339e73bf075888bac, 2020年 03月 11日 星期三 12:57:39 CST Took 0.0020 seconds hbase(main):001:0>
[root@hadoop101.yinzhengjie.org.cn ~]# hbase shell
2>.查看帮助命令
hbase(main):001:0> help HBase Shell, version 2.2.4, r67779d1a325a4f78a468af3339e73bf075888bac, 2020年 03月 11日 星期三 12:57:39 CST Type \'help "COMMAND"\', (e.g. \'help "get"\' -- the quotes are necessary) for help on a specific command. Commands are grouped. Type \'help "COMMAND_GROUP"\', (e.g. \'help "general"\') for help on a command group. COMMAND GROUPS: Group name: general Commands: processlist, status, table_help, version, whoami Group name: ddl Commands: alter, alter_async, alter_status, clone_table_schema, create, describe, disable, disable_all, drop, drop_all, enable, enable_all, exists, get_table, is_disabled, is_enabled, list, list_regions, locate_region, show_filters Group name: namespace Commands: alter_namespace, create_namespace, describe_namespace, drop_namespace, list_namespace, list_namespace_tables Group name: dml Commands: append, count, delete, deleteall, get, get_counter, get_splits, incr, put, scan, truncate, truncate_preserve Group name: tools Commands: assign, balance_switch, balancer, balancer_enabled, catalogjanitor_enabled, catalogjanitor_run, catalogjanitor_switch, cleaner_chore_enabled, cleaner_chore_run, cleaner_chore_switch, clear_block_cache, clear_compaction_queues, clear_deadservers, close_region , compact, compact_rs, compaction_state, compaction_switch, decommission_regionservers, flush, hbck_chore_run, is_in_maintenance_mode, list_deadservers, list_decommissioned_regionservers, major_compact, merge_region, move, normalize, normalizer_enabled, normalizer_switch, recommission_regionserver, regioninfo, rit, split, splitormerge_enabled, splitormerge_switch, stop_master, stop_regionserver, trace, unassign, wal_roll, zk_dump Group name: replication Commands: add_peer, append_peer_exclude_namespaces, append_peer_exclude_tableCFs, append_peer_namespaces, append_peer_tableCFs, disable_peer, disable_table_replication, enable_peer, enable_table_replication, get_peer_config, list_peer_configs, list_peers, list_replica ted_tables, remove_peer, remove_peer_exclude_namespaces, remove_peer_exclude_tableCFs, remove_peer_namespaces, remove_peer_tableCFs, set_peer_bandwidth, set_peer_exclude_namespaces, set_peer_exclude_tableCFs, set_peer_namespaces, set_peer_replicate_all, set_peer_serial, set_peer_tableCFs, show_peer_tableCFs, update_peer_config Group name: snapshots Commands: clone_snapshot, delete_all_snapshot, delete_snapshot, delete_table_snapshots, list_snapshots, list_table_snapshots, restore_snapshot, snapshot Group name: configuration Commands: update_all_config, update_config Group name: quotas Commands: disable_exceed_throttle_quota, disable_rpc_throttle, enable_exceed_throttle_quota, enable_rpc_throttle, list_quota_snapshots, list_quota_table_sizes, list_quotas, list_snapshot_sizes, set_quota Group name: security Commands: grant, list_security_capabilities, revoke, user_permission Group name: procedures Commands: list_locks, list_procedures Group name: visibility labels Commands: add_labels, clear_auths, get_auths, list_labels, set_auths, set_visibility Group name: rsgroup Commands: add_rsgroup, balance_rsgroup, get_rsgroup, get_server_rsgroup, get_table_rsgroup, list_rsgroups, move_namespaces_rsgroup, move_servers_namespaces_rsgroup, move_servers_rsgroup, move_servers_tables_rsgroup, move_tables_rsgroup, remove_rsgroup, remove_servers_ rsgroup SHELL USAGE: Quote all names in HBase Shell such as table and column names. Commas delimit command parameters. Type <RETURN> after entering a command to run it. Dictionaries of configuration used in the creation and alteration of tables are Ruby Hashes. They look like this: {\'key1\' => \'value1\', \'key2\' => \'value2\', ...} and are opened and closed with curley-braces. Key/values are delimited by the \'=>\' character combination. Usually keys are predefined constants such as NAME, VERSIONS, COMPRESSION, etc. Constants do not need to be quoted. Type \'Object.constants\' to see a (messy) list of all constants in the environment. If you are using binary keys or values and need to enter them in the shell, use double-quote\'d hexadecimal representation. For example: hbase> get \'t1\', "key\x03\x3f\xcd" hbase> get \'t1\', "key\003\023\011" hbase> put \'t1\', "test\xef\xff", \'f1:\', "\x01\x33\x40" The HBase shell is the (J)Ruby IRB with the above HBase-specific commands added. For more on the HBase Shell, see http://hbase.apache.org/book.html hbase(main):002:0>
hbase(main):001:0> help #会显示所有命令的帮助信息
3>.查看当前名称空间有哪些表
hbase(main):003:0> list #对于刚刚搭建的HBase集群,默认的数据库中是空的。 TABLE 0 row(s) Took 0.5429 seconds => [] hbase(main):004:0>
hbase(main):003:0> list #对于刚刚搭建的HBase集群,默认的数据库中是空的。
hbase(main):004:0> help "list" List all user tables in hbase. Optional regular expression parameter could be used to filter the output. Examples: hbase> list hbase> list \'abc.*\' hbase> list \'ns:abc.*\' hbase> list \'ns:.*\' hbase(main):005:0>
hbase(main):004:0> help “list” #仅查看”list”命令的帮助信息
4>.创建”teacher”表
hbase(main):002:0> help "create" Creates a table. Pass a table name, and a set of column family specifications (at least one), and, optionally, table configuration. Column specification can be a simple string (name), or a dictionary (dictionaries are described below in main help output), necessarily including NAME attribute. Examples: Create a table with namespace=ns1 and table qualifier=t1 hbase> create \'ns1:t1\', {NAME => \'f1\', VERSIONS => 5} Create a table with namespace=default and table qualifier=t1 hbase> create \'t1\', {NAME => \'f1\'}, {NAME => \'f2\'}, {NAME => \'f3\'} hbase> # The above in shorthand would be the following: hbase> create \'t1\', \'f1\', \'f2\', \'f3\' hbase> create \'t1\', {NAME => \'f1\', VERSIONS => 1, TTL => 2592000, BLOCKCACHE => true} hbase> create \'t1\', {NAME => \'f1\', CONFIGURATION => {\'hbase.hstore.blockingStoreFiles\' => \'10\'}} hbase> create \'t1\', {NAME => \'f1\', IS_MOB => true, MOB_THRESHOLD => 1000000, MOB_COMPACT_PARTITION_POLICY => \'weekly\'} Table configuration options can be put at the end. Examples: hbase> create \'ns1:t1\', \'f1\', SPLITS => [\'10\', \'20\', \'30\', \'40\'] hbase> create \'t1\', \'f1\', SPLITS => [\'10\', \'20\', \'30\', \'40\'] hbase> create \'t1\', \'f1\', SPLITS_FILE => \'splits.txt\', OWNER => \'johndoe\' hbase> create \'t1\', {NAME => \'f1\', VERSIONS => 5}, METADATA => { \'mykey\' => \'myvalue\' } hbase> # Optionally pre-split the table into NUMREGIONS, using hbase> # SPLITALGO ("HexStringSplit", "UniformSplit" or classname) hbase> create \'t1\', \'f1\', {NUMREGIONS => 15, SPLITALGO => \'HexStringSplit\'} hbase> create \'t1\', \'f1\', {NUMREGIONS => 15, SPLITALGO => \'HexStringSplit\', REGION_REPLICATION => 2, CONFIGURATION => {\'hbase.hregion.scan.loadColumnFamiliesOnDemand\' => \'true\'}} hbase> create \'t1\', \'f1\', {SPLIT_ENABLED => false, MERGE_ENABLED => false} hbase> create \'t1\', {NAME => \'f1\', DFS_REPLICATION => 1} You can also keep around a reference to the created table: hbase> t1 = create \'t1\', \'f1\' Which gives you a reference to the table named \'t1\', on which you can then call methods. hbase(main):003:0>
hbase(main):002:0> help “create” #仅查看”create”命令的帮助信息
hbase(main):006:0> list #对于刚刚搭建的HBase集群,默认的数据库中是空的。 TABLE 0 row(s) Took 0.5429 seconds => [] hbase(main):007:0> create \'teacher\',\'synopsis\',\'professional_skill\',\'project_experience\' #创建一张"teacher"表,并指定\'synopsis\',\'professional_skill\',\'project_experience\'这3个列族。 Created table teacher Took 2.7281 seconds => Hbase::Table - teacher hbase(main):008:0> list TABLE teacher 1 row(s) Took 0.0222 seconds => ["teacher"] hbase(main):009:0>
hbase(main):007:0> create \’teacher\’,\’synopsis\’,\’professional_skill\’,\’project_experience\’ #创建一张”teacher”表,并指定\’synopsis\’,\’professional_skill\’,\’project_experience\’这3个列族。
5>.查看表结构
hbase(main):010:0> help "describe" Describe the named table. For example: hbase> describe \'t1\' hbase> describe \'ns1:t1\' Alternatively, you can use the abbreviated \'desc\' for the same thing. hbase> desc \'t1\' hbase> desc \'ns1:t1\' hbase(main):011:0>
hbase(main):010:0> help “describe” #仅查看”describe”命令的帮助信息
hbase(main):011:0> list TABLE teacher 1 row(s) Took 0.0148 seconds => ["teacher"] hbase(main):012:0> describe \'teacher\' #查看"teacher"表的表结构 Table teacher is ENABLED teacher COLUMN FAMILIES DESCRIPTION {NAME => \'professional_skill\', VERSIONS => \'1\', EVICT_BLOCKS_ON_CLOSE => \'false\', NEW_VERSION_BEHAVIOR => \'false\', KEEP_DELETED_CELLS => \'FALSE\', CACHE_DATA_ON_WRITE => \'false\', DATA_BLOCK_ENCODING => \'NONE\', TTL => \'FOREVER\', MIN_VERSIONS => \'0\', REPLICATION_SCOPE => \' 0\', BLOOMFILTER => \'ROW\', CACHE_INDEX_ON_WRITE => \'false\', IN_MEMORY => \'false\', CACHE_BLOOMS_ON_WRITE => \'false\', PREFETCH_BLOCKS_ON_OPEN => \'false\', COMPRESSION => \'NONE\', BLOCKCACHE => \'true\', BLOCKSIZE => \'65536\'} {NAME => \'project_experience\', VERSIONS => \'1\', EVICT_BLOCKS_ON_CLOSE => \'false\', NEW_VERSION_BEHAVIOR => \'false\', KEEP_DELETED_CELLS => \'FALSE\', CACHE_DATA_ON_WRITE => \'false\', DATA_BLOCK_ENCODING => \'NONE\', TTL => \'FOREVER\', MIN_VERSIONS => \'0\', REPLICATION_SCOPE => \' 0\', BLOOMFILTER => \'ROW\', CACHE_INDEX_ON_WRITE => \'false\', IN_MEMORY => \'false\', CACHE_BLOOMS_ON_WRITE => \'false\', PREFETCH_BLOCKS_ON_OPEN => \'false\', COMPRESSION => \'NONE\', BLOCKCACHE => \'true\', BLOCKSIZE => \'65536\'} {NAME => \'synopsis\', VERSIONS => \'1\', EVICT_BLOCKS_ON_CLOSE => \'false\', NEW_VERSION_BEHAVIOR => \'false\', KEEP_DELETED_CELLS => \'FALSE\', CACHE_DATA_ON_WRITE => \'false\', DATA_BLOCK_ENCODING => \'NONE\', TTL => \'FOREVER\', MIN_VERSIONS => \'0\', REPLICATION_SCOPE => \'0\', BLOOMF ILTER => \'ROW\', CACHE_INDEX_ON_WRITE => \'false\', IN_MEMORY => \'false\', CACHE_BLOOMS_ON_WRITE => \'false\', PREFETCH_BLOCKS_ON_OPEN => \'false\', COMPRESSION => \'NONE\', BLOCKCACHE => \'true\', BLOCKSIZE => \'65536\'} 3 row(s) QUOTAS 0 row(s) Took 0.0801 seconds hbase(main):013:0>
hbase(main):012:0> describe \’teacher\’ #查看”teacher”表的表结构
7>.往”teacher”表中插入数据
hbase(main):013:0> help "put" Put a cell \'value\' at specified table/row/column and optionally timestamp coordinates. To put a cell value into table \'ns1:t1\' or \'t1\' at row \'r1\' under column \'c1\' marked with the time \'ts1\', do: hbase> put \'ns1:t1\', \'r1\', \'c1\', \'value\' hbase> put \'t1\', \'r1\', \'c1\', \'value\' hbase> put \'t1\', \'r1\', \'c1\', \'value\', ts1 hbase> put \'t1\', \'r1\', \'c1\', \'value\', {ATTRIBUTES=>{\'mykey\'=>\'myvalue\'}} hbase> put \'t1\', \'r1\', \'c1\', \'value\', ts1, {ATTRIBUTES=>{\'mykey\'=>\'myvalue\'}} hbase> put \'t1\', \'r1\', \'c1\', \'value\', ts1, {VISIBILITY=>\'PRIVATE|SECRET\'} The same commands also can be run on a table reference. Suppose you had a reference t to table \'t1\', the corresponding command would be: hbase> t.put \'r1\', \'c1\', \'value\', ts1, {ATTRIBUTES=>{\'mykey\'=>\'myvalue\'}} hbase(main):014:0>
hbase(main):013:0> help “put” #仅查看”put”命令的帮助信息
hbase(main):014:0> put \'teacher\',\'10001\',\'synopsis:name\',\'yinzhengjie\' #往"teacher"表中插入一条数据,指定rowkey为"10001",指定列族为"synopsis",字段为"name",其值为"yinzhengjie" Took 0.1176 seconds hbase(main):015:0> put \'teacher\',\'10001\',\'synopsis:age\',\'18\' Took 0.0105 seconds hbase(main):016:0> put \'teacher\',\'10001\',\'synopsis:address\',\'beijing\' Took 0.0118 seconds hbase(main):017:0> put \'teacher\',\'10002\',\'synopsis:name\',\'jason yin\' #需要注意的是,这里的rowkey为"10002",和上面的"10001"并不是同一条记录哟~ Took 0.0099 seconds hbase(main):018:0> put \'teacher\',\'10002\',\'synopsis:age\',\'27\' Took 0.0098 seconds hbase(main):019:0> put \'teacher\',\'10002\',\'synopsis:address\',\'shijiazhuang\' Took 0.0101 seconds hbase(main):020:0>
8>.查看表数据
hbase(main):019:0> help "scan" Scan a table; pass table name and optionally a dictionary of scanner specifications. Scanner specifications may include one or more of: TIMERANGE, FILTER, LIMIT, STARTROW, STOPROW, ROWPREFIXFILTER, TIMESTAMP, MAXLENGTH, COLUMNS, CACHE, RAW, VERSIONS, ALL_METRICS, METRICS, REGION_REPLICA_ID, ISOLATION_LEVEL, READ_TYPE, ALLOW_PARTIAL_RESULTS, BATCH or MAX_RESULT_SIZE If no columns are specified, all columns will be scanned. To scan all members of a column family, leave the qualifier empty as in \'col_family\'. The filter can be specified in two ways: 1. Using a filterString - more information on this is available in the Filter Language document attached to the HBASE-4176 JIRA 2. Using the entire package name of the filter. If you wish to see metrics regarding the execution of the scan, the ALL_METRICS boolean should be set to true. Alternatively, if you would prefer to see only a subset of the metrics, the METRICS array can be defined to include the names of only the metrics you care about. Some examples: hbase> scan \'hbase:meta\' hbase> scan \'hbase:meta\', {COLUMNS => \'info:regioninfo\'} hbase> scan \'ns1:t1\', {COLUMNS => [\'c1\', \'c2\'], LIMIT => 10, STARTROW => \'xyz\'} hbase> scan \'t1\', {COLUMNS => [\'c1\', \'c2\'], LIMIT => 10, STARTROW => \'xyz\'} hbase> scan \'t1\', {COLUMNS => \'c1\', TIMERANGE => [1303668804000, 1303668904000]} hbase> scan \'t1\', {REVERSED => true} hbase> scan \'t1\', {ALL_METRICS => true} hbase> scan \'t1\', {METRICS => [\'RPC_RETRIES\', \'ROWS_FILTERED\']} hbase> scan \'t1\', {ROWPREFIXFILTER => \'row2\', FILTER => " (QualifierFilter (>=, \'binary:xyz\')) AND (TimestampsFilter ( 123, 456))"} hbase> scan \'t1\', {FILTER => org.apache.hadoop.hbase.filter.ColumnPaginationFilter.new(1, 0)} hbase> scan \'t1\', {CONSISTENCY => \'TIMELINE\'} hbase> scan \'t1\', {ISOLATION_LEVEL => \'READ_UNCOMMITTED\'} hbase> scan \'t1\', {MAX_RESULT_SIZE => 123456} For setting the Operation Attributes hbase> scan \'t1\', { COLUMNS => [\'c1\', \'c2\'], ATTRIBUTES => {\'mykey\' => \'myvalue\'}} hbase> scan \'t1\', { COLUMNS => [\'c1\', \'c2\'], AUTHORIZATIONS => [\'PRIVATE\',\'SECRET\']} For experts, there is an additional option -- CACHE_BLOCKS -- which switches block caching for the scanner on (true) or off (false). By default it is enabled. Examples: hbase> scan \'t1\', {COLUMNS => [\'c1\', \'c2\'], CACHE_BLOCKS => false} Also for experts, there is an advanced option -- RAW -- which instructs the scanner to return all cells (including delete markers and uncollected deleted cells). This option cannot be combined with requesting specific COLUMNS. Disabled by default. Example: hbase> scan \'t1\', {RAW => true, VERSIONS => 10} There is yet another option -- READ_TYPE -- which instructs the scanner to use a specific read type. Example: hbase> scan \'t1\', {READ_TYPE => \'PREAD\'} Besides the default \'toStringBinary\' format, \'scan\' supports custom formatting by column. A user can define a FORMATTER by adding it to the column name in the scan specification. The FORMATTER can be stipulated: 1. either as a org.apache.hadoop.hbase.util.Bytes method name (e.g, toInt, toString) 2. or as a custom class followed by method name: e.g. \'c(MyFormatterClass).format\'. Example formatting cf:qualifier1 and cf:qualifier2 both as Integers: hbase> scan \'t1\', {COLUMNS => [\'cf:qualifier1:toInt\', \'cf:qualifier2:c(org.apache.hadoop.hbase.util.Bytes).toInt\'] } Note that you can specify a FORMATTER by column only (cf:qualifier). You can set a formatter for all columns (including, all key parts) using the "FORMATTER" and "FORMATTER_CLASS" options. The default "FORMATTER_CLASS" is "org.apache.hadoop.hbase.util.Bytes". hbase> scan \'t1\', {FORMATTER => \'toString\'} hbase> scan \'t1\', {FORMATTER_CLASS => \'org.apache.hadoop.hbase.util.Bytes\', FORMATTER => \'toString\'} Scan can also be used directly from a table, by first getting a reference to a table, like such: hbase> t = get_table \'t\' hbase> t.scan Note in the above situation, you can still provide all the filtering, columns, options, etc as described above. hbase(main):020:0>
hbase(main):019:0> help “scan” #仅查看”scan”命令的帮助信息
hbase(main):020:0> scan \'teacher\' #查看全表信息,生产环境中不推荐使用该命令,因为产线的数据是相当庞大的! ROW COLUMN+CELL 10001 column=synopsis:address, timestamp=1590402865107, value=beijing 10001 column=synopsis:age, timestamp=1590402849894, value=18 10001 column=synopsis:name, timestamp=1590402838027, value=yinzhengjie 10002 column=synopsis:address, timestamp=1590402961446, value=shijiazhuang 10002 column=synopsis:age, timestamp=1590402919298, value=27 10002 column=synopsis:name, timestamp=1590402908251, value=jason yin 2 row(s) Took 0.1843 seconds hbase(main):021:0>
hbase(main):020:0> scan \’teacher\’ #查看全表信息,生产环境中不推荐使用该命令,因为产线的数据是相当庞大的!
hbase(main):021:0> scan \'teacher\',{STARTROW => \'10001\',STOPROW => \'10002\'} #仅查看ROWKEY为"10001"的行。注意在使用STARTEOW和STOPROW时,是左闭右开区间哟~ ROW COLUMN+CELL 10001 column=synopsis:address, timestamp=1590402865107, value=beijing 10001 column=synopsis:age, timestamp=1590402849894, value=18 10001 column=synopsis:name, timestamp=1590402838027, value=yinzhengjie 1 row(s) Took 0.0227 seconds hbase(main):022:0>
hbase(main):021:0> scan \’teacher\’,{STARTROW => \’10001\’,STOPROW => \’10002\’} #仅查看ROWKEY为”10001″的行。注意在使用STARTEOW和STOPROW时,是左闭右开区间哟~
hbase(main):022:0> scan \'teacher\',{STARTROW => \'10001\'} #指定查看的起始点"10001"行后的所有数据。 ROW COLUMN+CELL 10001 column=synopsis:address, timestamp=1590402865107, value=beijing 10001 column=synopsis:age, timestamp=1590402849894, value=18 10001 column=synopsis:name, timestamp=1590402838027, value=yinzhengjie 10002 column=synopsis:address, timestamp=1590402961446, value=shijiazhuang 10002 column=synopsis:age, timestamp=1590402919298, value=27 10002 column=synopsis:name, timestamp=1590402908251, value=jason yin 2 row(s) Took 0.0128 seconds hbase(main):023:0>
hbase(main):022:0> scan \’teacher\’,{STARTROW => \’10001\’} #指定查看的起始点”10001″行后的所有数据。基于(ROWKEY)行过滤本质上是基于字符串匹配的。
hbase(main):023:0> help "get" Get row or cell contents; pass table name, row, and optionally a dictionary of column(s), timestamp, timerange and versions. Examples: hbase> get \'ns1:t1\', \'r1\' hbase> get \'t1\', \'r1\' hbase> get \'t1\', \'r1\', {TIMERANGE => [ts1, ts2]} hbase> get \'t1\', \'r1\', {COLUMN => \'c1\'} hbase> get \'t1\', \'r1\', {COLUMN => [\'c1\', \'c2\', \'c3\']} hbase> get \'t1\', \'r1\', {COLUMN => \'c1\', TIMESTAMP => ts1} hbase> get \'t1\', \'r1\', {COLUMN => \'c1\', TIMERANGE => [ts1, ts2], VERSIONS => 4} hbase> get \'t1\', \'r1\', {COLUMN => \'c1\', TIMESTAMP => ts1, VERSIONS => 4} hbase> get \'t1\', \'r1\', {FILTER => "ValueFilter(=, \'binary:abc\')"} hbase> get \'t1\', \'r1\', \'c1\' hbase> get \'t1\', \'r1\', \'c1\', \'c2\' hbase> get \'t1\', \'r1\', [\'c1\', \'c2\'] hbase> get \'t1\', \'r1\', {COLUMN => \'c1\', ATTRIBUTES => {\'mykey\'=>\'myvalue\'}} hbase> get \'t1\', \'r1\', {COLUMN => \'c1\', AUTHORIZATIONS => [\'PRIVATE\',\'SECRET\']} hbase> get \'t1\', \'r1\', {CONSISTENCY => \'TIMELINE\'} hbase> get \'t1\', \'r1\', {CONSISTENCY => \'TIMELINE\', REGION_REPLICA_ID => 1} Besides the default \'toStringBinary\' format, \'get\' also supports custom formatting by column. A user can define a FORMATTER by adding it to the column name in the get specification. The FORMATTER can be stipulated: 1. either as a org.apache.hadoop.hbase.util.Bytes method name (e.g, toInt, toString) 2. or as a custom class followed by method name: e.g. \'c(MyFormatterClass).format\'. Example formatting cf:qualifier1 and cf:qualifier2 both as Integers: hbase> get \'t1\', \'r1\' {COLUMN => [\'cf:qualifier1:toInt\', \'cf:qualifier2:c(org.apache.hadoop.hbase.util.Bytes).toInt\'] } Note that you can specify a FORMATTER by column only (cf:qualifier). You can set a formatter for all columns (including, all key parts) using the "FORMATTER" and "FORMATTER_CLASS" options. The default "FORMATTER_CLASS" is "org.apache.hadoop.hbase.util.Bytes". hbase> get \'t1\', \'r1\', {FORMATTER => \'toString\'} hbase> get \'t1\', \'r1\', {FORMATTER_CLASS => \'org.apache.hadoop.hbase.util.Bytes\', FORMATTER => \'toString\'} The same commands also can be run on a reference to a table (obtained via get_table or create_table). Suppose you had a reference t to table \'t1\', the corresponding commands would be: hbase> t.get \'r1\' hbase> t.get \'r1\', {TIMERANGE => [ts1, ts2]} hbase> t.get \'r1\', {COLUMN => \'c1\'} hbase> t.get \'r1\', {COLUMN => [\'c1\', \'c2\', \'c3\']} hbase> t.get \'r1\', {COLUMN => \'c1\', TIMESTAMP => ts1} hbase> t.get \'r1\', {COLUMN => \'c1\', TIMERANGE => [ts1, ts2], VERSIONS => 4} hbase> t.get \'r1\', {COLUMN => \'c1\', TIMESTAMP => ts1, VERSIONS => 4} hbase> t.get \'r1\', {FILTER => "ValueFilter(=, \'binary:abc\')"} hbase> t.get \'r1\', \'c1\' hbase> t.get \'r1\', \'c1\', \'c2\' hbase> t.get \'r1\', [\'c1\', \'c2\'] hbase> t.get \'r1\', {CONSISTENCY => \'TIMELINE\'} hbase> t.get \'r1\', {CONSISTENCY => \'TIMELINE\', REGION_REPLICA_ID => 1} hbase(main):024:0>
hbase(main):023:0> help “get” #进查看”get”命令的帮助信息
hbase(main):024:0> get \'teacher\',\'10001\' #查看指定(ROWKEY)行的数据 COLUMN CELL synopsis:address timestamp=1590404964908, value=AnKang synopsis:age timestamp=1590402849894, value=18 synopsis:name timestamp=1590402838027, value=yinzhengjie 1 row(s) Took 0.4822 seconds hbase(main):025:0>
hbase(main):024:0> get \’teacher\’,\’10001\’ #查看指定(ROWKEY)行的所有数据
hbase(main):025:0> get \'teacher\',\'10001\',\'synopsis:name\' #查看指定(ROWKEY)行的列族字段数据 COLUMN CELL synopsis:name timestamp=1590402838027, value=yinzhengjie 1 row(s) Took 0.0179 seconds hbase(main):026:0>
hbase(main):025:0> get \’teacher\’,\’10001\’,\’synopsis:name\’ #查看指定(ROWKEY)行的列族字段数据
9>.更新指定字段的数据
hbase(main):026:0> scan \'teacher\',{STARTROW => \'10001\',STOPROW => \'10002\'} ROW COLUMN+CELL 10001 column=synopsis:address, timestamp=1590402865107, value=beijing 10001 column=synopsis:age, timestamp=1590402849894, value=18 10001 column=synopsis:name, timestamp=1590402838027, value=yinzhengjie 1 row(s) Took 0.0107 seconds hbase(main):027:0> put \'teacher\',\'10001\',\'synopsis:address\',\'AnKang\' #当插入的数据其字段不存在时新建,但插入的数据其字段已存在时会覆盖原来的数据哟~ Took 0.0138 seconds hbase(main):028:0> scan \'teacher\',{STARTROW => \'10001\',STOPROW => \'10002\'} ROW COLUMN+CELL 10001 column=synopsis:address, timestamp=1590404964908, value=AnKang 10001 column=synopsis:age, timestamp=1590402849894, value=18 10001 column=synopsis:name, timestamp=1590402838027, value=yinzhengjie 1 row(s) Took 0.0164 seconds hbase(main):029:0>
10>.统计表行数
hbase(main):030:0> count \'teacher\' 2 row(s) Took 0.0777 seconds => 2 hbase(main):031:0>
hbase(main):030:0> count \’teacher\’
11>.变更表信息
hbase(main):016:0> describe \'teacher\' Table teacher is ENABLED teacher COLUMN FAMILIES DESCRIPTION {NAME => \'professional_skill\', VERSIONS => \'1\', EVICT_BLOCKS_ON_CLOSE => \'false\', NEW_VERSION_BEHAVIOR => \'false\', KEEP_DELETED_CELLS => \'FALSE\', CACHE_DATA_ON_WRITE => \'false\', DATA_BLOCK_ENCODING => \'NONE\', TTL => \'FOREVER\', MIN_VERSIONS => \'0\', REPLICATION_SCOPE => \' 0\', BLOOMFILTER => \'ROW\', CACHE_INDEX_ON_WRITE => \'false\', IN_MEMORY => \'false\', CACHE_BLOOMS_ON_WRITE => \'false\', PREFETCH_BLOCKS_ON_OPEN => \'false\', COMPRESSION => \'NONE\', BLOCKCACHE => \'true\', BLOCKSIZE => \'65536\'} {NAME => \'project_experience\', VERSIONS => \'1\', EVICT_BLOCKS_ON_CLOSE => \'false\', NEW_VERSION_BEHAVIOR => \'false\', KEEP_DELETED_CELLS => \'FALSE\', CACHE_DATA_ON_WRITE => \'false\', DATA_BLOCK_ENCODING => \'NONE\', TTL => \'FOREVER\', MIN_VERSIONS => \'0\', REPLICATION_SCOPE => \' 0\', BLOOMFILTER => \'ROW\', CACHE_INDEX_ON_WRITE => \'false\', IN_MEMORY => \'false\', CACHE_BLOOMS_ON_WRITE => \'false\', PREFETCH_BLOCKS_ON_OPEN => \'false\', COMPRESSION => \'NONE\', BLOCKCACHE => \'true\', BLOCKSIZE => \'65536\'} {NAME => \'synopsis\', VERSIONS => \'1\', EVICT_BLOCKS_ON_CLOSE => \'false\', NEW_VERSION_BEHAVIOR => \'false\', KEEP_DELETED_CELLS => \'FALSE\', CACHE_DATA_ON_WRITE => \'false\', DATA_BLOCK_ENCODING => \'NONE\', TTL => \'FOREVER\', MIN_VERSIONS => \'0\', REPLICATION_SCOPE => \'0\', BLOOMF ILTER => \'ROW\', CACHE_INDEX_ON_WRITE => \'false\', IN_MEMORY => \'false\', CACHE_BLOOMS_ON_WRITE => \'false\', PREFETCH_BLOCKS_ON_OPEN => \'false\', COMPRESSION => \'NONE\', BLOCKCACHE => \'true\', BLOCKSIZE => \'65536\'} 3 row(s) QUOTAS 0 row(s) Took 0.0835 seconds hbase(main):017:0> alter \'teacher\',{NAME => \'synopsis\',VERSIONS => 5} #修改"teacher"表,让其管理最近修改的版本次数为5(即会记录最近5词该列族的所有字段的修改)。 Updating all regions with the new schema... 1/1 regions updated. Done. Took 2.8844 seconds hbase(main):018:0> describe \'teacher\' Table teacher is ENABLED teacher COLUMN FAMILIES DESCRIPTION {NAME => \'professional_skill\', VERSIONS => \'1\', EVICT_BLOCKS_ON_CLOSE => \'false\', NEW_VERSION_BEHAVIOR => \'false\', KEEP_DELETED_CELLS => \'FALSE\', CACHE_DATA_ON_WRITE => \'false\', DATA_BLOCK_ENCODING => \'NONE\', TTL => \'FOREVER\', MIN_VERSIONS => \'0\', REPLICATION_SCOPE => \' 0\', BLOOMFILTER => \'ROW\', CACHE_INDEX_ON_WRITE => \'false\', IN_MEMORY => \'false\', CACHE_BLOOMS_ON_WRITE => \'false\', PREFETCH_BLOCKS_ON_OPEN => \'false\', COMPRESSION => \'NONE\', BLOCKCACHE => \'true\', BLOCKSIZE => \'65536\'} {NAME => \'project_experience\', VERSIONS => \'1\', EVICT_BLOCKS_ON_CLOSE => \'false\', NEW_VERSION_BEHAVIOR => \'false\', KEEP_DELETED_CELLS => \'FALSE\', CACHE_DATA_ON_WRITE => \'false\', DATA_BLOCK_ENCODING => \'NONE\', TTL => \'FOREVER\', MIN_VERSIONS => \'0\', REPLICATION_SCOPE => \' 0\', BLOOMFILTER => \'ROW\', CACHE_INDEX_ON_WRITE => \'false\', IN_MEMORY => \'false\', CACHE_BLOOMS_ON_WRITE => \'false\', PREFETCH_BLOCKS_ON_OPEN => \'false\', COMPRESSION => \'NONE\', BLOCKCACHE => \'true\', BLOCKSIZE => \'65536\'} {NAME => \'synopsis\', VERSIONS => \'5\', EVICT_BLOCKS_ON_CLOSE => \'false\', NEW_VERSION_BEHAVIOR => \'false\', KEEP_DELETED_CELLS => \'FALSE\', CACHE_DATA_ON_WRITE => \'false\', DATA_BLOCK_ENCODING => \'NONE\', TTL => \'FOREVER\', MIN_VERSIONS => \'0\', REPLICATION_SCOPE => \'0\', BLOOMF ILTER => \'ROW\', CACHE_INDEX_ON_WRITE => \'false\', IN_MEMORY => \'false\', CACHE_BLOOMS_ON_WRITE => \'false\', PREFETCH_BLOCKS_ON_OPEN => \'false\', COMPRESSION => \'NONE\', BLOCKCACHE => \'true\', BLOCKSIZE => \'65536\'} 3 row(s) QUOTAS 0 row(s) Took 0.0775 seconds hbase(main):019:0>
hbase(main):017:0> alter \’teacher\’,{NAME => \’synopsis\’,VERSIONS => 5} #修改”teacher”表,让其管理最近修改的版本次数为5(即会记录最近5词该列族的所有字段的修改)。
hbase(main):027:0> scan \'teacher\',{STARTROW => \'10001\',STOPROW => \'10002\'} ROW COLUMN+CELL 10001 column=synopsis:address, timestamp=1590413642872, value=beijing 10001 column=synopsis:name, timestamp=1590417615037, value=yinzhengjie2020 1 row(s) Took 0.0096 seconds hbase(main):028:0> get \'teacher\',\'10001\',\'synopsis:name\' COLUMN CELL synopsis:name timestamp=1590417615037, value=yinzhengjie2020 1 row(s) Took 0.0063 seconds hbase(main):029:0>
hbase(main):027:0> scan \’teacher\’,{STARTROW => \’10001\’,STOPROW => \’10002\’} #查看原始数据
hbase(main):031:0> put \'teacher\',\'10001\',\'synopsis:name\',\'yinzhengjie2019\' #连续修改5次数据 Took 0.0064 seconds hbase(main):032:0> put \'teacher\',\'10001\',\'synopsis:name\',\'yinzhengjie2018\' Took 0.0063 seconds hbase(main):033:0> put \'teacher\',\'10001\',\'synopsis:name\',\'yinzhengjie2017\' Took 0.0094 seconds hbase(main):034:0> put \'teacher\',\'10001\',\'synopsis:name\',\'yinzhengjie2016\' Took 0.0080 seconds hbase(main):035:0> put \'teacher\',\'10001\',\'synopsis:name\',\'yinzhengjie2015\' Took 0.0141 seconds hbase(main):036:0> scan \'teacher\',{STARTROW => \'10001\',STOPROW => \'10002\'} ROW COLUMN+CELL 10001 column=synopsis:address, timestamp=1590413642872, value=beijing 10001 column=synopsis:name, timestamp=1590417809578, value=yinzhengjie2015 #不难发现,最新的数据保留的是最后一次修改记录。 1 row(s) Took 0.0088 seconds hbase(main):037:0>
hbase(main):044:0> get \'teacher\',\'10001\',{COLUMN=>\'synopsis:name\',VERSIONS=>3} #查看"teacher"表,ROWKEY为"10001",列族为"synopsis",字段名为"name"的最近3次修改的记录。 COLUMN CELL synopsis:name timestamp=1590417809578, value=yinzhengjie2015 synopsis:name timestamp=1590417806829, value=yinzhengjie2016 synopsis:name timestamp=1590417803412, value=yinzhengjie2017 1 row(s) Took 0.0154 seconds hbase(main):045:0>
hbase(main):044:0> get \’teacher\’,\’10001\’,{COLUMN=>\’synopsis:name\’,VERSIONS=>3} #查看”teacher”表,ROWKEY为”10001″,列族为”synopsis”,字段名为”name”的最近3次修改的记录。
hbase(main):045:0> get \'teacher\',\'10001\',{COLUMN=>\'synopsis:name\',VERSIONS=>5} COLUMN CELL synopsis:name timestamp=1590417809578, value=yinzhengjie2015 synopsis:name timestamp=1590417806829, value=yinzhengjie2016 synopsis:name timestamp=1590417803412, value=yinzhengjie2017 synopsis:name timestamp=1590417762124, value=yinzhengjie2018 synopsis:name timestamp=1590417698039, value=yinzhengjie2019 1 row(s) Took 0.0111 seconds hbase(main):046:0>
hbase(main):045:0> get \’teacher\’,\’10001\’,{COLUMN=>\’synopsis:name\’,VERSIONS=>5} #同上,查看最近5次的记录
hbase(main):046:0> get \'teacher\',\'10001\',{COLUMN=>\'synopsis:name\',VERSIONS=>10} #同上,查看最近10次的记录。但是我们上面在变更表时指定"synopsis"列族仅能保存5个版本,因此查看10个版本也只会显示最近5次修改的记录哟~ COLUMN CELL synopsis:name timestamp=1590417809578, value=yinzhengjie2015 synopsis:name timestamp=1590417806829, value=yinzhengjie2016 synopsis:name timestamp=1590417803412, value=yinzhengjie2017 synopsis:name timestamp=1590417762124, value=yinzhengjie2018 synopsis:name timestamp=1590417698039, value=yinzhengjie2019 1 row(s) Took 0.0096 seconds hbase(main):047:0>
hbase(main):046:0> get \’teacher\’,\’10001\’,{COLUMN=>\’synopsis:name\’,VERSIONS=>10} #同上,查看最近10次的记录。但是我们上面在变更表时指定”synopsis”列族仅能保存5个版本,因此查看10个版本也只会显示最近5次修改的记录哟~
12>.删除数据
hbase(main):031:0> scan \'teacher\' ROW COLUMN+CELL 10001 column=synopsis:address, timestamp=1590404964908, value=AnKang 10001 column=synopsis:age, timestamp=1590402849894, value=18 10001 column=synopsis:name, timestamp=1590402838027, value=yinzhengjie 10002 column=synopsis:address, timestamp=1590402961446, value=shijiazhuang 10002 column=synopsis:age, timestamp=1590402919298, value=27 10002 column=synopsis:name, timestamp=1590402908251, value=jason yin 2 row(s) Took 0.0130 seconds hbase(main):032:0> deleteall \'teacher\',\'10002\' #删除ROWKEY为"10002"的全部数据 Took 0.0293 seconds hbase(main):033:0> scan \'teacher\' ROW COLUMN+CELL 10001 column=synopsis:address, timestamp=1590404964908, value=AnKang 10001 column=synopsis:age, timestamp=1590402849894, value=18 10001 column=synopsis:name, timestamp=1590402838027, value=yinzhengjie 1 row(s) Took 0.0276 seconds hbase(main):034:0>
hbase(main):032:0> deleteall \’teacher\’,\’10002\’ #删除ROWKEY为”10002″的全部数据
hbase(main):033:0> scan \'teacher\' ROW COLUMN+CELL 10001 column=synopsis:address, timestamp=1590404964908, value=AnKang 10001 column=synopsis:age, timestamp=1590402849894, value=18 10001 column=synopsis:name, timestamp=1590402838027, value=yinzhengjie 1 row(s) Took 0.0276 seconds hbase(main):034:0> delete \'teacher\',\'10001\',\'synopsis:age\' #删除ROWKEY为"10001",列族为"synopsis",字段为"age"的数据 Took 0.0113 seconds hbase(main):035:0> scan \'teacher\' ROW COLUMN+CELL 10001 column=synopsis:address, timestamp=1590404964908, value=AnKang 10001 column=synopsis:name, timestamp=1590402838027, value=yinzhengjie 1 row(s) Took 0.0083 seconds hbase(main):036:0>
hbase(main):034:0> delete \’teacher\’,\’10001\’,\’synopsis:age\’ #删除ROWKEY为”10001″,列族为”synopsis”,字段为”age”的数据
hbase(main):037:0> scan \'teacher\' ROW COLUMN+CELL 10001 column=synopsis:address, timestamp=1590404964908, value=AnKang 10001 column=synopsis:name, timestamp=1590402838027, value=yinzhengjie 1 row(s) Took 0.0089 seconds hbase(main):038:0> truncate \'teacher\' #清空表数据,从下面的输出可以看出,清空表的操作顺序为先disable,然后再truncate。 Truncating \'teacher\' table (it may take a while): Disabling table... Truncating table... Took 4.7630 seconds hbase(main):039:0> scan \'teacher\' ROW COLUMN+CELL 0 row(s) Took 0.8933 seconds hbase(main):040:0>
hbase(main):038:0> truncate \’teacher\’ #清空表数据,从下面的输出可以看出,清空表的操作顺序为先disable,然后再truncate。
hbase(main):040:0> list TABLE teacher 1 row(s) Took 0.0140 seconds => ["teacher"] hbase(main):041:0> disable \'teacher\' #禁用"teacher"表 Took 0.7649 seconds hbase(main):042:0> drop \'teacher\' #在执行删除"teacher"表操作前,该表必须处于"Disable"状态 Took 0.4524 seconds hbase(main):043:0> list TABLE 0 row(s) Took 0.0098 seconds => [] hbase(main):044:0>
14>.查看名称空间
hbase(main):063:0> help "list_namespace" #仅查看"list_namespace"命令的帮助信息 List all namespaces in hbase. Optional regular expression parameter could be used to filter the output. Examples: hbase> list_namespace hbase> list_namespace \'abc.*\' hbase(main):064:0>
hbase(main):063:0> help “list_namespace” #仅查看”list_namespace”命令的帮助信息
hbase(main):064:0> list_namespace #很明显,在没有创建任何名称空间时,默认就有2个名称空间哟~ NAMESPACE default hbase 2 row(s) Took 0.0152 seconds hbase(main):065:0>
15>.创建名称空间
hbase(main):067:0> help "create_namespace" #仅查看"create_namespace"命令的帮助信息 Create namespace; pass namespace name, and optionally a dictionary of namespace configuration. Examples: hbase> create_namespace \'ns1\' hbase> create_namespace \'ns1\', {\'PROPERTY_NAME\'=>\'PROPERTY_VALUE\'} hbase(main):068:0>
hbase(main):067:0> help “create_namespace” #仅查看”create_namespace”命令的帮助信息
hbase(main):068:0> list_namespace NAMESPACE default hbase 2 row(s) Took 0.0304 seconds hbase(main):069:0> create_namespace \'yinzhengjie2020\' #创建"yinzhengjie2020"这个名称空间。 Took 0.2975 seconds hbase(main):070:0> list_namespace NAMESPACE default hbase yinzhengjie2020 3 row(s) Took 0.0170 seconds hbase(main):071:0>
16>.在指定名称空间创建表
hbase(main):070:0> list_namespace NAMESPACE default hbase yinzhengjie2020 3 row(s) Took 0.0170 seconds hbase(main):071:0> create \'yinzhengjie2020:student\',\'courses\',\'info\',\'detail\' #新建表时指定名称空间,就会将该表创建在相应的名称空间哟~ Created table yinzhengjie2020:student Took 2.3499 seconds => Hbase::Table - yinzhengjie2020:student hbase(main):072:0> list TABLE teacher yinzhengjie2020:student 2 row(s) Took 0.0048 seconds => ["teacher", "yinzhengjie2020:student"] hbase(main):073:0>
hbase(main):071:0> create \’yinzhengjie2020:student\’,\’courses\’,\’info\’,\’detail\’ #新建表时指定名称空间,就会将该表创建在相应的名称空间哟~如下图所示。
17>.删除名称空间
hbase(main):080:0> list TABLE teacher yinzhengjie2020:student 2 row(s) Took 0.0106 seconds => ["teacher", "yinzhengjie2020:student"] hbase(main):081:0> disable "yinzhengjie2020:student" Took 0.9778 seconds hbase(main):082:0> drop "yinzhengjie2020:student" Took 0.4751 seconds hbase(main):083:0> list_namespace NAMESPACE default hbase yinzhengjie2020 3 row(s) Took 0.0194 seconds hbase(main):084:0> drop_namespace "yinzhengjie2020" #删除名称空间之前要确保该名称空间没有保存表信息(换句话说,需要手动删除该名称空间的所有表)。 Took 0.2688 seconds hbase(main):085:0> list_namespace NAMESPACE default hbase 2 row(s) Took 0.0205 seconds hbase(main):086:0>
hbase(main):084:0> drop_namespace “yinzhengjie2020″ #删除名称空间之前要确保该名称空间没有保存表信息(换句话说,需要手动删除该名称空间的所有表)。=
三.HBase工作原理概述
博主推荐阅读: https://www.cnblogs.com/yinzhengjie2020/p/12237264.html