Hbase快速统计行数

本文介绍了Hbase快速统计行数的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

现在我在 ResultScanner 上实现行计数，就像这样

  for（结果rs = scanner.next（）; rs！= null; rs = scanner.next（））{
 number ++; 
 
 
 
 $ b $ p 
 $ b如果达到数百万次计算的数据量很大，我想实时计算我不想使用Mapreduce 
 
 
 如何快速计算行数。
解决方案在HBase中使用RowCounter  
 RowCounter是一个mapreduce作业，用于统计表的所有行。这是一个很好的实用工具，可以用作健全性检查，以确保HBase可以读取表中所有块，如果有任何元数据不一致的担忧。它将在单个进程中运行mapreduce，但如果您有MapReduce群集供其利用，它将运行得更快。 
  $ hbase org.apache.hadoop.hbase.mapreduce.RowCounter< tablename> 
 
用法：RowCounter [options] 
< tablename> [
 --starttime = [start] 
 --endtime = [end] 
 [--range = [startKey]，[endKey]] 
 [< column1> < column2> ...] 
] 
  
 
Right now I implement row count over ResultScanner like this
for (Result rs = scanner.next(); rs != null; rs = scanner.next()) {
    number++;
}
If data reaching millions time computing is large.I want to compute in real time that i don't want to use Mapreduce
How to quickly count number of rows.
 解决方案 
Use RowCounter in HBaseRowCounter is a mapreduce job to count all the rows of a table. This is a good utility to use as a sanity check to ensure that HBase can read all the blocks of a table if there are any concerns of metadata inconsistency. It will run the mapreduce all in a single process but it will run faster if you have a MapReduce cluster in place for it to exploit. 
$ hbase org.apache.hadoop.hbase.mapreduce.RowCounter <tablename>

Usage: RowCounter [options]
    <tablename> [
        --starttime=[start]
        --endtime=[end]
        [--range=[startKey],[endKey]]
        [<column1> <column2>...]
    ]
                        
这篇关于Hbase快速统计行数的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持！