Scan HTable rows for specific column value using HBase shell

Nishu, here is solution I periodically use. It is actually much more powerful than you need right now but I think you will use it's power some day. Yes, it is for HBase shell.

import org.apache.hadoop.hbase.filter.CompareFilter
import org.apache.hadoop.hbase.filter.SingleColumnValueFilter
import org.apache.hadoop.hbase.filter.SubstringComparator
import org.apache.hadoop.hbase.util.Bytes

scan 'yourTable', {LIMIT => 10, FILTER => SingleColumnValueFilter.new(Bytes.toBytes('family'), Bytes.toBytes('field'), CompareFilter::CompareOp.valueOf('EQUAL'), Bytes.toBytes('AAA')), COLUMNS => 'family:field' }

Only family:field column is returned with filter applied. This filter could be improved to perform more complicated comparisons.

Here are also hints for you that I consider most useful:

  • http://hadoop-hbase.blogspot.com/2012/01/hbase-intra-row-scanning.html - Intra-row scanning explanation (Java API).
  • https://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/filter/FilterBase.html - JavaDoc for FilterBase class with links to descendants which actually can be used the same style. OK, shell syntax will be slightly different but having example above you can use this.

It is possible without Hive:

scan 'filemetadata', 
     { COLUMNS => 'colFam:colQualifier', 
       LIMIT => 10, 
       FILTER => "ValueFilter( =, 'binaryprefix:<someValue.e.g. test1 AsDefinedInQuestion>' )" 
     }

Note: in order to find all rows that contain test1 as value as specified in the question, use binaryprefix:test1 in the filter (see this answer for more examples)

Tags:

Nosql

Hbase