Site icon Tutorial

Filters class and types (comparison, dedicated and decorating)

When reading data from HBase using Get or Scan operations, you can use custom filters to return a subset of results to the client. While this does not reduce server-side IO, it does reduce network bandwidth and reduces the amount of data the client needs to process. Filters are generally used using the Java API, but can be used from HBase Shell for testing and debugging purposes.

HBase filters take zero or more arguments, in parentheses. Where the argument is a string, it is surrounded by single quotes (‘string’).

Filters can be combined together with logical operators. Some filters take a combination of comparison operators and comparators. Following is the list of each.

Logical Operators

(Filter1 AND Filter2)OR(Filter3 AND Filter4)

Comparison Operators

Comparators

Examples

Example1: >, ‘binary:abc’ will match everything that is lexicographically greater than “abc”

Example2: =, ‘binaryprefix:abc’ will match everything whose first 3 characters are lexicographically equal to “abc”

Example3: !=, ‘regexstring:ab*yz’ will match everything that doesn’t begin with “ab” and ends with “yz”

Example4: =, ‘substring:abc123’ will match everything that begins with the substring “abc123”

Compound Operators

Within an expression, parentheses can be used to group clauses together, and parentheses have the highest order of precedence.

SKIP and WHILE operators are next, and have the same precedence.

The AND operator is next.

The OR operator is next

Examples

A filter string of the form: “Filter1 AND Filter2 OR Filter3” will be evaluated as: “(Filter1 AND Filter2) OR Filter3”

A filter string of the form: “Filter1 AND SKIP Filter2 OR Filter3” will be evaluated as: “(Filter1 AND (SKIP Filter2)) OR Filter3”

Filter Types

HBase includes several filter types, as well as the ability to group filters together and create your own custom filters.

Syntax: KeyOnlyFilter ()

Syntax: FirstKeyOnlyFilter ()

Syntax:  PrefixFilter (‘<row_prefix>’)

Example: PrefixFilter (‘Row’)

Syntax:  ColumnPrefixFilter (‘<column_prefix>’)

Example: ColumnPrefixFilter (‘Col’)

Syntax:  MultipleColumnPrefixFilter (‘<column_prefix>’, ‘<column_prefix>’, …, ‘<column_prefix>’)

Example: MultipleColumnPrefixFilter (‘Col1’, ‘Col2’)

Syntax:  ColumnCountGetFilter (‘<limit>’)

Example: ColumnCountGetFilter (4)

Syntax:  PageFilter (‘<page_size>’)

Example: PageFilter (2)

Syntax:  ColumnPaginationFilter (‘<limit>’, ‘<offset>’)

Example: ColumnPaginationFilter (3, 5)

Syntax:  InclusiveStopFilter (‘<stop_row_key>’)

Example: InclusiveStopFilter (‘Row2’)

Syntax:  TimeStampsFilter (<timestamp>, <timestamp>, … ,<timestamp>)

Example: TimeStampsFilter (5985489, 48895495, 58489845945)

Syntax:  RowFilter (<compareOp>, ‘<row_comparator>’)

Example: RowFilter (<=, ‘binary:xyz)

Syntax:  FamilyFilter (<compareOp>, ‘<family_comparator>’)

Example: FamilyFilter (>=, ‘binaryprefix:FamilyB’)

Syntax:  QualifierFilter (<compareOp>, ‘<qualifier_comparator>’)

Example: QualifierFilter (=, ‘substring:Column1’)

Syntax:  ValueFilter (<compareOp>, ‘<value_comparator>’)

Example: ValueFilter (!=, ‘binary:Value’)

The filter can also take two more additional optional arguments, a compare operator and a value comparator, which are further checks in addition to the family and qualifier. If the dependent column is found, its value should also pass the value check. If it does pass the value check, only then is its timestamp taken into consideration.

Syntax:  DependentColumnFilter (‘<family>’, ‘<qualifier>’, <boolean>, <compare operator>, ‘<value comparator’)

DependentColumnFilter (‘<family>’, ‘<qualifier>’, <boolean>)

DependentColumnFilter (‘<family>’, ‘<qualifier>’)

Example: DependentColumnFilter (‘conf’, ‘blacklist’, false, >=, ‘zebra’)

DependentColumnFilter (‘conf’, ‘blacklist’, true)

DependentColumnFilter (‘conf’, ‘blacklist’)

This filter also takes two additional optional boolean arguments, filterIfColumnMissing and setLatestVersionOnly.

If the filterIfColumnMissing flag is set to true, the columns of the row will not be emitted if the specified column to check is not found in the row. The default value is false.

If the setLatestVersionOnly flag is set to false, it will test previous versions (timestamps) in addition to the most recent. The default value is true.

These flags are optional and dependent on each other. You must set neither or both of them together.

Syntax:  SingleColumnValueFilter (‘<family>’, ‘<qualifier>’, <compare operator>, ‘<comparator>’, <filterIfColumnMissing_boolean>, <latest_version_boolean>)

Syntax:  SingleColumnValueFilter (‘<family>’, ‘<qualifier>’, <compare operator>, ‘<comparator>’)

Example: SingleColumnValueFilter (‘FamilyA’, ‘Column1’, <=, ‘abc’, true, false)

Example: SingleColumnValueFilter (‘FamilyA’, ‘Column1’, <=, ‘abc’)

SingleColumnValueExcludeFilter – takes the same arguments and behaves same as SingleColumnValueFilter. However, if the column is found and the condition passes, all the columns of the row will be emitted except for the tested column value.

Syntax:  SingleColumnValueExcludeFilter (<family>, <qualifier>, <compare operators>, <comparator>, <latest_version_boolean>, <filterIfColumnMissing_boolean>)

Syntax:  SingleColumnValueExcludeFilter (<family>, <qualifier>, <compare operator> <comparator>)

Example: SingleColumnValueExcludeFilter (‘FamilyA’, ‘Column1’, ‘<=’, ‘abc’, ‘false’, ‘true’)

Example: SingleColumnValueExcludeFilter (‘FamilyA’, ‘Column1’, ‘<=’, ‘abc’)

Syntax:  ColumnRangeFilter (‘<minColumn >’, <minColumnInclusive_bool>, ‘<maxColumn>’, <maxColumnInclusive_bool>)

Example: ColumnRangeFilter (‘abc’, true, ‘xyz’, false)

HBase Shell Example

This example scans the ‘users’ table for rows where the contents of the cf:name column equals the string ‘abc’.

hbase> scan ‘users’, { FILTER => SingleColumnValueFilter.new(Bytes.toBytes(‘cf’),

Bytes.toBytes(‘name’), CompareFilter::CompareOp.valueOf(‘EQUAL’),

BinaryComparator.new(Bytes.toBytes(‘abc’)))}

Apply for HBase Certification

https://www.vskills.in/certification/certified-hbase-professional

Back to Tutorials

Exit mobile version