Skip to content

zeromax007/gpdb-roaringbitmap

Repository files navigation

gpdb-roaringbitmap

RoaringBitmap extension for greenplum-db

Introduction

Roaring bitmaps are compressed bitmaps which tend to outperform conventional compressed bitmaps such as WAH, EWAH or Concise. In some instances, roaring bitmaps can be hundreds of times faster and they often offer significantly better compression. They can even be faster than uncompressed bitmaps. More information https://github.com/RoaringBitmap/CRoaring.

Roaringbitmap是一种高效的Bitmap压缩算法,目前已被广泛应用在各种语言和各种大数据平台上。
本插件将Roaringbitmap功能集成到Greenplum数据库中,将Roaringbitmap作为一种数据类型提供原生的数据库函数、操作符、聚合等功能支持。
Bitmap位计算非常适合大数据基数计算,常用于去重、标签筛选、时间序列等计算中。

Build

su - gpadmin
make
make install

Make sure all nodes are copied the extension files.
确保插件文件同步到所有节点所对应的目录下。

psql -c "create extension roaringbitmap;"

Usage

Create table 创建表

CREATE TABLE t1 (id integer, bitmap roaringbitmap);

Build bitmap 生成一个Bitmap

INSERT INTO t1 SELECT 1,RB_BUILD(ARRAY[1,2,3,4,5,6,7,8,9,200]);

INSERT INTO t1 SELECT 2,RB_BUILD_AGG(e) FROM GENERATE_SERIES(1,100) e;

Bitmap Calculation (OR, AND, XOR, ANDNOT) Bitmap计算

SELECT RB_OR(a.bitmap,b.bitmap) FROM (SELECT bitmap FROM t1 WHERE id = 1) AS a,(SELECT bitmap FROM t1 WHERE id = 2) AS b;

Bitmap Aggregate (OR, AND, XOR, BUILD) Bitmap聚合

SELECT RB_OR_AGG(bitmap) FROM t1;
SELECT RB_AND_AGG(bitmap) FROM t1;
SELECT RB_XOR_AGG(bitmap) FROM t1;
SELECT RB_BUILD_AGG(e) FROM GENERATE_SERIES(1,100) e;

Cardinality 统计基数

SELECT RB_CARDINALITY(bitmap) FROM t1;

Bitmap to SETOF integer 转换为Offset List

SELECT RB_ITERATE(bitmap) FROM t1 WHERE id = 1;
SELECT RB_ITERATE_DECREMENT(bitmap) FROM t1 WHERE id = 1;

Cast between roaringbitmap and bytea

SELECT RB_BUILD('{1,2,3}')::BYTEA;
SELECT '\x3a3000000100000000000000100000000100'::ROARINGBITMAP;

Function List 函数一览

Function Input Output Desc Example
rb_build integer[] roaringbitmap Build a roaringbitmap from integer array.
通过数组创建一个Bitmap。
rb_build('{1,2,3,4,5}')
rb_build integer
integer
integer
roaringbitmap Build a roaringbitmap from integer range (with step). rb_build('{1,2,3,4,5}')
rb_to_array roaringbitmap integer[] Bitmap to integer array.
Bitmap转数组。
rb_to_array(rb_build('{1,2,3,4,5}'))
rb_and roraingbitmap
roaringbitmap
roaringbitmap Two roaringbitmap and calculation.
And计算。
rb_and(rb_build('{1,2,3}'),rb_build('{3,4,5}'))
rb_or roraingbitmap
roaringbitmap
roaringbitmap Two roaringbitmap or calculation.
Or计算。
rb_or(rb_build('{1,2,3}'),rb_build('{3,4,5}'))
rb_xor roraingbitmap
roaringbitmap
roaringbitmap Two roaringbitmap xor calculation.
Xor计算。
rb_xor(rb_build('{1,2,3}'),rb_build('{3,4,5}'))
rb_andnot roraingbitmap
roaringbitmap
roaringbitmap Two roaringbitmap andnot calculation.
AndNot计算
rb_andnot(rb_build('{1,2,3}'),rb_build('{3,4,5}'))
rb_cardinality roraingbitmap bigint Retrun roaringbitmap cardinality.
统计基数
rb_cardinality(rb_build('{1,2,3,4,5}'))
rb_cardinality roraingbitmap
integer
integer
bigint Retrun roaringbitmap cardinality between integer range. rb_cardinality(rb_build('{1,2,3,4,5}'),1,4)
rb_cardinality roraingbitmap
integer
integer
integer
bigint Retrun roaringbitmap cardinality between integer range with step. rb_cardinality(rb_build('{1,2,3,4,5}'),1,4,2)
rb_cardinality roraingbitmap
integer
integer
integer
integer
integer
bigint Retrun roaringbitmap cardinality between integer range with step, in offset range. rb_cardinality(rb_build('{1,2,3,4,5}'),1,4,2,1,3)
rb_cardinality roraingbitmap
integer
integer
integer
integer[]
bigint Retrun roaringbitmap cardinality between integer range with step, in offset array. rb_cardinality(rb_build('{1,2,3,4,5}'),1,4,2,'{1,3}')
rb_and_cardinality roraingbitmap
roaringbitmap
bigint Two roaringbitmap and calculation, return cardinality.
And计算并返回基数。
rb_and_cardinality(rb_build('{1,2,3}'),rb_build('{3,4,5}'))
rb_or_cardinality roraingbitmap
roaringbitmap
bigint Two roaringbitmap or calculation, return cardinality.
Or计算并返回基数。
rb_or_cardinality(rb_build('{1,2,3}'),rb_build('{3,4,5}'))
rb_xor_cardinality roraingbitmap
roaringbitmap
bigint Two roaringbitmap xor calculation, return cardinality.
Xor计算并返回基数。
rb_xor_cardinality(rb_build('{1,2,3}'),rb_build('{3,4,5}'))
rb_andnot_cardinality roraingbitmap
roaringbitmap
bigint Two roaringbitmap andnot calculation, return cardinality.
AndNot计算并返回基数。
rb_andnot_cardinality(rb_build('{1,2,3}'),rb_build('{3,4,5}'))
rb_is_empty roraingbitmap boolean Check if roaringbitmap is empty.
判断是否为空的Bitmap。
rb_is_empty(rb_build('{1,2,3,4,5}'))
rb_equals roraingbitmap
roaringbitmap
boolean Check two roaringbitmap are equal.
判断两个Bitmap是否相等。
rb_equals(rb_build('{1,2,3}'),rb_build('{3,4,5}'))
rb_not_equals roraingbitmap
roaringbitmap
boolean Check two roaringbitmap are not equal.
判断两个Bitmap是否不同。
rb_not_equals(rb_build('{1,2,3}'),rb_build('{3,4,5}'))
rb_intersect roraingbitmap
roaringbitmap
boolean Check two roaringbitmap are intersect.
判断两个Bitmap是否相交。
rb_intersect(rb_build('{1,2,3}'),rb_build('{3,4,5}'))
rb_contains roraingbitmap
roaringbitmap
boolean Check roaringbitmap conatins another one.
判断Bitmap是否包含另外一个。
rb_contains(rb_build('{1,2,3}'),rb_build('{3,4,5}'))
rb_contains roraingbitmap
integer
boolean Check roaringbitmap conatins a specific offset.
判断Bitmap是否包含特定的Offset。
rb_contains(rb_build('{1,2,3}'),1)
rb_contains roraingbitmap
integer
integer
boolean Check roaringbitmap conatins a specific offsets range.
判断Bitmap是否包含特定的Offset段。
rb_contains(rb_build('{1,2,3}'),1,3)
rb_becontained roraingbitmap
roaringbitmap
boolean Check roaringbitmap is contained by another one.
判断Bitmap是否被另外一个包含。
rb_becontained(rb_build('{1,2,3}'),rb_build('{3,4,5}'))
rb_becontained integer
roaringbitmap
boolean Check a specific offset is contained by Bitmap.
判断特定的Offset是否被Bitmap包含。
rb_becontained(1,rb_build('{3,4,5}'))
rb_add roraingbitmap
integer
roraingbitmap Add a specific offset to roaringbitmap.
添加特定的Offset到Bitmap。
rb_add(rb_build('{1,2,3}'),3)
rb_add roraingbitmap
integer
integer
roraingbitmap Add a specific offsets range to roaringbitmap.
添加特定的Offset段到Bitmap。
rb_add(rb_build('{1,2,3}'),3,4)
rb_remove roraingbitmap
integer
roraingbitmap Remove a specific offset from roaringbitmap.
从Bitmap移除特定的Offset。
rb_remove(rb_build('{1,2,3}'),3)
rb_remove roraingbitmap
integer
integer
roraingbitmap Remove a specific offsets rang from roaringbitmap.
从Bitmap移除特定的Offset段。
rb_remove(rb_build('{1,2,3}'),2,3)
rb_flip roraingbitmap
integer
roraingbitmap Flip a specific offset from roaringbitmap.
翻转Bitmap中特定的Offset。
rb_flip(rb_build('{1,2,3}'),3)
rb_flip roraingbitmap
integer
integer
roraingbitmap Flip a specific offsets range from roaringbitmap.
翻转Bitmap中特定的Offset段。
rb_flip(rb_build('{1,2,3}'),2,3)
rb_minimum roraingbitmap integer Return the smallest offset in roaringbitmap. Return -1 if the bitmap is empty.
返回Bitmap中最小的Offset,如果Bitmap为空则返回-1。
rb_minimum(rb_build('{1,2,3}'))
rb_maximum roraingbitmap integer Return the greatest offset in roaringbitmap. Return 0 if the bitmap is empty.
返回Bitmap中最大的Offset,如果Bitmap为空则返回0。
rb_maximum(rb_build('{1,2,3}'))
rb_rank roraingbitmap
integer
integer Return the number of offsets that are smaller or equal to a specific offset.
返回Bitmap中小于等于指定Offset的基数。
rb_rank(rb_build('{1,2,3}'),3)
rb_jaccard_index roraingbitmap
roraingbitmap
float8 Computes the Jaccard index between two bitmaps.
计算两个Bitmap之间的jaccard相似系数。
rb_jaccard_index(rb_build('{1,2,3}'),rb_build('{1,2,3,4}'))
rb_iterate roraingbitmap setof integer Return offsets List in increasing orders.
返回Offset List (从小到大)。
rb_iterate(rb_build('{1,2,3}'))
rb_iterate_decrement roraingbitmap setof integer Return offsets List in decreasing orders.
返回Offset List(从大到小)。
rb_iterate_decrement(rb_build('{1,2,3}'))

Operator List 操作符一览

Operator Left Right Output Desc Example
& roraingbitmap roraingbitmap roraingbitmap Two roaringbitmap and calculation.
两个Bitmap And 操作。
rb_build('{1,2,3}') & rb_build('{1,2,3}')
| roraingbitmap roraingbitmap roraingbitmap Two roaringbitmap or calculation.
两个Bitmap Or 操作。
rb_build('{1,2,3}') | rb_build('{1,2,3}')
# roraingbitmap roraingbitmap roraingbitmap Two roaringbitmap xor calculation.
两个Bitmap Xor 操作。
rb_build('{1,2,3}') # rb_build('{1,2,3}')
~ roraingbitmap roraingbitmap roraingbitmap Two roaringbitmap andnot calculation.
两个Bitmap Andnot 操作。
rb_build('{1,2,3}') ~ rb_build('{1,2,3}')
+ roraingbitmap intger roraingbitmap Add a specific offset from roaringbitmap.
向Bitmap中添加特定的Offset。
rb_build('{1,2,3}') + 4
+ intger roraingbitmap roraingbitmap Add a specific offset from roaringbitmap.
向Bitmap添加特定的Offset。
4 + rb_build('{1,2,3}')
- roraingbitmap intger roraingbitmap Remove a specific offset from roaringbitmap.
从Bitmap移除特定的Offset。
rb_build('{1,2,3}') - 1
= roraingbitmap roraingbitmap boolean Check two roaringbitmap are equal.
判断两个Bitmap是否相等。
rb_build('{1,2,3}') = rb_build('{3,2,1}')
<> roraingbitmap roraingbitmap boolean Check two roaringbitmap are not equal.
判断两个Bitmap是否不相等。
rb_build('{1,2,3}') <> rb_build('{3,2,1}')
&& roraingbitmap roraingbitmap boolean Check two roaringbitmaps are intersected.
判断两个Bitmap是否相交。
rb_build('{1,2,3}') && rb_build('{3,2,1}')
@> roraingbitmap roraingbitmap boolean Check roaringbitmap conatins another one.
判断Bitmap是否包含另外一个
rb_build('{1,2,3}') @> rb_build('{3,1}')
@> roraingbitmap integer boolean Check roaringbitmap conatins a specific offset.
判断Bitmap是否包含特定的Offset。
rb_build('{1,2,3}') @> 1
<@ roraingbitmap roraingbitmap boolean Check roaringbitmap is contained by another one.
判断Bitmap是否被另外一个包含。
rb_build('{1,3}') <@ rb_build('{3,2,1}')
<@ integer roraingbitmap boolean Check a specific offset is contained by Bitmap.
判断特定的Offset是否被Bitmap包含。
1 <@ rb_build('{1,2,3}')

Aggregation List 聚合函数一览

Aggregation Input Output Desc Example
rb_build_agg integer roraingbitmap Build a roaringbitmap from a integer set.
将Offset聚合成bitmap。
rb_build_agg(1)
rb_or_agg roraingbitmap roraingbitmap Or Aggregate calculations from a roraingbitmap set.
Or 聚合计算。
rb_or_agg(rb_build('{1,2,3}'))
rb_and_agg roraingbitmap roraingbitmap And Aggregate calculations from a roraingbitmap set.
And 聚合计算。
rb_and_agg(rb_build('{1,2,3}'))
rb_xor_agg roraingbitmap roraingbitmap Xor Aggregate calculations from a roraingbitmap set.
Xor 聚合计算。
rb_xor_agg(rb_build('{1,2,3}'))
rb_or_cardinality_agg roraingbitmap bigint Or Aggregate calculations from a roraingbitmap set, return cardinality.
Or 聚合计算并返回其基数。
rb_or_cardinality_agg(rb_build('{1,2,3}'))
rb_and_cardinality_agg roraingbitmap bigint And Aggregate calculations from a roraingbitmap set, return cardinality.
And 聚合计算并返回其基数。
rb_and_cardinality_agg(rb_build('{1,2,3}'))
rb_xor_cardinality_agg roraingbitmap bigint Xor Aggregate calculations from a roraingbitmap set, return cardinality.
Xor 聚合计算并返回其基数。
rb_xor_cardinality_agg(rb_build('{1,2,3}'))