fallenpegasus: amazon (Default)
[personal profile] fallenpegasus
A couple of days ago, Jacob DeHart and Harper Reed pinged me with an interesting idea for a MySQL storage engine. A "write-only" engine into a Bloom Filter.

What a neat idea!


The bloom filter, and it's varients and improvements, is almost magic. You can insert items. You can tell if you have inserted an item in the past, You can tell if you havn't inserted an item in the past (probably). But you can't get a list of all the items inserted.

And the performance is interesting. The space used is fixed. Insert and lookup time is constant.


CREATE TABLE t (v VARCHAR(255)) MAX_ROWS=100 ENGINE=BLOOM;

Have to specify about how many elements it will hold.

It usually doesn't make sense to specify more than one column. The entire row will be concatinated together as the input into the hash functions.

It probably doesn't make sense to use any index description syntax, unless doing so will give clues and hints to the query planner (need to talk to BrianA about that). Maybe the thing to do is if the row is described to be a unique or primary key, build a plain one-bit bloom, and otherwise make it a counting varient.

CREATE TABLE t (v VARCHAR(255)) CONNECTION='1000,10,8' ENGINE=BLOOM;

Alternatively, could use the connection string. In this case, we are saying "use m=1000 bits, k=10 hash functions, 8 bit count bucket".

INSERT INTO t (v) VALUES (('one'), ('two'), ('three'), ('four'));

Now we can insert stuff into it.

INSERT INTO t (v) VALUES (('five'), ('five'), ('five'));

In a plain bloom, this is pointless, the value "five" gets stored once. In a counting one, on the other hand, this makes sense.

SELECT v FROM t;

This can't work. Will just return an empty set.

SELECT v FROM t WHERE v = 'one';

This will return the string "one".

SELECT v FROM t WHERE v = 'nine';

This will return the empty set. Probably. There is a small chance it will return the string "nine".

DELETE FROM t WHERE v = 'two';

This can't be done in a classic bloom, but would work with a counting one.

SELECT COUNT(*) from t;

This might be possible, in a counting bloom. Maybe.

TRUNCATE TABLE t;

Would instantly empty the bloom.

DROP TABLE t;

And, of course, can always drop it.

Profile

fallenpegasus: amazon (Default)
Mark Atwood

December 2022

S M T W T F S
    123
45678910
11121314151617
18192021222324
25262728293031

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Jun. 21st, 2025 02:44 am
Powered by Dreamwidth Studios