Redis中Bloom filter布隆过滤器的学习
枫灵小宇 人气:01.概念
布隆过滤器是一个高空间利用率的概率性数据结构,主要目的是节省内存空间以及判断一个元素是否存在于一个集合中(存在误判的情况),可以理解为一个不怎么精确的 set 结构,当你使用它的 contains 方法判断某个对象是否存在时,它可能会误判。但是布隆过滤器也不是特别不精确,只要参数设置的合理,它的精确度可以控制的相对足够精确,只会有小小的误判概率(控制参数:error_rate-误判率 initial_size-初始容量)
error_rate越小,越精确,需要的空间越大
initial_size越大,越精确,当实际数量超出这个数值时,误判率会上升
布隆过滤器可以判断某个数据一定不存在,但是无法判断一定存在
2.guava实现
2.1.依赖
<!--guava实现布隆过滤器--> <dependency> <groupId>com.google.guava</groupId> <artifactId>guava</artifactId> <version>19.0</version> </dependency>
2.2.初始化布隆过滤器
//初始化布隆过滤器,放入到spring容器里面 @Bean public MyBloomFilter<String> initBloomFilterHelper() { return new MyBloomFilter<>((Funnel<String>) (from, into) -> into.putString(from, Charsets.UTF_8).putString(from, Charsets.UTF_8) , 1000000, 0.01); }
2.3.布隆过滤器
package com.qin.redis.bloomfilter; import com.google.common.base.Preconditions; import com.google.common.hash.Funnel; import com.google.common.hash.Hashing; /** * @version: V1.0.0 * @className: MyBloomFilter */ public class MyBloomFilter<T> { private int numHashFunctions; private int bitSize; private Funnel<T> funnel; public MyBloomFilter(Funnel<T> funnel, int expectedInsertions, double fpp) { Preconditions.checkArgument(funnel != null, "funnel不能为空"); this.funnel = funnel; // 计算bit数组长度 bitSize = optimalNumOfBits(expectedInsertions, fpp); // 计算hash方法执行次数 numHashFunctions = optimalNumOfHashFunctions(expectedInsertions, bitSize); } public int[] murmurHashOffset(T value) { int[] offset = new int[numHashFunctions]; long hash64 = Hashing.murmur3_128().hashObject(value, funnel).asLong(); int hash1 = (int) hash64; int hash2 = (int) (hash64 >>> 32); for (int i = 1; i <= numHashFunctions; i++) { int nextHash = hash1 + i * hash2; if (nextHash < 0) { nextHash = ~nextHash; } offset[i - 1] = nextHash % bitSize; } return offset; } /** * 计算bit数组长度 */ private int optimalNumOfBits(long n, double p) { if (p == 0) { // 设定最小期望长度 p = Double.MIN_VALUE; } int sizeOfBitArray = (int) (-n * Math.log(p) / (Math.log(2) * Math.log(2))); return sizeOfBitArray; } /** * 计算hash方法执行次数 */ private static int optimalNumOfHashFunctions(long n, long m) { int countOfHash = Math.max(1, (int) Math.round((double) m / n * Math.log(2))); return countOfHash; } public static void main(String[] args) { System.out.println(optimalNumOfHashFunctions(1000000000L, 123450000L)); } }
2.4.添加元素或者判断是否存在
package com.qin.redis.bloomfilter.service; import com.google.common.base.Preconditions; import com.hikvison.aksk.redis.bloomfilter.MyBloomFilter; import org.springframework.beans.factory.annotation.Autowired; import org.springframework.data.redis.core.RedisTemplate; import org.springframework.stereotype.Service; /** * @version: V1.0.0 * @className: RedisBloomFilterService */ @Service public class RedisBloomFilterService { @Autowired private RedisTemplate redisTemplate; /** * 根据给定的布隆过滤器添加值 */ public <T> void addByBloomFilter(MyBloomFilter<T> bloomFilterHelper, String key, T value) { Preconditions.checkArgument(bloomFilterHelper != null, "myBloomFilter不能为空"); int[] offset = bloomFilterHelper.murmurHashOffset(value); for (int i : offset) { System.out.println("key : " + key + " " + "value : " + i); redisTemplate.opsForValue().setBit(key, i, true); } } /** * 根据给定的布隆过滤器判断值是否存在 */ public <T> boolean includeByBloomFilter(MyBloomFilter<T> bloomFilterHelper, String key, T value) { Preconditions.checkArgument(bloomFilterHelper != null, "myBloomFilter不能为空"); int[] offset = bloomFilterHelper.murmurHashOffset(value); for (int i : offset) { System.out.println("key : " + key + " " + "value : " + i); if (!redisTemplate.opsForValue().getBit(key, i)) { return false; } } return true; } }
3.Redisson实现
3.1.依赖
<dependency> <groupId>org.redisson</groupId> <artifactId>redisson</artifactId> <version>2.7.0</version> </dependency>
3.2.注入或测试
//单机模式:可以设置集群、哨兵模式 @Bean public Redisson redisson() { Config config = new Config(); config.useSingleServer().setAddress("redis://127.0.0.1:6379"); RedissonClient redissonClient = Redisson.create(config); //初始化过滤器 RBloomFilter<Object> bloomFilter = redissonClient.getBloomFilter("testBloomFilter"); bloomFilter.tryInit(1000000L,0.05); //插入元素 bloomFilter.add("zhangsan"); bloomFilter.add("lisi"); //判断元素是否存在 boolean flag = bloomFilter.contains("lisi"); return (Redisson) redissonClient; }
加载全部内容