文件名称:simple-and-efficient-weighted-minwise-hashing
介绍说明--下载内容均来自于网络,请自行研究使用
Weighted minwise hashing (WMH) is one of the fundamental subroutine,
required by many celebrated approximation algorithms, commonly
adopted in industrial practice for large -scale search and learning. The
resource bottleneck with WMH is the computation of multiple (typically a
few hundreds to thousands) independent hashes of the data. We propose
a simple rejection type sampling scheme based on a carefully designed
red-green map, where we show that the number of rejected sample has
exactly the same distribution as weighted minwise sampling.
required by many celebrated approximation algorithms, commonly
adopted in industrial practice for large -scale search and learning. The
resource bottleneck with WMH is the computation of multiple (typically a
few hundreds to thousands) independent hashes of the data. We propose
a simple rejection type sampling scheme based on a carefully designed
red-green map, where we show that the number of rejected sample has
exactly the same distribution as weighted minwise sampling.
(系统自动生成,下载前可以参看下载内容)
下载文件列表
文件名 | 大小 | 更新时间 |
---|---|---|
AC1.simple-and-efficient-weighted-minwise-hashing.pdf | 508938 | 2018-02-07 |