Location
2311 Wireless Seminar Room
Event Description

Talk Title: “An Excursion in Probabilistic Hashing Techniques for Big Data”

Abstract: Large scale machine learning and data mining applications
are constantly dealing with datasets at TB scale and the anticipation
is that soon it will reach PB levels. At this scale conventional
algorithms fail and simple data mining operations such as search,
learning, clustering, etc. become challenging

In this talk, I will introduce probabilistic hashing techniques for
large scale search and learning. I will show how the old hashing
framework, originally meant for sub-linear search, can be converted
into fast learning algorithms. I will talk about our recent success in
constructing hash functions for dot product by making use of
asymmetry. Such a construction is not possible in the conventional
setting and was a known hard problem. I will further show the direct
consequence of hashing inner products in speeding up popular learning
algorithms. Later, I will discuss some of the recent improvements in
some decade old textbook hashing algorithms, which will include the
fastest way of performing minwise hashing in practice.

I will demonstrate the utility of the above techniques on various real
applications including search, learning, collaborative filtering and
our ongoing collaboration with HRDAG (Human Rights Data Analysis
Group) and NCRN (NSF- Census Research Network) in estimating death
counts in Syria since March 2011.

Bio: Anshumali Shrivastava is a Ph. D. student in the computer
science department at Cornell University since 2010. His broad
research interests include large scale machine learning, randomized
algorithms for big data systems and graph mining. His research on
hashing inner products has won Best Paper Award at NIPS 2014 while his
work on representing graphs got the Best Paper Award at IEEE/ACM
ASONAM 2014. Before coming to Cornell, he worked as a scientist at
FICO (Fair Isaac Corp.) research Bangalore. Anshumali did his
bachelors and masters in mathematics and computing from Indian
Institute of Technology (IIT) Kharagpur in 2008, where he holds
Institute Silver Medal for graduating at the top of the class.

Hosted by Minh Hoai Nguyen

Event Title
FacCandidate: Anshumali Shrivastava from Cornell University