Jure Leskovec, Anand Rajaraman, Jeff Ullman Stanford University. What the Book Is About At the highest level of description, this book is about data mining. What the Book Is About At the highest level of description, this book is about data mining. LSH can be used with MinHash to achieve sub-linear query cost - that is a huge improvement. Week 1: MapReduce Link Analysis -- PageRank Week 2: Locality-Sensitive Hashing -- Basics + Applications Distance Measures Nearest Neighbors Frequent Itemsets Week 3: Data Stream Mining Analysis of Large Graphs Week 4: Recommender Systems Dimensionality Reduction Week 5: Clustering Computational Advertising Week 6: Support-Vector Machines Decision Trees MapReduce Algorithms Week 7: More About Link Analysis -- Topic-specific PageRank, Link Spam. Get step-by-step explanations, verified by experts. Mining of Massive Datasets using Locality Sensitive Hashing (LSH). The popularity of the Web and Internet commerce provides many extremely large datasets from which information can be gleaned by data mining. The details of the algorithm can be found in Chapter 3, Mining of Massive Datasets. The emphasis will be on MapReduce and Spark as tools for creating parallel algorithms that can process very large amounts of data. Algorithms for clustering very large, high-dimensional datasets. The course will discuss data mining and machine learning algorithms for analyzing very large amounts of data. 