Skip to main content

Locality Sensitive Hashing for Big Data

Funding: 2017: $115,000
2018: $112,000
2019: $120,000

Project Member(s): Zhang, Y.

Funding or Partner Organisation: Australian Research Council (ARC Discovery Projects)

Start year: 2017

Summary: Locality sensitive hashing (LSH) is one of the most widely adopted methods for answering similarity queries, and have significant impact in many fields of computer science and in diverse applications. This project will address key challenges to LSH when applying it to the Big Data, namely the demand to handle new similarity functions, large data volume, and better efficiency. New theories, methodologies and prototypes will be created to address these challenges. The project will provide frontier technology to existing applications to combat crimes in the cybersecurity space, and open up a host of novel possibilities for more intelligent and real-time analysis of Big Data.

Publications:

Wang, X, Qin, L, Lin, X, Zhang, Y & Chang, L 2019, 'Leveraging set relations in exact and dynamic set similarity join.', VLDB J., vol. 28, no. 2, pp. 267-292.
View/Download from: Publisher's site

Wang, X, Qin, L, Lin, X, Zhang, Y & Chang, L 2017, 'Leveraging Set Relations in Exact Set Similarity Join.', Proc. VLDB Endow., vol. 10, no. 9, pp. 925-936.
View/Download from: Publisher's site

Wang, X, Zhang, Y, Zhang, W & Lin, X 2017, 'Efficient Distance-Aware Influence Maximization in Geo-Social Networks', IEEE Transactions on Knowledge and Data Engineering, vol. 29, no. 3, pp. 599-612.
View/Download from: Publisher's site

Wang, X, Zhang, Y, Zhang, W, Lin, X & Chen, C 2017, 'Bring Order into the Samples: A Novel Scalable Method for Influence Maximization', IEEE Transactions on Knowledge and Data Engineering, vol. 29, no. 2, pp. 243-256.
View/Download from: Publisher's site

Yang, J, Zhang, W, Yang, S, Zhang, Y & Lin, X 1970, 'TT-Join: Efficient Set Containment Join', 2017 IEEE 33rd International Conference on Data Engineering (ICDE), 2017 IEEE 33rd International Conference on Data Engineering (ICDE), IEEE, San Diego, CA, USA, pp. 509-520.
View/Download from: Publisher's site

Keywords: Locality sensitive hashing, Big Data, Similarity Queries

FOR Codes: Database Management, Information Processing Services (incl. Data Entry and Capture), Application Software Packages (excl. Computer Games), Database systems, Application software packages, Information systems, technologies and services not elsewhere classified