Locality Sensitive Hashing for Big Data
Funding: 2017: $115,000
2018: $112,000
2019: $120,000
Project Member(s): Zhang, Y.
Funding or Partner Organisation: Australian Research Council (ARC Discovery Projects)
Start year: 2017
Summary: Locality sensitive hashing (LSH) is one of the most widely adopted methods for answering similarity queries, and have significant impact in many fields of computer science and in diverse applications. This project will address key challenges to LSH when applying it to the Big Data, namely the demand to handle new similarity functions, large data volume, and better efficiency. New theories, methodologies and prototypes will be created to address these challenges. The project will provide frontier technology to existing applications to combat crimes in the cybersecurity space, and open up a host of novel possibilities for more intelligent and real-time analysis of Big Data.
Publications:
Wang, X, Qin, L, Lin, X, Zhang, Y & Chang, L 2019, 'Leveraging set relations in exact and dynamic set similarity join.', VLDB J., vol. 28, no. 2, pp. 267-292.
View/Download from: Publisher's site
Wang, X, Qin, L, Lin, X, Zhang, Y & Chang, L 2017, 'Leveraging Set Relations in Exact Set Similarity Join.', Proc. VLDB Endow., vol. 10, no. 9, pp. 925-936.
View/Download from: Publisher's site
Wang, X, Zhang, Y, Zhang, W & Lin, X 2017, 'Efficient Distance-Aware Influence Maximization in Geo-Social Networks', IEEE Transactions on Knowledge and Data Engineering, vol. 29, no. 3, pp. 599-612.
View/Download from: Publisher's site
Wang, X, Zhang, Y, Zhang, W, Lin, X & Chen, C 2017, 'Bring Order into the Samples: A Novel Scalable Method for Influence Maximization', IEEE Transactions on Knowledge and Data Engineering, vol. 29, no. 2, pp. 243-256.
View/Download from: Publisher's site
Yang, J, Zhang, W, Yang, S, Zhang, Y & Lin, X 1970, 'TT-Join: Efficient Set Containment Join', 2017 IEEE 33rd International Conference on Data Engineering (ICDE), 2017 IEEE 33rd International Conference on Data Engineering (ICDE), IEEE, San Diego, CA, USA, pp. 509-520.
View/Download from: Publisher's site
Keywords: Locality sensitive hashing, Big Data, Similarity Queries
FOR Codes: Database Management, Information Processing Services (incl. Data Entry and Capture), Application Software Packages (excl. Computer Games), Database systems, Application software packages, Information systems, technologies and services not elsewhere classified