Skip to main content

Big Data Machines: Internet-Scale Machine Learning Techniques to Combat the Curse of Big Data

Funding: 2013: $82,929
2014: $158,904
2015: $153,404
2016: $153,404
2017: $75,975

Project Member(s): Tsang, W.

Funding or Partner Organisation: Australian Research Council (ARC Future Fellowships)

Start year: 2014

Summary: The advent of 'big data' in business, government, science, social networks, the Internet, etc. creates opportunity in business and commercial domains. Big data also raises issues of increasing volume, variety, dimensionality and categories in open domain big data applications, which this project will solve by developing novel machine learning techniques, including theoretical foundations, a big data machine learning framework and open source website. The outcomes will provide frontier technologies for big data analysis that will have social and economic impact in such areas as social media computing, bioinformatics and business intelligence, and enhance Australia's global position in the pattern recognition and data mining communities.

Publications:

Shang, F, Zhou, K, Liu, H, Cheng, J, Tsang, IW, Zhang, L, Tao, D & Jiao, L 2020, 'VR-SGD: A Simple Stochastic Variance Reduction Method for Machine Learning', IEEE Transactions on Knowledge and Data Engineering, vol. 32, no. 1, pp. 188-202.
View/Download from: Publisher's site

Yan, Y, Tan, M, Tsang, IW, Yang, Y, Shi, Q & Zhang, C 2020, 'Fast and Low Memory Cost Matrix Factorization: Algorithm, Analysis, and Case Study', IEEE Transactions on Knowledge and Data Engineering, vol. 32, no. 2, pp. 288-301.
View/Download from: Publisher's site

Abidi, S, Piccardi, M, Tsang, IW & Williams, M-A 2019, 'Well-M$^3$N: A Maximum-Margin Approach to Unsupervised Structured Prediction', IEEE Transactions on Emerging Topics in Computational Intelligence, vol. 3, no. 6, pp. 427-439.
View/Download from: Publisher's site

Deng, W-Y, Lendasse, A, Ong, Y-S, Tsang, IW-H, Chen, L & Zheng, Q-H 2019, 'Domain Adaption via Feature Selection on Explicit Feature Map', IEEE Transactions on Neural Networks and Learning Systems, vol. 30, no. 4, pp. 1180-1190.
View/Download from: Publisher's site

Dong, X, Yan, Y, Tan, M, Yang, Y & Tsang, IW 2019, 'Late Fusion via Subspace Search With Consistency Preservation', IEEE Transactions on Image Processing, vol. 28, no. 1, pp. 518-528.
View/Download from: Publisher's site

Han, B, Yao, Q, Pan, Y, Tsang, IW, Xiao, X, Yang, Q & Sugiyama, M 2019, 'Millionaire: a hint-guided approach for crowdsourcing', Machine Learning, vol. 108, no. 5, pp. 831-858.
View/Download from: Publisher's site

Huang, S, Kang, Z, Tsang, IW & Xu, Z 2019, 'Auto-weighted multi-view clustering via kernelized graph learning', Pattern Recognition, vol. 88, pp. 174-184.
View/Download from: Publisher's site

Liu, W, Shen, X, Du, B, Tsang, IW, Zhang, W & Lin, X 2019, 'Hyperspectral Imagery Classification via Stochastic HHSVMs', IEEE Transactions on Image Processing, vol. 28, no. 2, pp. 577-588.
View/Download from: Publisher's site

Yao, J, Wang, J, Tsang, IW, Zhang, Y, Sun, J, Zhang, C & Zhang, R 2019, 'Deep Learning From Noisy Image Labels With Quality Embedding', IEEE Transactions on Image Processing, vol. 28, no. 4, pp. 1909-1922.
View/Download from: Publisher's site

Zhou, JT, Pan, SJ & Tsang, IW 2019, 'A deep learning framework for Hybrid Heterogeneous Transfer Learning', Artificial Intelligence, vol. 275, pp. 310-328.
View/Download from: Publisher's site

Zhou, JT, Tsang, IW, Ho, S-S & Müller, K-R 2019, 'N-ary decomposition for multi-class classification', Machine Learning, vol. 108, no. 5, pp. 809-830.
View/Download from: Publisher's site

Zhou, JT, Tsang, IW, Pan, SJ & Tan, M 2019, 'Multi-class Heterogeneous Domain Adaptation', Journal of Machine Learning Research, vol. 20, no. 57, pp. 1-31.

Shi, Y, Xu, D, Pan, Y, Tsang, IW & Pan, S 1970, 'Label Embedding with Partial Heterogeneous Contexts', Proceedings of the AAAI Conference on Artificial Intelligence, 33rd AAAI Conference on Artificial Intelligence / 31st Innovative Applications of Artificial Intelligence Conference / 9th AAAI Symposium on Educational Advances in Artificial Intelligence, Association for the Advancement of Artificial Intelligence (AAAI), Honolulu, HI, pp. 4926-4933.
View/Download from: Publisher's site

Yu, X, Han, B, Yao, J, Niu, G, Tsang, IW & Sugiyama, M 1970, 'How does disagreement help generalization against label corruption?', 36th International Conference on Machine Learning, ICML 2019, International Conference on Machine Learning, Curran, Long Beach Convention Center, Long Beach, pp. 12407-12417.

Guntuku, SC, Zhou, JT, Roy, S, Lin, W & Tsang, IW 2018, '‘Who Likes What and, Why?’ Insights into Modeling Users’ Personality Based on Image ‘Likes’', IEEE Transactions on Affective Computing, vol. 9, no. 1, pp. 130-143.
View/Download from: Publisher's site

Han, B, Pan, Y & Tsang, IW 2018, 'Robust Plackett–Luce model for k-ary crowdsourced preferences', Machine Learning, vol. 107, no. 4, pp. 675-702.
View/Download from: Publisher's site

Han, B, Tsang, IW, Chen, L, Yu, CP & Fung, S-F 2018, 'Progressive Stochastic Learning for Noisy Labels', IEEE Transactions on Neural Networks and Learning Systems, vol. 29, no. 10, pp. 5136-5148.
View/Download from: Publisher's site

Pan, Y, Han, B & Tsang, IW 2018, 'Stagewise learning for noisy k-ary preferences', Machine Learning, vol. 107, no. 8-10, pp. 1333-1361.
View/Download from: Publisher's site

Xu, D, Tsang, I & Zhang, Y 2018, 'Online Product Quantization', IEEE Transactions on Knowledge and Data Engineering, vol. 30, no. 11, pp. 1-1.
View/Download from: Publisher's site

Han, B, Yao, J, Niu, G, Zhou, M, Tsang, IW, Zhang, Y & Sugiyama, M 1970, 'Masking: A new perspective of noisy supervision', Advances in Neural Information Processing Systems, Annual Conference on Neural Information Processing Systems, Montréal, Canada, pp. 5836-5846.

Han, B, Yao, Q, Yu, X, Niu, G, Xu, M, Hu, W, Tsang, IW & Sugiyama, M 1970, 'Co-teaching: Robust training of deep neural networks with extremely noisy labels', Advances in Neural Information Processing Systems, International Workshop on Symbolic-Neural Learning, Toyota Technological Institute at Chicago, Nagoya, Japan, pp. 8527-8537.

Li, M, Zhang, Y, Sun, Y, Wang, W, Tsang, IW & Lin, X 1970, 'An Efficient Exact Nearest Neighbor Search by Compounded Embedding', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), International Conference on Database Systems for Advanced Applications, Springer International Publishing, Gold Coast, QLD, Australia, pp. 37-54.
View/Download from: Publisher's site

Lian, D, Zheng, K, Zheng, VW, Ge, Y, Cao, L, Tsang, IW & Xie, X 1970, 'High-order Proximity Preserving Information Network Hashing', Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD '18: The 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, London, United Kingdom, pp. 1744-1753.
View/Download from: Publisher's site

Liu, C, Chen, L, Tsang, I & Yin, H 1970, 'Towards the Learning of Weighted Multi-label Associative Classifiers', 2018 International Joint Conference on Neural Networks (IJCNN), 2018 International Joint Conference on Neural Networks (IJCNN), IEEE, Rio de Janeiro, Brazil, pp. 1-7.
View/Download from: Publisher's site

Liu, W, Liu, Z, Tsang, I, Zhang, W & Lin, X 1970, 'Doubly Approximate Nearest Neighbor Classification', Proceedings of the AAAI Conference on Artificial Intelligence, AAAI Conference on Artificial Intelligence, Association for the Advancement of Artificial Intelligence (AAAI), New Orleans, USA, pp. 3683-3690.
View/Download from: Publisher's site

Shen, X, Liu, W, Luo, Y, Ong, Y-S & Tsang, IW 1970, 'Deep Discrete Prototype Multilabel Learning', Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, Twenty-Seventh International Joint Conference on Artificial Intelligence {IJCAI-18}, International Joint Conferences on Artificial Intelligence Organization, Stockholm, Sweden, pp. 2675-2681.
View/Download from: Publisher's site

Shen, X, Liu, W, Tsang, I, Sun, Q-S & Ong, Y-S 1970, 'Compact Multi-Label Learning', Proceedings of the AAAI Conference on Artificial Intelligence, AAAI Conference on Artificial Intelligence, Association for the Advancement of Artificial Intelligence (AAAI), New Orleans, USA, pp. 4066-4073.
View/Download from: Publisher's site

Zhang, Y, Wang, H, Lian, D, Tsang, IW, Yin, H & Yang, G 1970, 'Discrete Ranking-based Matrix Factorization with Self-Paced Learning', Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, KDD '18: The 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, London, United Kingdom, pp. 2758-2767.
View/Download from: Publisher's site

Zhou, JT, Di, K, Du, J, Peng, X, Yang, H, Pan, SJ, Tsang, I, Liu, Y, Qin, Z & Goh, RSM 1970, 'SC2Net: Sparse LSTMs for Sparse Coding', Proceedings of the AAAI Conference on Artificial Intelligence, AAAI Conference on Artificial Intelligence, Association for the Advancement of Artificial Intelligence (AAAI), New Orleans, USA, pp. 4588-4595.
View/Download from: Publisher's site

Liu, W, Tsang, IW & Müller, KR 2017, 'An easy-to-hard learning paradigm for multiple classes and multiple labels', Journal of Machine Learning Research, vol. 18, no. 94, pp. 1-38.

Ma, C, Tsang, IW, Peng, F & Liu, C 2017, 'Partial Hash Update via Hamming Subspace Learning', IEEE Transactions on Image Processing, vol. 26, no. 4, pp. 1939-1951.
View/Download from: Publisher's site

Mao, Q, Wang, L & Tsang, IW 2017, 'A unified probabilistic framework for robust manifold learning and embedding', Machine Learning, vol. 106, no. 5, pp. 627-650.
View/Download from: Publisher's site

Mao, Q, Wang, L, Tsang, IW & Sun, Y 2017, 'Principal Graph and Structure Learning Based on Reversed Graph Embedding', IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 11, pp. 2227-2241.
View/Download from: Publisher's site

Wang, JJ-Y, Tsang, IW-H, Cui, X, Lu, Z & Gao, X 2017, 'Multi-instance dictionary learning via multivariate performance measure optimization', Pattern Recognition, vol. 66, pp. 448-459.
View/Download from: Publisher's site

Chai, J, Liu, W, Tsang, IW & Shen, X 1970, 'Compact Multiple-Instance Learning', Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, CIKM '17: ACM Conference on Information and Knowledge Management, ACM, Singapore, Singapore, pp. 2007-2010.
View/Download from: Publisher's site

Liu, W, Shen, X & Tsang, IW 1970, 'Sparse embedded k-means clustering', Advances in Neural Information Processing Systems, Thirty-first Annual Conference on Neural Information Processing Systems, Long Beach, California, USA, pp. 3320-3328.

Guntuku, SC, Zhou, JT, Roy, S, Lin, W & Tsang, IW 2016, 'Understanding Deep Representations Learned in Modeling Users Likes', IEEE Transactions on Image Processing, vol. 25, no. 8, pp. 3762-3774.
View/Download from: Publisher's site

Xu, X, Li, W, Xu, D & Tsang, IW 2016, 'Co-Labeling for Multi-View Weakly Labeled Learning', IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 6, pp. 1113-1125.
View/Download from: Publisher's site

Zhai, Y, Ong, Y-S & Tsang, IW 2016, 'Making Trillion Correlations Feasible in Feature Grouping and Selection', IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 38, no. 12, pp. 2472-2486.
View/Download from: Publisher's site

Han, B, Tsang, IW & Chen, L 1970, 'On the Convergence of a Family of Robust Losses for Stochastic Gradient Descent', Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery (ECML-PKDD), Springer International Publishing, Riva del Garda, Italy, pp. 665-680.
View/Download from: Publisher's site

Liu, W & Tsang, I 1970, 'Sparse Perceptron Decision Tree for Millions of Dimensions', Proceedings of the AAAI Conference on Artificial Intelligence, AAAI Conference on Artificial Intelligence, Association for the Advancement of Artificial Intelligence (AAAI), Phoenix, Arizona USA, pp. 1881-1887.
View/Download from: Publisher's site

Tan, M, Yan, Y, Wang, L, Hengel, AVD, Tsang, IW & Shi, QJ 1970, 'Learning Sparse Confidence-Weighted Classifier on Very High Dimensional Data', Proceedings of the AAAI Conference on Artificial Intelligence, AAAI Conference on Artificial Intelligence, Association for the Advancement of Artificial Intelligence (AAAI), Phoenix, USA, pp. 2080-2086.
View/Download from: Publisher's site

Wang, JJ-Y, Tsang, I & Gao, X 1970, 'Optimizing Multivariate Performance Measures from Multi-View Data', Proceedings of the AAAI Conference on Artificial Intelligence, AAAI Conference on Artificial Intelligence, Association for the Advancement of Artificial Intelligence (AAAI), Phoenix, Arizona USA, pp. 2152-2158.
View/Download from: Publisher's site

Yan, Y, Xu, Z, Tsang, I, Long, G & Yang, Y 1970, 'Robust Semi-Supervised Learning through Label Aggregation', Proceedings of the AAAI Conference on Artificial Intelligence, AAAI Conference on Artificial Intelligence, Association for the Advancement of Artificial Intelligence (AAAI), Phoenix, USA, pp. 2244-2250.
View/Download from: Publisher's site

Zhou, J, Pan, S, Tsang, I & Ho, S-S 1970, 'Transfer Learning for Cross-Language Text Categorization through Active Correspondences Construction', Proceedings of the AAAI Conference on Artificial Intelligence, AAAI Conference on Artificial Intelligence, Association for the Advancement of Artificial Intelligence (AAAI), Phoenix, Arizona USA, pp. 2400-2406.
View/Download from: Publisher's site

Zhou, JT, Xu, X, Pan, SJ, Tsang, IW, Qin, Z & Goh, RSM 1970, 'Transfer hashing with privileged information', IJCAI International Joint Conference on Artificial Intelligence, International Joint Conference on Artificial Intelligence, AAAI, New York, USA, pp. 2414-2420.

Chen, M, Tsang, IW, Tan, M & Cham, TJ 2015, 'A Unified Feature Selection Framework for Graph Embedding on High Dimensional Data', IEEE Transactions on Knowledge and Data Engineering, vol. 27, no. 6, pp. 1465-1477.
View/Download from: Publisher's site

Qi Mao, Tsang, IW, Shenghua Gao & Li Wang 2015, 'Generalized Multiple Kernel Learning With Data-Dependent Priors', IEEE Transactions on Neural Networks and Learning Systems, vol. 26, no. 6, pp. 1134-1148.
View/Download from: Publisher's site

Tan, M, Tsang, IW & Wang, L 2015, 'Matching Pursuit LASSO Part I: Sparse Recovery Over Big Dictionary', IEEE Transactions on Signal Processing, vol. 63, no. 3, pp. 727-741.
View/Download from: Publisher's site

Tan, M, Tsang, IW & Wang, L 2015, 'Matching Pursuit LASSO Part II: Applications and Sparse Recovery Over Batch Signals', IEEE Transactions on Signal Processing, vol. 63, no. 3, pp. 742-753.
View/Download from: Publisher's site

Tan, M, Tsang, IW & Wang, L 2014, 'Towards ultrahigh dimensional feature selection for big data', Journal of Machine Learning Research, vol. 15, pp. 1371-1429.

Zhai, Y, Ong, Y-S & Tsang, IW 2014, 'The Emerging 'Big Dimensionality'', IEEE Computational Intelligence Magazine, vol. 9, no. 3, pp. 14-26.
View/Download from: Publisher's site

Keywords: Machine Learning,Data Mining,Pattern Recognition

FOR Codes: Pattern Recognition and Data Mining, Expanding Knowledge in the Information and Computing Sciences, Application Software Packages (excl. Computer Games), Information Processing Services (incl. Data Entry and Capture), Machine learning, Information systems, technologies and services not elsewhere classified, Application software packages