Skip to main content

Big Data Machines: Internet-Scale Machine Learning Techniques to Combat the Curse of Big Data

Funding: 2013: $82,929
2014: $158,904
2015: $153,404
2016: $153,404
2017: $75,975

Project Member(s): Tsang, W.

Funding or Partner Organisation: Australian Research Council (ARC Future Fellowships)

Start year: 2014

Summary: The advent of 'big data' in business, government, science, social networks, the Internet, etc. creates opportunity in business and commercial domains. Big data also raises issues of increasing volume, variety, dimensionality and categories in open domain big data applications, which this project will solve by developing novel machine learning techniques, including theoretical foundations, a big data machine learning framework and open source website. The outcomes will provide frontier technologies for big data analysis that will have social and economic impact in such areas as social media computing, bioinformatics and business intelligence, and enhance Australia's global position in the pattern recognition and data mining communities.

Publications:

Yan, Y, Tan, M, Tsang, I, Yang, Y, Shi, Q & Zhang, C 2020, 'Fast and Low Memory Cost Matrix Factorization: Algorithm, Analysis and Case Study', IEEE Transactions on Knowledge and Data Engineering.
View/Download from: Publisher's site

Abidi, S, Piccardi, M, Tsang, WH & Williams, M-A 2019, 'Well-M³N: A Maximum-Margin Approach to Unsupervised Structured Prediction', IEEE Transactions on Emerging Topics in Computational Intelligence, vol. 3, no. 6, pp. 427-439.
View/Download from: Publisher's site

Deng, W-Y, Lendasse, A, Ong, Y-S, Tsang, W, Chen, L & Zheng, Q-H 2019, 'Domain Adaption via Feature Selection on Explicit Feature Map.', IEEE transactions on neural networks and learning systems, vol. 30, no. 4, pp. 1180-1190.
View/Download from: Publisher's site

Dong, X, Yan, Y, Tan, M, Yang, Y & Tsang, WHI 2019, 'Late Fusion via Subspace Search With Consistency Preservation.', IEEE transactions on image processing : a publication of the IEEE Signal Processing Society, vol. 28, no. 1, pp. 518-528.
View/Download from: Publisher's site

Han, B, Yao, Q, Pan, Y, Tsang, IW, Xiao, X, Yang, Q & Sugiyama, M 2019, 'Millionaire: a hint-guided approach for crowdsourcing', Machine Learning, vol. 108, pp. 831-858.
View/Download from: Publisher's site

Huang, S, Kang, Z, Tsang, IW & Xu, Z 2019, 'Auto-weighted multi-view clustering via kernelized graph learning', PATTERN RECOGNITION, vol. 88, pp. 174-184.
View/Download from: Publisher's site

Liu, W, Shen, X, Du, B, Tsang, WHI, Zhang, W & Lin, X 2019, 'Hyperspectral Imagery Classification via Stochastic HHSVMs.', IEEE Transactions on Image Processing, vol. 28, no. 2, pp. 577-588.
View/Download from: Publisher's site

Yao, J, Wang, J, Tsang, W, Zhang, Y, Sun, J, Zhang, C & Zhang, R 2019, 'Deep Learning from Noisy Image Labels with Quality Embedding.', IEEE transactions on image processing : a publication of the IEEE Signal Processing Society, vol. 28, no. 4, pp. 1909-1922.
View/Download from: Publisher's site

Zhou, JT, Pan, SJ & Tsang, IW 2019, 'A deep learning framework for Hybrid Heterogeneous Transfer Learning', Artificial Intelligence, vol. 275, pp. 310-328.
View/Download from: Publisher's site

Zhou, JT, Tsang, IW, Ho, SS & Müller, KR 2019, 'N-ary decomposition for multi-class classification', Machine Learning, vol. 108, no. 5, pp. 809-830.
View/Download from: Publisher's site

Zhou, JT, Tsang, IW, Pan, SJ & Tan, M 2019, 'Multi-class Heterogeneous Domain Adaptation', Journal of Machine Learning Research, vol. 20, no. 57, pp. 1-31.

Shi, Y, Xu, D, Pan, Y, Tsang, IW & Pan, S 2019, 'Label Embedding with Partial Heterogeneous Contexts', THIRTY-THIRD AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE / THIRTY-FIRST INNOVATIVE APPLICATIONS OF ARTIFICIAL INTELLIGENCE CONFERENCE / NINTH AAAI SYMPOSIUM ON EDUCATIONAL ADVANCES IN ARTIFICIAL INTELLIGENCE, 33rd AAAI Conference on Artificial Intelligence / 31st Innovative Applications of Artificial Intelligence Conference / 9th AAAI Symposium on Educational Advances in Artificial Intelligence, ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE, Honolulu, HI, pp. 4926-4933.

Yu, X, Han, B, Yao, J, Niu, G, Tsang, I & Sugiyama, M 2019, 'How does Disagreement Help Generalization Against Label Corruption', International Conference on Machine Learning, Curran, Long Beach Convention Center, Long Beach, pp. 12407-12417.

Bo Han, Tsang, IW, Ling Chen, Yu, CP & Sai-Fu Fung 2018, 'Progressive Stochastic Learning for Noisy Labels.', IEEE transactions on neural networks and learning systems, vol. 29, no. 10, pp. 5136-5148.
View/Download from: Publisher's site

Guntuku, S, Zhou, J, Roy, S, Lin, W & Tsang, W 2018, 'Who likes What, and Why? Insights into Modeling Users' Personality based on Image `Likes'', IEEE Transactions on Affective Computing, vol. 9, no. 1, pp. 130-130.
View/Download from: Publisher's site

Han, B, Pan, Y & Tsang, IW 2018, 'Robust Plackett–Luce model for k-ary crowdsourced preferences', Machine Learning, vol. 107, no. 4, pp. 675-702.
View/Download from: Publisher's site

Pan, Y, Han, B & Tsang, IW 2018, 'Stagewise learning for noisy k-ary preferences', Machine Learning, vol. 107, no. 8-10, pp. 1333-1361.
View/Download from: Publisher's site

Shang, F, Zhou, K, Liu, H, Cheng, J, Tsang, I, Zhang, L, Tao, D & Licheng, J 2018, 'VR-SGD: A Simple Stochastic Variance Reduction Method for Machine Learning', IEEE Transactions on Knowledge and Data Engineering, vol. 2018.
View/Download from: Publisher's site

Xu, D, Tsang, IW & Zhang, Y 2018, 'Online Product Quantization', IEEE Transactions on Knowledge and Data Engineering, vol. 30, no. 11, pp. 2185-2198.
View/Download from: Publisher's site

Han, B, Yao, J, Niu, G, Zhou, M, Tsang, IW, Zhang, Y & Sugiyama, M 2018, 'Masking: A new perspective of noisy supervision', Advances in Neural Information Processing Systems, Annual Conference on Neural Information Processing Systems, Montréal, Canada, pp. 5836-5846.

Han, B, Yao, Q, Yu, X, Niu, G, Xu, M, Hu, W, Tsang, IW & Sugiyama, M 2018, 'Co-teaching: Robust training of deep neural networks with extremely noisy labels', Advances in Neural Information Processing Systems, International Workshop on Symbolic-Neural Learning, Toyota Technological Institute at Chicago, Nagoya, Japan, pp. 8527-8537.

Li, M, Zhang, Y, Sun, Y, Wang, W, Tsang, IW & Lin, X 2018, 'An efficient exact nearest neighbor search by compounded embedding', Database Systems for Advanced Applications (LNCS), International Conference on Database Systems for Advanced Applications, Springer, Gold Coast, QLD, Australia, pp. 37-54.
View/Download from: Publisher's site

Lian, D, Zheng, K, Cao, L, Zheng, VW, Tsang, IW, Ge, Y & Xie, X 2018, 'High-order proximity preserving information network hashing', Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, International Conference on Knowledge Discovery & Data Mining, ACM, London, United Kingdom, pp. 1744-1753.
View/Download from: Publisher's site

Liu, C, Chen, L, Tsang, I & Yin, H 2018, 'Towards the Learning of Weighted Multi-label Associative Classifiers', Proceedings of the International Joint Conference on Neural Networks, International Joint Conference on Neural Networks, IEEE, Rio de Janeiro, Brazil, pp. 1-7.
View/Download from: Publisher's site

Liu, W, Liu, Z, Tsang, IW, Zhang, W & Lin, X 2018, 'Doubly approximate nearest neighbor classification', 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, AAAI Conference on Artificial Intelligence, AAAI, New Orleans, USA, pp. 3683-3690.

Shen, X, Liu, W, Luo, Y, Ong, YS & Tsang, IW 2018, 'Deep binary prototype multi-label learning', IJCAI International Joint Conference on Artificial Intelligence, International Joint Conference on Artificial Intelligence, International Joint Conference on Artificial Intelligence, Stockholm, Sweden, pp. 2675-2681.
View/Download from: Publisher's site

Shen, X, Liu, W, Tsang, IW, Sun, QS & Ong, YS 2018, 'Compact multi-label learning', 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, AAAI Conference on Artificial Intelligence, AAAI, New Orleans, USA, pp. 4066-4073.

Zhang, Y, Tsang, IW, Wang, H, Yin, H, Lian, D & Yang, G 2018, 'Discrete ranking-based matrix factorization with self-paced learnings', Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, ACM, London, United Kingdom, pp. 2758-2767.
View/Download from: Publisher's site

Zhou, JT, Di, K, Du, J, Peng, X, Yang, H, Pan, SJ, Tsang, IW, Liu, Y, Qin, Z & Goh, RSM 2018, 'Sc2Net: Sparse LSTMs for sparse coding ', 32nd AAAI Conference on Artificial Intelligence, AAAI 2018, AAAI Conference on Artificial Intelligence, AAI, New Orleans, USA, pp. 4588-4595.

Liu, W, Tsang, W & Muller, K 2017, 'An Easy-to-hard Learning Paradigm for Multiple Classes and Multiple Labels', Journal of Machine Learning Research, vol. 18, no. 94, pp. 1-38.

Ma, C, Tsang, IW, Peng, F & Liu, C 2017, 'Partial Hash Update via Hamming Subspace Learning', IEEE Transactions on Image Processing, vol. 26, no. 4, pp. 1939-1951.
View/Download from: Publisher's site

Mao, Q, Wang, L & Tsang, IW 2017, 'A unified probabilistic framework for robust manifold learning and embedding', Machine Learning, vol. 106, no. 5, pp. 627-650.
View/Download from: Publisher's site

Mao, Q, Wang, L, Tsang, I & Sun, Y 2017, 'Principal Graph and Structure Learning Based on Reversed Graph Embedding', IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 39, no. 11, pp. 2227-2241.
View/Download from: Publisher's site

Wang, JJY, Tsang, IWH, Cui, X, Lu, Z & Gao, X 2017, 'Multi-instance dictionary learning via multivariate performance measure optimization', Pattern Recognition, vol. 66, pp. 448-459.
View/Download from: Publisher's site

Chai, J, Liu, W, Tsang, IW & Shen, X 2017, 'Compact multiple-instance learning', International Conference on Information and Knowledge Management, Proceedings, ACM International Conference on Information and Knowledge Management, ACM, Singapore, Singapore, pp. 2007-2010.
View/Download from: Publisher's site

Liu, W, Shen, X & Tsang, IW 2017, 'Sparse Embedded k-means Clustering', Advances in Neural Information Processing Systems, Thirty-first Annual Conference on Neural Information Processing Systems, Long Beach, California, USA, pp. 3320-3328.

Guntuku, SC, Zhou, JT, Roy, S, Lin, W & Tsang, IW 2016, 'Understanding Deep Representations Learned in Modeling Users Likes', IEEE TRANSACTIONS ON IMAGE PROCESSING, vol. 25, no. 8, pp. 3762-3774.
View/Download from: Publisher's site

Xu, X, Li, W, Xu, D & Tsang, IW 2016, 'Co-Labeling for Multi-View Weakly Labeled Learning', IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, vol. 38, no. 6, pp. 1113-1125.
View/Download from: Publisher's site

Zhai, Y, Ong, YS & Tsang, I 2016, 'Making Trillion Correlations Feasible in Feature Grouping and Selection.', IEEE transactions on pattern analysis and machine intelligence, vol. 38, no. 12, pp. 2472-2486.
View/Download from: Publisher's site

Han, B, Tsang, IW & Chen, L 2016, 'On the Convergence of A Family of Robust Losses for Stochastic Gradient Descent', Machine Learning and Knowledge Discovery in Databases - LNCS, The European Conference on Machine Learning and Principles and Practice of Knowledge Discovery (ECML-PKDD), Springer, Riva del Garda, Italy.
View/Download from: Publisher's site

Liu, W & Tsang, W 2016, 'Sparse Perceptron Decision Tree for Millions of Dimensions', Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16), AAAI Conference on Artificial Intelligence, AAAI, Phoenix, Arizona USA, pp. 1881-1887.

Tan, M, Yan, Y, Wang, L, Hengel, A, Tsang, W & Shi, Q 2016, 'Learning Sparse Confidence-Weighted classifier on Very High Dimensional Data', Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16), AAAI Conference on Artificial Intelligence, AAAI, Phoenix, USA, pp. 2080-2086.

Wang, J, Tsang, W & Gao, X 2016, 'Optimizing Multivariate Performance Measures from Multi-View Data', Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16), AAAI Conference on Artificial Intelligence, AAAI, Phoenix, Arizona USA, pp. 2152-2158.

Yan, Y, Xu, Z, Tsang, W, Long, G & Yang, Y 2016, 'Robust Semi-supervised Learning through Label Aggregation', Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16), AAAI Conference on Artificial Intelligence, AAAI, Phoenix, USA, pp. 2244-2250.

Zhou, J, Pan, S, Tsang, W & Ho, S 2016, 'Transfer Learning for Cross-Language Text Categorization through Active Correspondences Construction', Proceedings of the Thirtieth AAAI Conference on Artificial Intelligence (AAAI-16), AAAI Conference on Artificial Intelligence, AAAI, Phoenix, Arizona USA, pp. 2400-2406.

Zhou, J, Xu, X, Pan, S, Tsang, W, Qin, Z & Goh, R 2016, 'Transfer Hashing with Privileged Information', Proceedings of 25th International Joint Conference on Artificial Intelligence, International Joint Conference on Artificial Intelligence, AAAI, New York, USA, pp. 2414-2420.

Chen, M, Tsang, IW, Tan, M & Cham, TJ 2015, 'A Unified Feature Selection Framework for Graph Embedding on High Dimensional Data', IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, vol. 27, no. 6, pp. 1465-1477.
View/Download from: Publisher's site

Mao, Q, Tsang, IW, Gao, S & Wang, L 2015, 'Generalized Multiple Kernel Learning With Data-Dependent Priors', IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS, vol. 26, no. 6, pp. 1134-1148.
View/Download from: Publisher's site

Tan, M, Tsang, IW & Wang, L 2015, 'Matching Pursuit LASSO Part I: Sparse Recovery Over Big Dictionary', IEEE TRANSACTIONS ON SIGNAL PROCESSING, vol. 63, no. 3, pp. 727-741.
View/Download from: Publisher's site

Tan, M, Tsang, IW & Wang, L 2015, 'Matching Pursuit LASSO Part II: Applications and Sparse Recovery Over Batch Signals', IEEE TRANSACTIONS ON SIGNAL PROCESSING, vol. 63, no. 3, pp. 742-753.
View/Download from: Publisher's site

Tan, M, Tsang, I & Wang, L 2014, 'Towards Ultrahigh Dimensional Feature Selection for Big Data', Journal of Machine Learning Research, vol. 15, pp. 1371-1429.

Zhai, Y, Ong, Y-S & Tsang, IW 2014, 'The Emerging "Big Dimensionality"', IEEE Computational Intelligence Magazine, vol. 9, no. 3, pp. 14-26.
View/Download from: Publisher's site

Keywords: Machine Learning,Data Mining,Pattern Recognition

FOR Codes: Pattern Recognition and Data Mining, Expanding Knowledge in the Information and Computing Sciences, Application Software Packages (excl. Computer Games), Information Processing Services (incl. Data Entry and Capture)