Skip to main content

Learning Software Security Analysers with Imperfect Data

Project Member(s): Sui, Y.

Funding or Partner Organisation: Australian Research Council (ARC Future Fellowships)
Australian Research Council (ARC Future Fellowships)

Start year: 2022

Summary: This project aims to systematically investigate a new approach that learns to generate software security analysers with limited, weakly-labelled, and concept-drifting data to detect vulnerabilities in real-world large-scale software. Specifically, we aim (1) to harvest limited training data by precisely selecting a subset of important code samples from unlabelled real-world fragments through active contrastive learning, (2) to find sweet spots between precision and efficiency by generating on-demand analysis abstractions, thus paying attention to important vulnerable code fragments, and (3) to reinforce a generated analyser with iterative validations using counterexamples from analysis oracles.