Learning a strategy for adapting a program analysis via bayesian optimisation

@article{Oh2015LearningAS,
  title={Learning a strategy for adapting a program analysis via bayesian optimisation},
  author={Hakjoo Oh and Hongseok Yang and Kwangkeun Yi},
  journal={Proceedings of the 2015 ACM SIGPLAN International Conference on Object-Oriented Programming, Systems, Languages, and Applications},
  year={2015},
  url={https://api.semanticscholar.org/CorpusID:13940725}
}
This paper presents a new approach for building an adaptive static analyser that includes a sophisticated parameterised strategy that decides, for each part of a given program, whether to apply a precision-improving technique to that part or not, and presents a method for learning a good parameter for such a strategy from an existing codebase via Bayesian optimisation.

Figures and Tables from this paper

Adaptive Static Analysis via Learning with Bayesian Optimization

This article presents a new learning-based approach for adaptive static analysis that includes a sophisticated parameterized strategy that decides, for each part of a given program, whether to apply a precision-improving technique to that part or not and develops partially flow- and context-sensitive variants of a realistic C static analyzer.

Learning a Strategy for Choosing Widening Thresholds from a Large Codebase

A method is presented that automatically learns a good strategy for choosing widening thresholds from a given codebase and is able to achieve this performance 26 times faster than the previous Bayesian optimization approach.

Machine Learning-Guided Adaptive Program Analysis

This talk aims to automate this procedure by learning a good strategy from an existing codebase by automatically adapting a program analysis to a given verification task by using machine learning techniques.

Learning Abstraction Selection for Bayesian Program Analysis

A data-driven framework to solve the problem of selecting abstractions for Bayesian program analysis by learning from labeled programs by considering graph properties of analysis derivations.

Learning to Boost Disjunctive Static Bug-Finders

A novel data-driven technique that efficiently collects alarm-triggering traces, learns multiple candidate models, and adaptively chooses the best model tailored for each target program is presented.

A Machine-Learning Algorithm with Disjunctive Model for Data-Driven Program Analysis

A new machine-learning algorithm with disjunctive model for data-driven program analysis as well as a learning algorithm to find the model parameters that is able to express nonlinear combinations of program properties.

Data-driven context-sensitivity for points-to analysis

This work proposes an automated and data-driven approach that learns to effectively apply context-sensitivity from codebases and presents a greedy algorithm that efficiently learns the parameter of the heuristic rules.

Automatically generating features for learning program analysis heuristics

The technique goes through selected program-query pairs in codebases, and it reduces and abstracts the program in each pair to a few lines of code, while ensuring that the analysis behaves similarly for the original and the new programs with respect to the query.
...

Probabilistic, modular and scalable inference of typestate specifications

The results for the large benchmark show that ANEK can quickly infer specifications that are both accurate and qualitatively similar to those written by hand, and at 5% of the time taken to manually discover and hand-code the specifications.

Learning minimal abstractions

This paper introduces two machine learning algorithms for efficiently finding a minimal abstraction and shows empirically that minimal abstractions are actually quite coarse: It suffices to provide context/object sensitivity to a very small fraction of the sites to yield equally precise results as providing context/ object sensitivity uniformly to all sites.

Sequential Model-Based Optimization for General Algorithm Configuration

This paper extends the explicit regression models paradigm for the first time to general algorithm configuration problems, allowing many categorical parameters and optimization for sets of instances, and yields state-of-the-art performance.

On abstraction refinement for program analyses in Datalog

This work presents a new approach for finding such abstractions for program analyses written in Datalog based on counterexample-guided abstraction refinement, which uses a boolean satisfiability formulation that is general, complete, and optimal.

Selective context-sensitivity guided by impact pre-analysis

This method applies context-sensitivity only when and where doing so is likely to improve the precision that matters for resolving given queries, and demonstrates generality by following the same principle and developing a selective relational analysis.

Dynamic inference of likely data preconditions over predicates by tree learning

This work presents a technique to infer likely data preconditions for procedures written in an imperative programming language that successfully learns a precondition that captures a safe and permissive calling environment.

Introspective analysis: context-sensitivity, across the board

This work proposes introspective analysis: a technique for uniformly scaling context-sensitive analysis by eliminating its performance-detrimental behavior, at a small precision expense, and shows that a simple but principled approach can be remarkably effective, achieving scalability for benchmarks previously completely out-of-reach for deep context- sensitive analyses.

Abstractions from tests

The main insight is to directly and efficiently compute from a concrete trace, a necessary condition on the parameter configurations to prove a given query, and thereby prune the space of parameter configurations that the static analysis must consider.

Typestate-based semantic code search over partial programs

A novel code search approach for answering queries focused on API-usage with code showing how the API should be used, and the results indicate that the combination of a relatively precise analysis and consolidation allowed PRIME to answer challenging queries effectively.

Termination proofs from tests

An algorithm TpT for proving termination of a program based on information derived from testing it, which is able to prove termination on 15% more benchmarks than any previously known technique, and its evaluation on Windows device drivers demonstrates its ability to analyze and scale to real world applications.