Skip to content

boschresearch/active-learning-framework

Repository files navigation

Active Learning Framework (ALEF)

This repo contains classes and scripts to actively learn stochastic models like Gaussian Processes. It provides different kind of experimental design methods like pool-based, oracle-base and safe active learning as well as Bayesian optimization. Furthermore, different stochastic models are implemented like standard GPs, GPs with marginalized hyperparameters, Deep GPs and many different kernels such as Deep Kernels, Neural Kernel Network and more.

For questions to the code contact Matthias Bitzer or Cen-You Li.

Setup

conda env create --file environment.yml
conda activate alef
pip install -e .

To check if everything is set up correctly you can run the tests from the root dir of the repo

pytest .

Repo Structure

The repo is divided in the following folders/submethods (not complete only the most important parts):


  • active_learner: This folder contains the different active learner classes:

    • ActiveLearner: This class implements pool based active learning. It needs an instance of a child of the base_model class as model [setted via self.set_model(model)]. It contains an active_learner.Pool object (self.pool) that needs to be populized with data (numpy arrays) x_data and y_data through the self.set_pool(x_data,y_data) method. The initial dataset can be sampled directly from the pool via self.sample_initial_data(n_data). Besides the initial data it also needs a test set for validation which needs to be setted seperatly via self.set_test_set(x_test,y_test). The main method self.learn(n_steps) will trigger the AL cycle. Further settings such as acquisition function type (Entropy,Variance,...) and the kind of validation metric (RMSE, NLL,...) are specified in the config files (see Configs section below). Some of the mentioned methods are implemented in its abstract parent class BasePoolActiveLearner. Config classes: alef/configs/active_learner/active_learner_configs.py


    • ActiveLearnerOracle: This class implements active learning with an oracle. It needs an instance of a child of the base_model class as model and an instance of a child of base_oracle as oracle. An object of this class gets instantiated with an active_learner_oracle.AcquisitionFunctionType object (Enum) which defines the type of acqusition function which is used and an active_learner_oracle.ValidationType object which defines the type of validation that is used as a metric (TODO: create Config file also for this class). Initial training set and test set need to initialized and can be directly generated by sampling from the oracle via the methods self.sample_train_set(n_data) and self.sample_test_set(n_test) (both datasets can also be set manually). The main method self.learn(n_steps) will trigger the AL cycle and will iteratively query the points of its oracle. It outputs the validation metrics collected over the iterations and the chosen query locations.


    • SafeActiveLearner: This class implements safe and pool-based active learning. An explanation of safe active learning can be found in [Schreiter et al. (2015)].(https://ipvs.informatik.uni-stuttgart.de/mlr/papers/15-schreiter-ECML.pdf) It needs an instance of a child of the base_model class as model and another instance of it as the safety_model and an instance of safe_active_learner.Pool as the pool. The pool gets populated with data x_data and y_data as well as associated safety_data which is the label for the safety_model. Besides the known objects for acquistion and validation type an object of this class gets an safety_threshold value that provides a lower bound of the safety_value the active learner should not exceed, while exploring/querying.


  • bayesian_optimization: This folder contains the different bayesian optimization classes.

    • bayesian_optimization_oracle: This class implements bayesian optimization. It needs an instance of a child of the base_model class as model which serves as surrogate model for the function that should be optimized. It also needs an instance of the base_oracle class as its oracle object that it will call and that should be optimized. To choose the acquisition function the class needs an bayesian_optimization.enums.AcquisitionFunctionType enum value (important AF's are implemented such as GP-UCB and EI). The initial training can be sampled directly from the oracle via sample_train_set() or one can set the train set directly using set_train_set(self,x_train,y_train). The optimization procedure can be started by calling maximize(n_steps) which will execute n_steps bayesian optimization queries.

  • models: This folder contains the classes that implement the abstract class BaseModel. Each child class must implement the infer(x_data,y_data) method, which infers the parameters (using maximum likelihood, MCMC or variational inference) from the given data. Furthermore it needs to implement the predictive_dist(x_test) method that gives back mu and sigma for given inputs (further methods that needs to be implemented to fulfill the inteface can be found in the BaseModel class). Example of such models are:

    • GPModel: This model forms a wrapper around an gpflow.models.GPR object. It needs a kernel object (child class of gpflow.kernels.Kernel). The optimize_hps boolean determines if the trainable kernel hyperparameters are learned or not (If True they are learned with maximum likelihood or maximum a-posteriori in case a the kernel HP's are equipped with a prior). Furthermore it can be decided if the hyperparameter optimization should be done with multiple initial values using the perform_multi_start_optimization flag. The model is therefore standard GP regression with MAP/ML estimation of the HP's. Config classes: alef/configs/models/gp_model_config.py


    • GPModelMarginalized: This model also forms a wrapper around an gpflow.models.GPR object and it also needs a kernel object. In this class all model parameters (Kernel hyperparameters + observation noise) are learned with MCMC (Hamiltonian Monte Carlo). The training of the likelihood variance can be shut off with train_likelihood_variance=False. The variable num_samples specifies the number of final samples used for prediction, num_burnin_steps determines the number of burnin samples for MCMC and thin_steps determines the number of samples that are thrown away to ensure independence (thin_steps-1 out of thin_steps samples). The method predictive_dist() in this case gives a sample based estimate of the marginal predictive distribution (marginalized over the hyperparameter posterior). Config classes: alef/configs/models/gp_model_marginalized_config.py


    • GPModelPytorch: This model forms a wrapper around gpytorch.models.ExactGP model. It needs a kernel object (child class of gpytorch.kernels.Kernel) and performs Type-2 ML or MAP estimation of the GP parameters. Training is based on the RAdam optimizer and multiple restarts are possible. Notable parameters are training_iter which determines the number of gradient steps per run, lr determining the learning rate, do_multi_start_optimization determines if multiple restarts with different intitial GP parameters should be done (necesitates that the torch parameters are equipped with priors), n_restarts_multistart determines number of restarts, do_map_estimation determines if MAP estimation should be done rather than Type-2 ML and do_early_stopping flags if an RAdam run should be interupted once the loss has converged. The model inherits from BaseModel and thus has all associated methods. A possible configuration can be seen below. Config classes: alef/configs/models/gp_model_pytorch_config.py Kernel configs: alef/configs/kernels/pytorch_kernel/*

    • GPModelLaplace: This model provides a further method to infer the hyperparameters of a GP Model. It also forms a wrapper around an gpflow.models.GPR object and it also needs a kernel object. This class implements Laplace inference for the kernel hyperparameters and approximates the predictive distribution with a normal according to the method presented in Garnett et al. (2014). Config classes: alef/configs/models/gp_model_laplace_config.py


    • SparseGpModel: This class inherits from GPModel and replaces the gpflow.models.GPR object with an gpflow.models.SGPR object. This therefore implements a sparse GP model based on inducing points that can be used for GP modelling on larger datasets. Accordingly one needs to set the n_inducing_points points property which specifies how many inducing points should be used (maximally, if smaller than n_data than n_data=n_inducing_points). The inducing point locations are generated via k-means clustering on the input data and are not trainable. As it inherits from GPModel it also contains all functionalities of GPModel such as multistart optimization. Config classes: alef/configs/models/sparse_gp_model_config.py


    • MOGPModel: This class implements a Multi-Output Gaussian process. It forms a wrapper around the models.MOGPR model and needs and instance of a gpflow.kernels.MultioutputKernel as its kernel. For this model the dimension of y_data is allowed to be greater than 1 (y_data.shape can be [n,m]). The self.predictive_dist(x_test) method outputs a tuple of np.arrays of shape [n,m] containing mean and variances over datapoints and outputs. The kernel hyperparameters are learned via ML/MAP estimation (on calling the self.infer(x_data,y_data) method). Config classes: alef/configs/models/mogp_model_config.py


    • DeepGP: This class forms a wrapper around the gpflux.DeepGP model and provides a solid configuration of the DeepGP. This configuration is based on Salimbeni and Deisenroth (2017). Parameters that are still configurable in this class are for example n_layer specifying the number of GP layers in the DeepGP as well as with max_n_inducing_points the number of inducing points used for inference. Config classes: alef/configs/models/deep_gp_config.py


    • GPModelKernelSearch: This class forms a wrapper around a search procedure over kernel space. It makes BO over kernel space via specifying a KernelGrammarCandidateGenerator that defines the kernel grammar over which it should search and a kernel-kernel in self.kernel. The selection criteria is defined via the oracle_type enum (Log-Evidence, BIC and CVLL are possible). When calling infer the search starts and stores in the end the found kernel in a GPModel in self.model. All other BaseModel methods are forwarded to the GPModel then.


  • oracles: This folder contains classes that implement the abstract class BaseOracle. Each child class must implement the method query(x) that evaluates the oracle/function at x and returns the value. The method get_box_bounds() return the definition bounds of the oracle. The get_dimension() method returns the dimension of the input of the oracle. Finally the get_random_data(n) method queries n datapoints uniform distributed in the definition region and returns x and y (np.arrays). Examples are:

    • GPOracle1D: An object of this class gets a kernel_config and samples a one dimensional function f (discretized) from the associated GP when calling initialize(). The sampled function doesn`t change over the lifetime of the object and can be queried at any given x in the definition region. Points that are outside the discretization get linearly interpolated.

  • data_sets: This folder contains wrapper/loader for different datasets that all admit to the interface BaseDataset. All classes need to have a base_path argument in the constructor that specifies where its file/s lie. The load_data() method loads the data from the associated file and the get_random_data(n) retrieves n datapoints from the dataset. For most of the classes like e.g. Energy or Powerplant the necessary files that should lie in the base_path can be found in the corresponding UCI repo.

Configs/Model Usage

Objects of the models, active learner, bayesian optimization and kernel classes can be initialized with instances of pydantic.BaseSettings objects which are specified in the configs folder. The actual object get build by the corresponding factory classes:

  • models\model_factory.py: ModelFactory class which takes a BaseModelConfig object or a child of it (defined in configs\models\) and returns an initialized BaseModel object or a child of it

  • kernels\kernel_factory.py: KernelFactory class which takes a BaseKernelConfig object or a child of it (defined in configs\kernels\) and returns an initialized gpflow.kernels.Kernel object or a child of it

  • kernels\pytorch_kernels\pytorch_kernel_factory.py: PytorchKernelFactory class which takes a BaseKernelPytorchConfig object or a child of it (defined in configs\kernels\pytorch_kernels) and returns an initialized gpytorch.kernels.Kernel object or a child of it

  • active_learner\active_learner_factory.py: ActiveLearnerFactory class which takes either BasicActiveLearnerConfig object, a BasicActiveLearnerOracleConfig or a BasicBatchActiveLearnerConfig object (defined in configs\active_learner) and returns the respective active learner object.

These configs contain all necessary constructor arguments and guarantee reproducible experiments with the same settings. For example building a standard GP model with RBF Kernel can be build with

kernel_config = BasicRBFConfig(input_dimension=2)
model_config = BasicGPModelConfig(kernel_config=kernel_config)
model_factory = ModelFactory()
model = model_factory.build(model_config)

Given numpy arrays X_train,y_train and X_test model training and prediction can be done for each BaseModel via

model.infer(X_train,y_train)
pred_mu,pred_sigma = model.predictive_dist(X_test)

We give a list of model config combinations that might be useful. The config BasicGPModelConfig builds a GPModel which is based on GPflow. One can also build a GPytorch based model GPModelPytorch via BasicGPModelPytorchConfig. Here the kernel_config must be of type BaseKernelPytorchConfig. For example one might build

kernel_config = BasicRBFPytorchConfig(input_dimension=2)
model_config = BasicGPModelPytorchConfig(kernel_config=kernel_config)
model_factory = ModelFactory()
model = model_factory.build(model_config)

In case one wants to use a different kernel like for example the Hierarchical-Hyperplane-Kernel one can configure a GPflow based version with

kernel_config = HHKEightLocalDefaultConfig(input_dimension=2)
model_config = BasicGPModelConfig(kernel_config=kernel_config)
model_factory = ModelFactory()
model = model_factory.build(model_config)

The HHK always comes with a prior on the parameters and the GPModel in this case automatically switches to MAP estimation. For GPytorch based GP Model we recommend the following configuration:

kernel_config = HHKEightLocalDefaultPytorchConfig(input_dimension=2)
model_config = GPModelPytorchMAPIntenseOptimizationConfig(kernel_config=kernel_config)
model_factory = ModelFactory()
model = model_factory.build(model_config)

In case one want to perform HMC inference (GPflow based) rather than Type-2 ML or MAP inference one can do:

kernel_config = RBFWithPriorConfig(input_dimension=2)
model_config = BasicGPModelMarginalizedConfig(kernel_config=kernel_config)
model_factory = ModelFactory()
model = model_factory.build(model_config)

Building more complex models like GPModelKernelSearch or building active learning objects are explained in the notebooks directory.

Usage

The main usage of the building blocks of the repo is illustrated with different jupyter notebooks the notebooks folder.

License

This software is open-sourced under the AGPL-3.0 license. See the LICENSE file for details.

Citation

If you use our software in your scientific work, please cite one of our papers:

@article{bitzer2022structural,
  title={Structural kernel search via bayesian optimization and symbolical optimal transport},
  author={Bitzer, Matthias and Meister, Mona and Zimmer, Christoph},
  journal={Advances in Neural Information Processing Systems},
  volume={35},
  pages={39047--39058},
  year={2022}
}
@inproceedings{li2022safe,
  title={Safe active learning for multi-output gaussian processes},
  author={Li, Cen-You and Rakitsch, Barbara and Zimmer, Christoph},
  booktitle={International Conference on Artificial Intelligence and Statistics},
  pages={4512--4551},
  year={2022},
  organization={PMLR}
}
@inproceedings{bitzer2023hierarchical,
  title={Hierarchical-hyperplane kernels for actively learning gaussian process models of nonstationary systems},
  author={Bitzer, Matthias and Meister, Mona and Zimmer, Christoph},
  booktitle={International Conference on Artificial Intelligence and Statistics},
  pages={7897--7912},
  year={2023},
  organization={PMLR}
}
@inproceedings{bitzer2023amortized,
  title={Amortized inference for gaussian process hyperparameters of structured kernels},
  author={Bitzer, Matthias and Meister, Mona and Zimmer, Christoph},
  booktitle={Uncertainty in Artificial Intelligence},
  pages={184--194},
  year={2023},
  organization={PMLR}
}
@article{li2024global,
  title={Global Safe Sequential Learning via Efficient Knowledge Transfer},
  author={Li, Cen-You and Duennbier, Olaf and Toussaint, Marc and Rakitsch, Barbara and Zimmer, Christoph},
  journal={arXiv preprint arXiv:2402.14402},
  year={2024}
}
@article{li2024amortized,
  title={Amortized Active Learning for Nonparametric Functions},
  author={Li, Cen-You and Toussaint, Marc and Rakitsch, Barbara and Zimmer, Christoph},
  journal={arXiv preprint arXiv:2407.17992},
  year={2024}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published