This repo contains classes and scripts to actively learn stochastic models like Gaussian Processes. It provides different kind of experimental design methods like pool-based, oracle-base and safe active learning as well as Bayesian optimization. Furthermore, different stochastic models are implemented like standard GPs, GPs with marginalized hyperparameters, Deep GPs and many different kernels such as Deep Kernels, Neural Kernel Network and more.
For questions to the code contact Matthias Bitzer or Cen-You Li.
conda env create --file environment.yml
conda activate alef
pip install -e .To check if everything is set up correctly you can run the tests from the root dir of the repo
pytest .The repo is divided in the following folders/submethods (not complete only the most important parts):
-
active_learner: This folder contains the different active learner classes:-
ActiveLearner: This class implements pool based active learning. It needs an instance of a child of thebase_modelclass asmodel[setted viaself.set_model(model)]. It contains anactive_learner.Poolobject (self.pool) that needs to be populized with data (numpy arrays)x_dataandy_datathrough theself.set_pool(x_data,y_data)method. The initial dataset can be sampled directly from the pool viaself.sample_initial_data(n_data). Besides the initial data it also needs a test set for validation which needs to be setted seperatly viaself.set_test_set(x_test,y_test). The main methodself.learn(n_steps)will trigger the AL cycle. Further settings such as acquisition function type (Entropy,Variance,...) and the kind of validation metric (RMSE, NLL,...) are specified in the config files (see Configs section below). Some of the mentioned methods are implemented in its abstract parent classBasePoolActiveLearner.Config classes:alef/configs/active_learner/active_learner_configs.py
-
ActiveLearnerOracle: This class implements active learning with an oracle. It needs an instance of a child of thebase_modelclass asmodeland an instance of a child ofbase_oracleasoracle. An object of this class gets instantiated with anactive_learner_oracle.AcquisitionFunctionTypeobject (Enum) which defines the type of acqusition function which is used and anactive_learner_oracle.ValidationTypeobject which defines the type of validation that is used as a metric (TODO: create Config file also for this class). Initial training set and test set need to initialized and can be directly generated by sampling from the oracle via the methodsself.sample_train_set(n_data)andself.sample_test_set(n_test)(both datasets can also be set manually). The main methodself.learn(n_steps)will trigger the AL cycle and will iteratively query the points of itsoracle. It outputs the validation metrics collected over the iterations and the chosen query locations.
-
SafeActiveLearner: This class implements safe and pool-based active learning. An explanation of safe active learning can be found in [Schreiter et al. (2015)].(https://ipvs.informatik.uni-stuttgart.de/mlr/papers/15-schreiter-ECML.pdf) It needs an instance of a child of thebase_modelclass asmodeland another instance of it as thesafety_modeland an instance ofsafe_active_learner.Poolas thepool. Thepoolgets populated with datax_dataandy_dataas well as associatedsafety_datawhich is the label for thesafety_model. Besides the known objects for acquistion and validation type an object of this class gets ansafety_thresholdvalue that provides a lower bound of the safety_value the active learner should not exceed, while exploring/querying.
-
-
bayesian_optimization: This folder contains the different bayesian optimization classes.bayesian_optimization_oracle: This class implements bayesian optimization. It needs an instance of a child of thebase_modelclass asmodelwhich serves as surrogate model for the function that should be optimized. It also needs an instance of thebase_oracleclass as itsoracleobject that it will call and that should be optimized. To choose the acquisition function the class needs anbayesian_optimization.enums.AcquisitionFunctionTypeenum value (important AF's are implemented such as GP-UCB and EI). The initial training can be sampled directly from theoracleviasample_train_set()or one can set the train set directly usingset_train_set(self,x_train,y_train). The optimization procedure can be started by callingmaximize(n_steps)which will executen_stepsbayesian optimization queries.
-
models: This folder contains the classes that implement the abstract classBaseModel. Each child class must implement theinfer(x_data,y_data)method, which infers the parameters (using maximum likelihood, MCMC or variational inference) from the given data. Furthermore it needs to implement thepredictive_dist(x_test)method that gives back mu and sigma for given inputs (further methods that needs to be implemented to fulfill the inteface can be found in theBaseModelclass). Example of such models are:-
GPModel: This model forms a wrapper around angpflow.models.GPRobject. It needs akernelobject (child class ofgpflow.kernels.Kernel). Theoptimize_hpsboolean determines if the trainable kernel hyperparameters are learned or not (IfTruethey are learned with maximum likelihood or maximum a-posteriori in case a the kernel HP's are equipped with a prior). Furthermore it can be decided if the hyperparameter optimization should be done with multiple initial values using theperform_multi_start_optimizationflag. The model is therefore standard GP regression with MAP/ML estimation of the HP's.Config classes:alef/configs/models/gp_model_config.py
-
GPModelMarginalized: This model also forms a wrapper around angpflow.models.GPRobject and it also needs akernelobject. In this class all model parameters (Kernel hyperparameters + observation noise) are learned withMCMC(Hamiltonian Monte Carlo). The training of the likelihood variance can be shut off withtrain_likelihood_variance=False. The variablenum_samplesspecifies the number of final samples used for prediction,num_burnin_stepsdetermines the number of burnin samples for MCMC andthin_stepsdetermines the number of samples that are thrown away to ensure independence (thin_steps-1out ofthin_stepssamples). The methodpredictive_dist()in this case gives a sample based estimate of the marginal predictive distribution (marginalized over the hyperparameter posterior).Config classes:alef/configs/models/gp_model_marginalized_config.py
-
GPModelPytorch: This model forms a wrapper aroundgpytorch.models.ExactGPmodel. It needs akernelobject (child class ofgpytorch.kernels.Kernel) and performs Type-2 ML or MAP estimation of the GP parameters. Training is based on theRAdamoptimizer and multiple restarts are possible. Notable parameters aretraining_iterwhich determines the number of gradient steps per run,lrdetermining the learning rate,do_multi_start_optimizationdetermines if multiple restarts with different intitial GP parameters should be done (necesitates that the torch parameters are equipped with priors),n_restarts_multistartdetermines number of restarts,do_map_estimationdetermines if MAP estimation should be done rather than Type-2 ML anddo_early_stoppingflags if an RAdam run should be interupted once the loss has converged. The model inherits fromBaseModeland thus has all associated methods. A possible configuration can be seen below.Config classes:alef/configs/models/gp_model_pytorch_config.pyKernel configs:alef/configs/kernels/pytorch_kernel/* -
GPModelLaplace: This model provides a further method to infer the hyperparameters of a GP Model. It also forms a wrapper around angpflow.models.GPRobject and it also needs akernelobject. This class implements Laplace inference for the kernel hyperparameters and approximates the predictive distribution with a normal according to the method presented in Garnett et al. (2014).Config classes:alef/configs/models/gp_model_laplace_config.py
-
SparseGpModel: This class inherits fromGPModeland replaces thegpflow.models.GPRobject with angpflow.models.SGPRobject. This therefore implements a sparse GP model based on inducing points that can be used for GP modelling on larger datasets. Accordingly one needs to set then_inducing_pointspoints property which specifies how many inducing points should be used (maximally, if smaller than n_data than n_data=n_inducing_points). The inducing point locations are generated via k-means clustering on the input data and are not trainable. As it inherits fromGPModelit also contains all functionalities ofGPModelsuch as multistart optimization.Config classes:alef/configs/models/sparse_gp_model_config.py
-
MOGPModel: This class implements a Multi-Output Gaussian process. It forms a wrapper around themodels.MOGPRmodel and needs and instance of agpflow.kernels.MultioutputKernelas its kernel. For this model the dimension ofy_datais allowed to be greater than 1 (y_data.shape can be [n,m]). Theself.predictive_dist(x_test)method outputs a tuple of np.arrays of shape [n,m] containing mean and variances over datapoints and outputs. The kernel hyperparameters are learned via ML/MAP estimation (on calling theself.infer(x_data,y_data)method).Config classes:alef/configs/models/mogp_model_config.py
-
DeepGP: This class forms a wrapper around thegpflux.DeepGPmodel and provides a solid configuration of the DeepGP. This configuration is based on Salimbeni and Deisenroth (2017). Parameters that are still configurable in this class are for examplen_layerspecifying the number of GP layers in the DeepGP as well as withmax_n_inducing_pointsthe number of inducing points used for inference.Config classes:alef/configs/models/deep_gp_config.py
-
GPModelKernelSearch: This class forms a wrapper around a search procedure over kernel space. It makes BO over kernel space via specifying aKernelGrammarCandidateGeneratorthat defines the kernel grammar over which it should search and a kernel-kernel inself.kernel. The selection criteria is defined via theoracle_typeenum (Log-Evidence, BIC and CVLL are possible). When callinginferthe search starts and stores in the end the found kernel in aGPModelinself.model. All otherBaseModelmethods are forwarded to theGPModelthen.
-
-
oracles: This folder contains classes that implement the abstract classBaseOracle. Each child class must implement the methodquery(x)that evaluates the oracle/function at x and returns the value. The methodget_box_bounds()return the definition bounds of the oracle. Theget_dimension()method returns the dimension of the input of the oracle. Finally theget_random_data(n)method queries n datapoints uniform distributed in the definition region and returns x and y (np.arrays). Examples are:GPOracle1D: An object of this class gets akernel_configand samples a one dimensional function f (discretized) from the associated GP when callinginitialize(). The sampled function doesn`t change over the lifetime of the object and can be queried at any given x in the definition region. Points that are outside the discretization get linearly interpolated.
data_sets: This folder contains wrapper/loader for different datasets that all admit to the interfaceBaseDataset. All classes need to have abase_pathargument in the constructor that specifies where its file/s lie. Theload_data()method loads the data from the associated file and theget_random_data(n)retrieves n datapoints from the dataset. For most of the classes like e.g.EnergyorPowerplantthe necessary files that should lie in thebase_pathcan be found in the corresponding UCI repo.
Objects of the models, active learner, bayesian optimization and kernel classes can be initialized with instances of pydantic.BaseSettings objects which are specified in the configs folder. The actual object get build by the corresponding factory classes:
-
models\model_factory.py:ModelFactoryclass which takes aBaseModelConfigobject or a child of it (defined inconfigs\models\) and returns an initializedBaseModelobject or a child of it -
kernels\kernel_factory.py:KernelFactoryclass which takes aBaseKernelConfigobject or a child of it (defined inconfigs\kernels\) and returns an initializedgpflow.kernels.Kernelobject or a child of it -
kernels\pytorch_kernels\pytorch_kernel_factory.py:PytorchKernelFactoryclass which takes aBaseKernelPytorchConfigobject or a child of it (defined inconfigs\kernels\pytorch_kernels) and returns an initializedgpytorch.kernels.Kernelobject or a child of it -
active_learner\active_learner_factory.py:ActiveLearnerFactoryclass which takes eitherBasicActiveLearnerConfigobject, aBasicActiveLearnerOracleConfigor aBasicBatchActiveLearnerConfigobject (defined inconfigs\active_learner) and returns the respective active learner object.
These configs contain all necessary constructor arguments and guarantee reproducible experiments with the same settings. For example building a standard GP model with RBF Kernel can be build with
kernel_config = BasicRBFConfig(input_dimension=2)
model_config = BasicGPModelConfig(kernel_config=kernel_config)
model_factory = ModelFactory()
model = model_factory.build(model_config)Given numpy arrays X_train,y_train and X_test model training and prediction can be done for each BaseModel via
model.infer(X_train,y_train)
pred_mu,pred_sigma = model.predictive_dist(X_test)
We give a list of model config combinations that might be useful. The config BasicGPModelConfig builds a GPModel which is based on GPflow. One can also build a GPytorch based model GPModelPytorch via BasicGPModelPytorchConfig. Here the kernel_config must be of type BaseKernelPytorchConfig. For example one might build
kernel_config = BasicRBFPytorchConfig(input_dimension=2)
model_config = BasicGPModelPytorchConfig(kernel_config=kernel_config)
model_factory = ModelFactory()
model = model_factory.build(model_config)
In case one wants to use a different kernel like for example the Hierarchical-Hyperplane-Kernel one can configure a GPflow based version with
kernel_config = HHKEightLocalDefaultConfig(input_dimension=2)
model_config = BasicGPModelConfig(kernel_config=kernel_config)
model_factory = ModelFactory()
model = model_factory.build(model_config)
The HHK always comes with a prior on the parameters and the GPModel in this case automatically switches to MAP estimation. For GPytorch based GP Model we recommend the following configuration:
kernel_config = HHKEightLocalDefaultPytorchConfig(input_dimension=2)
model_config = GPModelPytorchMAPIntenseOptimizationConfig(kernel_config=kernel_config)
model_factory = ModelFactory()
model = model_factory.build(model_config)
In case one want to perform HMC inference (GPflow based) rather than Type-2 ML or MAP inference one can do:
kernel_config = RBFWithPriorConfig(input_dimension=2)
model_config = BasicGPModelMarginalizedConfig(kernel_config=kernel_config)
model_factory = ModelFactory()
model = model_factory.build(model_config)
Building more complex models like GPModelKernelSearch or building active learning objects are explained in the notebooks directory.
The main usage of the building blocks of the repo is illustrated with different jupyter notebooks the notebooks folder.
This software is open-sourced under the AGPL-3.0 license. See the LICENSE file for details.
If you use our software in your scientific work, please cite one of our papers:
@article{bitzer2022structural,
title={Structural kernel search via bayesian optimization and symbolical optimal transport},
author={Bitzer, Matthias and Meister, Mona and Zimmer, Christoph},
journal={Advances in Neural Information Processing Systems},
volume={35},
pages={39047--39058},
year={2022}
}
@inproceedings{li2022safe,
title={Safe active learning for multi-output gaussian processes},
author={Li, Cen-You and Rakitsch, Barbara and Zimmer, Christoph},
booktitle={International Conference on Artificial Intelligence and Statistics},
pages={4512--4551},
year={2022},
organization={PMLR}
}
@inproceedings{bitzer2023hierarchical,
title={Hierarchical-hyperplane kernels for actively learning gaussian process models of nonstationary systems},
author={Bitzer, Matthias and Meister, Mona and Zimmer, Christoph},
booktitle={International Conference on Artificial Intelligence and Statistics},
pages={7897--7912},
year={2023},
organization={PMLR}
}
@inproceedings{bitzer2023amortized,
title={Amortized inference for gaussian process hyperparameters of structured kernels},
author={Bitzer, Matthias and Meister, Mona and Zimmer, Christoph},
booktitle={Uncertainty in Artificial Intelligence},
pages={184--194},
year={2023},
organization={PMLR}
}
@article{li2024global,
title={Global Safe Sequential Learning via Efficient Knowledge Transfer},
author={Li, Cen-You and Duennbier, Olaf and Toussaint, Marc and Rakitsch, Barbara and Zimmer, Christoph},
journal={arXiv preprint arXiv:2402.14402},
year={2024}
}
@article{li2024amortized,
title={Amortized Active Learning for Nonparametric Functions},
author={Li, Cen-You and Toussaint, Marc and Rakitsch, Barbara and Zimmer, Christoph},
journal={arXiv preprint arXiv:2407.17992},
year={2024}
}