|
| 1 | +.. _AltBLPriors: |
| 2 | + |
| 3 | +Alternatives: Prior specification for BL hyperparameters |
| 4 | +======================================================== |
| 5 | + |
| 6 | +Overview |
| 7 | +-------- |
| 8 | + |
| 9 | +In the fully :ref:`Bayes linear<DefBayesLinear>` approach to |
| 10 | +emulating a complex :ref:`simulator<DefSimulator>`, the |
| 11 | +:ref:`emulator<DefEmulator>` is formulated to represent prior |
| 12 | +knowledge of the simulator in terms of a :ref:`second-order belief |
| 13 | +specification<DefSecondOrderSpec>`. The BL prior specification |
| 14 | +requires the specification of beliefs about some |
| 15 | +:ref:`hyperparameters<DefHyperparameter>`, as discussed in the |
| 16 | +alternatives page on emulator prior mean function |
| 17 | +(:ref:`AltMeanFunction<AltMeanFunction>`), the discussion page on the |
| 18 | +GP covariance function |
| 19 | +(:ref:`DiscCovarianceFunction<DiscCovarianceFunction>`) and the |
| 20 | +alternatives page on emulator prior correlation function |
| 21 | +(:ref:`AltCorrelationFunction<AltCorrelationFunction>`). |
| 22 | +Specifically, in the :ref:`core problem<DiscCore>` that is the |
| 23 | +subject of the core threads (:ref:`ThreadCoreBL<ThreadCoreBL>`, |
| 24 | +:ref:`ThreadCoreGP<ThreadCoreGP>`) a vector :math:`\beta` defines the |
| 25 | +detailed form of the mean function, a scalar :math:`\sigma^2` quantifies |
| 26 | +the uncertainty or variability of the simulator around the prior mean |
| 27 | +function, while :math:`\delta` is a vector of hyperparameters defining |
| 28 | +details of the correlation function. Threads that deal with variations |
| 29 | +on the basic core problem may introduce further hyperparameters. |
| 30 | + |
| 31 | +A Bayes linear analysis requires hyperparameters to be given prior |
| 32 | +expectations, variances and covariances. We consider here ways to |
| 33 | +specify these prior beliefs for the hyperparameters of the core problem. |
| 34 | +Prior specifications for other hyperparameters are addressed in the |
| 35 | +relevant variant thread. Hyperparameters may be handled differently in |
| 36 | +the fully :ref:`Bayesian<DefBayesian>` approach - see |
| 37 | +:ref:`ThreadCoreGP<ThreadCoreGP>`. |
| 38 | + |
| 39 | +Choosing the Alternatives |
| 40 | +------------------------- |
| 41 | + |
| 42 | +The prior beliefs should be chosen to represent whatever prior knowledge |
| 43 | +the analyst has about the hyperparameters. However, the prior |
| 44 | +distributions will be updated with the information from a set of |
| 45 | +training runs, and if there is substantial information in the training |
| 46 | +data about one or more of the hyperparameters then the prior information |
| 47 | +about those hyperparameters may be irrelevant. |
| 48 | + |
| 49 | +In general, a Bayes linear specification requires statements of |
| 50 | +second-order beliefs for all uncertain quantities. In the current |
| 51 | +version of this Toolkit, the Bayes linear emulation approach does not |
| 52 | +consider the situation where :math:`\sigma^2` and :math:`\delta` are |
| 53 | +uncertain, and so we require the following: |
| 54 | + |
| 55 | +- :math:`\text{E}[\beta_i]`, :math:`\text{Var}[\beta_i]`, |
| 56 | + :math:`\text{Cov}[\beta_i,\beta_j]` - expectations, variances and |
| 57 | + covariances for each coefficient :math:`\beta_i`, and covariances |
| 58 | + between every pair of coefficients :math:`(\beta_i,\beta_j), i\neq j` |
| 59 | +- :math:`\sigma^2=\text{Var}[w(x)]` - the variance of the residual |
| 60 | + stochastic process |
| 61 | +- :math:`\delta` - a value for the hyperparameters of the correlation |
| 62 | + function |
| 63 | + |
| 64 | +The Nature of the Alternatives |
| 65 | +------------------------------ |
| 66 | + |
| 67 | +Priors for :math:`\beta` |
| 68 | +~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 69 | + |
| 70 | +Given a specified form for the basis functions :math:`h(x)` of :math:`m(x)` as |
| 71 | +described in the alternatives page on basis functions for the emulator |
| 72 | +mean (:ref:`AltBasisFunctions<AltBasisFunctions>`), we must specify |
| 73 | +expectation and variance for each coefficient :math:`\beta_i` and a |
| 74 | +covariance between every pair :math:`(\beta_i,\beta_j)`. |
| 75 | + |
| 76 | +As with the basis functions :math:`h(x)`, there are two primary means of |
| 77 | +obtaining a belief specification for :math:`\beta`. |
| 78 | + |
| 79 | +#. **Expert-led specification** - the specification can be made directly |
| 80 | + by an expert using methods such as |
| 81 | + |
| 82 | + a. Intuitive understanding of the magnitude and impact of the |
| 83 | + physical effects represented by :math:`h(x)` leading to a direct |
| 84 | + quantification of expectations, variances and covariances. |
| 85 | + b. Assessing the difference between the model under study and another |
| 86 | + well-understood model such as a fast approximate version or an |
| 87 | + earlier version of the same simulator. In this approach, we can |
| 88 | + combine the known information about the mean behaviour of the |
| 89 | + second simulator with the belief statements about the differences |
| 90 | + between the two simulator to construct an appropriate belief |
| 91 | + specification for the hyperparameters -- see :ref:`multilevel |
| 92 | + emulation<DefMultilevelEmulation>`. |
| 93 | + |
| 94 | +#. **Data-driven specification** - when prior beliefs are weak and we |
| 95 | + have ample model evaluations, then prior values for :math:`\beta` are |
| 96 | + typically not required and we can replace adjusted values for |
| 97 | + :math:`\beta` with empirical estimates, :math:`\hat{\beta}`, obtained by |
| 98 | + fitting the linear regression :math:`f(x)=h(x)^T\beta`. Our uncertainty |
| 99 | + statements about :math:`\beta` can then be deduced from the "estimation |
| 100 | + error" associated with :math:`\hat{\beta}`. |
| 101 | + |
| 102 | +Priors for :math:`\sigma^2` |
| 103 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 104 | + |
| 105 | +The current version of the Toolkit requires a point value for the |
| 106 | +variance about the emulator mean, :math:`\sigma^2`. This corresponds |
| 107 | +directly to making a specification about :math:`\text{Var}[w(x)]`. As with |
| 108 | +the model coefficients above, there are two possible approaches to |
| 109 | +making such a quantification. An expert could make the specification by |
| 110 | +directly quantifying the magnitude of :math:`\sigma^2`. Alternatively, an |
| 111 | +expert assessment of the expected prior adequacy of the mean function at |
| 112 | +representing the variation in the simulator outputs can be combined with |
| 113 | +information on the variation of the simulator output, which allows for |
| 114 | +the deduction of a value of :math:`\sigma^2`. In the case of a data-driven |
| 115 | +assessment, the estimate for the residual variance :math:`\hat{\sigma}^2` |
| 116 | +can be used. |
| 117 | + |
| 118 | +In subsequent versions of the toolkit, Bayes linear methods will be |
| 119 | +developed for :ref:`learning<DefBLVarianceLearning>` about |
| 120 | +:math:`\sigma^2` in the emulation process. This will require making prior |
| 121 | +specifications about the squared emulator residuals. |
| 122 | + |
| 123 | +Priors for :math:`\delta` |
| 124 | +~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 125 | + |
| 126 | +Specification of correlation function hyperparameters is a more |
| 127 | +challenging task. Direct elicitation can be difficult as the |
| 128 | +hyperparameter :math:`\delta` is hard to conceptualise - the alternatives |
| 129 | +page on prior distributions for GP hyperparameters |
| 130 | +(:ref:`AltGPPriors<AltGPPriors>`) provides some discussion on this |
| 131 | +topic, with particular application to the Gaussian correlation function. |
| 132 | +Alternatively, when given a large collection of simulator runs then |
| 133 | +:math:`\delta` can be crudely estimated using methods such as |
| 134 | +:ref:`variogram<ProcVariogram>` fitting on the empirical residuals. |
| 135 | + |
| 136 | +Assessing and updating uncertainties about :math:`\delta` raises both |
| 137 | +conceptual and technical problems as methods which would be optimal for |
| 138 | +assessing such parameters given realisations drawn from a corresponding |
| 139 | +stochastic process may prove to be highly non-robust when applied to |
| 140 | +functional computer output which is only represented very approximately |
| 141 | +by such a process. Methods for approaching this problem will appear in a |
| 142 | +subsequent version of the toolkit. |
0 commit comments