RBF Gaussian Process Continuous Approximation¶
Our basic setup is a game with a set of roles , a set of strategies strategies per role, and a number of players per role . For the rest of this analysis we only consider partial profiles, these are profiles omitting a deviating player (alternately, just omitting one player). The Gaussian process payoff estimator produces a payoff estimate from a normalized partial profile. A partial profile is simply an assignment of players to strategies less the deviating player, but for the benefit of notation, we will assume that is already less the deviating player:
The Gaussian process regressor is defined as
where are the RBF length scales for each role and strategy. These can all be identical if the length scale is not specific to a particular strategy, but in general, they should vary by role. are the Gaussian process weights for each sample. are the training partial profiles, where indexes over training sample.
Our goal is to estimate the expected deviation payoff from a mixture :
The approximation comes from approximating a sum over the multinomial distribution with an integral over a Gaussian approximation. The next step is to derive simplifications for the determinant and the inverse of . First, we need to define a few helpful variables:
Then
Plugging these into the equation for yields
The derivative with respect to one element of the mixture is