RBF Gaussian Process Continuous Approximation¶

Our basic setup is a game with a set of roles $r \in \mathcal R = [R]$ , a set of strategies $s_r \in \mathcal S_r = [S_r] \ne \varnothing$ strategies per role, and a number of players per role $n_r > 0$ . For the rest of this analysis we only consider partial profiles, these are profiles omitting a deviating player (alternately, just omitting one player). The Gaussian process payoff estimator produces a payoff estimate from a normalized partial profile. A partial profile is simply an assignment of players to strategies less the deviating player, but for the benefit of notation, we will assume that $n_r$ is already less the deviating player:

$\mathbf x &= \{x_{rs_r}\}_{r \in \mathcal R, s_r \in \mathcal S_r},\ x_{rs_r} \in \mathbb Z_+,\ \sum_{s_r} x_{rs_r} = n_r$

The Gaussian process regressor is defined as

$\operatorname{payoff}(\mathbf x) &= \sum_j \alpha_j \exp\left\{ -\frac{1}{2} \sum_{r} \sum_{s_r} \frac{(X_{jrs_r} - x_{rs_r})^2}{\ell_{rs_r}^2} \right\}\\ &= \left( \prod_{r} (2 \pi)^\frac{S_r}{2} \prod_{s_r} \ell_{rs_r} \right) \sum_j \alpha_j \prod_{r} \mathcal N(\mathbf x_r; \mathbf X_{jr}, \mathbf L_r) \\ &= C \sum_j \alpha_j \prod_{r} \mathcal N(\mathbf x_r; \mathbf X_{jr}, \mathbf L_r),\ C = \prod_{r} (2 \pi)^\frac{S_r}{2} \prod_{s_r} \ell_{rs_r}$

where $\mathbf L_r = \operatorname{diag}(\ell_{rs_r}^2)$ are the RBF length scales for each role and strategy. These can all be identical if the length scale is not specific to a particular strategy, but in general, they should vary by role. $\boldsymbol \alpha = \mathbf K^{-1} \mathbf y$ are the Gaussian process weights for each sample. $\mathbf X$ are the training partial profiles, where $j$ indexes over training sample.

Our goal is to estimate the expected deviation payoff from a mixture $\mathbf p$ :

$\operatorname{devpayoff}(\mathbf p) &= \mathbb E_{\mathbf x_r \sim \mathcal M (n_r, \mathbf p_r)} \left[ \operatorname{payoff}(\mathbf x) \right] \\ &= \sum_{\mathbf x} C \sum_j \alpha_j \prod_{r} \mathcal N(\mathbf x_r; \mathbf X_{jr}, \mathbf L_r) \mathcal M (\mathbf x_r; n_r, \mathbf p_r) \\ &\approx \int_{\mathbf x} C \sum_j \alpha_j \prod_{r} \mathcal N(\mathbf x_r; \mathbf X_{jr}, \mathbf L_r) \mathcal N \left(\mathbf x_r; n_r \mathbf p_r, n_r \mathbf M_r\right) \\ &= \int_{\mathbf x} C \sum_j \alpha_j \prod_r \mathcal N \left(\mathbf X_{jr}; n_r \mathbf p_r, \mathbf L_r + n_r \mathbf M_r \right) \mathcal N (\mathbf x_r; \boldsymbol \mu_r, \boldsymbol \Sigma_r) \\ &= C \sum_j \alpha_j \prod_r \mathcal N \left(\mathbf X_{jr}; n_r \mathbf p_r, \mathbf L_r + n_r \mathbf M_r \right) \\ &= C \sum_j \alpha_j \prod_r \left| 2 \pi \left(\mathbf L_r + n_r \mathbf M_r\right) \right| ^{-\frac{1}{2}} \exp \left\{ -\frac{1}{2} (\mathbf X_{jr} - n_r \mathbf p_r)^\top \left(\mathbf L_r + n_r \mathbf M_r\right)^{-1} (\mathbf X_{jr} - n_r \mathbf p_r) \right\} \\ &= \left( \prod_r \left( \prod_{s_r} \ell_{rs_r} \right) \left| \left(\mathbf L_r + \frac{1}{n_r} \mathbf M_r\right) \right| ^{-\frac{1}{2}} \right) \sum_j \alpha_j \exp \left\{ -\frac{1}{2} \sum_r (\mathbf X_{jr} - n_r \mathbf p_r)^\top \left(\mathbf L_r + n_r \mathbf M_r\right)^{-1} (\mathbf X_{jr} - n_r \mathbf p_r) \right\}$

The approximation comes from approximating a sum over the multinomial distribution with an integral over a Gaussian approximation. The next step is to derive simplifications for the determinant and the inverse of $\mathbf L_r + n_r \mathbf M_r$ . First, we need to define a few helpful variables:

$d_{rs_r} &= \ell_{rs_r}^2 + n_r p_{rs_r} \\ \gamma_r &= 1 - n_r \sum_{s_r} \frac{p_{rs_r}^2}{d_{rs_r}}$

Then

$\left| \mathbf L_r + n_r \mathbf M_r \right| &= \left| \operatorname{diag}_{s_r}(d_{rs_r}) - n_r \mathbf p_r \mathbf p_r^\top \right| \\ &= \left( 1 - n_r \sum_{s_r} \frac{p_{rs_r}^2}{d_{rs_r}} \right) \prod_{s_r} d_{rs_r} \\ &= \gamma_r \prod_{s_r} d_{rs_r} \\ \left( \mathbf L_r + n_r \mathbf M_r \right)^{-1} &= \left( \operatorname{diag}_{s_r}(d_{rs_r}) - n_r \mathbf p_r \mathbf p_r^\top \right)^{-1} \\ &= \operatorname{diag}_{s_r} \left(\frac{1}{d_{rs_r}}\right) + \frac{n_r}{ 1 - n_r \sum_{s_r} \frac{p_{rs_r}^2}{d_{rs_r}} } \mathbf q_r \mathbf q_r^\top,\ q_{rs_r} = \frac{p_{rs_r}}{d_{rs_r}} \\ &= \operatorname{diag}_{s_r} \left(\frac{1}{d_{rs_r}}\right) + \frac{n_r}{\gamma_r} \mathbf q_r \mathbf q_r^\top$

Plugging these into the equation for $\operatorname{devpayoff}$ yields

$\operatorname{devpayoff}(\mathbf p) &= \left( \prod_r \gamma_r^{-\frac{1}{2}} \prod_{s_r} \frac{ \ell_{rs_r} }{ \sqrt{d_{rs_r}} } \right) \sum_j \alpha_j \exp \left\{ -\frac{1}{2} \sum_r \left( (\mathbf X_{jr} - n_r \mathbf p_r)^\top \left(\operatorname{diag}_{s_r} \left(\frac{1}{d_{rs_r}}\right) + \frac{n_r}{ \gamma_r } \mathbf q_r \mathbf q_r^\top \right) (\mathbf X_{jr} - n_r \mathbf p_r) \right) \right\} \\ &= \left( \prod_r \gamma_r^{-\frac{1}{2}} \prod_{s_r} \frac{ \ell_{rs_r} }{ \sqrt{d_{rs_r}} } \right) \sum_j \alpha_j \exp \left\{ -\frac{1}{2} \sum_r \left( \left( \sum_{s_r} \frac{(X_{jrs_r} - n_r p_{rs_r})^2}{d_{rs_r}} \right) + \frac{n_r}{\gamma_r} \left( \sum_{s_r} \frac{p_{rs_r}}{d_{rs_r}} (X_{jrs_r} - n_r p_{rs_r}) \right)^2 \right) \right\}$

The derivative with respect to one element of the mixture $p_{\rho i}$ is

$\frac{\partial}{\partial p_{\rho i}} \operatorname{devpayoff}(\mathbf p) &= -\frac{1}{2} \left( \prod_r \gamma_r^{-\frac{1}{2}} \prod_{s_r} \frac{ \ell_{rs_r} }{ \sqrt{d_{rs_r}} } \right) \left [ \left( \gamma_{\rho}^{-1} \left( \beta_{\rho i}^2 - 1 \right) + \frac{n_{\rho}}{d_{\rho i}} \right) \sum_j \alpha_j \exp \left\{ \cdot \right\} + n_{\rho} \sum_j \alpha_j \exp \left\{ \cdot \right\} \left( \left( \delta_{j \rho} - 1 \right)^2 - \left( \delta_{j \rho} \beta_{\rho i} - \xi_{j \rho i} \right)^2 \right) \right] \\ \delta_{j \rho} &= \gamma_{\rho}^{-1} \sum_{s_{\rho}} \frac{p_{\rho s_{\rho}}}{d_{\rho s_{\rho}}} \left( X_{j \rho s_{\rho}} - n_{\rho} p_{\rho s_{\rho}} \right) \\ \beta_{\rho i} &= 1 - \frac{n_{\rho} p_{\rho i}}{d_{\rho i}} \\ \xi_{j \rho i} &= 1 + \frac{X_{j \rho i} - n_{\rho} p_{\rho i}}{d_{\rho i}}$

Definitions, Notations and Identities¶

$[N] &= \{i\}_{i=1}^N \\ \operatorname{diag}_i(a_i) &= \begin{bmatrix} a_1 & 0 & \dots & 0 \\ 0 & a_2 & \dots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \dots & a_n \end{bmatrix} \\ \mathcal N (\mathbf x; \boldsymbol \mu, \boldsymbol \Sigma) &= | 2 \pi \boldsymbol \Sigma|^{-\frac{1}{2}} \exp\left\{-\frac{1}{2} (\mathbf x - \boldsymbol \mu)^\top \boldsymbol \Sigma^{-1} (\mathbf x - \boldsymbol \mu) \right\} \\ \mathcal M (\mathbf x; n, \mathbf p) &= \frac{n!}{\prod_i x_i!}\prod_i p_i^{x_i}\text{ if }\sum_i x_i = n \\ \mathcal M (\mathbf x; n, \mathbf p) &\approx \mathcal N (\mathbf x; n \mathbf p, n \mathbf M),\ \mathbf M = \operatorname{diag}_i(p_i) - \mathbf p \mathbf p^\top \\ | \operatorname{diag}_i(a_i) | &= \prod_i a_i \\ \left| \mathbf A + \mathbf u \mathbf v^\top \right| &= (1 + \mathbf v^\top \mathbf A^{-1} \mathbf u) | \mathbf A | \\ \left| \operatorname{diag}_i(a_i) + \mathbf u \mathbf v^\top \right| &= \left( 1 + \sum_i \frac{u_i v_i}{a_i} \right) \prod_i a_i \\ (\mathbf A + \mathbf B \mathbf C \mathbf D)^{-1} &= \mathbf A^{-1} - \mathbf A^{-1} \mathbf B ( \mathbf C^{-1} + \mathbf D \mathbf A^{-1} \mathbf B )^{-1} \mathbf D \mathbf A^{-1} \\ \left(\operatorname{diag}_i(a_i) + c \mathbf b \mathbf b^\top\right)^{-1} &= \operatorname{diag}_i\left(\frac{1}{a_i}\right) - \left(\frac{1}{c} + \sum_i \frac{b_i^2}{a_i} \right)^{-1} \mathbf b^\prime {\mathbf b^\prime}^\top,\ b_i^\prime = \frac{b_i}{a_i} \\ \mathcal N ( \mathbf x; \boldsymbol \mu_1, \boldsymbol \Sigma_1 ) \mathcal N (\mathbf x; \boldsymbol \mu_2, \boldsymbol \Sigma_2 ) &= \mathcal N (\boldsymbol \mu_1; \boldsymbol \mu_2, \boldsymbol \Sigma_1 + \boldsymbol \Sigma_2 ) \mathcal N (\mathbf x; \boldsymbol \mu_3, \boldsymbol \Sigma_3 )$

RBF Gaussian Process Continuous Approximation¶

Definitions, Notations and Identities¶

Table of Contents

Previous topic

Next topic

This Page