# Bounding the Estimation Error of Sampling-based Shapley Value Approximation

The Shapley value is arguably the most central normative solution concept in cooperative game theory. It specifies a unique way in which the reward from cooperation can be "fairly" divided among players. While it has a wide range of real world applications, its use is in many cases hampered by the hardness of its computation. A number of researchers have tackled this problem by (i) focusing on classes of games where the Shapley value can be computed efficiently, or (ii) proposing representation formalisms that facilitate such efficient computation, or (iii) approximating the Shapley value in certain classes of games. For the classical \textit{characteristic function} representation, the only attempt to approximate the Shapley value for the general class of games is due to Castro \textit{et al.} \cite{castro}. While this algorithm provides a bound on the approximation error, this bound is \textit{asymptotic}, meaning that it only holds when the number of samples increases to infinity. On the other hand, when a finite number of samples is drawn, an unquantifiable error is introduced, meaning that the bound no longer holds. With this in mind, we provide non-asymptotic bounds on the estimation error for two cases: where (i) the \textit{variance}, and (ii) the \textit{range}, of the players' marginal contributions is known. Furthermore, for the second case, we show that when the range is significantly large relative to the Shapley value, the bound can be improved (from $O(\frac{r}{m})$ to $O(\sqrt{\frac{r}{m}})$). Finally, we propose, and demonstrate the effectiveness of using stratified sampling for improving the bounds further.

PDF Abstract