On The Emergence Of Cooperation In The Repeated Prisoner's Dilemma

Abstract

Using simulations between pairs of \(\epsilon\)-greedy q-learners with one-period memory, this article demonstrates that the potential function of the stochastic replicator dynamics (Foster and Young, 1990) allows it to predict the emergence of error-proof cooperative strategies from the underlying parameters of the repeated prisoner's dilemma. The observed cooperation rates between q-learners are related to the ratio between the kinetic energy exerted by the polar attractors of the replicator dynamics under the grim trigger strategy. The frontier separating the parameter space conducive to cooperation from the parameter space dominated by defection can be found by setting the kinetic energy ratio equal to a critical value, which is a function of the discount factor, \(f(\delta) = \delta/(1-\delta)\), multiplied by a correction term to account for the effect of the algorithms' exploration probability. The gradient at the frontier increases with the distance between the game parameters

On The Emergence Of Cooperation In The Repeated Prisoner's Dilemma

Abstract

Authors

Tags

Stats

Related papers