dc.description.abstract |
Fairness within groups is important to a very broad range of problems, from policies for
battery-operated soccer robots to distributed traffic control. While no single action may be
fair to everyone, it is possible to achieve long-term optimal fairness for everyone through
choice of repeated actions. I explore the issue of achieving such long-term fairness among
multiple agents, and provide a unified view of the problem and solutions to it. The issue of
constructing long-term fair policies among multiple agents has not been well studied in the
literature.
I concentrate on the “average reward” utility model, where one’s utility is defined as
the average of the rewards one has received from past interactions. Also, I focus on a
particular definition of fairness, called “leximin” fairness, but most of the results apply to
other measures as well.
After examination of fairness through an infinite series of repeated actions, I extend
analysis in several directions. First, I consider how to achieve as fair a result as possible given
a finite series of actions, where the length of the series is not precisely known beforehand
but rather is chosen from an unknown or stochastic distribution of time horizons. My
solution guarantees the beneficiaries the fairest possible long-term results, minus a bounded
worst-case loss due to the game ending unexpectedly. I show that finding sequences of
actions with optimal worst-case loss is NP-hard, and I propose a family of approximation
algorithms.
Second, I examine stateful domains, where one’s choices have side-effects that influence
the effects of actions in the future. I introduce a multi-objective genetic algorithm for finding
good tradeoff points between beneficiaries utilities and their worst-case losses.
Third, I focus on decision-making processes which have been decentralized in the form
of hierarchies. I propose an algorithm based on my stochastic time-horizon solution, and
show empirically that an agent hierarchy running that algorithm is able to achieve optimal
long-term utilities. |
|