Trading Agents Competing: Performance, Progress, and Market Effectiveness
Since the year 2000, the annual trading agent competition has provided a forum for designers to evaluate programmed trading techniques in a challenging market scenario in competition with other design groups. After three years of apparent progress, we attempt to evaluate the trading competence of competition participants, in the 2002 tournament and over time. Although absolute measure of individual performance is difficult to assess, relative measures, and measures of the market performance overall are more amenable to direct analysis. We quantify the effectiveness of the TAC travel market in terms of allocative efficiency, finding improvement within and between tournaments. By comparison with alternative allocation benchmarks, we can calibrate this efficiency, and identify opportunities for further gain from trade.
Nash Q-learning for general-sum stochastic games
We extend Q-learning to a noncooperative multiagent context, using the framework of general-sum stochastic games. A learning agent maintains Q-functions over joint actions, and performs updates based on assuming Nash equilibrium behavior over the current Q-values. This learning protocol provably converges given certain restrictions on the stage games (defined by Q-values) that arise during learning. Experiments with a pair of two-player grid games suggest that such restrictions on the game structure are not necessarily required. Stage games encountered during learning in both grid environments violate the conditions. However, learning consistently converges in the first grid game, which has a unique equilibrium Q-function, but sometimes fails to converge in the second, which has three different equilibrium Q-functions. In a comparison of offline learning performance in both games, we find agents are more likely to reach a joint optimal path with Nash Q-learning than with a single-agent Q-learning method. When at least one agent adopts Nash Q-learning, the performance of both agents is better than using single-agent Q-learning. We have also implemented an online version of Nash Q-learning that balances exploration with exploitation, yielding improved performance.
On market-inspired approaches to propositional satisfiability
We describe three market-inspired approaches to propositional satisfiability. The first is based on a formulation of satisfiability as production on a supply chain, where producers of particular variable assignments must acquire licenses to fail to satisfy particular clauses. Experiments show that although this general supply-chain protocol can converge to market allocations corresponding to satisfiable truth assignments, it is impractically slow. We find that a simplified market structure and a variation on the pricing method can improve performance significantly. We compare the performance of the three market-based protocols with distributed breakout algorithm and GSAT on benchmark 3-SAT problems. We identify a tradeoff between performance and economic realism in the market protocols, and a tradeoff between performance and the degree of decentralization between the market protocols and distributed breakout. We also conduct informal and experimental analyses to gain insight into the operation of price-guided search.
Decentralized supply chain formation: A market protocol and competitive equilibrium analysis
Supply chain formation is the process of determining the structure and terms of exchange relationships to enable a multilevel, multiagent production activity. We present a simple model of supply chains, highlighting two characteristic features: hierarchical subtask decomposition, and resource contention. To decentralize the formation process, we introduce a market price system over the resources produced along the chain. In a competitive equilibrium for this system, agents choose locally optimal allocations with respect to prices, and outcomes are optimal overall. To determine prices, we define a market protocol based on distributed, progressive auctions, and myopic, non-strategic agent bidding policies. In the presence of resource contention, this protocol produces better solutions than the greedy protocols common in the artificial intelligence and multiagent systems literature. The protocol often converges to high-value supply chains, and when competitive equilibria exist, typically to approximate competitive equilibria. However, complementarities in agent production technologies can cause the protocol to wastefully allocate inputs to agents that do not produce their outputs. A subsequent decommitment phase recovers a significant fraction of the lost surplus.
The 2001 Trading Agent Competition
The 2001 Trading Agent Competition was the second in a series of events aiming to shed light on research issues in automating trading strategies. Based on a challenging market scenario in the domain of travel shopping, the competition presents agents with difficult issues in bidding strategy, market prediction, and resource allocation. Entrants in 2001 demonstrated substantial progress over the prior year, with the overall level of competence exhibited suggesting that trading in online markets is a viable domain for highly autonomous agents