Docsity
Docsity

Prepare for your exams
Prepare for your exams

Study with the several resources on Docsity


Earn points to download
Earn points to download

Earn points by helping other students or get them with a premium plan


Guidelines and tips
Guidelines and tips

Multiagent Systems: Lecture 6 - Utility, Preferences, and Multiagent Encounters, Study notes of Electrical and Electronics Engineering

A lecture note from the university of liverpool's introduction to multiagent systems course. It covers topics such as multiagent systems, utilities and preferences, utility functions, and multiagent encounters. The lecture explains the concept of utility, which is not the same as money, and how it relates to money. It also discusses the concept of a state transformer function, which models the environment in which agents will act, and how agents choose actions based on their utility functions and the state transformer function.

Typology: Study notes

2009/2010

Uploaded on 02/24/2010

koofers-user-lwy
koofers-user-lwy 🇺🇸

5

(1)

10 documents

1 / 24

Toggle sidebar

Related documents


Partial preview of the text

Download Multiagent Systems: Lecture 6 - Utility, Preferences, and Multiagent Encounters and more Study notes Electrical and Electronics Engineering in PDF only on Docsity! LE C T U R E 6: M U LT IA G E N T IN T E R A C T IO N S A n Introduction to M ultiagentS ystem s h t t p : / / w w w . c s c . l i v . a c . u k / ˜ m j w / p u b s / i m a s / Lecture 6 A n Introduction to M ultiagentS ystem s 1 W hatare M ultiagentS ystem s? E nvironm ent sphere of influence K E Y agent interaction organisational relationship h t t p : / / w w w . c s c . l i v . a c . u k / ˜ m j w / p u b s / i m a s / 1 Lecture 6 A n Introduction to M ultiagentS ystem s W hatis U tility? U tility is notm oney (butitis a usefulanalogy). Typicalrelationship betw een utility & m oney: u tility m o n ey h t t p : / / w w w . c s c . l i v . a c . u k / ˜ m j w / p u b s / i m a s / 4 Lecture 6 A n Introduction to M ultiagentS ystem s 3 M ultiagentE ncounters W e need a m odelofthe environm entin w hich these agents w ill act... – agents sim ultaneously choose an action to perform ,and as a resultofthe actions they select,an outcom e in  w illresult; – the actualoutcom e depends on the com bination ofactions; – assum e each agenthas justtw o possible actions thatitcan perform C (“cooperate”) and “D ” (“defect”). E nvironm entbehaviour given by state transform er function:  A c   agenti’s action  A c   agent j’s action  h t t p : / / w w w . c s c . l i v . a c . u k / ˜ m j w / p u b s / i m a s / 5 Lecture 6 A n Introduction to M ultiagentS ystem s H ere is a state transform er function:  D D    D C   C D    C C   (T his environm entis sensitive to actions ofboth agents.) H ere is another:  D D    D C    C D    C C   (N either agenthas any influence in this environm ent.) A nd here is another:  D D    D C   C D    C C  (T his environm entis controlled by j.) h t t p : / / w w w . c s c . l i v . a c . u k / ˜ m j w / p u b s / i m a s / 6 Lecture 6 A n Introduction to M ultiagentS ystem s D om inantS trategies G iven any particular strategy s (either C or D ) agenti,there w ill be a num ber ofpossible outcom es. W e say s dom inates s ifevery outcom e possible by iplaying s is preferred over every outcom e possible by iplaying s . A rationalagentw illnever play a dom inated strategy. S o in deciding w hatto do,w e can delete dom inated strategies. U nfortunately,there isn’talw ays a unique undom inated strategy. h t t p : / / w w w . c s c . l i v . a c . u k / ˜ m j w / p u b s / i m a s / 9 Lecture 6 A n Introduction to M ultiagentS ystem s N ash E quilibrium In general,w e w illsay thattw o strategies s and s are in N ash equilibrium if: 1. under the assum ption thatagent iplays s ,agent jcan do no better than play s ;and 2. under the assum ption thatagent jplays s ,agent ican do no better than play s . N either agenthas any incentive to deviate from a N ash equilibrium . U nfortunately: 1. N otevery interaction scenario has a N ash equilibrium . 2. S om e interaction scenarios have m ore than one N ash equilibrium . h t t p : / / w w w . c s c . l i v . a c . u k / ˜ m j w / p u b s / i m a s / 10 Lecture 6 A n Introduction to M ultiagentS ystem s C om petitive and Z ero-S um Interactions W here preferences ofagents are diam etrically opposed w e have strictly com petitive scenarios. Z ero-sum encounters are those w here utilities sum to zero: u i #" u j   $ for all %  Z ero sum im plies strictly com petitive. Z ero sum encounters in reallife are very rare ... butpeople tend to actin m any scenarios as ifthey w ere zero sum . h t t p : / / w w w . c s c . l i v . a c . u k / ˜ m j w / p u b s / i m a s / 11 Lecture 6 A n Introduction to M ultiagentS ystem s T he individualrationalaction is defect. T his guarantees a payoffofno w orse than 2,w hereas cooperating guarantees a payoffofatm ost1. S o defection is the bestresponse to allpossible strategies: both agents defect,and getpayoff= 2. B utintuition says this is notthe bestoutcom e: S urely they should both cooperate and each getpayoffof3! h t t p : / / w w w . c s c . l i v . a c . u k / ˜ m j w / p u b s / i m a s / 14 Lecture 6 A n Introduction to M ultiagentS ystem s T his apparentparadox is the fundam entalproblem ofm ulti-agent interactions. Itappears to im ply thatcooperation w illnotoccur in societies of self-interested agents. R ealw orld exam ples: – nuclear arm s reduction (“w hy don’tIkeep m ine...”) – free rider system s — public transport; – in the U K — television licenses. T he prisoner’s dilem m a is ubiquitous. C an w e recover cooperation? h t t p : / / w w w . c s c . l i v . a c . u k / ˜ m j w / p u b s / i m a s / 15 Lecture 6 A n Introduction to M ultiagentS ystem s A rgum ents for R ecovering C ooperation C onclusions thatsom e have draw n from this analysis: – the gam e theory notion ofrationalaction is w rong! – som ehow the dilem m a is being form ulated w rongly A rgum ents to recover cooperation: – W e are notallm achiavelli! – T he other prisoner is m y tw in! – T he shadow ofthe future... h t t p : / / w w w . c s c . l i v . a c . u k / ˜ m j w / p u b s / i m a s / 16 Lecture 6 A n Introduction to M ultiagentS ystem s 4.3 A xelrod’s Tournam ent S uppose you play iterated prisoner’s dilem m a againsta range of opponents ... W hatstrategy should you choose,so as to m axim ise your overall payoff? A xelrod (1984) investigated this problem ,w ith a com puter tournam entfor program s playing the prisoner’s dilem m a. h t t p : / / w w w . c s c . l i v . a c . u k / ˜ m j w / p u b s / i m a s / 19 Lecture 6 A n Introduction to M ultiagentS ystem s S trategies in A xelrod’s Tournam ent A LLD : “A lw ays defect” — the haw k strategy; T IT-F O R -TAT : 1. O n round u $ ,cooperate. 2. O n round u $ ,do w hatyour opponentdid on round u & . T E S T E R : O n 1stround,defect. Ifthe opponentretaliated,then play T IT-F O R -TAT.O therw ise intersperse cooperation & defection. JO S S : A s T IT-F O R -TAT,exceptperiodically defect. h t t p : / / w w w . c s c . l i v . a c . u k / ˜ m j w / p u b s / i m a s / 20 Lecture 6 A n Introduction to M ultiagentS ystem s R ecipes for S uccess in A xelrod’s Tournam ent A xelrod suggests the follow ing rules for succeeding in his tournam ent: D on’tbe envious: D on’tplay as ifitw ere zero sum ! B e nice: S tartby cooperating,and reciprocate cooperation. R etaliate appropriately: A lw ays punish defection im m ediately,butuse “m easured” force — don’toverdo it. D on’thold grudges: A lw ays reciprocate cooperation im m ediately. h t t p : / / w w w . c s c . l i v . a c . u k / ˜ m j w / p u b s / i m a s / 21
Docsity logo



Copyright © 2024 Ladybird Srl - Via Leonardo da Vinci 16, 10126, Torino, Italy - VAT 10816460017 - All rights reserved