Week 6 — Social Cognition & Behavioral Game Theory第6週 — 社会的認知と行動ゲーム理論

Social Cognition & Behavioral Game Theory社会的認知と行動ゲーム理論

Agendaアジェンダ

What is the other player thinking?相手は何を考えている？ 0:00
People aren’t Nash人はナッシュではない 0:07
The beauty contest美人投票ゲーム 0:15
Theory of mind心の理論 0:30
Autism, theory of mind & the beauty contest自閉症・心の理論・美人投票ゲーム 0:53
Blame & intent非難と意図 1:11
One more gameもう一つのゲーム 1:33

1 · What is the other player thinking?1 · 相手は何を考えている？

Week 3: beliefs → action第3週：信念 → 行動

In Week 3 we solved the forward problem:

Given your beliefs about the zones and your utility over fish,
expected utility tells you what to do; ε-greedy is one solution.

Beliefs + values → action.

第3週では順方向の問題を解きました：

ゾーンについての信念と魚に対する効用が与えられたとき、
期待効用が取るべき行動を教え、ε-greedyは一つの解でした。

信念 + 価値 → 行動。

Today: action → beliefs今日：行動 → 信念

Today we run the arrow backwards:

You watch someone act. What can you infer about what they believe, what they want, and how hard they’re thinking about you?

That inverse problem is theory of mind — the cognitive engine under every multi-agent system you’ve built.

今日は矢印を逆向きにします：

誰かが行動するのを見る。その人が何を信じ、何を望み、そしてどれだけ深くあなたについて考えているかを、どこまで推論できるか？

この逆問題が心の理論です — これまで作ってきたすべてのマルチエージェント系の認知エンジンです。

You’ve already done this皆さんはもうやっている

MP3: your agents modeled each other — reciprocators waited to see if anyone else would cooperate; the trust system tracked whether claims matched catches.
MP4: you’re designing rules for agents who respond strategically to them — comply or defy based on trust and payoff.

Every one of those is a mind reading another mind. Today: how that actually works in people — and where it breaks.

MP3： あなたのエージェントは互いをモデル化していた — 互恵者は他の誰かが協力するか様子を見て、信頼システムは主張と漁獲が一致するか追跡した。
MP4： あなたは、ルールに戦略的に反応するエージェントのためのルールを設計している — 信頼と利得に応じて従うか背くか。

これらはすべて、ある心が別の心を読むことです。今日は、それが人間で実際にどう働くか — そしてどこで壊れるか。

Where we are現在地

✓ What is the other player thinking?相手は何を考えている？ 0:00
People aren’t Nash人はナッシュではない 0:07
The beauty contest美人投票ゲーム 0:15
Theory of mind心の理論 0:30
Autism, theory of mind & the beauty contest自閉症・心の理論・美人投票ゲーム 0:53
Blame & intent非難と意図 1:11
One more gameもう一つのゲーム 1:33

2 · People aren’t Nash2 · 人はナッシュではない

Classical game theory predicts… and people deviate古典的ゲーム理論の予測 … そして人はずれる

Classical game theory: what perfectly rational, self-interested players should do.

People deviate — but the deviations are systematic and modelable, not noise.

Same move as Week 3: a normative ideal, a descriptive reality, and a gap where the science lives. (Camerer, Behavioral Game Theory, 2003)

古典的ゲーム理論：完全に合理的で利己的なプレイヤーがすべきこと。

人はそこからずれる — しかしそのずれは体系的でモデル化できるもので、ノイズではない。

第3週と同じ動き：規範的理想、記述的現実、そしてその間のギャップに科学がある。（Camerer, Behavioral Game Theory, 2003）

Let’s play: the ultimatum gameやってみよう：最後通牒ゲーム

There’s ¥1,000 to split. Everyone plays both roles — write down two numbers:

As proposer: your offer — how much of the ¥1,000 you’d give the responder.
As responder: the smallest offer you’d accept.

If an offer ≥ your threshold → split as proposed. If not → both get nothing.

¥1,000 を分けます。全員が両方の役をやります — 2つの数字を書いて：

提案者として： あなたの提案額 — ¥1,000 のうち受け手にいくら渡すか。
受け手として： 受け入れる最低額。

提案額 ≥ あなたの閾値 → 提案通りに分配。そうでなければ → 両者とも何も得られない。

What people actually offer — ultimatum人が実際に提案する額 — 最後通牒

In Western lab samples (Güth et al. 1982; Camerer 2003 synthesis of 30+ studies):

Modal & median offer: 40–50% of the pie. Mean ≈ 40–45%.
Offers below ~20% are rejected about half the time (and more often as they shrink).
Almost no one offers more than 50%.

The rational prediction — offer the smallest positive amount, accept anything — is wrong. Responders pay to punish unfairness; proposers anticipate it.

How does this compare to our room?

欧米の実験室サンプルでは（Güth et al. 1982；Camerer 2003 が30以上の研究を総括）：

最頻・中央の提案：パイの40〜50%。平均 ≈ 40〜45%。
約20%未満の提案は約半分の確率で拒否される（小さくなるほど拒否率は上がる）。
50%を超える提案はほとんどない。

合理的予測 — 最小の正の額を提案し、何でも受け入れる — は間違い。受け手はコストを払って不公平を罰し、提案者はそれを見越す。

我々の部屋とどう違う？

Now remove the veto: the dictator game拒否権を外す：独裁者ゲーム

Same ¥1,000. One change: the responder has no veto — they just receive whatever you give.

You can keep everything with zero risk.

Write your new offer. Did it change from your ultimatum offer?

同じ¥1,000。変更点は一つ： 受け手に拒否権がない — あなたが与えたものをただ受け取る。

あなたはリスクゼロで全額を取れる。

新しい提案額を書いて。最後通牒のときから変わった？

What people actually give — dictator人が実際に与える額 — 独裁者

Engel (2011) meta-study — 600+ treatments, 100+ papers:

Mean given ≈ 28% (vs. ~40% in the ultimatum game).
~36% give nothing. ~17% give exactly half. ~64% give something.

Removing the veto roughly halves giving. So part of “ultimatum fairness” was fear of rejection — but a real chunk of pure other-regard remains (most people still give something).

Engel (2011) メタ研究 — 600以上の処理、100以上の論文：

平均で約28%を分配（最後通牒の約40%に対して）。
約36%は何も与えない。約17%はちょうど半分。 約64%は何かを与える。

拒否権を外すと、分配はおよそ半分に。つまり「最後通牒の公平さ」の一部は拒否への恐れだった — しかし純粋な他者配慮もかなり残る（大半はそれでも何かを与える）。

The same games, different cultures同じゲーム、異なる文化

Almost all of that comes from Western university students. What happens elsewhere?

Henrich et al. (2001): ran the ultimatum game in 15 small-scale societies — Amazonian horticulturalists, African foragers, Indonesian whale hunters, Mongolian herders…

The “fair” 40–50% offer turns out not to be universal at all.

これらのほとんどは欧米の大学生から得られたもの。他の場所ではどうか？

Henrich et al. (2001)： 15の小規模社会で最後通牒ゲームを実施 — アマゾンの園耕民、アフリカの狩猟採集民、インドネシアの捕鯨民、モンゴルの牧畜民…

「公平な」40〜50%の提案は、実はまったく普遍的ではないと判明した。

Offers range from 26% to 58%提案額は26%から58%まで

Machiguenga (Peru, family-level farming): mean ~26% — and almost no rejections, even of low offers.
Lamalera (Indonesia, cooperative whale hunters): mean ~58% — hyper-fair.
Au & Gnau (Papua New Guinea): often reject hyper-fair offers — accepting a big gift creates a debt.

The pattern: the more a society depends on cooperation in production and market exchange, the fairer the offers — these two factors explain ~68% of the variance across societies (Henrich et al. 2001).

マチゲンガ（ペルー、家族単位の農耕）：平均約26% — そして低い提案でもほとんど拒否しない。
ラマレラ（インドネシア、協同の捕鯨民）：平均約58% — 超公平。
アウ族とグナウ族（パプアニューギニア）：しばしば「超公平」な提案を拒否する — 大きな贈り物を受け取ると負債が生じるため。

パターン： 社会が生産における協力と市場交換に依存するほど、提案は公平になる — この二つの要因が社会間のばらつきの約68%を説明する（Henrich et al. 2001）。

~4 min. Heart of the cross-cultural segment. Three distinct deviations from the Western ~44%, each with a story: - Machiguenga LOW + no rejection: a family-level economy with little anonymous exchange — “splitting with a stranger” isn’t a familiar script, and there’s no norm to punish stinginess. Henrich’s field quote: rejecting free money “just seemed ridiculous” to them. - Lamalera HYPER-fair: whale hunting needs large-group cooperation; daily life shares big lumpy payoffs, so generosity is the default. - Au/Gnau rejecting hyper-fair offers: gift-giving creates obligation, so a too-large offer is a burden — they reject for the OPPOSITE reason Westerners reject low offers. Henrich’s headline: market integration + cooperation in production explain ~68% of cross-society variance in offers. Exact percentages vary slightly by source (week6-PLAN.md TODO: the chart uses approximate published means).

Discussion hook: “fairness isn’t one human universal — it’s calibrated to how your society actually produces and trades. What’s the ‘fair split’ where you grew up?” Great moment to draw out the international students.

One pattern under all of itすべての根底にある一つのパターン

You can model every one of these results:

Inequity aversion (Fehr & Schmidt 1999): add a penalty for unequal payoffs to the utility function → predicts ultimatum offers, rejections, and dictator giving, with culture setting the weights.
Altruistic punishment (Fehr & Gächter 2002, Nature): people pay out of pocket to punish unfairness — and that sustains cooperation.

Callback to Grisha: Tit-for-Tat’s retaliation is altruistic punishment with a name. Callback to Week 3: same move — write a utility function that predicts the deviation.

これらの結果はすべてモデル化できる：

不平等回避（Fehr & Schmidt 1999）：効用関数に不平等な配分へのペナルティを加える → 最後通牒の提案・拒否、そして独裁者の分配を予測。文化が重みを決める。
利他的処罰（Fehr & Gächter 2002, Nature）：人は自腹を切って不公平を罰し、それが協力を維持する。

グリーシャへの参照： しっぺ返し戦略の報復性は、名前のついた利他的処罰。第3週への参照： 同じ動き — ずれを予測する効用関数を書く。

Where we are現在地

✓ What is the other player thinking?相手は何を考えている？ 0:00
✓ People aren’t Nash人はナッシュではない 0:07
The beauty contest美人投票ゲーム 0:15
Theory of mind心の理論 0:30
Autism, theory of mind & the beauty contest自閉症・心の理論・美人投票ゲーム 0:53
Blame & intent非難と意図 1:11
One more gameもう一つのゲーム 1:33

3 · The beauty contest3 · 美人投票ゲーム

Let’s play. Pick a number 0–100.やってみよう。0〜100の数を選んで。

The winner is whoever’s closest to ⅔ of the average of everyone’s guess.

Write it down. No talking.

勝者は、全員の予想の平均の⅔に最も近い人。

書き留めて。相談はなし。

The reasoning ladder推論のはしご

Level 0: guess randomly → average ≈ 50.
Level 1: “if others are random, I guess ⅔ × 50 ≈ 33.”
Level 2: “if others think that, I guess ⅔ × 33 ≈ 22.”
Iterate forever → the only Nash equilibrium is everyone guesses 0.

But did anyone here guess 0? Almost nobody ever does.

レベル0： ランダムに予想 → 平均 ≈ 50。
レベル1： 「他人がランダムなら、⅔ × 50 ≈ 33と予想」。
レベル2： 「他人がそう考えるなら、⅔ × 33 ≈ 22と予想」。
永遠に反復 → 唯一のナッシュ均衡は全員が0と予想。

でも、ここで0と予想した人は？ほとんど誰もいない。

Level-k reasoning ladder descending 50 to 33 to 22 to 15 to 0, each level best-responds to the level below

What people actually do人が実際にすること

Across thousands of players (Nagel 1995; Bosch-Domènech et al. 2002 newspaper experiments), guesses cluster at 33 and 22 — people do 1–2 steps, not infinite.

The cognitive-hierarchy model (Camerer, Ho & Chong 2004) puts the mean number of thinking steps at ≈ 1.5.

Depth of reasoning about other minds is not all-or-nothing — it’s a number, and you can measure it.

何千人ものプレイヤーで（Nagel 1995；Bosch-Domènech et al. 2002 の新聞実験）、予想は33と22に集中 — 人は1〜2ステップで、無限ではない。

認知階層モデル（Camerer, Ho & Chong 2004）は、平均の思考ステップ数を ≈ 1.5とする。

他者の心についての推論の深さは、全か無かではない — それは数値であり、測定できる。

Histogram of beauty-contest guesses with spikes at 33 and 22 and a smaller spike at 0

Where we are現在地

✓ What is the other player thinking?相手は何を考えている？ 0:00
✓ People aren’t Nash人はナッシュではない 0:07
✓ The beauty contest美人投票ゲーム 0:15
Theory of mind心の理論 0:30
Autism, theory of mind & the beauty contest自閉症・心の理論・美人投票ゲーム 0:53
Blame & intent非難と意図 1:11
One more gameもう一つのゲーム 1:33

4 · Theory of mind4 · 心の理論

From “how deep” to “what’s in there”「どれだけ深く」から「中に何があるか」へ

The beauty contest measured how many steps you reason about others.

But strategic depth presupposes something more basic: that you can represent what someone else believes — even when it’s false, even when it differs from what you know.

That capacity is theory of mind. Where does it come from, and how does it work?

美人投票ゲームは、他者について何ステップ推論するかを測りました。

しかし戦略的深さは、より基本的な何かを前提とします：他者が何を信じているか — それが誤りであっても、自分の知っていることと違っても — を表現できること。

その能力が心の理論です。それはどこから来て、どう働くのか？

The classic test: false belief古典的なテスト：誤信念

Theory of mind = attributing mental states to others (Premack & Woodruff 1978).

The false-belief task — Sally-Anne (Baron-Cohen, Leslie & Frith 1985):

Sally hides her marble in a basket and leaves. Anne moves it to a box. Sally comes back. Where will Sally look for her marble?

Passing requires representing a belief that differs from reality. Most children pass around age 4.

心の理論 = 他者に心的状態を帰属させること（Premack & Woodruff 1978）。

誤信念課題 — サリーとアン（Baron-Cohen, Leslie & Frith 1985）：

サリーがビー玉をかごに隠して出て行く。アンがそれを箱に移す。サリーが戻ってくる。サリーはどこを探す？

正解するには、現実と異なる信念を表現する必要がある。多くの子どもは4歳頃に通過する。

Watch: the Sally-Anne task映像：サリーとアン課題

Watch the child’s answer — and the age at which it flips from “the box” (where it really is) to “the basket” (where Sally thinks it is). 子どもの答えに注目 — 「箱」（実際にある場所）から「かご」（サリーがいると思っている場所）へと、何歳で切り替わるか。

The computational turn計算論的転回

How does mature theory of mind actually work? Baker, Saxe & Tenenbaum (2009): “Action understanding as inverse planning.”

Week 3: beliefs + utilities → action (planning)
ToM: action → beliefs + utilities (inverse planning)

Theory of mind is your own decision theory, run backwards on someone else.

成熟した心の理論は実際どう働くのか？ Baker, Saxe & Tenenbaum (2009)：「逆プランニングとしての行動理解」。

第3週：信念 + 効用 → 行動（プランニング）
心の理論：行動 → 信念 + 効用（逆プランニング）

心の理論とは、自分の意思決定理論を他者に対して逆向きに動かすことです。

Forward planning (beliefs+utilities to action) vs inverse planning (action to beliefs+utilities)

The machinery is old; the evidence is what’s new仕組みは古い、新しいのは証拠

A fair caution: “infer preferences from choices, assuming utility-maximization” is just revealed preference — neoclassical economics, run in reverse. The theory isn’t new.

What is striking is the empirical reach:

Even toddlers infer others’ costs and rewards from a single choice, and expect agents to act efficiently (Jara-Ettinger et al. 2016; Liu et al. 2017, Science).
Inverse-planning models quantitatively predict adults’ moment-to-moment belief and desire attributions (Baker et al. 2017, Nat. Hum. Behav.).

So: not a new mechanism — the same utility-maximization you met in behavioral game theory — but deployed as mind-reading, from infancy.

正直な注意：「効用最大化を仮定して、選択から選好を推論する」は顕示選好 — 新古典派経済学を逆に回しただけ。理論は新しくない。

新しいのは、実証的な射程です：

幼児でさえ、たった一度の選択から他者のコストと報酬を推論し、エージェントが効率的に行動すると期待する（Jara-Ettinger et al. 2016；Liu et al. 2017, Science）。
逆プランニングモデルは、成人の刻々の信念・欲求の帰属を定量的に予測する（Baker et al. 2017, Nat. Hum. Behav.）。

つまり：新しい仕組みではない — 行動ゲーム理論で出会った効用最大化と同じ — しかし乳児期から、心を読むために使われている。

~3 min. Per Joe’s note: the naïve-utility-calculus THEORY is essentially revealed preference / utility-maximization (already in neoclassical econ + behavioral game theory) run as inference — say so plainly; it builds credibility. The real contribution is EMPIRICAL: (a) the developmental finding that infants/toddlers already do this (Jara-Ettinger 2016 review; Liu, Ullman, Tenenbaum & Spelke 2017 Science — infants infer the value of a goal from the cost an agent pays), and (b) quantitative fits to adult judgments (Baker et al. 2017). The Bayes equation from the prior slide is optional scaffolding; this slide is the honest framing of what’s genuinely new. Neural aside if time: TPJ recruited specifically for others’ beliefs (Saxe & Kanwisher 2003) — “a brain region for inverse planning.” (JA confirmed: “revealed preference” = 顕示選好; “naïve utility calculus” = 素朴効用計算.)

Break休憩

Where we are現在地

✓ What is the other player thinking?相手は何を考えている？ 0:00
✓ People aren’t Nash人はナッシュではない 0:07
✓ The beauty contest美人投票ゲーム 0:15
✓ Theory of mind心の理論 0:30
Autism, theory of mind & the beauty contest自閉症・心の理論・美人投票ゲーム 0:53
Blame & intent非難と意図 1:11
One more gameもう一つのゲーム 1:33

5 · Autism, theory of mind & the beauty contest5 · 自閉症・心の理論・美人投票ゲーム

Theory of mind and autism: the classic finding心の理論と自閉症：古典的な知見

The Sally-Anne task wasn’t designed for children in general — it was designed to study autism (Baron-Cohen, Leslie & Frith 1985, “Does the autistic child have a theory of mind?”).

The classic result: many autistic children passed control questions but failed the false-belief question at higher rates than matched peers.
This launched the “mindblindness” hypothesis — autism as a specific difficulty representing others’ mental states.

For a long time this was the textbook story. The next results complicate it.

サリーとアン課題は、子ども一般のために作られたのではなく、自閉症を研究するために作られました（Baron-Cohen, Leslie & Frith 1985「自閉症児は心の理論を持つか？」）。

古典的な結果：多くの自閉症児は統制質問には答えたが、誤信念質問には対応する仲間より高い割合で不正解だった。
これが「マインドブラインドネス」仮説を生んだ — 自閉症を、他者の心的状態を表現する特異的な困難とみなす見方。

長らくこれが教科書的な物語でした。次の結果がそれを複雑にします。

Reframing it: the double-empathy problem再考：二重共感問題

The “mindblindness” story is one-sided — it measures autistic people reading neurotypical minds, and calls the gap a deficit.

Milton (2012), the double-empathy problem: the mismatch is bidirectional — each group struggles to read the other. Neurotypical people are no better at reading autistic minds than vice versa.

Recent work: autistic-to-autistic communication can be as effective as neurotypical-to-neurotypical.

Reframe, using our spine: not a broken module, but two differently-tuned inverse planners reading each other.

「マインドブラインドネス」の物語は一方的です — 自閉症の人が定型発達の心を読む能力を測り、その差を欠損と呼んでいる。

Milton (2012)、二重共感問題： ミスマッチは双方向 — 各集団がもう一方を読むのに苦労する。定型発達の人が自閉症の心を読むのは、その逆より上手いわけではない。

最近の研究：自閉症者同士のコミュニケーションは、定型発達者同士と同じくらい効果的でありうる。

再構成（スパインを使って）：壊れたモジュールではなく、互いを読み合う異なる調整の逆プランナー二つ。

So: a clean predictionでは：きれいな予測

Put the two threads together. The beauty contest requires recursive theory of mind — “I think that you think…”. Autism is classically associated with theory-of-mind differences.

So: should autistic players reason to a shallower level in the beauty contest? What’s your prediction?

二つの糸を合わせます。美人投票ゲームは再帰的な心の理論を必要とする — 「私はあなたが考えていると考える…」。自閉症は古典的に心の理論の違いと関連づけられてきた。

では：自閉症のプレイヤーは美人投票ゲームでより浅いレベルで推論するはず？あなたの予測は？

Result 1 — the surprise結果1 — 意外な結果

Pantelis & Kennedy (2017), Cognition — “Autism does not limit strategic thinking in the beauty contest game.”

ASD vs. neurotypical: statistically indistinguishable in strategic depth.
ASD mean ≈ 30.2, controls ≈ 31.8; same share of “higher-order” players.
Bayes Factor: moderate evidence for the null.

Look how nearly identical the two distributions are →

Pantelis & Kennedy (2017), Cognition — 「自閉症は美人投票ゲームでの戦略的思考を制限しない」。

ASD 対定型発達：戦略的深さで統計的に区別できない。
ASDの平均 ≈ 30.2、対照群 ≈ 31.8；「高次」プレイヤーの割合も同じ。
ベイズ因子：帰無仮説を中程度に支持。

二つの分布がほぼ同一なことに注目 →

Pantelis & Kennedy 2017 mirror-plot: ASD guess distribution (top) vs Control (bottom) are nearly identical, with overlapping mean lines near 30-31.

Guess distributions, ASD (top) vs. control (bottom) — the dashed mean lines almost coincide. (Pantelis & Kennedy 2017, Exp 2)

Result 2 — the reversal結果2 — 逆転

Król & Król (2019), Thinking & Reasoning — “Autism limits strategic thinking after all.”

They replicated the outcome null — but added a payoff calculator to trace the process.
Neurotypicals played best-response to the hypothetical others they entered.
Autistic participants were less strategic in process — answers larger relative to what they attributed to others — even though the final numbers matched.

Outcome (top): same. Process (bottom): different. →

Król & Król (2019), Thinking & Reasoning — 「やはり自閉症は戦略的思考を制限する」。

彼らは結果の帰無を再現 — しかし過程を追跡するペイオフ計算機を加えた。
定型発達者は、入力した仮想の他者に対して最適応答を行った。
自閉症の参加者は過程において戦略性が低かった — 自分が他者に帰属させた値に比べて答えが大きかった — 最終的な数値は一致していたにもかかわらず。

結果（上）：同じ。過程（下）：異なる。 →

Top: ASD vs neurotypical guess distributions overlap (outcome null). Bottom: best-response process measure separates the groups.

~4 min. The figure is the punchline, now stacked vertically in the right column: TOP panel = outcomes overlap (Pantelis & Kennedy null), BOTTOM panel = process diverges (Król & Król). The bottom panel’s numbers are schematic; the qualitative finding is real. This pairs with the real P&K distribution figure on Result 1: same outcomes there, process gap here — “what you measure decides what you conclude.”

The study (Król & Król 2019, “Autism limits strategic thinking after all: A process-tracing study of the beauty-contest game,” Thinking & Reasoning 25(3):339–367; Michał Król, Univ. of Manchester + Magdalena Król, SWPS Univ., Poland):

Why they ran it. Pantelis & Kennedy (2017) had found autistic and neurotypical players gave statistically indistinguishable beauty-contest answers. Król & Król’s worry: a final number is a black box — two people can land on the same guess for completely different reasons. So they built a way to watch the reasoning, not just the output.
The method — a “payoff calculator.” Instead of just asking for one number, the interface let each participant type in what they thought the other players would guess, and it showed the resulting payoff for any answer they considered. So you can see (a) what others each player imagined, and (b) whether their final guess was a genuine best response to those imagined others (i.e. ≈ ⅔ of the average they entered). This turns an invisible thought process into a recorded trace.
What they found.
- Outcomes replicated the null — final guesses were again similar across groups. By the Pantelis & Kennedy measure, “no difference.”
- Process diverged. Neurotypical participants best-responded to the others they entered — their final answer tracked ⅔ of their own stated expectation. Autistic participants were less strategic in process: their answers were larger relative to what they attributed to others, and less tightly a best-response to their own entered hypotheticals. In effect they reached similar numbers without going through the same recursive best-response computation.
- Crucially, the process gap was NOT explained by a measured theory-of-mind difference between the groups. So the authors frame it as a difference in strategic/process engagement, not a clean “ToM deficit.”
The lesson for the lecture (and the course). Identical outcomes concealed a real difference in process — only a method that traced the reasoning could surface it. That’s the methods beat: what you choose to measure determines the conclusion you reach. Tie it to MP2 (same agent behavior, different internal algorithm — you debugged exactly this) and MP4 (a fair outcome can sit on top of an unfair process).
Caveats to keep honest. This is one study; the Pantelis & Kennedy outcome result stands and is not overturned — Król & Król add a process layer rather than refute the null. Treat the bottom-panel “process score” in our figure as illustrative, not a quoted statistic (verify exact figures against the paper before citing numbers). (JA: “payoff calculator” = ペイオフ計算機, “best response” = 最適応答, “process tracing” = 過程追跡 — confirmed; all appear in the JA slide body.)

The lesson教訓

What you measure determines what you conclude.

Look only at outcomes → “no difference.”
Trace the process → a difference appears.

An agent — or a person, or a fairness rule — can produce the right output for the wrong reasons.

MP2: same agent behavior, different internal algorithm — you debugged exactly this. MP4: a fair outcome can hide an unfair process.

何を測るかが、何を結論するかを決める。

結果だけを見る → 「違いはない」。
過程を追う → 違いが現れる。

エージェント — あるいは人、あるいは公平性ルール — は、間違った理由で正しい出力を生み出しうる。

MP2： 同じエージェントの振る舞い、異なる内部アルゴリズム — まさにこれをデバッグした。MP4： 公平な結果が、不公平な過程を隠しうる。

Where we are現在地

✓ What is the other player thinking?相手は何を考えている？ 0:00
✓ People aren’t Nash人はナッシュではない 0:07
✓ The beauty contest美人投票ゲーム 0:15
✓ Theory of mind心の理論 0:30
✓ Autism, theory of mind & the beauty contest自閉症・心の理論・美人投票ゲーム 0:53
Blame & intent非難と意図 1:11
One more gameもう一つのゲーム 1:33

6 · Blame & intent6 · 非難と意図

A vote: did he do it on purpose?投票：彼はわざとやったか？

A chairman is told a new program will increase profits and — as a side effect — harm the environment. He says: “I don’t care about the environment. I just want profit.” The program runs; the environment is harmed.

Did the chairman harm the environment intentionally? Hands up.

ある会長が、新しい事業は利益を増やし — 副作用として — 環境を害すると告げられる。彼は言う：「環境はどうでもいい。利益が欲しいだけだ」。事業は実行され、環境は害された。

会長は意図的に環境を害したか？ 挙手で。

~4 min. Single condition, whole room (per Joe — with n=5 a between-subjects harm/help split gives ~2–3 per cell, too few for the contrast to show). Framing to say aloud (moved here off the slide): “Whole room, this one version. He said he didn’t care — so by a reasons-first account he had no intention to harm. Watch how many of you say ‘intentional’ anyway.” Ask ONLY the harm version: “Did he harm the environment intentionally?” Almost everyone says yes, intentional. But notice the setup: he was indifferent — harm was a foreseen side effect, not a goal. Under a clean blame-late account, “intentional” requires intending the outcome, and he didn’t intend the harm — so it should NOT read as intentional. The fact that it does is the puzzle: the badness of the outcome is making it feel intentional. That’s the live demonstration; the published help-condition contrast (23% — the version you didn’t need to run with this group) is revealed on the next slide. Do NOT reveal the 82/23 numbers yet.

The side-effect effect副作用効果

Knobe (2003) — across studies:

82% say he harmed the environment intentionally (≈ your show of hands).
Flip one word to help — same indifferent chairman — and only 23% say he helped intentionally.

He had the same mental state both times: he didn’t care. A reasons-first (blame-late) account would call both unintentional.

Yet “intentional” tracks bad vs. good, not his actual intent — the badness comes first and pulls the judgment with it.

Knobe (2003) — 複数の研究で：

82% が、彼は環境を害したのは意図的だと言う （≈ 皆さんの挙手）。
一語を助けるに変えると — 同じ無関心な会長 — 23% しか助けたのは意図的だと言わない。

彼の心的状態は両方で同じ — どうでもよかった。理由先行（遅い非難）ならどちらも非意図的と呼ぶはず。

それでも「意図的」は彼の実際の意図ではなく悪いか善いかを追う — 悪さが先に来て判断を引きずる。

Bar chart: 82% intentional for harm condition, 23% for help condition

Blame: early or late?非難：早いか遅いか？

The Knobe asymmetry raises a deeper question: when you blame someone, when does the judgment happen?

Blame late: weigh cause, intent, consequences → then judge. Reasoning → blame.
Blame early: feel blame first, then assemble reasons to justify it. Blame → reasoning.

Two pictures of moral judgment. The evidence cuts both ways — let’s look at each.

クノービの非対称性は、より深い問いを投げかけます：誰かを非難するとき、その判断はいつ起こるのか？

遅い非難： 原因・意図・結果を比べ → それから判断。推論 → 非難。
早い非難： まず非難を感じ、それを正当化する理由を後から組み立てる。非難 → 推論。

道徳的判断の二つの見方。証拠は両方を支持する — それぞれ見ていきましょう。

Blame early: the gut goes first早い非難：直感が先

Evidence for blame early — quick gut reactions, with reasons built afterward (Haidt 2001, “the emotional dog and its rational tail”).

Moral dumbfounding: people stay certain even when they run out of reasons.

Julie and Mark, adult siblings on holiday, decide to sleep together once — two forms of contraception, no harm, kept secret. Was it wrong?

Most say “yes — but I can’t explain why.” Judgment outruns justification.

Reasons can’t be the whole story if there are no reasons to give.

早い非難の証拠 — 速い直感的反応、理由は後から作る（Haidt 2001「感情という犬と理性という尻尾」）。

道徳的当惑（moral dumbfounding）： 理由が尽きても人は確信を保つ。

成人したきょうだいのジュリーとマークが、旅行中に一度だけ関係を持つと決める — 二重の避妊、害はなく、秘密にする。それは間違いだった？

ほとんどが「間違い — でもなぜかは説明できない」と言う。判断が正当化を追い越す。

与える理由がないなら、理由がすべてではありえない。

~4 min. From Feb21SocialCog slides 7+9. Haidt’s intuitionist case for blame-early. Julie & Mark is the canonical moral-dumbfounding probe (Haidt, Bjorklund & Murphy 2000; Haidt 2001) — the vignette strips out every harm-based reason, yet the “wrong!” survives. Joe’s deck phrases the prompt as “Why exactly is incest immoral? If blame were due to reasons, people should be able to explain this.” Read carefully and respectfully — it’s deliberately taboo; the point is the dumbfounding, not the content. Caveat (one sentence if asked): critics note many participants reject the “harmless” stipulation (Royzman et al. 2015) — but judgment-outrunning-justification is robust. This is the cleanest blame-early case (NO reasons available); the next slide gives the second one (Knobe re-read). (JA confirmed: “moral dumbfounding” = 道徳的当惑.)

Blame early: re-reading the Knobe effect早い非難：クノービ効果の再解釈

Look back at the chairman. The story, his indifference, the structure — all identical. Only the outcome’s valence flipped: harm vs. help.

Harm → 82% call it intentional. Help → 23%.

If intentionality were read off behavior first and fed into blame, valence couldn’t move it. Instead it looks like we judge the actor bad first (he didn’t care, and harm resulted) — and that verdict pulls “intentional” along with it.

Affect about the agent shaping a judgment that’s supposed to be an input to blame: that’s blame early.

会長の話に戻りましょう。話も、彼の無関心も、構造も — すべて同一。変わったのは結果の価だけ：害対益。

害 → 82%が意図的と判断。益 → 23%。

もし意図性がまず行動から読み取られ、それが非難に入力されるなら、価がそれを動かせるはずがない。むしろ、我々はまず行為者を「悪い」と判断し（彼は気にせず、害が生じた）、その判断が「意図的」を引きずってくるように見える。

非難への入力であるはずの判断を、行為者への情動が形づくる — それが早い非難。

~3 min. This replaces the earlier Alicke “cocaine driver” slide. Per Joe’s correction: the cocaine/gift example is NOT clean blame-early evidence — motive IS a reason, so “bad reason → more blame” is exactly the blame-LATE path-model prediction (Malle’s intentional branch scales blame with reasons). It doesn’t isolate affect-first processing. The Knobe re-read is the better second example (alongside dumbfounding): the actor’s valence flips the intentionality attribution, and intentionality is supposed to be an INPUT to the blame computation, not an output of how-bad-the-actor-seems (Feb21SocialCog slide 8: “Pre-judge CEO as bad and so we give responsibility for bad things, but not good things”). The genuinely blame-early residue of Alicke is the finding that bad motive inflates causal-responsibility judgments for an identical physical accident (causation shouldn’t depend on motive) — that belongs on the “Both” synthesis slide as the affect-bleeds-into-the-late-computation bridge, not here. (JA: side-effect / valence wording = 副作用効果・価 — confirmed, matches the JA slide body.)

Evidence for blame late (Malle, Guglielmo & Monroe 2014): much blame is structured reasoning, not reflex —

Detect a negative event.
Was there a causal agent? (No → no blame.)
Was it intentional?
- Intentional → blame scales with the agent’s reasons (selfish → more; good goal → less).
- Unintentional → blame scales with obligation + capacity (should they have prevented it, and could they?).

Intent raises blame for the same outcome (manslaughter vs. homicide); lack of knowledge lowers it. The inverse-planning machinery again — cause → intent → reasons.

遅い非難の証拠（Malle, Guglielmo & Monroe 2014）：非難の多くは反射ではなく構造化された推論 —

否定的な出来事を検出。
原因となるエージェントがいたか？（いない → 非難なし。）
それは意図的だったか？
- 意図的 → 非難はエージェントの理由に応じて増減（利己的 → 増、善い目的 → 減）。
- 非意図的 → 非難は義務 + 能力に応じて増減（防ぐべきだったか、そして防げたか）。

同じ結果でも意図が非難を高める（過失致死対殺人）；知識の欠如は下げる。再び逆プランニングの仕組み — 原因 → 意図 → 理由。

Decision tree: negative event, causal agent, intentional, then reasons or obligation+capacity

Both — and the verdict is social両方 — そして判断は社会的

So which is it? Both — fast affect and structured reasoning run, and they interact.

And the verdict isn’t even purely individual — the group bends it.

Asch (1951): pick which line matches — easy, unambiguous. But confederates all give the same wrong answer.

~37% conformed on critical trials; ~75% at least once (alone: <1% errors).
Mostly normative (fit in), not informational — people saw the right answer and went along.

If even line-length perception bends to the group, so does moral judgment.

ではどちら？両方 — 速い情動と構造化された推論が働き、相互作用する。

そして判断は純粋に個人的でさえない — 集団がそれを曲げる。

Asch (1951)： どの線が一致するか選ぶ — 簡単で曖昧さがない。しかしサクラ全員が同じ誤答をする。

約37%が重要試行で同調；約75%が少なくとも一度（単独なら誤答1%未満）。
大半は規範的（馴染むため）で情報的ではない — 正答が見えていながら合わせた。

線の長さの知覚でさえ集団に曲げられるなら、道徳的判断もそうなる。

Asch conformity: alone, error rate under 1%; when the group gives a wrong answer, ~37% conform; ~75% conform at least once.

~4 min. Synthesis (Feb21SocialCog slide 8 logic + slide 25 Asch). First reconcile early/late: BOTH run and interact. The legitimate blame-EARLY residue of Alicke (1992) belongs HERE, not as a standalone early example: bad motive inflates judgments of causal responsibility for an identical physical accident — causation shouldn’t depend on motive, so that’s affect bleeding into the “objective” late computation (“blame-validation,” Alicke 2000). (We dropped Alicke as a standalone blame-early slide because motive = a reason, which the blame-late path model already explains.) Then Asch as its own social-influence beat (per Joe): even unambiguous perception conforms, so moral judgment certainly can. Verified Asch (1951/1956): ~37% conformity on critical trials, ~75% conform ≥once, ~25% never, <1% control error; normative vs informational (Deutsch & Gerard 1955), debriefs leaned normative. Course tie-in: norms + conformity are the social-enforcement layer behind Grisha’s cooperation and Mizuki’s trust.

Mind perception: who can be blamed?心の知覚：誰を非難できるか？

Gray, Gray & Wegner (2007): mind perception has two dimensions —

Agency — planning, self-control (the capacity to do)
Experience — feeling pain, emotion (the capacity to feel)

Dyadic morality (Gray, Young & Waytz 2012): a moral situation is read as an intentional agent acting on a feeling patient.

Blame needs an agent with agency; harm needs a patient with experience. That’s why we argue over whether a company, an AI, or an animal can be blamed.

Gray, Gray & Wegner (2007)： 心の知覚には2つの次元がある —

行為主体性（agency） — 計画、自制（行う能力）
経験（experience） — 痛みや感情を感じること（感じる能力）

二者間道徳（Gray, Young & Waytz 2012）：道徳的状況は、意図的な主体が、感じる受け手に作用するものとして読まれる。

非難には行為主体性を持つ主体が必要；危害には経験を持つ受け手が必要。だからこそ、企業やAIや動物を非難できるかを我々は議論する。

~5 min. Agency vs experience (Gray/Gray/Wegner 2007, Science — verified). Dyadic morality / moral typecasting (Gray/Young/Waytz 2012). The AI/company/animal hook is the live, course-relevant payoff — mind perception is why blameability is contested for non-human agents.

Latest stance on blaming AI (brief research, 2023–2025) — the new yellow line: - Mind perception drives AI blame. Stuart & Kneer (2024, PLOS One, “It’s the AI’s fault, not mine: mind perception increases blame attribution to AI”) — across 3 studies, the more human-like / mind-endowed people rate an AI, the more blame it gets for a moral violation; notably this shifts blame onto the AI without much reducing the company’s share when each is rated separately. (Press echo: ScienceDaily, Dec 2024, “Human-like AI may face greater blame for moral violations.”) This is the exact agency-dimension prediction of this slide, made live for AI. - The “responsibility gap.” Because AI lacks full agency/consciousness, no single party cleanly fits the agent slot, so blame distributes across developer → deployer/organization → user (shared-liability picture; e.g. healthcare AI = developer + hospital + clinician). Some philosophers argue the gap is overstated (whoever knowingly deploys is responsible), but empirically people still hesitate to blame the autonomous system the way they’d blame a human driver. - Self-interest twist. Longin / “Responsibility gaps and self-interest bias” (2023, JESP): people transfer moral responsibility to AI more for their own hybrid transgressions than for others’ — a motivated, not purely rational, attribution. Good provocation for a design-ethics aside. - (Verify author names/years against the papers before citing precise stats; the qualitative directions are solid. For citation counts use Google Scholar.)

Callbacks (~3 min): MIZUKI — “your trust system dropped trust when a claim didn’t match a catch. That’s intent attribution: the agent meant to deceive. You built a tiny blame model.” CATHARINA/MP4 — “when an allocation rule produces disparate impact with no intent, is anyone to blame? The path model shifts unintentional harm to obligation + capacity — did the designer have a duty and the ability to prevent it? That’s the live question in your fairness project.” KNOBE→AI — “when an AI causes harm as a foreseen side effect, the side-effect effect predicts people call it intentional.”

Where we are現在地

✓ What is the other player thinking?相手は何を考えている？ 0:00
✓ People aren’t Nash人はナッシュではない 0:07
✓ The beauty contest美人投票ゲーム 0:15
✓ Theory of mind心の理論 0:30
✓ Autism, theory of mind & the beauty contest自閉症・心の理論・美人投票ゲーム 0:53
✓ Blame & intent非難と意図 1:11
One more gameもう一つのゲーム 1:33

7 · One more game7 · もう一つのゲーム

Let’s auction ¥1,0001,000円を競売します

I’m selling this ¥1,000 note. Open bidding, ¥50 increments.

Highest bidder wins the ¥1,000 — and pays their bid.
The runner-up also pays their last bid — and gets nothing.

Who’ll start at ¥50?

(We won’t really collect — but bid as if it’s real. Let’s see where it stops.)

この1,000円札を売ります。公開入札、50円刻み。

最高額の入札者が1,000円を落札 — 入札額を支払う。
2位の入札者も最後の入札額を支払う — そして何も得られない。

50円から始める人は？

（実際には集めません — でも本物のつもりで入札を。どこで止まるか見てみましょう。）

~6 min. RUN IT LIVE (the dollar auction, here in yen; Shubik 1971). Rules: auction a real ¥1,000 note, ¥50 increments, both the top two bidders pay, only the top one gets the ¥1,000. Start the bidding and let it run — write bids on the board. With even 5 students it usually escalates past ¥1,000, because once you’re in 2nd place, bidding ¥50 more to maybe win the note looks better than paying your current bid for nothing. That local rationality is the trap. DON’T explain the trap yet — let them feel the pull first; the reveal is the next slide. (You said you won’t actually collect — good; engagement comes from bidding as if real.) If nobody bids past ~¥500, prompt the current 2nd-place bidder: “you’re about to pay ¥X for nothing — want to bid one more?” Note: ¥1,000 keeps the round-bill framing and matches the ¥1,000 stake used in the ultimatum/dictator games earlier — a nice callback.

Why ¥1,000 sold for more than ¥1,000なぜ1,000円が1,000円超で売れたか

The dollar auction (Shubik 1971). Each next bid is locally rational — “pay ¥50 more and I might win ¥1,000” beats “pay my current bid for nothing.” But follow that logic and the bids sail past ¥1,000.

It’s escalation of commitment / sunk-cost entrapment — a war of attrition where quitting realizes your loss.
The trap isn’t in the bidders. It’s in the rules — the #2-pays mechanism is engineered to turn rational steps into a collective loss.

A theory-of-mind failure, too: you don’t reason far enough about where the other bidder’s identical logic leads.

ドル・オークション（Shubik 1971）。次の一手はどれも局所的には合理的 — 「あと50円払えば1,000円を取れるかも」は「今の入札額を払って何も得ない」より良い。しかしその論理を辿ると入札は1,000円を超えていく。

これはコミットメントのエスカレーション／サンクコストの罠 — 降りると損失が確定する消耗戦。
罠は入札者の中ではない。ルールの中にある — 2位も支払うという仕組みが、合理的な一歩を集団の損失に変えるよう設計されている。

これは心の理論の失敗でもある：相手の同じ論理がどこへ向かうかを十分に推論していない。

The winner’s curse勝者の呪い

A second auction trap. A jar of coins worth ¥1,000; highest bid wins and pays it.

Bazerman & Samuelson (1983) — in their classic version, MBA students bid on jars worth $8:

Average estimate came in below the true value — yet the average winning bid was well above it, a reliable loss, with losses in more than half of auctions.

The winner is whoever most overestimated. Winning is bad news about your own estimate — and you didn’t reason about what winning reveals about everyone else.

もう一つのオークションの罠。価値1,000円のコイン瓶；最高額が落札し支払う。

Bazerman & Samuelson (1983) — 古典的な版では、MBA学生が価値8ドルの瓶に入札：

平均の見積もりは真の価値を下回った — なのに平均の落札額はそれを大きく上回り、確実に損失、半分以上のオークションで損が出た。

落札者は最も過大評価した人。落札は自分の見積もりについての悪い知らせ — 落札が他者について何を明かすかを推論しなかった。

~3 min (OPTIONAL — first thing to cut if the live auction ran long). A different cognitive failure from the dollar auction: here you lose by failing to reason about what winning itself tells you about everyone else’s information (you only win when you’ve over-estimated relative to the group). Verified figures (Bazerman & Samuelson 1983): $8 jars, avg estimate $5.13, avg winning bid $10.01, ~$2.01 avg loss, losses in >half of auctions — I kept the original $ figures here since they’re the actual published numbers (the jar framing is a thought experiment, not a thing you run live, so the currency mismatch with the live ¥ auction is fine). Inverse-planning fix: condition on “I won,” then ask what that implies about others’ estimates — the lecture’s spine, once more. (JA confirmed: “winner’s curse” = 勝者の呪い.)

Mechanism design can be a weaponメカニズムデザインは武器にもなる

Mechanisms can be designed to exploit predictable reasoning failures:

Reserve prices that extract surplus (Myerson 1981)
Sniping on hard-close auctions (Roth & Ockenfels 2002)
Shill bidding, drip pricing, dark patterns

Not just an efficiency tool — a tool for extraction.

Colin Rowat makes this formal on Tuesday — first- vs. second-price, the revelation principle, incentive compatibility, optimal auctions. I give you the psychology; he gives you the mechanism.

メカニズムは、予測可能な推論の失敗を突くように設計できる：

余剰を抽出する最低落札価格（Myerson 1981）
ハードクローズ・オークションでのスナイピング（Roth & Ockenfels 2002）
見せかけ入札（shill bidding）、ドリップ・プライシング、ダークパターン

効率化の道具であるだけでなく、抽出の道具。

コリン・ロワットが火曜日にこれを形式化します — 第一価格対第二価格、顕示原理、誘因両立性、最適オークション。私は心理を、彼はメカニズムを渡します。

The thread: theory of mind = decision theory backwards一本の糸：心の理論 = 逆向きの意思決定理論

Beauty contest → inferring the depth of others’ reasoning.
Inverse planning → inferring belief + desire from action.
Blame → inferring intent, with a moral overlay.
Auctions → failing to infer what winning reveals about others.

All four are one move: run your decision theory backwards on another mind.

美人投票ゲーム → 他者の推論の深さを推論する。
逆プランニング → 行動から信念 + 欲求を推論する。
非難 → 道徳的な層とともに意図を推論する。
オークション → 落札が他者について明かすものを推論し損ねる。

四つはすべて一つの動き：自分の意思決定理論を、別の心に対して逆向きに動かす。

Where this sits in the courseこのコースの中での位置づけ

Grisha gave you the games. Mizuki the ecosystems. Catharina the fairness rules.
Today: the minds underneath — why people play, trust, and judge the way they do.

As you finish MP4: a fair outcome can hide an unfair process — and people will read intent into your system whether or not you put it there.

Tuesday: Colin Rowat, Agents for Economics. Thank you.

グリーシャはゲームを、ミズキは生態系を、カタリーナは公平性のルールを与えた。
今日： その下にある心 — なぜ人はそのように遊び、信頼し、判断するのか。

MP4を仕上げるにあたって：公平な結果が、不公平な過程を隠しうる — そして人々は、あなたが意図を込めたかどうかに関わらず、システムに意図を読み込む。

火曜日：コリン・ロワット『Agents for Economics』。ありがとうございました。

That’s the map全体の地図

✓ What is the other player thinking?相手は何を考えている？ 0:00
✓ People aren’t Nash人はナッシュではない 0:07
✓ The beauty contest美人投票ゲーム 0:15
✓ Theory of mind心の理論 0:30
✓ Autism, theory of mind & the beauty contest自閉症・心の理論・美人投票ゲーム 0:53
✓ Blame & intent非難と意図 1:11
✓ One more gameもう一つのゲーム 1:33