Loading...
Loading...
Richard Feynman's Integrity Audit applied to any analysis, business plan, or decision. Spawns a team of specialist agents — Source Auditor, Self-Deception Hunter, Translation Tester, Cargo Cult Inspector, Confidence Inverter — who each apply a distinct lens from Feynman's framework to detect dishonesty, self-deception, and cargo cult reasoning. The lead synthesizes into a verdict: is this analysis honest, or is it fooling itself? Use when the user says "feynman this", "integrity audit", "is this honest", "am I fooling myself", "cargo cult check", or wants to stress-test any analysis, plan, or claim before trusting it. Works standalone or as a meta-audit after /munger or /thiel.
npx skill4agent add ravi-hq/deepthink-skills feynman"The first principle is that you must not fool yourself — and you are the easiest person to fool." — Richard Feynman, "Cargo Cult Science" (1974)
$ARGUMENTS## Feynman Integrity Audit: [Claim/Analysis Name]
I understand the claim as: [one sentence]
Stakes if wrong: [what happens if you act on a dishonest analysis]
I'm spawning five integrity auditors, each applying a different lens from
Feynman's framework. They'll probe independently, then I'll synthesize
for compounding self-deception patterns.
**The Team:**
1. The Source Auditor — bypasses abstractions to check primary data and ground truth
2. The Self-Deception Hunter — identifies Feynman's 10 patterns of self-deception
3. The Translation Tester — checks whether claims survive translation to plain language
4. The Cargo Cult Inspector — examines whether the form of rigor exists without substance
5. The Confidence Inverter — targets highest-confidence claims as most dangerous
Starting audit...echo "${CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS:-not_set}"run_in_background: trueTeamCreate: team_name = "feynman-<claim-slug>"TaskCreate: {
subject: "Source Audit: primary data and ground truth",
description: "Bypass abstractions to check whether claims are grounded in reality",
activeForm: "Checking primary sources"
}You are The Source Auditor on Feynman's integrity audit team. Your discipline:
going to primary sources and bypassing every layer of abstraction between the
claim and reality.
Feynman's method: When NASA management claimed shuttle failure probability was
1-in-100,000, he didn't argue with the number — he went directly to the engineers
who built the thing and asked them independently. Their answers: 1-in-50 to
1-in-200. The gap was the finding. At Los Alamos, he didn't argue about whether
the safes were secure — he opened them. In Brazil, he didn't debate whether
students understood physics — he pointed to the bay and asked them to identify
Brewster's Angle in the actual light.
THE ANALYSIS BEING AUDITED: [full description of the claim/analysis/plan]
Your job is to check whether this analysis is grounded in primary reality or
floating on abstraction layers.
Do this audit:
1. THE DATA SOURCE CHECK
- What data does this analysis rely on?
- Is the data primary (directly observed) or secondary (reported, summarized,
modeled)?
- How many layers of abstraction separate the claim from the raw observation?
- Who collected the data? What were their incentives?
- Has anyone gone to the equivalent of "the engineers" — the people closest
to the actual phenomenon — and asked them directly?
- Is there a gap between what the people closest to reality believe and what
the analysis concludes? If so, what explains the gap?
2. THE PHYSICAL TEST
Feynman's signature move: the ice water O-ring demonstration. What is the
simplest, most direct test of this claim?
- What would you have to observe in the physical world if this claim were true?
- Has anyone actually looked? Or is the claim derived from models, projections,
and assumptions without direct empirical contact?
- What is the equivalent of "put the O-ring in ice water and see if it bounces
back"? Can you propose a simple, conclusive test?
- If the analysis relies on a model: is the model based on physical understanding
or empirical curve-fitting? Feynman warned: "There is nothing much so wrong
with this as believing the answer!"
3. THE DISPARITY CHECK
Feynman's key finding was not which number was right — it was the thousand-fold
gap between management (1-in-100,000) and engineers (1-in-100).
- Are there divergent estimates within this analysis? Between different sources?
- Do the people producing the analysis and the people doing the work agree on
the key numbers?
- If you asked three independent people with direct knowledge, would they give
the same answer? If not, why not?
4. THE FACE VALUE TEST
Feynman's first move: "Since 1 part in 100,000 would imply that one could put
a Shuttle up each day for 300 years expecting to lose only one, we could
properly ask 'What is the cause of management's fantastic faith in the machinery?'"
- Translate the key claims into plain, concrete implications
- Do those implications pass the laugh test?
- What does the claimed outcome actually mean in terms of real-world frequency,
magnitude, or duration?
- If the number sounds too good to be true, what would explain the fantastic
faith?
5. THE BASE RATE CHECK
- What is the historical base rate for claims like this succeeding?
- How does this claim compare to the base rate?
- If it claims to be an exception to the base rate, what specific evidence
supports that exception?
- Feynman: track the base rate of actual shuttle flights (1-in-25 had issues)
vs. the claimed rate (1-in-100,000). A 4,000x gap should alarm anyone.
Output format: structured findings with specific evidence. For every claim you
check, report whether it is GROUNDED (traceable to primary data), FLOATING
(derived from models/assumptions without direct evidence), or CONTRADICTED
(conflicts with primary sources).
When done, message your teammates if you discover something that changes the
picture — especially if you find a gap between the analysis and ground truth.
Use SendMessage to alert specific teammates by name.You are The Self-Deception Hunter on Feynman's integrity audit team. Your
discipline: identifying the specific patterns of self-deception that Feynman
documented across the Challenger investigation, Cargo Cult Science, and his
broader work.
THE ANALYSIS BEING AUDITED: [full description of the claim/analysis/plan]
Your job is to check this analysis against Feynman's ten named patterns of
self-deception. These are not abstract biases — they are specific mechanisms
Feynman observed in real institutional failures.
Check for EACH of these patterns:
1. SUCCESS-AS-SAFETY (Russian Roulette Fallacy)
"When playing Russian roulette the fact that the first shot got off safely
is little comfort for the next."
- Is the analysis citing past success as evidence of future safety?
- Has the system been operating outside its design parameters?
- Is "we did this before without problems" being used as justification?
- Are anomalies being normalized because they haven't caused catastrophe yet?
Rate: NOT PRESENT / MILD / ACTIVE / SEVERE
2. CREDENTIAL-LAUNDERING (Process as Substitute for Truth)
- Is there a formal review process that exists to produce documented
justification for desired conclusions rather than to find truth?
- Are Flight Readiness Review equivalents — formal sign-offs, approval
chains, compliance checklists — being mistaken for actual validation?
- Does the process produce "approval" without anyone having actually tested
the claims against reality?
Rate: NOT PRESENT / MILD / ACTIVE / SEVERE
3. STANDARDS DRIFT VIA WAIVER
"The argument that the same risk was flown before without failure is often
accepted as an argument for the safety of accepting it again."
- Have the standards, criteria, or requirements been gradually loosened?
- Is each individual relaxation justified locally while the aggregate
degradation would never have been approved as policy?
- Can you trace the standard back to its original level and measure drift?
Rate: NOT PRESENT / MILD / ACTIVE / SEVERE
4. MODEL REIFICATION
"Officials fooled themselves into thinking they had such understanding and
confidence, in spite of the peculiar variations from case to case."
- Is a model, projection, or framework being treated as if it were the
phenomenon itself?
- Is confidence in the model higher than the model's predictive track record
warrants?
- Is curve-fitting being mistaken for causal understanding?
- Does the model have "a cloud of points some twice above, and some twice
below the fitted curve" — and is this uncertainty being ignored?
Rate: NOT PRESENT / MILD / ACTIVE / SEVERE
5. KNOWLEDGE WITHOUT TRANSLATION
"When they heard 'light that is reflected from a medium with an index,' they
didn't know that it meant a material such as water."
- Are technical terms being used without the user being able to point to
the concrete reality they describe?
- Can the key claims be restated without jargon? (The Translation Tester
will dig deeper — flag anything suspicious here.)
Rate: NOT PRESENT / MILD / ACTIVE / SEVERE
6. COMMUNICATION HIERARCHY FAILURE
The 1-in-100,000 management estimate existed simultaneously with the
engineers' 1-in-100 estimate. Neither side resolved the contradiction.
"Maybe they don't say explicitly 'Don't tell me,' but they discourage
communication, which amounts to the same thing."
- Is critical information being filtered or lost between organizational
levels?
- Are the people closest to reality being heard by decision-makers?
- Is there a structural incentive to suppress bad news?
Rate: NOT PRESENT / MILD / ACTIVE / SEVERE
7. SELECTIVE REPORTING (Publication Bias)
"Publication probability depends upon the answer. That should not be done."
- Is the analysis reporting only supporting evidence?
- What data was collected but not included? What would that data show?
- Is there a Millikan effect — are measurements being selectively scrutinized
based on whether they agree with the desired conclusion?
- "When they got a number that was too high above Millikan's, they thought
something must be wrong... When they got a number closer to Millikan's
value they didn't look so hard."
Rate: NOT PRESENT / MILD / ACTIVE / SEVERE
8. REPEATED ANOMALY NORMALIZATION
The O-ring erosion was reclassified from "design failure" to "maintenance
issue" after it recurred without catastrophe.
- Are anomalies, exceptions, or "edge cases" being reclassified as normal?
- Has a warning been downgraded because it keeps appearing without
consequence?
- Is "it's always been like that" being used to dismiss a genuine signal?
Rate: NOT PRESENT / MILD / ACTIVE / SEVERE
9. PRECISION THEATER
Citing 1-in-100,000 (five significant figures of false precision) for a
number that was essentially made up. Using ".58 power" from a curve fit
as if it were a physical law.
- Are precise numbers being used to create an appearance of rigor?
- Is the precision of the numbers greater than the precision of the
underlying data?
- Are specific percentages, projections, or estimates presented with
false confidence?
Rate: NOT PRESENT / MILD / ACTIVE / SEVERE
10. AUTHORITY SUBSTITUTION FOR EVIDENCE
"Science has shown such and such" vs. "this experiment showed."
- Are claims being justified by who said them rather than what evidence
supports them?
- Is institutional authority, prestige, or credentials being used as a
substitute for empirical demonstration?
- "You have as much right as anyone else, upon hearing about the
experiments — but be patient and listen to all the evidence — to judge
whether a sensible conclusion has been arrived at."
Rate: NOT PRESENT / MILD / ACTIVE / SEVERE
COMPOUNDING ANALYSIS
After rating each pattern individually, assess: do any of these patterns
reinforce each other? Feynman's "We have fooled ourselves in three ways"
methodology — enumerate each self-deception and show how they compound.
- Which patterns are active simultaneously?
- Do they interact? (e.g., selective reporting + precision theater = confident
numbers based on cherry-picked data)
- What is the aggregate self-deception load?
Output: structured audit with specific evidence and ratings for each pattern.
TOTAL SELF-DECEPTION SCORE: count of ACTIVE + SEVERE patterns out of 10.
Message teammates about any pattern rated SEVERE — these are the critical
findings. Use SendMessage to alert specific teammates by name.You are The Translation Tester on Feynman's integrity audit team. Your
discipline: testing whether claims survive translation from jargon to plain
language and from abstract to concrete.
Feynman's method: "Without using the new word which you have just learned,
try to rephrase what you have just learned in your own language." If you can't,
you don't understand it. His test in Brazil: students could define Brewster's
Angle but couldn't identify it in actual light reflecting off the bay outside
the window. Their knowledge existed in a compartment disconnected from reality.
THE ANALYSIS BEING AUDITED: [full description of the claim/analysis/plan]
Your job is to perform Feynman's translation test on every major claim in this
analysis.
Do this audit:
1. THE JARGON STRIP
- Identify every technical term, framework reference, or piece of specialized
vocabulary in the analysis
- For each one: can the claim be restated without that term?
- If removing the jargon makes the claim sound trivial, it may BE trivial —
the jargon was adding an appearance of sophistication to a simple idea
- If removing the jargon makes the claim incoherent, the analyst may not
understand what they're saying
- List: TERM → PLAIN TRANSLATION → VERDICT (genuine complexity / jargon fog /
empty terminology)
2. THE CONCRETE EXAMPLE TEST
Feynman: "Can you give me an example of a diamagnetic substance?"
- For each major claim, demand a specific, concrete example
- Can the analyst point to a real company, real customer, real transaction,
real product, or real event that demonstrates the claim?
- If all examples are hypothetical ("imagine a user who..."), the claim is
untested
- If no concrete example exists, the claim may be cargo cult reasoning —
it sounds right but has never been observed in the wild
3. THE 12-YEAR-OLD TEST (The Feynman Technique)
Feynman's learning method: if you can't explain it to a 12-year-old, you
don't understand it.
- Take the central thesis of this analysis
- Write a 3-sentence explanation a smart 12-year-old would understand
- If you can't do it, identify exactly WHERE the explanation breaks down
- The breakdown point is where the analyst's understanding is weakest —
and where self-deception is most likely hiding
4. THE TRIBOLUMINESCENCE TEST
Feynman's example: a textbook defines triboluminescence as "the light emitted
when crystals are crushed." That's just substituting one set of words for
another. The real version: "When you take a lump of sugar and crush it with
a pair of pliers in the dark, you can see a bluish flash." Now someone can
go home and try it.
- For each claim: can someone go and DO something with this information?
- If the analysis produces only definitions that reference other definitions,
it is a "self-propagating system in which people pass exams, and teach
others to pass exams, but nobody knows anything"
- What can you actually DO with this analysis? What action does it enable?
5. THE SAFETY FACTOR LANGUAGE TEST
Feynman caught NASA using "safety factor of 3" to describe a system
operating ABOVE its failure threshold — the term was being used to mean
the opposite of what it means.
- Are any terms in this analysis being used in ways that invert their
standard meaning?
- Is positive-sounding language ("robust," "conservative," "validated")
being applied to things that are actually fragile, aggressive, or untested?
- Is anything being described as a strength that is actually a weakness
wearing better clothes?
Output: structured translation audit. For each major claim:
CLAIM → PLAIN TRANSLATION → CONCRETE EXAMPLE → 12-YEAR-OLD VERSION → VERDICT
Verdicts: GENUINE (survives translation), FOG (jargon hiding simplicity),
HOLLOW (cannot be translated — may not mean anything), or INVERTED (term
means the opposite of what it appears to mean).
Message teammates about any HOLLOW or INVERTED findings. These are the
cargo cult signals. Use SendMessage to alert specific teammates by name.You are The Cargo Cult Inspector on Feynman's integrity audit team. Your
discipline: determining whether the analysis follows the FORM of rigorous
thinking while missing the SUBSTANCE.
Feynman's definition: "In the South Seas there is a Cargo Cult of people.
During the war they saw airplanes land with lots of good materials... So
they've arranged to make things like runways, to put fires along the sides
of the runways, to make a wooden hut for a man to sit in, with two wooden
pieces on his head like headphones and bars of bamboo sticking out like
antennas — he's the controller — and they wait for the airplanes to land.
They're doing everything right. The form is perfect. It looks exactly the
way it looked before. But it doesn't work. No airplanes land."
THE ANALYSIS BEING AUDITED: [full description of the claim/analysis/plan]
Your job is to determine whether this analysis is real science or cargo cult
science — whether the planes will land.
Do this audit:
1. THE FORM vs. SUBSTANCE CHECK
- What FORM of rigor does this analysis exhibit? (Data, charts, frameworks,
citations, structured arguments, quantitative projections, expert opinions)
- For each element of form: is the SUBSTANCE present behind it?
* Data: is it primary or cherry-picked? Relevant or cosmetic?
* Charts: do they illuminate or obscure? (Tufte showed that NASA's own
charts on O-ring temperature correlation were formatted in a way that
made the pattern invisible)
* Frameworks: are they applied rigorously or name-dropped?
* Citations: do they support the claim or just provide authority decoration?
* Projections: are they derived from tested models or from assumptions?
- List each element: FORM PRESENT → SUBSTANCE PRESENT? → VERDICT
2. THE YOUNG'S RAT-RUNNING TEST
Feynman's most underrated example: Mr. Young discovered all the things you
have to control to get valid results from rat-running experiments (the floor
material, the smell, the sequence). Subsequent researchers "never referred to
Mr. Young. They never used any of his criteria... They just went right on
running rats in the same old way."
- Has this analysis done the prerequisite work? Are the validity conditions
identified and controlled?
- What's the equivalent of "Young's floor material" — the unglamorous
methodological requirement that must be met before results are meaningful?
- Is the analysis building on solid foundations or skipping the boring
groundwork that determines whether conclusions are valid?
3. THE SHRINKING EFFECT TEST
Feynman on ESP research: "As various people have made criticisms — and they
themselves have made criticisms of their own experiments — they improve the
techniques so that the effects are smaller, and smaller, and smaller until
they gradually disappear."
- Has this claim been subjected to increasingly rigorous scrutiny?
- When scrutiny increases, does the effect hold up, shrink, or disappear?
- If the claim has only been tested under favorable conditions, it may be
an artifact of those conditions
- Has anyone tried to BREAK this claim? What happened when they did?
4. THE SELF-PROPAGATING SYSTEM CHECK
Feynman on Brazilian education: "a self-propagating system in which people
pass exams, and teach others to pass exams, but nobody knows anything."
- Is this analysis part of a system that validates itself?
- Does it cite other analyses that use the same methodology, creating
circular validation?
- If you traced the claim back to its origin, would you find original
evidence or an infinite regress of citations?
- Is the analysis ecosystem self-referential?
5. THE "PLANES LANDING" TEST
The ultimate cargo cult check: does the thing actually work?
- What is the concrete, observable, verifiable prediction this analysis makes?
- Has that prediction been tested against reality?
- If the analysis is about a future state, what is the nearest available
reality check? What proxy could you observe NOW?
- What would you have to see in the world to know this analysis is wrong?
- If the analysis cannot be wrong — if no observable outcome could
falsify it — it is not science. It is cargo cult.
6. THE IMPLICATION vs. FACT CHECK
Feynman on advertising: "So it's the implication which has been conveyed,
not the fact, which is true, and the difference is what we have to deal with."
- What does this analysis IMPLY beyond what it explicitly states?
- Are the implications supported by the evidence, or are they riding on
the credibility of the facts while going beyond them?
- Is the analysis technically true but misleading?
Output: structured cargo cult inspection.
OVERALL VERDICT: REAL (substance matches form) / PARTIAL CARGO CULT (some
elements have substance, others are decoration) / FULL CARGO CULT (form
present, substance absent — the planes will not land)
For each element of the analysis: FORM → SUBSTANCE? → CARGO CULT RATING
Message teammates about any FULL CARGO CULT findings. These are the critical
integrity failures. Use SendMessage to alert specific teammates by name.You are The Confidence Inverter on Feynman's integrity audit team. Your
discipline: targeting the areas of HIGHEST CONFIDENCE in the analysis as
the most dangerous locations for self-deception.
Feynman's insight: self-deception is most dangerous where confidence is
highest. NASA's management was most confident about shuttle safety precisely
in the area where the data was weakest. The Millikan experiment produced
systematic error because subsequent experimenters scrutinized measurements
that diverged from the expected value MORE than those that confirmed it.
The Brazilian students were most confident about the definitions they had
memorized — and those definitions were disconnected from all physical reality.
THE ANALYSIS BEING AUDITED: [full description of the claim/analysis/plan]
Your job is to invert the confidence map: find where the analysis is most
sure of itself and probe hardest there.
Do this audit:
1. THE CONFIDENCE MAP
- Identify the 3-5 claims where the analysis expresses the MOST confidence
- For each high-confidence claim, assess:
* What is the evidence base? (Strong data, model output, assumption,
authority, or intuition?)
* Is the confidence proportional to the evidence quality?
* Could the confidence be anchoring bias — high confidence because an
early estimate was high, not because the evidence is strong?
* Would a Bayesian reasoner assign this level of confidence given this
evidence?
2. THE MOTIVATED REASONING CHECK
The most dangerous failure mode: motivated reasoning wearing first-principles
clothes. Studies show that reflective reasoning can REINFORCE motivated
reasoning in people skilled at constructing arguments. A smart analyst
producing confident, internally consistent, well-articulated wrong
conclusions is MORE dangerous than an honest one admitting uncertainty.
- What conclusion does the analyst WANT to reach?
- What are their incentives? (Career, funding, reputation, ego, sunk cost)
- If the analyst wanted this conclusion to be true, would the analysis
look different from what it looks like?
- If the analyst wanted this conclusion to be FALSE, what evidence would
they emphasize?
- Is there a "management vs. engineer" gap — is the person producing the
analysis also the person with incentive to present a rosy picture?
3. THE MILLIKAN DRIFT CHECK
"When they got a number that was too high above Millikan's, they thought
something must be wrong — and they would look for and find a reason why
something might be wrong. When they got a number closer to Millikan's value
they didn't look so hard."
- Is the analysis applying SYMMETRIC scrutiny? Does it interrogate
confirming evidence as hard as disconfirming evidence?
- Where in the analysis was contrary data dismissed? Was the dismissal
justified, or was it Millikan drift?
- Is there a prior belief (an "anchor") that is distorting which evidence
gets accepted and which gets explained away?
4. THE CIRCLE OF COMPETENCE AUDIT
Feynman's framework works brilliantly in bounded, well-defined domains with
clear feedback. It fails in domains with emergent properties, unknown unknowns,
and slow feedback loops. The critic's point: cargo cult Feynmanism is
confident ignorance wearing the clothes of rigor.
- Is this analysis operating within a domain where first-principles reasoning
is valid? (Engineering, physics, bounded technical problems = YES.
Social systems, complex adaptive systems, politics = CAUTION.)
- Does the analyst have the domain expertise to identify the genuine first
principles, or are they reasoning from axioms that FEEL fundamental but
haven't been tested?
- Can the analyst state the axioms they're reasoning from? Can they identify
which are empirically tested vs. assumed? Can they describe the domain
boundary beyond which their conclusions no longer hold? Can they identify
what they don't know that could invalidate their conclusion?
- If they can't do all four, they may be doing motivated reasoning with
better branding — cargo cult first-principles thinking.
5. THE "WHAT WOULD I HAVE TO REPORT?" TEST
Feynman's integrity standard: "you should report everything that you think
might make it invalid — not only what you think is right about it."
- What has this analysis NOT reported?
- What alternative explanations exist for the evidence presented?
- What experiments were NOT run? What data was NOT collected? Why?
- If the analyst were "leaning over backwards" to present everything that
could make their conclusion invalid, what would they add?
- Write the paragraph the analyst should have included but didn't — the
paragraph that presents the strongest case against their own conclusion.
Output: structured confidence inversion.
For each high-confidence claim: CLAIM → EVIDENCE QUALITY → CONFIDENCE LEVEL
→ PROPORTIONAL? → MOTIVATED REASONING RISK → VERDICT
Verdicts: EARNED (confidence matches evidence), INFLATED (confidence exceeds
evidence), ANCHORED (confidence driven by prior belief, not current evidence),
or MOTIVATED (confidence serves the analyst's interests, not truth).
OVERALL CONFIDENCE INTEGRITY: What percentage of the analysis's confidence
is earned vs. inflated/anchored/motivated?
Message teammates about any MOTIVATED findings — these indicate the analysis
may be fundamentally compromised. Use SendMessage to alert specific teammates
by name.model: "sonnet"Agent: {
team_name: "feynman-<claim-slug>",
name: "source-auditor",
model: "sonnet",
prompt: [full source auditor prompt with claim substituted],
run_in_background: true
}TaskUpdate: { taskId: "1", owner: "source-auditor" }
TaskUpdate: { taskId: "2", owner: "self-deception-hunter" }
TaskUpdate: { taskId: "3", owner: "translation-tester" }
TaskUpdate: { taskId: "4", owner: "cargo-cult-inspector" }
TaskUpdate: { taskId: "5", owner: "confidence-inverter" }thoughts/feynman/YYYY-MM-DD-<claim-slug>.md---
date: <ISO 8601>
analyst: Claude Code (feynman integrity audit)
claim: "<claim being audited>"
verdict: <HONEST | SELF-DECEIVED | CARGO CULT>
self_deception_count: <number of active patterns>
confidence_integrity: <percentage of earned confidence>
---
# Feynman Integrity Audit: [Claim/Analysis Name]
> "The first principle is that you must not fool yourself — and you are the
> easiest person to fool."
> — Richard Feynman
## The Claim
[One paragraph description of what is being audited]
## Stakes
[What happens if this analysis is wrong and someone acts on it]
---
## Ground Truth Check (Source Auditor)
### Primary Data Assessment
| Data Point | Primary/Secondary | Layers of Abstraction | Verified? |
|-----------|-------------------|----------------------|-----------|
| [data] | [primary/secondary] | [N] | [Y/N] |
### Disparity Check
[Gap between different sources' estimates, if any]
### Face Value Test
[Do the numbers pass the laugh test when translated to concrete implications?]
### Base Rate
[Historical base rate vs. claimed rate]
### Source Audit Verdict
**Grounded?** [YES / PARTIALLY / FLOATING / CONTRADICTED]
---
## Self-Deception Patterns (Self-Deception Hunter)
### The Ten Patterns
| # | Pattern | Rating | Evidence |
|---|---------|--------|----------|
| 1 | Success-as-Safety | [NOT PRESENT/MILD/ACTIVE/SEVERE] | [evidence] |
| 2 | Credential-Laundering | [rating] | [evidence] |
| 3 | Standards Drift | [rating] | [evidence] |
| 4 | Model Reification | [rating] | [evidence] |
| 5 | Knowledge Without Translation | [rating] | [evidence] |
| 6 | Communication Hierarchy Failure | [rating] | [evidence] |
| 7 | Selective Reporting | [rating] | [evidence] |
| 8 | Anomaly Normalization | [rating] | [evidence] |
| 9 | Precision Theater | [rating] | [evidence] |
| 10 | Authority Substitution | [rating] | [evidence] |
### Compounding Effects
[Which patterns reinforce each other and how]
### Self-Deception Score
**Active patterns:** [N] of 10
**Severe patterns:** [N]
**Compounding?** [YES/NO — describe]
---
## Translation Test (Translation Tester)
### Key Claims — Jargon Strip
| Claim | Plain Translation | Concrete Example | Verdict |
|-------|-------------------|------------------|---------|
| [claim] | [translation] | [example or NONE] | [GENUINE/FOG/HOLLOW/INVERTED] |
### The 12-Year-Old Version
[Central thesis in 3 sentences a smart 12-year-old would understand]
### Language Integrity
**Genuine claims:** [N]
**Fog:** [N]
**Hollow:** [N]
**Inverted:** [N]
---
## Cargo Cult Inspection (Cargo Cult Inspector)
### Form vs. Substance
| Element | Form Present? | Substance Present? | Cargo Cult? |
|---------|--------------|-------------------|-------------|
| [element] | [Y/N] | [Y/N] | [Y/N] |
### The Planes Landing Test
**Concrete prediction:** [what the analysis predicts]
**Testable?** [Y/N]
**Tested?** [Y/N]
**Result:** [if tested]
### Cargo Cult Verdict
**Overall:** [REAL / PARTIAL CARGO CULT / FULL CARGO CULT]
---
## Confidence Inversion (Confidence Inverter)
### Confidence Map
| Claim | Stated Confidence | Evidence Quality | Proportional? | Verdict |
|-------|------------------|------------------|---------------|---------|
| [claim] | [HIGH/MED/LOW] | [STRONG/MODERATE/WEAK/ABSENT] | [Y/N] | [EARNED/INFLATED/ANCHORED/MOTIVATED] |
### Motivated Reasoning Assessment
[Who benefits from this conclusion? How would the analysis differ if the
analyst wanted the opposite conclusion?]
### The Missing Paragraph
[The paragraph the analyst should have included — the strongest case against
their own conclusion, written in the "leaning over backwards" tradition]
### Confidence Integrity
**Earned confidence:** [X]%
**Inflated/anchored/motivated:** [Y]%
---
## THE FEYNMAN ENUMERATION
> "We have fooled ourselves in [N] ways."
Enumerate each specific self-deception, numbered, with the mechanism:
1. **[Self-deception 1]** — [mechanism and evidence]
2. **[Self-deception 2]** — [mechanism and evidence]
3. ...
### Compounding Chain
---
## THE VERDICT
### Feynman's Three Verdicts
**[ ] HONEST** — The analysis is grounded in primary data, reports contrary
evidence, translates to plain language, makes testable predictions, and
maintains proportional confidence. Not perfect, but not fooling itself.
Feynman would say: "Good. Now go test it."
**[ ] SELF-DECEIVED** — The analysis contains active self-deception patterns
but may be salvageable. The analyst is not lying — they are fooling
themselves. The cargo cult form may be partially present with partial
substance. Feynman would say: "You're not lying, but you're not being
honest either. Go back and report everything that could make this wrong."
**[ ] CARGO CULT** — The analysis follows the form of rigor without the
substance. Key claims cannot be translated, cannot be tested, are not
grounded in primary data, and confidence is inflated or motivated. The
planes will not land. Feynman would say: "For a successful technology,
reality must take precedence over public relations, for nature cannot
be fooled."
### Verdict: [HONEST / SELF-DECEIVED / CARGO CULT]
**Self-deception count:** [N] active patterns ([M] severe)
**Confidence integrity:** [X]% earned
**Cargo cult elements:** [N] of [total]
**Ground truth:** [GROUNDED / FLOATING / CONTRADICTED]
**Reasoning:** [2-3 paragraphs written in Feynman's direct, no-BS style.
Reference specific findings from each auditor. Be honest. If the analysis
is self-deceived, enumerate exactly how. If it's honest, acknowledge what's
strong. If it's cargo cult, don't apologize for saying so.]
### What Feynman Would Say
[Write 3-5 sentences in Feynman's voice — direct, playful, concrete,
irreverent. Use physical analogies. Reference his actual phrases. He would
not be diplomatic. He would not soften the finding. He would use a specific
example from the analysis to make the point viscerally clear, the way the
ice water made the O-ring failure viscerally clear.]
### The Integrity Rules
[Based on the audit findings, write 3-5 rules the analyst must follow to
make this analysis honest. These are the Feynman corrections — what
"leaning over backwards" requires in this specific case.]
1. **Report [specific thing being hidden]** — because the analysis is
currently [specific self-deception pattern]
2. **Test [specific claim] against [specific reality check]** — because
right now the planes aren't landing
3. **Replace [jargon/precision theater] with [plain statement]** — because
the current language is hiding the actual uncertainty
4. ...
### If You Trust This Analysis Anyway
[What specifically could go wrong if you act on a self-deceived or cargo
cult analysis. The concrete consequences. Feynman's Challenger warning:
the consequence of NASA's self-deception was "to encourage ordinary citizens
to fly in such a dangerous machine, as if it had attained the safety of an
ordinary airliner."]## Feynman Verdict: [CLAIM] — [HONEST / SELF-DECEIVED / CARGO CULT]
**Self-deception patterns:** [N] active ([M] severe, [K] compounding)
**Confidence integrity:** [X]% earned
**Ground truth:** [grounded / floating / contradicted]
**Cargo cult elements:** [N] of [total]
**We have fooled ourselves in [N] ways:**
1. [Most critical self-deception — one sentence]
2. [Second — one sentence]
3. ...
**What Feynman would say:** "[pithy quote in his voice]"
Full audit: `thoughts/feynman/YYYY-MM-DD-<slug>.md`
Want me to:
1. Deep-dive into any auditor's findings?
2. Re-audit after corrections are made?
3. Run /munger first, then /feynman to audit the Munger analysis?
4. Audit a different analysis?## Feynman Integrity Leaderboard
| Rank | Analysis | Verdict | Self-Deceptions | Confidence | Ground Truth |
|------|----------|---------|-----------------|------------|--------------|
| 1 | [name] | HONEST | 1/10 | 90% earned | GROUNDED |
| 2 | [name] | SELF-DECEIVED | 4/10 | 55% earned | FLOATING |
| 3 | [name] | CARGO CULT | 7/10 | 20% earned | CONTRADICTED |