Entropic Governance Framework (EGF) · Documents

EGF–A4 (Informative)
AI-Assisted Adversarial Stress Testing (Protocol)

Status: Informative / Methodological (Non-Normative) · Applies to: EGF · Version 1.0 · 2025

1. Purpose

This document defines a reproducible protocol for AI-assisted adversarial stress testing of the Entropic Governance Framework (EGF). The objective is to surface candidate conceptual weaknesses, boundary ambiguities, misinterpretation risks, governance failure modes, and potential falsifiers through structured, adversarial analysis using large language models (LLMs).

2. Scope and Non-Claims

AI systems are treated strictly as heuristic adversarial generators. Outputs generated under this protocol do not constitute validation, review, endorsement, evidence of correctness, or expert judgement. All interpretation, classification, and response decisions remain the responsibility of the framework author and subsequent human reviewers.

This protocol explicitly does not claim that:

LLMs can validate or falsify governance frameworks;
LLM outputs represent peer review, expert assessment, or disciplinary authority;
LLM-generated critiques should be accepted without human judgement;
stress testing substitutes for empirical research, institutional scrutiny, or public accountability.

AI systems are not treated as authors, contributors, or sources of original claims within the EGF corpus. The role of AI in this protocol is analogous to red-teaming or adversarial brainstorming in engineering and security disciplines.

3. Canonical Inputs and Version Discipline

Stress testing should be conducted against stable, versioned reference documents. At minimum, the protocol assumes:

EGF-W1 (White Paper; canonical reference);
EGF-R1 (Research Agenda; testing posture and falsification criteria);
EGF-N1 (Clarifications & scope; interpretive boundaries and common objections);
EGF-D1 (Definitions & glossary; vocabulary control).

Each stress-test run must record the exact versions used (date-stamped file, version tag, commit hash, or DOI where applicable). Version discipline is part of the method: it ensures critiques can be traced to specific claims and text.

4. Test Packets

To reduce noise and improve comparability, tests should use defined “packets” (bounded excerpts), rather than arbitrary multi-page pastes. Packets are not new documents; they are standardised, frozen excerpts derived from canonical EGF texts.

Recommended packet set:

Packet A (Core claims): Abstract + 5–10 explicit core claims (about 1–2 pages equivalent).
Packet B (Architecture): Governance architecture and constraint logic (about 2–4 pages).
Packet C (Definitions): Entropy, irreversibility, optionality, system boundary (about 1–2 pages).
Packet D (Worked vignette): One short scenario (e.g., infrastructure, disaster recovery, AI deployment).
Packet E (Known critiques): Curated critique list to test against (about 1 page).

Packet boundaries are part of the method. If packet content or boundaries are changed during a test cycle, the change must be recorded as a protocol deviation (with rationale).

5. Model Diversity and Run Conditions

Each module should be run across multiple LLMs where feasible to detect model-specific bias, interpretive drift, and stability of critique themes. Recommended minimum: three distinct model families or providers.

Where the interface allows it, run each module 2–3 times per model to sample variability. Keep temperature and other sampling settings consistent (or record defaults if settings are not visible).

5.1 Persona prompts

Persona-based prompts may be used only for interpretive drift and misuse analysis (e.g., “how could this be misread?”), and must be explicitly labelled as heuristic. Persona outputs must not be treated as evidence of correctness, review, endorsement, or disciplinary authority.

6. Adversarial Modules and Prompt Templates

The protocol is organised into modules. Each module should use role-neutral adversarial prompts by default. The templates below may be adapted, but the intent and constraints should be preserved.

6.1 Module M1: Falsification and breakpoints

Prompt template:
“Assume EGF is fundamentally flawed. Identify the strongest falsifiers: conditions, cases, or observations that would make EGF incorrect, non-operational, or redundant. Provide at least five, and for each: (i) why it breaks the framework, (ii) what would need to change.”

6.2 Module M2: Redundancy and collapse into existing frameworks

Prompt template:
“Rewrite EGF’s core claims entirely using established frameworks (ecological economics, resilience, institutional theory, systems thinking). Map EGF concepts → existing concepts. Conclude what (if anything) remains irreducible.”

6.3 Module M3: Governance paralysis risk

Prompt template:
“Construct the strongest argument that irreversibility-focused governance leads to paralysis (‘do nothing’). Propose decision rules that avoid paralysis while preserving accountability for irreversible losses.”

6.4 Module M4: Power, incentives, capture

Prompt template:
“Critique EGF from a power/political economy lens. Where can actors manipulate ‘constraint’ narratives? Propose safeguards (transparency, contestability, auditability) to reduce capture.”

6.5 Module M5: Misuse and harm scenarios (red-team)

Prompt template:
“Red-team EGF: design a scenario where EGF language is used to justify a harmful decision while claiming constraint-awareness. Identify rhetorical failure points and counter-measures.”

6.6 Module M6: Boundary sensitivity

Prompt template:
“Show how different reasonable system boundaries change conclusions in the same case. If outcomes flip, propose a boundary discipline or selection principle.”

6.7 Module M7: Minimal operational workflow

Prompt template:
“Propose a minimum viable operational workflow for applying EGF in a government or enterprise decision context within two weeks. Specify steps, artefacts, roles, and outputs. Identify ‘required’ versus ‘optional’ elements.”

7. Logging and Traceability

Each run should be logged. The purpose is methodological traceability, not publication of raw transcripts. For each run, record at minimum:

Model name/version (as displayed by the platform);
Date/time of run;
Module ID (M1–M7);
Packet IDs used (A–E) and packet version label;
Prompt text (verbatim);
Output (verbatim, retained privately);
Evaluator notes (human interpretation and classification).

A simple spreadsheet is sufficient. The goal is reproducibility of method, not exhaustion of all model behaviours.

8. Evaluation Rubric

Each critique candidate should be scored (0–3) on the following dimensions:

Specificity: concrete, testable objections vs vague commentary;
Adversarial strength: steelman critique vs strawman;
Actionability: does it suggest a response pathway (clarify, constrain, extend, narrow scope, defer)?
Faithfulness: does it accurately represent EGF claims, or critique a misread?
Novelty: genuinely new insight vs restatement of known critiques.

Recommended retention rule: keep critiques with Adversarial strength ≥ 2 and Faithfulness ≥ 2. Treat low-faithfulness items as interpretive-risk signals rather than substantive falsifiers.

9. Synthesis Method and Change Management

AI outputs must be converted into human-curated objections before any publication or incorporation into the EGF corpus. For each retained critique:

Restate as a human-authored objection O# in one sentence (no AI attribution).
Classify: misinterpretation / scope limitation / conceptual weakness / potential falsifier.
Choose response type: clarify (N-series), constrain (rule/gate), extend (new A/G document), narrow scope, or defer.
Record disposition: addressed / deferred / rejected (misread) with rationale.

Major breakpoints are considered valuable outcomes. A framework that cannot be stress-tested is not yet a serious framework.

10. Publication Rules (A5 and Beyond)

Findings from protocol execution are synthesised and documented in a separate document (EGF-A5) if and when substantive results exist. Raw AI transcripts are not published by default. A5 should publish:

Curated objection list (O#), grouped by theme;
Classification and disposition for each objection;
Resulting clarifications (links to N-series updates) or extensions (new A/G documents);
A change log describing what changed and why.

If a potential falsifier is identified, A5 should record it explicitly and state whether EGF narrows scope, revises constructs, or defers pending research. Such findings are not treated as reputational threats, but as evidence of disciplined inquiry.

11. Position Within the EGF Corpus

EGF-A4 is a methodological companion to EGF-R1. A4 defines how stress testing is performed; R1 defines what would count as falsification. Neither document modifies normative claims in EGF-W1.