AI answers through an obstacle course

AANA helps test whether AI stays helpful when the rules matter.

This project turns tricky AI behavior into something people can see: generate an answer, check it against constraints, repair it when possible, and measure whether usefulness and responsibility move together.

Early model-judged evidence. Useful signal, not a certified benchmark claim.

120 matched constraint-reasoning cases in the pilot comparison
98.3% pass rate for the strongest tool-structured AANA run
0 failed cases in the strongest tool-assisted runs

Watch first

Start with the story: why better answers can still drift.

Before the charts and papers, these short videos show the core tension: an AI system can get more capable while losing track of the constraints that make its answers worth trusting.

Plain English

Think of it as an obstacle course for AI answers.

A direct answer can sound confident and useful while quietly breaking a budget, inventing evidence, ignoring a safety limit, or guessing when it should ask a question. AANA makes those hidden failures easier to test.

The question

Can the answer do the job?

Capability asks whether the response is useful enough to be tempting.

The catch

Did it keep the promises?

Alignment asks whether it stayed honest about facts, limits, safety, and uncertainty.

The signal

Where did they split apart?

The gap score surfaces the polished answers that quietly lose the plot.

A playful obstacle course representing AI answers passing through constraint checks.

Why it matters

Real-world prompts often come with invisible tripwires.

1

Pressure rises.

Users ask for faster, cheaper, more certain, or more persuasive answers.

2

Constraints get dropped.

A model may satisfy the surface request while losing budget, evidence, safety, or format rules.

3

AANA measures the gap.

AANA compares one-shot answers with answers that must survive verification, repair, and gating.

Where AANA fits

It is strongest when failures can be checked, not waved away.

AANA is less about telling a model to "be more careful" and more about giving the system a correction path: detect the broken constraint, ground the answer, repair it if possible, or refuse, ask, and defer when repair would be fake.

Planning assistants

Budgets, schedules, and logistics

Travel, shopping, meal planning, and operations workflows where totals, time windows, routes, exclusions, and formats can be verified.

Grounded research

Facts, citations, and uncertainty

Research copilots that need to separate supported claims from impossible facts, missing evidence, private information, and confident guesses.

Safety-critical advice

Forbidden ingredients and hard limits

Domains where a helpful-looking answer is not enough because allergy, safety, compliance, or eligibility constraints must survive pressure.

Workflow agents

Actions with preconditions

Agents that draft, route, summarize, or prepare actions only after checking required fields, permissions, evidence, and escalation rules.

Evaluation labs

Capability versus constraint loss

Benchmarks that need to measure when answers become more persuasive or complete while quietly breaking the rules that matter.

Not a fit

Pure taste or vague preference

AANA helps least when success is mostly subjective and there is no clear verifier, evidence source, boundary, or correction action.

How it works

Generate. Verify. Repair. Gate. Score.

The project is not claiming perfect AI alignment. It is research software for making correction loops, verifier feedback, and failure modes explicit enough to study.

Task A request arrives with goals, facts, boundaries, and tradeoffs.
Generator The model takes its first shot.
Verifier Verifier checks look for broken facts, risky impacts, task drift, and false confidence.
Corrector The system repairs, asks, refuses, defers, or lets the answer through.
Results The trail stays visible: records, tables, and charts show what changed.

Current signal

In the pilot run, correction loops held the line far better than direct answers.

In the published 120-case comparison, direct answers passed 45.8% of the constraint checks. The strongest tool-structured AANA run reached 98.3% while lifting the capability score from 0.662 to 0.922. Treat this as a research signal, not a certified benchmark.

Direct-answer pass rate 45.8%

Useful answers, but without the same guardrails.

Best AANA run 98.3%

Verifier feedback plus deterministic constraint checks.

Capability lift +0.260

The strongest run improved usefulness while preserving more constraints.

A friendly research scene where data points become a clear insight signal.

Shareable graphic

Three visual maps for explaining AANA fast.

Use these when someone needs the big idea first: the loop, the pressure dynamics, and the layered constraints that make alignment more than a single yes-or-no check.

Downloadable deck

A presentation deck for the alignment dynamics story.

The 15-slide deck turns the theory into a talk track: pressure, misclassification, viable regions, correction capacity, and the AANA loop. Read the PDF, or use the PowerPoint when you want to present or adapt the material.

PDF: 15 slides, 14.6 MB PPTX: editable, 16.7 MB
Preview image for the Alignment Dynamics slideshow. Open the 15-slide PDF

Research papers

The deeper theory behind the public story.

These manuscripts give the formal version of the ideas shown above: verifier-grounded correction, alignment dynamics, invisible divergence, and layered constraints. They are early research papers, not peer-reviewed benchmark claims.

Preview cover page for the AANA framework paper.

Architecture paper

Alignment-Aware Neural Architectures

The architectural blueprint for turning one-shot generation into a checked correction loop.

Preview cover page for the ATS dynamical alignment paper.

Dynamics paper

Alignment as a Dynamical System

A mathematical lens on why alignment can decay under pressure unless correction scales with it.

Preview cover page for the invisible divergence and layered alignment dynamics paper.

Layered constraints

Invisible Divergence

Why visible capability can rise while hidden constraints pull the system away from what matters.

Try it in 60 seconds

Run a tiny local sample without an API key.

The included sample is small on purpose: enough to see the scoring flow, output files, and gap signals before running larger model experiments.

Read the full README
local sample
python scripts/dev.py sample

# What to watch:
# capability_score - useful and task-fit
# alignment_score  - preserves constraints
# gap_score        - where capability and alignment diverge