Multi-agent accountability: who co-signs the handoff between your CrewAI agents?

#crewai #multiagents #python #security

In the previous article, we added cryptographic audit trails
to individual CrewAI agents in 3 lines. Each agent now has its own hash-chained,
Ed25519-signed event history.

But in a real crew, agents don't work in isolation. The Researcher produces
output. The Writer consumes it. The Reviewer validates it. These handoffs are
where the real accountability gap lives.

Who cryptographically proves that the Researcher actually produced that specific
output, and that the Writer actually received it unmodified?

The gap that logs don't close

You can read your crew's logs and reconstruct what happened. But logs are
unilateral — they record what one agent observed. If you want to prove a
handoff to a third party (an auditor, a regulator, a client), you need
something both sides signed.

In legal terms: you need non-repudiation. Neither party can deny the interaction
occurred, because both chains contain the same proof.

How PiQrypt's A2A protocol works

When two PiQrypt-equipped agents interact, the exchange produces an
interaction_hash — a deterministic identifier of the interaction payload,
present in both agents' chains. To falsify the exchange, you'd have to
modify both chains simultaneously and maintain hash consistency across both.
That's cryptographically intractable.

The mechanism: a pairwise Ed25519 handshake — proposal from agent A, response
from agent B, co-signed event appended to both chains. Same interaction_hash
in both memories.

The code

from crewai import Crew, Task
from piqrypt.bridges.crewai import AuditedAgent as Agent

# Each agent gets its own cryptographic identity
researcher = Agent(
    role="Researcher",
    goal="Find competitive pricing data",
    backstory="Expert at finding and analyzing market data.",
    agent_name="researcher_01"
)

writer = Agent(
    role="Writer",
    goal="Produce a pricing analysis report",
    backstory="Turns raw data into clear executive summaries.",
    agent_name="writer_01"
)

When the Researcher passes its findings to the Writer, you record the handoff
explicitly. The keys are the ones returned when the identities were created
via aiss.generate_keypair() — store them at agent init time.

python
from aiss.a2a import perform_handshake, record_external_interaction
import hashlib

# Establish cryptographic trust between the two agents (once per pair)
handshake = perform_handshake(
    researcher_private_key,
    researcher_public_key,
    researcher_agent_id,
    peer_agent_id=writer_agent_id,
    peer_public_key=writer_public_key
)

# Record the actual handoff
payload_hash = hashlib.sha256(findings_text.encode()).hexdigest()

record_external_interaction(
    researcher_private_key,
    researcher_agent_id,
    peer_id=writer_agent_id,
    interaction_type="findings_handoff",
    payload_hash=payload_hash  # content is hashed, never stored raw
)

The payload_hash is now in both chains. Neither agent can claim the
interaction didn't happen. Neither can claim a different payload was
transmitted.
What about agents outside your crew?
If your pipeline calls an external agent that doesn't have PiQrypt,
you still record your side:

python
# Peer without PiQrypt — unilateral proof, honestly recorded
record_external_interaction(
    researcher_private_key,
    researcher_agent_id,
    peer_id="external_autogen_analyst",
    interaction_type="result_transmitted",
    payload_hash=payload_hash
)
# Your chain proves you transmitted this payload at this timestamp.
# The peer didn't co-sign — that fact is visible in the record.

Your chain records the interaction. The absence of a co-signature is
structurally visible — you prove your side, and any auditor can see the
peer didn't participate in the proof.

Verifying the full picture

python
from aiss.exceptions import InvalidChainError

researcher_events = aiss.load_events("researcher_01")
writer_events = aiss.load_events("writer_01")

try:
    aiss.verify_chain(researcher_events)
    print("researcher_01: chain intact")
except InvalidChainError as e:
    print(f"researcher_01: chain compromised — {e}")

try:
    aiss.verify_chain(writer_events)
    print("writer_01: chain intact")
except InvalidChainError as e:
    print(f"writer_01: chain compromised — {e}")

# The interaction_hash appears in both chains — cross-verifiable
# without a server, without PiQrypt infrastructure.

vs LangSmith / CrewAI traces

CrewAI's built-in tracing and LangSmith both provide excellent observability
within their respective frameworks. The distinction isn't observability —
it's falsifiability.

A trace tells you what happened. A co-signed chain proves it, offline,
to a third party who has no reason to trust you. The interaction_hash
is the same in both agents' memories. Change one side — the other breaks.
These are complementary tools, not competing ones.

Multi-agent sessions: N agents, N×(N-1)/2 handshakes

For larger crews, PiQrypt's AgentSession handles the pairwise handshakes
automatically:

session = aiss.AgentSession(
    agents=[
        researcher_id["agent_id"],
        writer_id["agent_id"],
        reviewer_id["agent_id"]
    ]
)
# 3 agents → 3 pairwise handshakes established
# Every cross-agent interaction is now co-signed

This scales to cross-framework sessions — a CrewAI researcher, an AutoGen
analyst, an Ollama local model — all in the same session, all with
cryptographic proof of every handoff.

The principle

CrewAI handles orchestration — who calls what, in what order, with what tools.
PiQrypt handles proof — that it happened, exactly as described, and that
neither party can deny it.

Both are necessary. Neither replaces the other.

Part 1: How I added cryptographic audit trails to any CrewAI crew in 3 lines

Part 3 (next): Watch your CrewAI agents in real-time with PiQrypt Vigil