Replay-Based Forking #
This file adds an additive replay-based fork path beside the existing seed-based
forking infrastructure in VCVio.CryptoFoundations.SeededFork.
The key idea is to record a first-run QueryLog, then replay that transcript
exactly up to a selected fork point while changing one distinguished oracle
answer. This is meant to support settings such as Fiat-Shamir where ambient
randomness should be replayed by transcript rather than by pre-generated seed
coverage.
The query input at position n in the full interleaved log, if it exists.
Instances For
The n-th answer in the log for queries to oracle t, if it exists.
Instances For
If getQueryValue? log t n = some u, then the n-th t-filtered entry of
log is ⟨t, u⟩.
Converse: if the n-th t-filtered entry is ⟨t, u⟩, then
getQueryValue? log t n = some u.
Every entry of log.getQ (· = t) has its first component equal to t.
If the t-filtered log has at least n + 1 entries, then getQueryValue? log t n
is some _.
Prepending an entry whose oracle index does not match t leaves the t-indexed
view of the log unchanged.
The first entry of getQueryValue? (⟨t, u⟩ :: log) t 0 is the prepended value.
Prepending a matching ⟨t, _⟩ entry shifts later t-indexed lookups by one.
The prefix of log up to (but not including) the s-th i-query.
If log has fewer than s + 1 queries to i, this returns the entire log. This is
the replay analogue of QuerySeed.takeAtIndex and is the key slicing operator used in
the Cauchy-Schwarz step of the replay forking bound.
Instances For
The prefix takeBeforeForkAt log i s has at most s queries to i.
If log contains at least s + 1 queries to i, then takeBeforeForkAt log i s has
exactly s queries to i.
takeBeforeForkAt log i s is a prefix of log.
getQueryValue? on the truncation at index i position s is none: the prefix
has fewer than s + 1 entries at oracle i.
If log has at most s entries of type i, then truncating log at position s
leaves it unchanged: there is no s-th i-entry to truncate before.
Replay state for the second run of a replay-based fork.
trace is the first-run transcript, cursor tracks how much of that transcript
has been matched so far, distinguishedCount counts how many queries to the
forked oracle family have already been replayed, and observed stores the
actual second-run transcript.
- trace : spec.QueryLog
- cursor : ℕ
- distinguishedCount : ℕ
- forkQuery : ℕ
- replacement : spec.Range i
- forkConsumed : Bool
- mismatch : Bool
- observed : spec.QueryLog
Instances For
Initial replay state before the second run starts.
Instances For
The next entry in the logged first-run transcript, if any.
Instances For
Record an observed query-answer pair in the second-run log.
Instances For
Mark the replay as having deviated from the first-run prefix.
Instances For
Replay oracle for the second run of a replay-based fork.
Before the fork point, the oracle must match the logged first-run transcript exactly. At the selected distinguished query it returns the replacement answer. Once the fork point has been consumed, or once replay has mismatched the logged prefix, the oracle falls through to the live ambient oracle.
Instances For
Run main with query logging. This is the first-run object for replay forks.
Instances For
Replay the second run against a fixed first-run log, fork point, and replacement answer, returning both the final output and replay state.
Instances For
Deterministic replay-fork core with the first-run output and transcript fixed.
This mirrors seededForkWithSeedValue: the first-run result and replacement answer are
inputs, while the second run may still make live oracle calls after the fork
point.
Instances For
Additive replay-based fork operation.
The first run is obtained via withQueryLog. The second run then replays that
transcript exactly up to the selected distinguished query, replacing exactly one
answer at that query.
Instances For
Live-mode α-marginal #
Once the replay oracle has transitioned into "live mode" (either forkConsumed = true
after the fork fired, or mismatch = true after a trace mismatch or exhaustion), every
subsequent query simply falls through to the ambient query t and records the answer in
observed. In particular, the α-component of the simulated computation coincides with
main itself: the state only records observations and does not influence the output.
These lemmas are used in the B1 faithfulness proofs (evalDist_uniform_bind_fst_replay RunWithTraceValue_takeBeforeForkAt and tsum_probOutput_replayFirstRun_weight_take BeforeForkAt): after the fork point on either side (full log vs. truncated log), both
computations enter live mode, and the α-marginal collapses to evalDist main.
Live mode is preserved by noteObserved: neither forkConsumed nor mismatch is
touched by recording an observation.
Live-mode α-marginal. Starting from a replay state in live mode
(forkConsumed = true or mismatch = true), the α-component of running the replay
oracle on main equals main itself. The state only accumulates observations; it has
no effect on the α-distribution.
α-marginal form of fst_map_simulateQ_replayOracle_of_live, specialized to the
evalDist level. Once in live mode, the α-output distribution of the replay run is
evalDist main.
Prefix-style replay invariant: the consumed prefix of observed matches the consumed prefix of
trace at the level of query inputs, and if the fork has not fired yet then observed has no
extra suffix beyond that prefix.
The values clause additionally pins down values on the non-fork positions: for every position
n strictly before the fork (or every position < cursor when the fork has not yet fired),
observed[n]? = trace[n]?, i.e., both the input oracle and the stored response agree. At the
fork position itself the value in observed is the replacement, so only the input agrees.
The distinguishedCount_eq and fork_position clauses pin down the position of the fork entry
in the filtered i-log: while pre-fork with no mismatch, distinguishedCount counts the number
of i-type entries in observed; once the fork has fired, the entry at position cursor - 1
in observed is exactly the forkQuery-th (0-indexed) i-type entry, i.e., the prefix
observed.take (cursor - 1) contains exactly forkQuery entries of type i.
Instances For
Per-query preservation of the replay prefix invariant: a single
replayOracle i t step starting from any state satisfying
ReplayPrefixInvariant produces a state still satisfying it.
Made protected (formerly private) so the Std.Do.Triple bridge in
VCVio/CryptoFoundations/ReplayForkStdDo.lean can lift this to a per-query
spec consumable by mvcgen.
Every reachable replay state preserves the logged query-input prefix up to cursor.
Support-level replay-prefix lemma: before cursor, the second run queries the same oracle
inputs as the logged first-run transcript.
Support-level value-agreement lemma: before the effective fork position (i.e., before
cursor - 1 once the fork has fired, or before cursor while it has not), the second-run
observed log agrees with the first-run trace log on both the query input and the stored
response. This strengthens replayRunWithTraceValue_prefix_input_eq and is the key lemma
for arguing that the adversary's query transcript prior to the fork is identical across
the two runs.
Extract the "fork-position count" invariant: once the fork has fired, the prefix of observed
up to the fork position contains exactly st.forkQuery entries of type i. This pins down where
the replacement entry sits in the filtered i-log.
If replay has consumed the fork point, the last consumed log entry is the distinguished query
input i. This is the pathwise "same target" fact used downstream.
The replay oracle never mutates the immutable parameters forkQuery, replacement,
or trace.
Made protected (formerly private) so the Std.Do.Triple bridge in
VCVio/CryptoFoundations/ReplayForkStdDo.lean can lift this to a per-query
spec consumable by mvcgen.
Second replay invariant: once the fork has fired, the forkQuery-th entry among
distinguished-oracle queries in the observed log is exactly the replacement value.
Before the fork fires and before any mismatch, distinguishedCount exactly tracks the number
of i-entries in observed. Once the fork fires, position forkQuery in the i-filtered
observed log is permanently pinned to replacement.
Instances For
Per-query preservation of the replay replacement invariant.
Made protected (formerly private) so the Std.Do.Triple bridge in
VCVio/CryptoFoundations/ReplayForkStdDo.lean can lift this to a per-query
spec consumable by mvcgen.
Every reachable replay state preserves the replacement invariant.
If the replay has consumed the fork and the fork point is forkQuery, then the
forkQuery-th distinguished-oracle entry in the observed log is exactly the replacement.
Every replay run can be realized as a logged run with the same observed transcript.
Replay does not increase the total number of oracle queries. If main makes at most
n queries, then replayRunWithTraceValue main i trace forkQuery replacement also makes
at most n queries.
Reduces to IsTotalQueryBound.simulateQ_run_of_step with
replayOracle_step_isTotalQueryBound supplying the per-step bound of 1.
If forkReplay succeeds, both runs agree on the selected fork index.
On forkReplay support, first-projection success equals the pair-style success event.
This mirrors probEvent_seededFork_fst_eq_probEvent_pair for the replay fork.
Helper lemmas for le_probOutput_forkReplay #
The pointwise replay bound is proved by the same three-step decomposition used for
le_probOutput_seededFork:
probOutput_collisionReplay_le_main_div(replay analogue ofprobOutput_collision_le_main_div): bounds the probability that the uniformly sampled replacementucollides with the logged answer at thes-thi-query of the first run. The bound isPr[cf <$> main = some s] / hwhereh = |spec.Range i|. Proof is purely about thereplayFirstRunmarginal: for any fixedv,Pr[u = v | u ← uniform] = 1/h.noGuardReplayComp_le_forkReplay_add_collisionReplay(replay analogue ofhNoGuardLeAdd): a structural inequality saying that the unrestricted "no-guard" composition (which always runs the second pass and inspects both projections ofcf) is dominated by the genuineforkReplaysuccess event plus the collision event. Proof is pointwise on(x₁, log)-pairs fromreplayFirstRun.sq_probOutput_main_le_noGuardReplayComp(replay analogue ofsq_probOutput_main_le_noGuardComp): the Jensen / Cauchy-Schwarz step. Squares the probability that the first run satisfiescf x₁ = some sand bounds it by the no-guard joint success probability. This is the deepest piece: it requires a replay-side analogue ofseededOracle.tsum_probOutput_generateSeed_weight_takeAtIndex, stating that averaging over the full first-run log is equal to averaging over the "log truncated at thes-thi-query, then completed with a fresh uniform answer plus live tail samples".
Reachability hypothesis on the fork-index selector cf: whenever the first run
of main outputs x and the recorded log is log, every selected fork index
s = cf x actually corresponds to an i-query in log (i.e. the s-th i-query
exists in the log). This is needed for the replay forking lemma because, unlike
the seeded variant, forkReplay's second run cannot fork at a position the first
run never reached. In FiatShamir-style applications cf extracts the index of a
recorded query, so this property holds by construction.
Instances For
Replay state-correctness invariant #
The next group of lemmas establishes that when main is replayed against a log it
itself produced and the fork index is reachable in that log, the second run reaches
the fork query without mismatching the prefix. The proof has three parts:
replayOracle_preservesConsumed(per-step) andreplayRun_preservesConsumed(full simulation): onceforkConsumed = true ∧ mismatch = falseholds at some point, both flags stay set for the remainder of the run.replayRun_state_correct_aux: a coupled inductive invariant overmainshowing that, starting from a state coupled to a partial first run with enough remainingi-queries to hit the fork, the simulation reachesforkConsumed = truewithmismatch = false.replayRunWithTraceValue_state_correct: the user-facing corollary obtained by instantiating the auxiliary invariant at the initial replay state.
Replay correctness invariant: starting from a logged first run of main whose
log already records an i-query at position ↑s, replaying main against that log
with substitution at the fork query always reaches the fork (so forkConsumed = true
and mismatch = false on every output state).
This is the user-facing corollary of replayRun_state_correct_aux, instantiated at
the initial replay state produced by ReplayForkState.init. The invariant is used in
the replay forking lemma to argue that the no-guard composition cannot succeed via a
state-failure path that forkReplay would reject.
The "no-guard" replay composition: run main with logging, sample u, then run
main a second time with the replay oracle (replaying log up to the s-th i-query
and substituting u there). Always returns the pair of cf-projections, with no
guards. This is the replay analogue of the noGuardComp used in
sq_probOutput_main_le_noGuardComp for the seeded fork.
Instances For
The "collision" replay composition: run main with logging, sample u, and
return cf x₁ only when u coincides with the logged answer at the s-th i-query.
This is the replay analogue of collisionComp used in the seeded forking proof.
Instances For
Key pointwise replay lower bound. This is the replay analogue of
le_probOutput_seededFork.
The summed (aggregated) version is the replay analogue of Firsov-Janku's pr_fork_success
in fsec/proof/Forking.ec:1175. The quantitative argument
decomposes into:
pr_split[Forking.ec:410]: factor out theacc · h⁻¹collision term.pr_succ_resp_eq[Forking.ec:480]: exchange symmetry of the two replay answers.pr_fork_specific[Forking.ec:1115]: pointwisePr[success at s]² ≤ Pr[fork at s].square_sum[Forking.ec:1148]: Jensen / Cauchy-SchwarzΣ aⱼ² ≥ (Σ aⱼ)² / Q.
In Lean the analogous pointwise bound corresponds to step (3) combined with (1) and is
structurally similar to the seed-based le_probOutput_seededFork proof in
VCVio/CryptoFoundations/SeededFork.lean, with replayFirstRun/replayRunWithTraceValue playing
the role of generateSeed/seededOracle and QueryLog.takeBeforeFork-style slicing replacing
QuerySeed.takeAtIndex.
The proof reduces to three helper lemmas:
probOutput_collisionReplay_le_main_div,
sq_probOutput_main_le_noGuardReplayComp, and
noGuardReplayComp_le_forkReplay_add_collisionReplay.
The hreach hypothesis (CfReachable) is needed because, unlike the seed-based version
(where cf x = some s always implies the s-th query value is well-defined in the seed),
in the replay setting, cf is computed from x independently from the actual queries
made by the run that produced it. The hypothesis captures the natural condition that the
fork point s chosen by cf always corresponds to a query that was actually issued.
Currently conditional on the two prefix-faithfulness sorrys feeding
sq_probOutput_main_le_noGuardReplayComp:
evalDist_uniform_bind_fst_replayRunWithTraceValue_takeBeforeForkAt (induction on main)
and tsum_probOutput_replayFirstRun_weight_takeBeforeForkAt (weighted-averaging induction).
Downstream consumers (probOutput_none_forkReplay_le, le_probEvent_isSome_forkReplay,
Fork.replayForkingBound, euf_nma_bound, euf_cma_bound) inherit this conditionality
until both faithfulness lemmas are discharged.
The replay-fork precondition is itself bounded by 1. This mirrors
seededFork_precondition_le_one; the statement is independent of the fork mechanism.
Replay fork failure probability bound. This mirrors probOutput_none_seededFork_le;
the proof structure is identical, substituting the pointwise replay lower bound
le_probOutput_forkReplay for its seed-based analogue. The hreach hypothesis is
threaded through from le_probOutput_forkReplay.
Currently conditional on the two B1 prefix-faithfulness sorrys (transitively via
le_probOutput_forkReplay → sq_probOutput_main_le_noGuardReplayComp).
Packaged replay forking theorem. This is the replay analogue of
le_probEvent_isSome_seededFork, derived from probOutput_none_forkReplay_le and
forkReplay_precondition_le_one by the same 1 - · conversion used in
le_probEvent_isSome_seededFork. The hreach hypothesis is threaded through.
Currently conditional on the two B1 prefix-faithfulness sorrys (transitively via
probOutput_none_forkReplay_le → le_probOutput_forkReplay
→ sq_probOutput_main_le_noGuardReplayComp).
Structural success facts for forkReplay: both outputs come from logged runs of main,
share the same selected fork index, differ at the selected distinguished-oracle answer, and the
second run is witnessed by a replay state whose observed log agrees with the first-run log on the
replayed prefix.
This mirrors Firsov-Janku's success_log_props at
fsec/proof/Forking.ec:1373. The Lean proof is library-native:
it unfolds forkReplay with mem_support_bind_iff, then consumes the already-proved
support-level lemmas replayRunWithTraceValue_mem_support_replayFirstRun,
replayRunWithTraceValue_prefix_input_eq, and
replayRunWithTraceValue_getQueryValue?_observed_eq_replacement. No procedural while-loop
invariant is needed; the replacement invariant ReplayReplacementInvariant is established
pointwise by induction on the replay oracle.
Replay property transfer: any postcondition that holds for every logged run of main
holds for both outputs of a successful replay fork, together with the common selected fork index
and the fact that the distinguished answers differ at that index.
This mirrors Firsov-Janku's property_transfer at
fsec/proof/Forking.ec:1351, combining fst_run_prop with
the shared-prefix facts established by success_log_props.