File Details
v0.32.8
- R
- May 22, 2026
- 379.48 KB
- 35
- 12.0.5
- Retail
File Name
GuildBankLedger-v0.32.8.zip
Supported Versions
- 12.0.5
GuildBankLedger
v0.32.8 (2026-05-22)
Full Changelog Previous Releases
- Merge pull request #30 from RussellFeinstein/layout-sort
Sort hardening (v0.32.8): confirm-on-deposit, cross-tab in-window re-query, timeout taxonomy + diagnostics - Sync docs to the shipped v0.32.8 sort hardening
Audit pass after the confirm-on-deposit and cross-tab in-window re-query work
landed in PR #30, so the docs match what shipped:- ROADMAP: the "Active" sort entry no longer lists "planner correctness across
cold caches". This session disproved the cold-cache / scan-staleness theory
(the scan is accurate; the cold tab was an executor observation issue, now
fixed), so the entry lists the real remaining work: confirmation-speed tuning
and the overflow-pack full-stack-merge edge case. - PLAN-phase-b-sort-hardening: marked SHIPPED in v0.32.8, noting the emergent
confirm-on-deposit / cross-tab fixes beyond the original B1-B4a scope and the
final timeout[s,p,c,m,dp,o] audit format (the n/r bucket names in the
verification criteria were pre-implementation guesses). - 2026-05-21-deposit-latency-coldtab sort log: a resolution banner on the
"Cold-tab: still open" section, since the cause is now confirmed and the
tab-change query it proposed was ineffective (query is not select); the
working fix is the in-window re-query.
Doc-only; no version bump or changelog entry.
Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com
- ROADMAP: the "Active" sort entry no longer lists "planner correctness across
- Add view-gated-reads mock mode + cross-tab confirm regression tests
Round 2 of the in-window dst re-query fix (PR #30): lock in the now-confirmed
fix with the regression test that would have caught the original ineffective
one.
The mock fired GUILDBANKBAGSLOTS_CHANGED synchronously on every mutation
regardless of the viewed tab, so it could not reproduce the non-viewed-tab
staleness the fix targets, and every cold-tab attempt passed green. Add an
opt-in MockWoW.guildBank.viewGatedReads: reads of a tab other than currentTab
come from a per-tab visible snapshot refreshed only by QueryGuildBankTab, and a
mutation to a non-selected tab lands in the store but fires no event and stays
invisible until pulled. QueryGuildBankTab snapshots without changing the
selected tab (query is not select in-game). Default off, so the existing tests
are untouched.
Tests (spec/sortexecutor_spec.lua): a mock self-check (deposit invisible until
queried, query does not select), a cross-tab op that confirms via the in-window
re-query with zero server-rejected timeouts (fails if the re-query regresses),
and a no-replan-echo check.
Test infrastructure only; no production code or user-visible change, so no
version bump or changelog entry. Full suite 1279 green, luacheck clean.
Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com - Re-query destination tab in-window so cross-tab deposits confirm fast
The v0.32.8 "view destination tab" fix (ddb2f2a) was ineffective in-game.
QueryGuildBankTab pulls a tab's data but does not change the selected tab, so
GetCurrentGuildBankTab stayed on the viewed tab. A 184-op capture (user on T6
the whole run, never toggling) showed every cross-tab op timing out
server-rejected at 4s then late-confirming at +5s (avg 5.64s/op, 17 min total).
The sort still converged, so this was a speed and false-failure-logging
problem, not a correctness one.
Mechanism: WoW pushes slot updates only for the currently-viewed tab, so a
deposit into any other tab lands server-side but is invisible to our reads
until something re-pulls that tab. Nothing in the 4s confirmation window
re-queried the destination; the only re-pull was the next op's step(), so each
deposit surfaced one op cycle (~5s) late.
Fix: while an op is in flight, re-query the destination tab (QueryGuildBankTab
at 0.5/1.5/3s, gated on the dst differing from GetCurrentGuildBankTab) so its
slots-changed event drives the async-confirm path in-window. It runs in the
background and does not change the user's viewed tab. The mis-timed pre-op
query and its pendingViewQuery suppression are removed.
Harden the replan gate so the re-query's own event is safe: a slots-changed
event with no in-flight op now replans only on a genuine divergence (the last
completed op's slots no longer match their projected post-state), not on any
unexplained event. A refresh echo that matches is the executor's own re-query,
not foreign activity. This replaces the flag-based suppression and also stops
unrelated foreign activity from forcing needless rescans.
The two destination-tab-view tests are removed: they exercised the removed
pre-op query, and the current mock fires the confirm event synchronously on
every mutation regardless of viewed tab, so it cannot reproduce the
non-viewed-tab staleness. Round 2 (after the confirming capture) adds an opt-in
view-gated-reads mock mode and a regression test.
No version bump (PR #30 / 0.32.8 stacked work). Full suite green (1276),
luacheck clean.
Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com - View destination tab during execution (cold-tab fix)
The viewed=TN instrument plus the user clicking tabs mid-sort pinned the
"cold tab" cause decisively: WoW refreshes GetGuildBankItemInfo only for the
currently-viewed guild bank tab, so a deposit into a tab we are not viewing
still lands server-side but reads stale-empty, surfacing as a false
server-rejected timeout and a 4s stall. Proof it is observation, not the
deposit: ops marked server-rejected with viewed=T6 still drain their source
stack x13->x12->x11->x10, and the bank reaches T5:92 / 0 ops regardless.
viewed=T5 -> the same ops succeed in ~0.6s.
Fix: step() calls QueryGuildBankTab(op.dstTab) when GetCurrentGuildBankTab()
differs from the destination tab (only on an actual tab change), so the
destination reads are fresh and confirm-on-deposit fires. The query's own
GUILDBANKBAGSLOTS_CHANGED is swallowed by a one-shot state.pendingViewQuery
flag consumed at the top of the slots-changed handler. A one-shot flag, not
a time window: a window also swallowed genuine foreign events and broke the
reversion/replan tests; the flag is consumed by the query's own event and
leaves later genuine events intact. Operating on a non-viewed source is fine;
only observation of the destination needs the view.
Full write-up in docs/sort-logs/2026-05-21-viewed-tab-confirmed.md.
Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com - Confirm splits/moves into empty slots on deposit; add viewed-tab probe
Phase 2, driven by the deposit-latency capture: in-game the deposit lands
in ~0.6s while the source-stack decrement lags past the 4s window, so
waiting for source-drain wasted ~3.4s on every op.
opSucceeded(w): a split/move into an empty (or different-item) slot is
confirmed the moment the destination holds the item, dropping the
source-drain wait there; a merge into an occupied same-item slot still
requires source-drain (the optimistic-bounce case). Wired into all four
advance paths (sync, interim-poll, async, late-poll). The common case now
confirms in ~1s instead of 4s. Safe: deposits are server-driven and
persist (the bank converges correctly), and the change is gated on the
already-captured dstPreOp so it never relaxes the rule for merges.
Cold-tab investigation (passive, no behavior change): stamp the currently
-viewed guild bank tab (GetCurrentGuildBankTab) as viewed=TN on the done,
timeout, and deposit-observer audit lines. The executor never queries a
tab (only the scanner does) and WoW tracks one viewed tab, so the next
capture can correlate the viewed tab with the >20s cold-tab first-deposit
tail without perturbing the executor. Mock gains GetCurrentGuildBankTab.
Full write-up in docs/sort-logs/2026-05-21-deposit-latency-coldtab.md.
Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com - Instrument deposit-confirmation latency; narrow abort to genuine refusals
Instrument-first step before changing the confirmation rule: measure when
guild-bank deposits actually land so Phase 2's wait can be tuned from real
data instead of an assumption.- Each no-op-suspected line now stamps elapsed-since-issue, so the first
one on an op marks the fast-deposit latency. - An op whose deposit had not landed by the 4s timeout (server-rejected
with an empty/other pre-op dst) is watched for up to 20s, logging when
the deposit actually lands or that it never did. This captures the
slow-deposit latency the 4s window misses. - The 3-strike abort is narrowed to genuine refusals: only a move/merge
into a slot that already held the same item (a max-stack bounce) counts.
Any op into an empty or different-item slot is an in-flight deposit
(fast = drain-pending, slow = server-rejected) and no longer aborts, so
a real sort runs to completion and a full latency distribution can be
captured. A planner mistake like merging two full max-stack stacks still
aborts. Exposed via isAbortableRefusal with a test hook.
No confirmation/advance behavior is trusted to the deposit yet; the op
still advances at 4s as before. That trust is Phase 2, tuned from this
capture.
Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com
- Each no-op-suspected line now stamps elapsed-since-issue, so the first
- Classify lagging-deposit splits as drain-pending; exclude from abort
The post-fix capture proved the cold-snapshot theory wrong: the sort's
splits actually succeed (destination tab fills monotonically in lockstep
with ops issued, source stacks drain across runs), but a guild-bank split
deposits into the destination before the client reflects the source-stack
decrement. Within the 4s confirmation window the source still looks full,
so srcDrainedAsExpected calls a successful split a no-op. Once the
classifier was fixed these correctly bucketed as merge-noop, and the
now-functional 3-strike abort started killing sorts that were placing
items correctly.
Split the merge-noop bucket using the already-captured dstPreOp: a timeout
where the destination was empty (or a different item) pre-op and now holds
the expected item is a fresh deposit whose source-drain merely lags, not a
refusal. Classify it drain-pending (reported as dp=N in the timeout
summary), exclude it from the 3-strike abort, and audit the
deposit-landed-source-pending state. A real sort now runs to completion
instead of bailing on its own progress, which both confirms the splits
succeed across a full run and sets up the proper fix (confirm a
split-into-empty on the deposit, not the lagging source-drain).
Replace the now-inaccurate cold-tab sort-log doc with one that records the
corrected diagnosis.
Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com - Classify sort timeouts from live slot state; add scan freshness diagnostics
The v0.32.8 timeout classifier matched "it:<id> x" at the start of the slot
description, but describeSlot prepends the resolved item name in-game
("Flask (it:NNN) xN"), so the match never fired: every real timeout fell
into the "other" bucket and the merge-noop / server-rejected buckets plus
the 3-strike abort were dead outside the no-item-name test env. Classify
from structured live-slot snapshots (snapshotLiveSlot) instead. A new
regression test warms the ItemCache so describeSlot carries the in-game name
and asserts the bucket and abort still fire.
Add scan freshness diagnostics motivated by a 2026-05-21 capture where a cold
snapshot (5 occupied display slots read as empty) produced 5 phantom splits
that ran 21.3s before a warm rescan showed 0 ops needed: per-tab completedVia
(event vs query-timeout) and lockedSkips in a Scan: summary line, and a
per-tab occupied-slot breakdown on the sort plan line. These identify a cold
tab and its mechanism in /gbl sortlog. Full diagnosis in
docs/sort-logs/2026-05-21-phantom-split-cold-tab.md.
Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com - Stamp v0.32.8
VERSION, GuildBankLedger.toc, src/Core.lua, CLAUDE.md "Current:" line,
and docs/PLAN-phase-b-sort-hardening.md target-version line all bump
from 0.32.7 to 0.32.8. DEV_BUILD stays nil.
CHANGELOG.md and UI/ChangelogView.lua CHANGELOG_DATA get matching
v0.32.8 blocks documenting the Phase B sort hardening work
shipped under this stamp: B1 timeout taxonomy and 3-strike abort,
B2 interim polling cascade and counter reset on replan, B3 live-
primary audit lines and projectionDrifts counter, B4a Phase 2
planner instrumentation.
Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com - Add Phase 2 instrumentation for cycle / pivot / refused emit
Phase B sort hardening, milestone B4a. The Gateway Control Shard chain
in the 2026-05-14 and 2026-05-20 sort logs revealed a planner pattern
that should have been resolved by Phase 2's pivot-break loop but instead
emitted directly as a 6-cycle of refused ops. Without instrumentation,
the trace stopped at "the planner emitted these six ops"; B4b cannot
design a fix without knowing what Phase 2 saw at emit time.
canExecute now returns (ok, reason) where reason is one of
"src-shortfall", "dst-mismatch", or "max-stack-overflow". The existing
callers wrap the function in a boolean condition and pick up only the
first return, so control flow is unchanged. greedyDrain reads the
reason on a refusal and emits a debug audit line naming the failing
predicate.
pivotBreakLoop emits three new debug lines: cycle-blocked (which slot
is wanted, which item blocks it), pivot-chosen (where the pivot lands),
and no-pivot abort (when findPivot returns nil). The pre-existing
"no-stuck" abort branch also gets a debug line for the rare "src
drifted in pure-planner mode" path.
All emissions go through self:SortDebug, which Logger.lua drops from
both the chat output and the log buffer unless
db.profile.sort.debugChat is true. Normal users see clean sort logs;
debugging a singleton-chain requires flipping the flag. A 20-emission
cap per plan prevents flooding the debug channel on degenerate inputs.
Tests cover three branches: cycle + pivot lines fire on a 2-cycle that
Phase 2 resolves; no-pivot-abort fires when both same-tab and overflow
are claimed; no Phase 2 lines leak into the buffer when debugChat is
off.
Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com - Clarify timeout-path audit: live observed primary, planner secondary
Phase B sort hardening, milestone B3. The timeout-path audit block in
SortExecutor.lua unconditionally emitted four lines per timeout: timed-
out marker, op intent, live observed values, planner-expected values.
In-game captures showed the third and fourth lines were nearly always
duplicates because the planner's emit-time projection of src/dst tends
to match live state on healthy ops, and when they diverged the
divergence itself was the signal of interest, not the projection
content.
Audit block now emits the planner-projected line ONLY when it diverges
from observed. Divergence compares rendered descriptions
(describePlannerSlot vs describeSlot) so nil and non-nil planner
projections fold into the same string-equality check. On divergence the
projected line is rewritten as "(planner projected: src ..., dst ...)"
and state.projectionDrifts increments.
projectionDrifts surfaces in the finish() result table and in the
single-line summary as drifts=%d, between the timeout-class bucket and
the avg-per-op number. Non-zero indicates the planner's in-memory state
has drifted from server reality during the run. Expected when any ops
took the timeout path since the planner doesn't roll back its working
state on a no-op.
Tests cover both branches: planner projection matches observed (no
secondary line, drifts=0) and planner projection mismatches observed
(secondary line emitted, drifts=1). The existing diagnostic-counter
test also asserts projectionDrifts=0 on the happy-path baseline.
Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com - Add interim-poll cascade and reset refusal counter on replan
Phase B sort hardening, milestone B2 plus three B1 follow-ups identified
by post-ship audit.
Interim-poll cascade. After the sync post-Pickup check arms state.waiting,
schedule four C_Timer.After polls at 0.25/0.5/1.0/2.0 s. Each poll runs
the same advance predicates the sync and async paths use. First poll to
pass advances the op via the [interim-poll] audit tag and gives the
server a slightly longer cushion (0.5 s vs the default 0.3 s) before
issuing the next op. The 4.0 s MOVE_CONFIRM_TIMEOUT late-poll stays as
the backstop.
Without this, every op confirmation that doesn't land within
PickupGuildBankItem's synchronous window waits the full 4 s for the
late-poll. In-game captures show this dominates wall-clock sort time
(every op via [late-poll], 4.5 s/op average) even when the server
processed in well under a second.
cancelPollTimers helper. Centralized cleanup so any advance / abort /
replan / cancel path cancels pending poll timers in one call. Wired at
sync advance, cursor-stuck path, interim-poll advance, async-event
advance, late-poll cleanup, doReplan, and finish().
B1.F1: doReplan was leaving consecutiveRefusedByItem populated across
plan rebuilds. Two refusals + replan + one refusal would falsely trip
the 3-strike abort even though plan structure changed between strikes 2
and 3. doReplan now clears the counter alongside state.waiting.
B1.F2: Test for per-item counter isolation (two refusals on item A plus
one on item B does not trigger the abort).
B1.F3: Test hooks _sortExecutorGetRefusalCount /
_sortExecutorSetRefusalCount for deterministic counter manipulation
without driving through real refusals first.
Mock affordance. MockWoW.deferredBankEvents (default off) gates
fireBankEvent so tests can drive the interim-poll cascade without the
sync mock firing GUILDBANKBAGSLOTS_CHANGED inside PickupGuildBankItem.
MockWoW.fireQueuedBankEvent pops queued events one at a time.
QueryGuildBankTab is intentionally NOT gated so scan setup keeps firing
synchronously.
State-alive guard. When the synchronous mock advances via the async-
event handler during PickupGuildBankItem, the recursive step() can
reach the "all done" branch and finish() before the outer step()
returns. The outer step() must short-circuit before scheduling polls
in that case to avoid indexing a nil state.
Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com - Split timeout-classifier bucket and abort on repeated refusal
Phase B sort hardening, milestone B1. classifyTimeoutState now
distinguishes two cases that previously collapsed into "other":- "server-rejected" (was "none"): src holds expected item, dst empty,
cursor empty. Server dropped the pickup before the drop landed. - "merge-noop" (was "other"): src and dst both hold the expected item,
cursor empty, planner intended dst to be empty. Op was a no-op
because the slot was already populated.
A merge-success-with-lost-ACK (planner intended a same-item merge and
got one) stays in "complete" via the new plannerDstAt branch so a
real success is not double-counted as a refusal.
Each timeout in those two classes increments a per-item counter on
state.consecutiveRefusedByItem. Any success or non-refused timeout
resets it. On the third consecutive refusal for the same itemID the
executor finish()es with "repeated server refusal on item N (3
consecutive merge-noop|server-rejected)" instead of falling through
to step(). This stops the singleton-chain pattern from compounding
into a long chain of refused ops before the replan cap aborts.
The audit summary format string changes from
timeout[n=%d,p=%d,c=%d,o=%d]
to
timeout[s=%d,p=%d,c=%d,m=%d,o=%d]
with the obvious key mapping (s=server-rejected, m=merge-noop).
Tests cover all five classifier branches via the new
_sortExecutorClassifyTimeoutState test hook, drive a real 3-refusal
abort through the mock's bounce-on-max-stack-overflow behavior, and
verify a success between refusals clears the counter.
Co-Authored-By: Claude Opus 4.7 (1M context) noreply@anthropic.com
- "server-rejected" (was "none"): src holds expected item, dst empty,