

Shadow Mode is a testing feature that allows businesses to evaluate new anti-fraud rules, risk models, or agent checks without disrupting live transactions. It works by running simulations in parallel with live systems, processing the same transaction data but without affecting actual outcomes. The results provide a clear comparison between live decisions and simulated ones, helping refine policies and improve decision-making.
Key Features:
Parallel Testing: Simulates transaction evaluations alongside live systems without interference.
Detailed Insights: Generates verdicts like PASS, FLAG, or BLOCK with explanations in Risk Dossiers.
Safe Environment: Operates in sandbox, staging, and production areas to prevent risks to live operations.
Policy Refinement: Helps fine-tune rules using real transaction data before deployment.
Benefits:
Test unproven rules safely.
Identify gaps in risk models or policies.
Prevent disruptions in payment flows.
Strengthen security by analyzing live transaction patterns.
Shadow Mode enables controlled and informed updates to risk management systems, ensuring policies are effective before going live.

Shadow Mode Implementation Workflow: From Setup to Live Deployment
Setting Up Shadow Mode in Stablerail

Prerequisites for Enabling Shadow Mode
Before activating Shadow Mode in Stablerail, make sure your environment is properly configured. The Policy Console must enforce programmatic guardrails like transaction limits, role-based access, whitelists, and specific operational timeframes during simulations. Additionally, stablecoins need to be stored in MPC-secured vaults with split keys and configurable signing thresholds. This ensures that Stablerail never has unilateral authority to sign transactions.
The Treasury Hub acts as the intelligence layer between your assets and the blockchain, ensuring no blind signing occurs by providing detailed context for every transaction. Identity and access management should include measures such as SSO, SCIM, MFA, and hardware keys for all authorized signers. It's also critical that your environment is divided into sandbox, staging, and production areas to enable safe testing without risking live operations.
"Agents verify the context. Humans sign the transaction. The system protects the treasury - it never touches the money." – Stablerail
To confirm readiness, initiate a Shadow Audit through Stablerail’s interface. This audit highlights control gaps in your treasury processes and helps design a secure setup for decision-making without moving actual funds. The Pre-Flight Risk Dossiers generated during the audit provide verdicts like PASS, FLAG, or BLOCK based on simulated scenarios, showing how your policies would handle real-world transaction patterns.
Once these components are verified, you can move on to configuring test intents that mimic real transaction scenarios.
Configuring Shadow Mode for Test Intents
After completing the Shadow Audit and ensuring all prerequisites are met, you can begin setting up Shadow Mode to simulate transaction test intents. Start by uploading payout CSVs (up to 500 transfers) or invoice PDFs into the sandbox environment. Stablerail’s system agents will automatically extract context from these documents and align them with your existing policies.
Activate Shadow Mode at the merchant or organization level through the administration portal. This allows you to process test intents side-by-side with live transactions, enabling a direct comparison of how your policies perform under both conditions.
"Every payment is simulated before execution. First-time destinations, address changes, and duplicates are caught before you sign." – Stablerail
You can also configure cool-off periods in your test policies, such as a 4-hour delay for transactions exceeding $100,000 or for payments to new beneficiaries. Maintain a Golden Source of approved vendor addresses - if an address is altered, the system will lock the payment to prevent potential fraud. Shadow Mode results are displayed in dedicated API objects and dashboards, which provide a clear comparison of what would happen under shadow evaluation versus current live rules.
These configurations ensure that every simulated transaction undergoes rigorous, real-time analysis, reinforcing the security of your treasury operations.
How Shadow Mode Validates Context Analysis
Running Agent Checks on Test Intents
Shadow Mode runs the same verification agents on test intents that it uses for live transactions, working in parallel with live evaluations. For example, when you upload a payout CSV or an invoice PDF into the sandbox environment, Stablerail's agents immediately begin processing the data. These agents perform tasks like sanctions screening, policy enforcement, behavioral anomaly detection, and counterparty risk scoring - all using the same transaction details, including device fingerprints and timestamps.
The sanctions screening checks blockchain addresses against blocklists for activities like terrorist financing, frozen wallets, or other illicit behaviors. Policy enforcement ensures that each intent complies with your specific rules, such as single-payment limits, role-based permissions, jurisdictional allowlists, and time-based restrictions. Behavioral agents keep an eye out for unusual patterns, like sudden activity spikes, repeated disputes, or signs of social engineering. Meanwhile, counterparty risk scoring evaluates the reputation of the recipient and cross-references your "Golden Source" whitelist to flag any address changes that might indicate fraud.
"Shadow mode allows a synchronous fraud evaluation without stopping the transaction from being processed by the payment processor." – DEUNA
If a shadow agent encounters an error or fails, it doesn’t disrupt the primary evaluation process. Instead, the results are captured in a fraud_shadowmode object, which mirrors the format of a live response. This setup allows for easy side-by-side comparisons, ensuring that simulated transactions undergo the same level of scrutiny as live ones, ultimately strengthening the system's security.
Generating and Reviewing Risk Dossiers
After the parallel evaluations, detailed Risk Dossiers are created to summarize the analysis of each intent. Instead of just providing raw risk scores, these dossiers deliver clear verdicts - PASS, FLAG, or BLOCK - along with plain-English explanations. Each dossier includes the risk level (low, medium, or high), a numerical risk score, the agent or processor responsible for the analysis, and a breakdown of the reasoning behind the verdict.
The Policy Trace section of the dossier explains the decision-making process in detail. For instance, if a $120,000 transfer to a new vendor address is flagged, the trace would identify the specific policy clause (e.g., "New address payments over $5,000 require CFO approval"), the timestamp of the intent, and the counterparty risk classification. This level of detail helps finance teams fine-tune thresholds and adjust agent configurations before deploying new rules.
Additionally, the Transaction Visibility dashboard allows you to review raw request and response data from shadow agents. This feature provides insights into how different providers behave, enabling you to improve detection accuracy without affecting customer experience or transaction approval rates.
Improving Policies with Shadow Mode Results
Analyzing Performance Metrics
Shadow Mode logs are a treasure trove of data for evaluating how well your policies perform. Start by comparing the results from fraud_shadowmode with your live system's decisions. This comparison can uncover gaps in how risks are assessed. Focus on false positive rates, detection rates, and total alert volumes - these metrics reveal whether your rules are overly restrictive, overly lenient, or balanced.
The analysis.score field is particularly useful for fine-tuning thresholds. For instance, if the shadow provider assigns scores of 65–75 to legitimate payments but your policy blocks anything over 60, you might be unnecessarily increasing manual reviews. The policyTrace array provides detailed insights into which rules are firing most often, including the Rule ID, outcome, and accompanying message. Use this information to identify patterns and adjust accordingly.
Another critical step is estimating the operational impact of potential rule changes. Look at the volume of "manual_review" or "rejected" statuses that would result under the new policy. If shadow testing shows that a proposed rule could triple your manual review workload, it's a clear signal to tweak thresholds before going live. Additionally, align the shadow provider's risk levels (low, medium, high) with your desired transaction outcomes - whether that means accepting, flagging for manual review, or rejecting transactions outright. This ensures your policies align with your organization’s tolerance for risk.
These findings provide actionable insights for refining your policy framework.
Adjusting and Testing Policy Rules
Once you've gathered metrics and identified performance gaps, it's time to refine your policies. Use Stablerail's Policy Console to directly edit your policy-as-code rules. Adjust transaction limits, update your "Golden Source" whitelist of verified vendors, and refine role-based permissions or time-based restrictions based on shadow data insights. For example, if legitimate new addresses frequently trigger alerts, consider adding them to your approved whitelist or raising the threshold that requires CFO approval.
For high-alert scenarios, consider adding multi-step approvals. Instead of automatically blocking transactions, require human approval with an explicit override reason. This process not only improves decision-making but also creates a qualitative dataset to analyze why certain flags occur. If shadow testing reveals high-value transfers bypassing standard checks, introduce automatic delay periods - for example, a 4-hour hold on transfers exceeding $100,000 to unfamiliar addresses.
After adjusting your rules, run them through Shadow Mode again using production data. This ensures your changes address previous issues without introducing new risks. Review the updated Risk Dossiers to confirm that the plain-English explanations match your refined logic. This iterative testing and adjustment process strengthens your system’s reliability. Once validated, the updated policy can be deployed to live systems, where it will enforce rules programmatically - ensuring even high-level executives cannot override the code once it’s active.
Deploying Validated Policies to Live Systems
Approving and Deploying Updated Rules
Once Shadow Mode confirms the effectiveness of your rules through validated evaluations, it's time to move them into live enforcement. Using Stablerail's Policy Console, you can finalize these rules to ensure they are programmatically enforceable. This step locks in critical parameters - such as limits, roles, whitelists, and time windows - ensuring that even executives can't override them once they're active.
The rollout process uses a staged approach to minimize risks. Start by applying the updated policy to just 5% of traffic. Monitor its performance in real-time, watching for any unexpected blocks or performance bottlenecks. Once the metrics are stable and no significant issues arise, scale the policy up to 100% of traffic. This method builds directly on the risk analysis conducted during Shadow Mode, ensuring a smooth and controlled transition.
During live enforcement, the established three-step transaction workflow remains intact. High-risk scenarios still include safeguards like dual control or cool-off periods, maintaining the system's overall security and reliability while the rollout progresses.
Maintaining Compliance Through Audit Trails
Every transaction processed under live enforcement generates a tamper-proof Proof-of-Control receipt. These receipts include detailed information such as the payment amount, approval rationale, signatory, and risk verdict. Each receipt is linked back to both the payment intent and the corresponding policy evaluation. This creates a robust, audit-ready record that meets the standards required by auditors, boards, and banking partners.
"Every payout generates a defensible receipt: what was paid, why, who approved, and the risk verdict." - Stablerail
For flagged transactions, the system requires an override reason to be recorded. This adds a layer of transparency and creates a qualitative dataset that explains exceptions. For instance, if a vendor's address changes and triggers a lock, the approval workflow will document whether the payment was allowed to proceed - and why - or if it was denied. This transforms your audit trail into a detailed governance record, showcasing operational integrity and adherence to regulatory requirements. This process aligns with a broader stablecoin compliance checklist designed to help finance teams manage risk and liquidity.
Scaling Real-Time Fraud Detection With Databricks: Lessons From DraftKings

Conclusion
Shadow Mode gives finance teams the ability to simulate real-time policy checks and agent evaluations without impacting live funds. This approach allows risk models to be fine-tuned using current data. By leveraging its parallel testing framework, Shadow Mode not only runs simulations but also validates the effectiveness of policies using live transaction data.
Each shadow evaluation generates a concise Risk Dossier, which helps refine rules based on actual behavior patterns. If discrepancies emerge between your existing system and the shadow configuration, you can investigate and make adjustments before any funds are at risk.
"Agents verify the context. Humans sign the transaction. The system protects the treasury - it never touches the money." - Stablerail
Since Stablerail operates above custody and prior to signing, shadow testing ensures that your policy logic is evaluated against real transaction patterns. This protects your financial operations from risks like issuer freezes, address changes, duplicate payments, or tainted counterparties —which you can quantify using a stablecoin risk calculator— - all before a human signs off. The validated evidence gathered through this process supports a controlled and phased deployment. Instead of rolling out untested rules, you can deploy them with Proof-of-Control receipts and detailed audit trails that show exactly how each policy performed under real conditions. By embedding real-time validation, Shadow Mode shifts your approach from reactive problem-solving to proactive, systematic risk management.
FAQs
How is Shadow Mode different from a normal sandbox test?
Shadow Mode operates differently from sandbox testing by conducting real-time, side-by-side evaluations of transactions without interfering with live operations. Essentially, transactions are processed through both the primary and shadow systems at the same time, but only the primary system impacts the actual outcomes. This approach enables precise testing, fine-tuning, and performance tracking under real-world conditions. In contrast, sandbox tests rely on isolated environments or simulated data, which are separate from actual transaction flows.
What data does Shadow Mode need to generate a Risk Dossier?
Shadow Mode creates a Risk Dossier by analyzing data from pre-sign checks. These checks cover several critical areas, such as:
Sanctions and taint/exposure screening: Ensures compliance by flagging potential risks tied to restricted entities or activities.
Policy and limit enforcement: Verifies that transactions align with predefined rules and thresholds.
Behavioral anomaly detection: Identifies unusual patterns, like unexpected transaction times, deviations in amounts compared to typical baselines, or irregular payout trends.
Counterparty risk scoring: Assesses the risk level of involved parties to ensure safe interactions.
To make this data actionable, Shadow Mode provides plain-English narrative explanations, complete with evidence. This approach ensures clarity and helps users understand the reasoning behind the risk assessments.
How do you decide when a shadow-tested rule is safe to deploy to production?
When a shadow-tested rule consistently proves its accuracy and reliability during testing, it's ready for production. Shadow mode provides a controlled environment where systems can evaluate these rules without affecting live operations. By comparing shadow results to real-world outcomes, teams can fine-tune the rule. If it achieves low false-positive and false-negative rates, aligns with policy objectives, and avoids causing unintended issues, it’s considered safe for active deployment.
Related Blog Posts
Ready to modernize your treasury security?
Latest posts
Explore more product news and best practices for using Stablerail.


