Bot Architecture Patterns
Production market making bots need more than just order placement logic. This page covers the infrastructure patterns used across Drift’s keeper-bots-v2 , applicable to market making, filling, and triggering bots.
Subscription loop + resubscribe
WebSocket connections drop. RPC nodes restart. Your bot must handle reconnections automatically. The OrderSubscriber supports automatic resubscription and periodic resync to catch any missed updates.
import { OrderSubscriber } from "@drift-labs/sdk";
const orderSubscriber = new OrderSubscriber({
driftClient,
subscriptionConfig: {
type: "websocket",
resubTimeoutMs: 30_000, // resubscribe if no update in 30s
resyncIntervalMs: 300_000, // full resync every 5 minutes
},
});
await orderSubscriber.subscribe();Why resyncIntervalMs? WebSocket subscriptions can silently miss updates (dropped messages, RPC hiccups). Periodic resync ensures your local state doesn’t drift from onchain reality.
Periodic tasks + mutex guard
Many bot operations (refreshing quotes, running risk checks, rebalancing) should run on a timer but must not overlap. Use a mutex to prevent concurrent execution:
import { Mutex } from "async-mutex";
const periodicTaskMutex = new Mutex();
let lastWatchdogTs = Date.now();
setInterval(async () => {
if (periodicTaskMutex.isLocked()) {
console.log("Previous cycle still running, skipping");
return;
}
const release = await periodicTaskMutex.acquire();
try {
await refreshQuotes();
await runRiskChecks();
lastWatchdogTs = Date.now();
} catch (err) {
console.error("Periodic task error:", err);
} finally {
release();
}
}, 10_000);Throttling / backoff
When filling auctions or reacting to events, avoid hammering the same order repeatedly. Track recent attempts and enforce a cooldown:
const throttledNodes = new Map<string, number>();
function shouldAttemptFill(nodeKey: string, cooldownMs = 1000): boolean {
const lastAttempt = throttledNodes.get(nodeKey) ?? 0;
if (lastAttempt + cooldownMs > Date.now()) return false;
throttledNodes.set(nodeKey, Date.now());
return true;
}Priority fees and compute budget
Solana transactions compete for block space via priority fees. During congestion, your transactions will be dropped without adequate fees. The SDK provides a PriorityFeeSubscriber that tracks recent fee levels.
import { ComputeBudgetProgram } from "@solana/web3.js";
import { PriorityFeeSubscriber } from "@drift-labs/sdk";
// Subscribe to priority fee data
const priorityFeeSubscriber = new PriorityFeeSubscriber({
connection,
frequencyMs: 5000,
});
await priorityFeeSubscriber.subscribe();
// When building transactions, set compute budget
const priorityFee = priorityFeeSubscriber.getCustomStrategyResult();
const computeBudgetIx = ComputeBudgetProgram.setComputeUnitPrice({
microLamports: Math.floor(
priorityFee * driftClient.txSender.getSuggestedPriorityFeeMultiplier()
),
});
// Also set compute unit limit to avoid overpaying
const computeLimitIx = ComputeBudgetProgram.setComputeUnitLimit({
units: 400_000, // typical for place-and-make
});Tip: Place-and-make transactions typically use 200-400k compute units. Set the limit explicitly rather than defaulting to 200k (which may fail) or 1.4M (which overpays).
Address Lookup Tables (ALTs) and blockhash
Drift transactions reference many accounts. Use Address Lookup Tables to compress transaction size and stay under Solana’s 1232-byte limit:
// Fetch ALTs once at startup (or periodically refresh)
const lookupTableAccounts = await driftClient.fetchAllLookupTableAccounts();
// Get fresh blockhash for each transaction
const { blockhash } = await driftClient.connection.getLatestBlockhash({
commitment: "confirmed",
});Health monitoring
Your bot should continuously monitor its own account health and cancel orders before liquidation:
import { QUOTE_PRECISION, convertToNumber } from "@drift-labs/sdk";
async function checkHealth() {
const user = driftClient.getUser();
// Check free collateral
const freeCollateral = convertToNumber(
user.getFreeCollateral(),
QUOTE_PRECISION
);
// Check leverage
const leverage = convertToNumber(user.getLeverage(), new BN(10_000)); // 4 decimals
// Check margin ratio
const marginRatio = user.getMarginRatio();
console.log(`Free collateral: $${freeCollateral.toFixed(2)}, Leverage: ${leverage.toFixed(2)}x`);
// Emergency cancel if health is deteriorating
if (freeCollateral < MIN_FREE_COLLATERAL || leverage > MAX_LEVERAGE) {
console.warn("⚠️ Health threshold breached, cancelling all orders");
await driftClient.cancelOrders(); // cancel ALL orders across all markets
return false;
}
return true;
}
// Run health check every cycle
setInterval(checkHealth, 5_000);Graceful shutdown
When your bot stops (deploy, crash, SIGINT), it should cancel all resting orders to avoid being picked off while offline:
let isShuttingDown = false;
async function gracefulShutdown(signal: string) {
if (isShuttingDown) return;
isShuttingDown = true;
console.log(`Received ${signal}, shutting down gracefully...`);
try {
// Cancel all orders across all markets
console.log("Cancelling all orders...");
await driftClient.cancelOrders();
console.log("All orders cancelled");
// Unsubscribe from feeds
await orderSubscriber.unsubscribe();
await driftClient.unsubscribe();
console.log("Shutdown complete");
} catch (err) {
console.error("Error during shutdown:", err);
} finally {
process.exit(0);
}
}
process.on("SIGINT", () => gracefulShutdown("SIGINT"));
process.on("SIGTERM", () => gracefulShutdown("SIGTERM"));
// Also handle uncaught errors
process.on("uncaughtException", async (err) => {
console.error("Uncaught exception:", err);
await gracefulShutdown("uncaughtException");
});Why this matters: If your bot crashes with orders still on the book, those orders may fill at stale prices while you’re not running, causing classic adverse selection.
Error handling patterns
Solana transactions fail for many reasons. Your bot should handle these gracefully:
async function sendWithRetry(
fn: () => Promise<string>,
maxRetries = 3,
baseDelayMs = 500
): Promise<string | null> {
for (let attempt = 0; attempt < maxRetries; attempt++) {
try {
return await fn();
} catch (err: any) {
const errMsg = err?.message || String(err);
// Transaction expired, blockhash is stale, just retry
if (errMsg.includes("BlockhashNotFound") || errMsg.includes("block height exceeded")) {
console.log(`Blockhash expired (attempt ${attempt + 1}), retrying...`);
continue;
}
// Compute budget exceeded, increase CU limit
if (errMsg.includes("exceeded CUs meter")) {
console.warn("Compute budget exceeded, consider raising CU limit");
continue;
}
// Insufficient funds for priority fee
if (errMsg.includes("insufficient lamports")) {
console.error("Insufficient SOL for transaction fees");
return null; // don't retry, need to top up
}
// Order would cross (post-only rejection), price moved, skip
if (errMsg.includes("PostOnlyOrderWouldBeFilled")) {
console.log("Post-only order would cross, skipping");
return null;
}
// Unknown error, log and retry with backoff
console.error(`Tx error (attempt ${attempt + 1}):`, errMsg);
await new Promise(r => setTimeout(r, baseDelayMs * Math.pow(2, attempt)));
}
}
console.error("Max retries exceeded");
return null;
}Risk and filtering
For JIT and SWIFT flows, apply these filters before filling to avoid adverse selection and stay within risk limits.
Oracle validation: Reject if oracle is stale or confidence is too wide:
const oracle = driftClient.getMMOracleDataForPerpMarket(marketIndex);
if (!oracle.isValid) return;
if (oracle.confidence > maxConfidence) return;Position limits: Don’t fill if it would exceed your max position or leverage:
const currentPosition = userAccount.perpPositions[marketIndex];
const newPosition = currentPosition.baseAssetAmount + fillSize;
if (Math.abs(newPosition) > maxPositionSize) return;Toxic flow: Skip orders that increase risk (e.g. direction that worsens your inventory) and skip when oracle/index divergence is high:
if (isRiskIncreasing(order, currentPosition)) return;
if (Math.abs(oraclePrice - indexPrice) > maxDivergence) return;When subscribed to both SWIFT and onchain feeds, use isSignedMsgOrder(order) to avoid double-handling the same order. See SWIFT API for details.
Putting it all together
A production bot typically follows this structure:
Initialize DriftClient + subscribe
Start your client and subscribe to required accounts/feeds.
Start OrderSubscriber (websocket + resync)
Bring up real-time order subscriptions with automatic resync.
Start PriorityFeeSubscriber
Track current priority fee levels for transaction construction.
Register SIGINT/SIGTERM handlers
Ensure graceful shutdown and emergency cleanup paths are active.
Main loop
Check health (and emergency cancel if needed), refresh oracle prices, update quotes, process JIT auctions (if participating), and log metrics (position, PnL, fill rate).
Sleep until next cycle
Pause until the next scheduled run tick.
Subaccount isolation
For multi-market strategies, use separate subaccounts per market to avoid order conflicts and simplify position tracking:
// Initialize with multiple subaccounts
const driftClient = new DriftClient({
connection,
wallet,
env: "mainnet-beta",
activeSubAccountId: 0,
subAccountIds: [0, 1, 2], // one per market
});
// Switch subaccount context for market-specific operations
driftClient.switchActiveUser(subAccountId);The JitMaker enforces a 1:1 subaccount-to-market ratio and throws if the mapping isn’t configured correctly.
Gotchas
- WebSocket silent failures: Solana WebSocket connections can stop delivering updates without disconnecting. The
resyncIntervalMsonOrderSubscribercatches this, but also monitor your last-update timestamp and force-reconnect if stale for >60s. - Priority fee spikes: during congestion, priority fees can spike 100x. Set a hard cap (
maxPriorityFee) to avoid draining your SOL balance on fees. ThePriorityFeeSubscribergives you the data, but you must enforce the cap. - Transaction size limits: Drift transactions reference many accounts. Without Address Lookup Tables (ALTs), you’ll hit the 1232-byte transaction limit. Always fetch and use ALTs (see the ALT section above).
- Blockhash expiry: Solana blockhashes expire after ~60 seconds. If your transaction is stuck in a queue, it will silently fail. Use
getLatestBlockhashwithcommitment: "confirmed"and setlastValidBlockHeightfor reliable expiry detection. - Rate limits: even paid RPC providers have limits. Batch reads where possible (e.g.,
getMultipleAccounts) and avoid redundant subscriptions.
See keeper-bots-v2 for the reference implementation of these patterns. Key files:
floatingMaker.ts: mutex guards, slot-based cooldowns, watchdog timerjitMaker.ts: subaccount isolation, DLOB integration, volatility checksutils.ts: throttling, market type conversion, volatility detection helpers