mts1b-GPUbacktester — public API surface
CUDA-accelerated backtest engine. Pure compute. No orchestration.
CLI
# Single run
mts1b-backtest run \
--factor f_crypto_realized_vol \
--params '{"h": 21}' \
--universe crypto-top-10 \
--start 2022-01-01 --end 2026-01-01 \
--rebal weekly \
--cost-bps 60 \
--sizing equal_weight_ls \
--n-long 2 --n-short 2 \
--output data/backtests/run-1.parquet
# Walk-forward
mts1b-backtest walk-forward \
--factor f_crypto_realized_vol \
--params-grid '{"h": [10, 21, 42, 63]}' \
--universe crypto-top-10 \
--start 2020-01-01 --end 2026-01-01 \
--train-window 252 --test-window 63 --step 63
# Batch from YAML
mts1b-backtest batch --config configs/sweep.yaml
# Backend selection
mts1b-backtest run --backend cuda ... # default if available
mts1b-backtest run --backend cpu ... # fallback
Programmatic
from mts1b_GPUbacktester import run_single, run_walk_forward
from mts1b_quantkit.factors import get
result = run_single(
factor=get("f_crypto_realized_vol"),
params={"h": 21},
universe="crypto-top-10",
start="2022-01-01", end="2026-01-01",
rebal="weekly",
sizing={"method": "equal_weight_ls", "n_long": 2, "n_short": 2, "gross": 1.0},
cost_bps=60,
invert=True, # buy low-z, sell high-z (mean reversion)
seed=42, # determinism
backend="cuda", # or "cpu"
)
print(result.sharpe, result.calmar, result.max_drawdown)
# Walk-forward
cv = run_walk_forward(
factor=get("f_crypto_realized_vol"),
params_grid={"h": [10, 21, 42, 63]},
universe="crypto-top-10",
start="2020-01-01", end="2026-01-01",
train_window=252, test_window=63, step=63,
)
print(cv["agg_sharpe"], cv["best_params"], cv["stability_score"])
BacktestResult schema
class BacktestResult(BaseModel):
config: BacktestConfig
# Returns
returns: np.ndarray # (T,) daily strategy returns
cum_returns: np.ndarray # (T,) cumulative
equity_curve: np.ndarray # (T,) NAV path
# Positions
weights: np.ndarray # (T, A) target weights
holdings: np.ndarray # (T, A) actual holdings (post-rebalance)
turnover: np.ndarray # (T,) one-way turnover
# Costs
fees: np.ndarray
slippage: np.ndarray
# Annualized metrics
sharpe: float
calmar: float
max_drawdown: float
cagr: float
ic: float
t_stat: float
turnover_annualized: float
# Walk-forward
fold_sharpes: list[float] | None
ci95_sharpe: tuple[float, float] | None
Serializable to parquet:
result.to_parquet("run.parquet")
result_loaded = BacktestResult.from_parquet("run.parquet")
Determinism
r1 = run_single(..., seed=42)
r2 = run_single(..., seed=42)
assert np.array_equal(r1.returns, r2.returns)
CI verifies on every PR.
Boundary verification (CI)
# No ladder code can leak in
python -m mts.tools.ast_scan --forbid "ladder" src/
# No HRP/BL clones (those are in quantkit)
python -m mts.tools.ast_scan --forbid "hrp_weights|black_litterman" src/
Memory management
# Stream large universes in chunks
mts1b-backtest run --chunk-size-mb 512 ...
Tradeoff: smaller chunks = more host↔device transfers = slower. Default 2 GB chunk works on most consumer GPUs.
Performance reference
| Setup | Time | Speedup |
|---|---|---|
| CPU, 100 syms, 10yr daily | 8 sec | 1x |
| CPU, 1000 syms, 10yr daily | 95 sec | 1x |
| GPU (4090), 1000 syms, 10yr daily | 4.8 sec | 20x |
| GPU (H100), 1000 syms, 10yr 1m bars | 14 sec | 77x |
Ladder sweeps (run via mts1b-research)
mts1b-GPUbacktester doesn't orchestrate ladder. For that, use mts1b-research:
from mts1b_research.ladder import run_ladder
await run_ladder(
factor_class="momentum",
param_grid={"h_long": list(range(60, 365, 5)),
"h_skip": [0, 5, 21, 42]},
universe="us-large-cap",
start="2010-01-01", end="2024-01-01",
cost_bps=5,
)
# Calls run_single under the hood for each combo at L1