Skip to main content

mts1b-GPUbacktester — public API surface

CUDA-accelerated backtest engine. Pure compute. No orchestration.

CLI

# Single run
mts1b-backtest run \
--factor f_crypto_realized_vol \
--params '{"h": 21}' \
--universe crypto-top-10 \
--start 2022-01-01 --end 2026-01-01 \
--rebal weekly \
--cost-bps 60 \
--sizing equal_weight_ls \
--n-long 2 --n-short 2 \
--output data/backtests/run-1.parquet

# Walk-forward
mts1b-backtest walk-forward \
--factor f_crypto_realized_vol \
--params-grid '{"h": [10, 21, 42, 63]}' \
--universe crypto-top-10 \
--start 2020-01-01 --end 2026-01-01 \
--train-window 252 --test-window 63 --step 63

# Batch from YAML
mts1b-backtest batch --config configs/sweep.yaml

# Backend selection
mts1b-backtest run --backend cuda ... # default if available
mts1b-backtest run --backend cpu ... # fallback

Programmatic

from mts1b_GPUbacktester import run_single, run_walk_forward
from mts1b_quantkit.factors import get

result = run_single(
factor=get("f_crypto_realized_vol"),
params={"h": 21},
universe="crypto-top-10",
start="2022-01-01", end="2026-01-01",
rebal="weekly",
sizing={"method": "equal_weight_ls", "n_long": 2, "n_short": 2, "gross": 1.0},
cost_bps=60,
invert=True, # buy low-z, sell high-z (mean reversion)
seed=42, # determinism
backend="cuda", # or "cpu"
)

print(result.sharpe, result.calmar, result.max_drawdown)


# Walk-forward
cv = run_walk_forward(
factor=get("f_crypto_realized_vol"),
params_grid={"h": [10, 21, 42, 63]},
universe="crypto-top-10",
start="2020-01-01", end="2026-01-01",
train_window=252, test_window=63, step=63,
)
print(cv["agg_sharpe"], cv["best_params"], cv["stability_score"])

BacktestResult schema

class BacktestResult(BaseModel):
config: BacktestConfig

# Returns
returns: np.ndarray # (T,) daily strategy returns
cum_returns: np.ndarray # (T,) cumulative
equity_curve: np.ndarray # (T,) NAV path

# Positions
weights: np.ndarray # (T, A) target weights
holdings: np.ndarray # (T, A) actual holdings (post-rebalance)
turnover: np.ndarray # (T,) one-way turnover

# Costs
fees: np.ndarray
slippage: np.ndarray

# Annualized metrics
sharpe: float
calmar: float
max_drawdown: float
cagr: float
ic: float
t_stat: float
turnover_annualized: float

# Walk-forward
fold_sharpes: list[float] | None
ci95_sharpe: tuple[float, float] | None

Serializable to parquet:

result.to_parquet("run.parquet")
result_loaded = BacktestResult.from_parquet("run.parquet")

Determinism

r1 = run_single(..., seed=42)
r2 = run_single(..., seed=42)
assert np.array_equal(r1.returns, r2.returns)

CI verifies on every PR.

Boundary verification (CI)

# No ladder code can leak in
python -m mts.tools.ast_scan --forbid "ladder" src/

# No HRP/BL clones (those are in quantkit)
python -m mts.tools.ast_scan --forbid "hrp_weights|black_litterman" src/

Memory management

# Stream large universes in chunks
mts1b-backtest run --chunk-size-mb 512 ...

Tradeoff: smaller chunks = more host↔device transfers = slower. Default 2 GB chunk works on most consumer GPUs.

Performance reference

SetupTimeSpeedup
CPU, 100 syms, 10yr daily8 sec1x
CPU, 1000 syms, 10yr daily95 sec1x
GPU (4090), 1000 syms, 10yr daily4.8 sec20x
GPU (H100), 1000 syms, 10yr 1m bars14 sec77x

Ladder sweeps (run via mts1b-research)

mts1b-GPUbacktester doesn't orchestrate ladder. For that, use mts1b-research:

from mts1b_research.ladder import run_ladder

await run_ladder(
factor_class="momentum",
param_grid={"h_long": list(range(60, 365, 5)),
"h_skip": [0, 5, 21, 42]},
universe="us-large-cap",
start="2010-01-01", end="2024-01-01",
cost_bps=5,
)
# Calls run_single under the hood for each combo at L1

See also