Skip to main content

Factor system

mts1b-research and mts1b-GPUbacktester share a factor registration system. A factor is a function that takes a panel of market data and returns a cross-sectional ranking. This page covers the contract.

Factor signature

mts1b_foundation.factors
from typing import Protocol
import numpy as np
from mts1b_foundation.market_data import UniversePanel


class FactorFn(Protocol):
"""A factor takes a UniversePanel + params, returns a (T, A) ndarray."""

def __call__(self, panel: UniversePanel, /, **params) -> np.ndarray | "cp.ndarray":
"""
Args:
panel: UniversePanel with .close (T,A), .high (T,A), .low (T,A),
.volume (T,A), .dates (T,), .symbols (A,)
**params: factor-specific params

Returns:
(T, A) cross-sectionally z-scored ranking.
Higher values = stronger signal.
NaN where data is missing.
"""

The (T, A) shape — time × asset — is the universal panel shape across the ecosystem.

Registering a factor

my_factors/realized_vol.py
import numpy as np
from mts1b_quantkit.factors import register
from mts1b_foundation.market_data import UniversePanel


@register("f_crypto_realized_vol")
def f_crypto_realized_vol(panel: UniversePanel, /, h: int = 21, ema_alpha: float = 0.94) -> np.ndarray:
"""21-day realized vol z-scored cross-sectionally.

Args:
panel: market data
h: lookback window in days
ema_alpha: EWMA decay for recency weighting

Returns:
(T, A) z-scored realized vol. High = high recent vol.
"""
close = panel.close # (T, A)
log_ret = np.log(close[1:] / close[:-1])
vol = ewma_std(log_ret, alpha=ema_alpha, window=h)
return zscore_cross_sectional(vol)

@register adds the factor to a global FACTOR_REGISTRY keyed by name. The name MUST start with f_ per convention.

The FACTOR_REGISTRY

mts1b_quantkit.factors
FACTOR_REGISTRY: dict[str, FactorFn] = {}


def register(name: str):
def deco(fn: FactorFn) -> FactorFn:
if not name.startswith("f_"):
raise ValueError(f"factor name must start with 'f_': {name}")
if name in FACTOR_REGISTRY:
raise ValueError(f"factor {name} already registered")
FACTOR_REGISTRY[name] = fn
return fn
return deco


def get(name: str) -> FactorFn:
if name not in FACTOR_REGISTRY:
raise KeyError(f"unknown factor {name}; known: {sorted(FACTOR_REGISTRY)}")
return FACTOR_REGISTRY[name]

The registry is populated at import time. mts1b-research and mts1b-GPUbacktester both look up factors via get(name) — no need to import them by reference.

CPU vs GPU variants

Most factors have both CPU (numpy) and GPU (cupy) implementations:

my_factors/realized_vol.py
@register("f_crypto_realized_vol")
def f_crypto_realized_vol(panel: UniversePanel, /, h: int = 21) -> np.ndarray:
close = panel.close # numpy or cupy depending on backend
xp = np if isinstance(close, np.ndarray) else cp # which array library?

log_ret = xp.log(close[1:] / close[:-1])
vol = ewma_std(log_ret, window=h, xp=xp)
return zscore_cross_sectional(vol, xp=xp)

The factor doesn't care whether it's running on CPU or GPU — xp dispatches. mts1b-GPUbacktester populates the panel with cupy.ndarrays; mts1b-research uses numpy.

Universe panel

mts1b_foundation.market_data
from dataclasses import dataclass
import numpy as np


@dataclass
class UniversePanel:
"""A (T, A) panel of market data for a factor."""

close: np.ndarray # (T, A) close prices
high: np.ndarray | None # (T, A) high prices (None if unavailable)
low: np.ndarray | None # (T, A) low prices
open: np.ndarray | None # (T, A) open prices
volume: np.ndarray | None # (T, A) traded volume

dates: np.ndarray # (T,) datetime64[D]
symbols: list[str] # (A,) symbol strings
asset_class: str # "equities", "crypto", "fx", ...

# Optional auxiliary data
market_cap: np.ndarray | None = None # (T, A)
sector: np.ndarray | None = None # (A,)
country: np.ndarray | None = None # (A,)

Factors should NOT assume high, low, volume are always populated. Crypto perpetuals on some venues only have close. Check before use:

if panel.volume is None:
raise NotImplementedError("f_vwap_factor requires panel.volume")

Signal contract

A factor returns a raw ranking. The signal that goes downstream to mts1b-portfolio for sizing is:

mts1b_foundation.signals
class Signal(BaseModel):
signal_id: str
fund_id: str
asof: datetime
factor_name: str # "f_crypto_realized_vol"
params: dict # {"h": 21, "ema_alpha": 0.94}
universe: list[str]
weights: dict[str, float] # symbol → target weight, must sum to gross_exposure
metadata: dict # IC, t-stat, walk-forward Sharpe, ...

The factor → signal transformation is done by mts1b-research/strategies.py. It takes the raw ranking and applies:

  1. Top-N / bottom-N selection (for L/S strategies)
  2. Equal-weight, vol-weight, or HRP weighting
  3. Universe filters (ADV, price, market cap)
  4. Vol-targeting (if mts1b-portfolio config requires)

Backtesting a factor

from mts1b_GPUbacktester.cli import run_single
from mts1b_quantkit.factors import get

results = run_single(
factor=get("f_crypto_realized_vol"),
params={"h": 21, "ema_alpha": 0.94},
universe="crypto-top-10",
start="2022-01-01",
end="2026-01-01",
rebal="weekly",
cost_bps=60,
)

print(f"Sharpe: {results.sharpe:.2f}")
print(f"Max DD: {results.max_dd:.2%}")
print(f"Calmar: {results.calmar:.2f}")
print(f"IC: {results.ic:.3f}")

Walk-forward validation

Always validate out-of-sample:

from mts1b_quantkit.cv import walk_forward

cv = walk_forward(
factor=get("f_crypto_realized_vol"),
params_grid={"h": [10, 21, 42, 63]},
universe="crypto-top-10",
start="2020-01-01", end="2026-01-01",
train_window=252, test_window=63,
cost_bps=60,
)

# cv is a dict with per-fold + aggregate stats
print(cv["fold_sharpes"]) # [0.92, 1.31, 1.18, ...]
print(cv["agg_sharpe"]) # 1.43
print(cv["agg_ic"]) # 0.062
print(cv["agg_t_stat"]) # 6.41

Source: the canonical walk-forward CV lives at mts1b_quantkit.cv.walk_forward (consolidated from 3 prior locations).

Decay monitoring

mts1b-research/ops/drift_monitor.py continuously evaluates the live factor against its backtested IC:

Drift metricThresholdAction
live_ic - backtest_ic< -0.5σTelegram warning
live_ic - backtest_ic< -1.0σHalve sleeve allocation
live_ic - backtest_ic< -2.0σShadow the factor (halt new orders)

These thresholds are configurable per factor in the strategy registry.

See also