Troubleshooting

Common problems + fixes.

Installation

`pip install mts1b-foundation` returns "package not found"

Packages are not on PyPI yet (private GitHub repos). Use local editable install:

git clone https://github.com/MTS1B/mts1b-foundation
pip install -e ./mts1b-foundation

For multiple packages, install foundation first then add others without auto-resolving:

pip install -e ./mts1b-foundation
pip install -e ./mts1b-quantkit --no-deps
pip install numpy scipy pandas       # add the other deps manually

`pydantic.errors.PydanticSchemaGenerationError: Unable to generate schema for Symbol`

You're using an older pydantic version. Upgrade:

pip install -U "pydantic>=2.0"

Symbol requires pydantic v2's GetCoreSchemaHandler interface.

`ImportError: cannot import name 'X' from 'mts1b_platform'`

This usually means you're hitting the bridge-stage. Some submodules still reference /apps/MTS1B/ internals. Two paths:

Use foundation directly for the type you need (most types are there): from mts1b_foundation import X
Check the central docs: https://docs.mts1b.investmentparadisellc.com/docs/repos/platform — the README lists what's currently working

Dependency resolution loop / clash

ERROR: ResolutionImpossible: ...

Pin foundation explicitly and use --no-deps for downstream:

pip install -e mts1b-foundation
pip install -e mts1b-quantkit --no-deps
pip install pandas numpy scipy

Imports

`ModuleNotFoundError: No module named 'mts1b_foundation'`

python -c "import sys; print('\n'.join(sys.path))"

If your install dir isn't on path, activate the venv:

source .venv/bin/activate

`from mts1b_quantkit.portfolio.hrp import hrp_weights` fails

In the bridge stage, deep submodule imports often fail because the copied code still has internal from mts.X.Y references that haven't been mapped.

Workaround: import from foundation if available, OR use the live monorepo path:

# Bridge workaround until extraction finishes
import sys
sys.path.insert(0, "/apps/MTS1B/services/research/src")
from gpuBT1060.portfolio.hrp import hrp_weights

Track progress: https://docs.mts1b.investmentparadisellc.com/docs/repos/quantkit#status

NATS

`nats: connection refused`

NATS server isn't running. Start it:

# Local dev (no auth)
docker run -d -p 4222:4222 -p 8222:8222 nats:2-alpine -js -m 8222

# Or via mts1b-deploy
mts1b-deploy install --profile minimal --include nats

Slow consumer / mismatch consumer warnings

WARN: slow consumer durable=my-consumer lag=42s

Your consumer is falling behind the producer. Either:

Increase max_ack_pending:

await js.subscribe(..., max_ack_pending=10000)

Add a second consumer with the same durable name (JetStream load-balances)
Process messages in batches instead of one-at-a-time

`IncompatibleConsumersError` at producer startup

mts1b_foundation.nats.IncompatibleConsumersError: no common version across 3 consumers

A consumer is pinned to an older schema version than the producer's range. Either:

Upgrade the consumer to support a newer schema
Or downgrade the producer's max_v to match

Inspect manifests:

mts1b-platform manifest list

Subject not in registry

WARN: unknown subject mts.v1.my_app.foo.bar — no schema validation

Either register it in mts1b-foundation/nats/_registry.py or use a dict payload (untyped).

Risk gates

`ORDER_REJECTED: gate=position_risk code=MAX_POSITION_EXCEEDED`

Your order would put the position above RiskEnvelope.max_position_pct. Either:

Reduce order quantity

Loosen the envelope (NOT during a halt):

mts mts1b-riskengine envelope set --fund-id X --max-position-pct 0.10

`ORDER_REJECTED: gate=drawdown_halt`

Fund is in halt state. Cannot loosen envelope while halted. Operator must resume:

mts cmd resume <fund_id>
# Confirms: type RESUME

This requires operator authority + 2FA for live funds.

`ORDER_REJECTED: gate=static code=BROKER_NOT_ALLOWED`

Order's broker field isn't in RiskEnvelope.allowed_brokers. Check:

mts mts1b-riskengine envelope show --fund-id <fund>

Either change the order's broker, or update the envelope.

OMS state

Order stuck in `PENDING_RISK`

Check riskengine health:

mts mts1b-deploy status mts1b-riskengine

If unhealthy, restart:

sudo systemctl restart mts1b-riskengine

If healthy but stuck, check the order's audit trail:

mts mts1b-operations audit show --subject-id <order_id>

Order shows `ACCEPTED` but no fill ever arrives

Broker may have rejected silently. Check:

# Broker connection
mts mts1b-brokers test --broker <broker_name>

# Broker-side open orders (some brokers don't push reject events)
mts mts1b-oms orders open --broker <broker_name>

Run reconciliation:

mts mts1b-riskengine reconcile --fund-id <fund> --force

Backtests

`RuntimeError: cupy not installed`

mts1b-GPUbacktester defaults to GPU. Install GPU extras:

pip install "mts1b-GPUbacktester[gpu]"

Or force CPU:

mts1b-backtest run --backend cpu --factor ...

Backtest runs but Sharpe is 0.0 / nan

Likely causes:

Universe too small — single asset = no cross-sectional ranking → all weights are 0
Lookback too long — first N bars have NaN; ensure start_date is past the warmup
Cost too high — strategy edge is killed by costs; lower cost_bps to verify the signal is present then add back realistic costs

Debug:

result = run_single(...)
print(result.returns)            # daily returns array
print(result.weights[-1])        # latest weights
print(result.config)             # check params

Walk-forward IC is much lower than in-sample

This is normal if your factor is overfit. Stop and rebuild — see Tutorial 3 on stability tests.

Deployment

Proxmox API: `401 Unauthorized`

Your token doesn't have the right permissions. In Proxmox UI:

Datacenter → Permissions → API Tokens
Check Privilege Separation is unchecked (or assign explicit perms)
Roles: PVEVMAdmin on /vms/* at least

LXC won't start: `pct create: storage already exists`

Container ID conflict. Use a different ID range:

mts1b-deploy menuconfig
# Proxmox section: Reserve IDs: 200-230

`mts1b-deploy install` hangs at "Pulling images..."

DNS / registry issue. Try:

docker pull ghcr.io/mts1b/mts1b-foundation:0.0.1

If that fails, you may need to authenticate to ghcr.io:

echo <github_pat> | docker login ghcr.io -u <username> --password-stdin

Or set a local mirror in mts1b.config:

registry:
  url: https://my-mirror.local
  username: mts1b
  password: ${MIRROR_PASSWORD}

Docs site

Algolia search returns no results

Two reasons:

Index is stale — Algolia crawler runs weekly. Re-trigger at https://dashboard.algolia.com → apps/O23N9EQJYS → Crawler → Restart crawl
Page was recently added — wait ~30 min for the next crawl, or trigger manually

`docs.mts1b.investmentparadisellc.com` returns `Site not found`

Cloudflare custom domain SSL cert provisioning takes 5-15 min after attach. Workaround:

curl https://mts1b-docs.pages.dev/      # use the .pages.dev URL meanwhile

LLM (mts1b-llm)

`BudgetExceededError: persona=CRO daily=$5.00`

Budget exhausted. Either:

Wait until UTC midnight (auto-reset)

Increase budget:

mts mts1b-llm budget set --persona CRO --daily-usd 10

`RateLimitError: anthropic ... Too Many Requests`

Provider rate-limit hit. The router will fall back to OpenAI / Google automatically if configured. To verify failover:

mts mts1b-llm providers status

LLM responses are inconsistent run-to-run

Set temperature=0.0 (deterministic) in the persona YAML:

name: my_persona
temperature: 0.0

Note: even at temperature 0, providers sometimes have small non-determinism (cache, model updates).

Frontends

`webui` shows "Cannot connect to OMS"

Check OMS health:

mts mts1b-deploy status mts1b-oms
curl -i http://localhost:8001/healthz

Likely fix:

sudo systemctl restart mts1b-oms

TUI shows garbled characters

Your terminal doesn't support truecolor. Set:

export TERM=xterm-256color
mts tui

Or set MTS1B_TUI_MONOCHROME=1 for a monochrome fallback.

Audit chain

`audit verify` returns "chain integrity FAILED at seq 42"

Tampering or storage corruption detected. Stop trading. Investigate.

# Inspect the entry
mts mts1b-operations audit show --sequence 42

# Diff against backup
restic restore <snapshot_id> --target /tmp/audit-backup
diff /apps/MTS1B/data/audit/main.log /tmp/audit-backup/main.log

If the diff shows malicious changes, restore from backup + rotate Vault secrets + investigate root cause.

Vault

`permission denied` on `vault kv get`

Your token doesn't have read access to that path. Check policy:

vault token lookup
vault policy list
vault policy read <policy_name>

Common fix: rotate via AppRole login:

vault write auth/approle/login \
    role_id=<your_role_id> \
    secret_id=<your_secret_id>

Then use the returned token.

Vault sealed

vault status
# Sealed: true

Unseal with 3 of 5 shares:

vault operator unseal <share_1>
vault operator unseal <share_2>
vault operator unseal <share_3>

If you don't have the shares, you've lost the Vault data. Restore from backup.

Performance

Backtest is slow

Symptom	Likely cause	Fix
5+ minutes per run	CPU backend with large universe	Switch to GPU: `--backend cuda`
Memory error	Universe × duration too big	Chunk: `--chunk-size-mb 512`
Lots of disk I/O	Cold parquet cache	First run warm; subsequent runs faster

OMS slow on submit

mts mts1b-platform metric latency --service mts1b-oms --window 5m

Look for p99 > 50ms. Likely culprits:

Riskengine gRPC slow (mts mts1b-platform metric latency --service mts1b-riskengine)
DB pool exhausted (mts mts1b-platform metric db_pool --pool primary)
NATS publish slow (mts mts1b-platform metric nats --subject "mts.v1.oms.>")

Where else to look

GitHub Discussions
Discord #help
Audit chain for "what happened when"
Grafana for metrics dashboards

Installation​

pip install mts1b-foundation returns "package not found"​

pydantic.errors.PydanticSchemaGenerationError: Unable to generate schema for Symbol​

ImportError: cannot import name 'X' from 'mts1b_platform'​

Dependency resolution loop / clash​

Imports​

ModuleNotFoundError: No module named 'mts1b_foundation'​

from mts1b_quantkit.portfolio.hrp import hrp_weights fails​

NATS​

nats: connection refused​

Slow consumer / mismatch consumer warnings​

IncompatibleConsumersError at producer startup​

Subject not in registry​

Risk gates​

ORDER_REJECTED: gate=position_risk code=MAX_POSITION_EXCEEDED​

ORDER_REJECTED: gate=drawdown_halt​

ORDER_REJECTED: gate=static code=BROKER_NOT_ALLOWED​

OMS state​

Order stuck in PENDING_RISK​

Order shows ACCEPTED but no fill ever arrives​

Backtests​

RuntimeError: cupy not installed​

Backtest runs but Sharpe is 0.0 / nan​

Walk-forward IC is much lower than in-sample​

Deployment​

Proxmox API: 401 Unauthorized​

LXC won't start: pct create: storage already exists​

mts1b-deploy install hangs at "Pulling images..."​

Docs site​

Algolia search returns no results​

docs.mts1b.investmentparadisellc.com returns Site not found​

LLM (mts1b-llm)​

BudgetExceededError: persona=CRO daily=$5.00​

RateLimitError: anthropic ... Too Many Requests​

LLM responses are inconsistent run-to-run​

Frontends​

webui shows "Cannot connect to OMS"​

TUI shows garbled characters​

Audit chain​

audit verify returns "chain integrity FAILED at seq 42"​

Vault​

permission denied on vault kv get​

Vault sealed​

Performance​

Backtest is slow​

OMS slow on submit​

Where else to look​