Troubleshooting
Common problems + fixes.
Installation
pip install mts1b-foundation returns "package not found"
Packages are not on PyPI yet (private GitHub repos). Use local editable install:
git clone https://github.com/MTS1B/mts1b-foundation
pip install -e ./mts1b-foundation
For multiple packages, install foundation first then add others without auto-resolving:
pip install -e ./mts1b-foundation
pip install -e ./mts1b-quantkit --no-deps
pip install numpy scipy pandas # add the other deps manually
pydantic.errors.PydanticSchemaGenerationError: Unable to generate schema for Symbol
You're using an older pydantic version. Upgrade:
pip install -U "pydantic>=2.0"
Symbol requires pydantic v2's GetCoreSchemaHandler interface.
ImportError: cannot import name 'X' from 'mts1b_platform'
This usually means you're hitting the bridge-stage. Some submodules still reference /apps/MTS1B/ internals. Two paths:
- Use foundation directly for the type you need (most types are there):
from mts1b_foundation import X - Check the central docs: https://docs.mts1b.investmentparadisellc.com/docs/repos/platform — the README lists what's currently working
Dependency resolution loop / clash
ERROR: ResolutionImpossible: ...
Pin foundation explicitly and use --no-deps for downstream:
pip install -e mts1b-foundation
pip install -e mts1b-quantkit --no-deps
pip install pandas numpy scipy
Imports
ModuleNotFoundError: No module named 'mts1b_foundation'
python -c "import sys; print('\n'.join(sys.path))"
If your install dir isn't on path, activate the venv:
source .venv/bin/activate
from mts1b_quantkit.portfolio.hrp import hrp_weights fails
In the bridge stage, deep submodule imports often fail because the copied code still has internal from mts.X.Y references that haven't been mapped.
Workaround: import from foundation if available, OR use the live monorepo path:
# Bridge workaround until extraction finishes
import sys
sys.path.insert(0, "/apps/MTS1B/services/research/src")
from gpuBT1060.portfolio.hrp import hrp_weights
Track progress: https://docs.mts1b.investmentparadisellc.com/docs/repos/quantkit#status
NATS
nats: connection refused
NATS server isn't running. Start it:
# Local dev (no auth)
docker run -d -p 4222:4222 -p 8222:8222 nats:2-alpine -js -m 8222
# Or via mts1b-deploy
mts1b-deploy install --profile minimal --include nats
Slow consumer / mismatch consumer warnings
WARN: slow consumer durable=my-consumer lag=42s
Your consumer is falling behind the producer. Either:
- Increase
max_ack_pending:await js.subscribe(..., max_ack_pending=10000) - Add a second consumer with the same
durablename (JetStream load-balances) - Process messages in batches instead of one-at-a-time
IncompatibleConsumersError at producer startup
mts1b_foundation.nats.IncompatibleConsumersError: no common version across 3 consumers
A consumer is pinned to an older schema version than the producer's range. Either:
- Upgrade the consumer to support a newer schema
- Or downgrade the producer's
max_vto match
Inspect manifests:
mts1b-platform manifest list
Subject not in registry
WARN: unknown subject mts.v1.my_app.foo.bar — no schema validation
Either register it in mts1b-foundation/nats/_registry.py or use a dict payload (untyped).
Risk gates
ORDER_REJECTED: gate=position_risk code=MAX_POSITION_EXCEEDED
Your order would put the position above RiskEnvelope.max_position_pct. Either:
- Reduce order quantity
- Loosen the envelope (NOT during a halt):
mts mts1b-riskengine envelope set --fund-id X --max-position-pct 0.10
ORDER_REJECTED: gate=drawdown_halt
Fund is in halt state. Cannot loosen envelope while halted. Operator must resume:
mts cmd resume <fund_id>
# Confirms: type RESUME
This requires operator authority + 2FA for live funds.
ORDER_REJECTED: gate=static code=BROKER_NOT_ALLOWED
Order's broker field isn't in RiskEnvelope.allowed_brokers. Check:
mts mts1b-riskengine envelope show --fund-id <fund>
Either change the order's broker, or update the envelope.
OMS state
Order stuck in PENDING_RISK
Check riskengine health:
mts mts1b-deploy status mts1b-riskengine
If unhealthy, restart:
sudo systemctl restart mts1b-riskengine
If healthy but stuck, check the order's audit trail:
mts mts1b-operations audit show --subject-id <order_id>
Order shows ACCEPTED but no fill ever arrives
Broker may have rejected silently. Check:
# Broker connection
mts mts1b-brokers test --broker <broker_name>
# Broker-side open orders (some brokers don't push reject events)
mts mts1b-oms orders open --broker <broker_name>
Run reconciliation:
mts mts1b-riskengine reconcile --fund-id <fund> --force
Backtests
RuntimeError: cupy not installed
mts1b-GPUbacktester defaults to GPU. Install GPU extras:
pip install "mts1b-GPUbacktester[gpu]"
Or force CPU:
mts1b-backtest run --backend cpu --factor ...
Backtest runs but Sharpe is 0.0 / nan
Likely causes:
- Universe too small — single asset = no cross-sectional ranking → all weights are 0
- Lookback too long — first N bars have NaN; ensure
start_dateis past the warmup - Cost too high — strategy edge is killed by costs; lower
cost_bpsto verify the signal is present then add back realistic costs
Debug:
result = run_single(...)
print(result.returns) # daily returns array
print(result.weights[-1]) # latest weights
print(result.config) # check params
Walk-forward IC is much lower than in-sample
This is normal if your factor is overfit. Stop and rebuild — see Tutorial 3 on stability tests.
Deployment
Proxmox API: 401 Unauthorized
Your token doesn't have the right permissions. In Proxmox UI:
- Datacenter → Permissions → API Tokens
- Check Privilege Separation is unchecked (or assign explicit perms)
- Roles:
PVEVMAdminon/vms/*at least
LXC won't start: pct create: storage already exists
Container ID conflict. Use a different ID range:
mts1b-deploy menuconfig
# Proxmox section: Reserve IDs: 200-230
mts1b-deploy install hangs at "Pulling images..."
DNS / registry issue. Try:
docker pull ghcr.io/mts1b/mts1b-foundation:0.0.1
If that fails, you may need to authenticate to ghcr.io:
echo <github_pat> | docker login ghcr.io -u <username> --password-stdin
Or set a local mirror in mts1b.config:
registry:
url: https://my-mirror.local
username: mts1b
password: ${MIRROR_PASSWORD}
Docs site
Algolia search returns no results
Two reasons:
- Index is stale — Algolia crawler runs weekly. Re-trigger at https://dashboard.algolia.com → apps/O23N9EQJYS → Crawler → Restart crawl
- Page was recently added — wait ~30 min for the next crawl, or trigger manually
docs.mts1b.investmentparadisellc.com returns Site not found
Cloudflare custom domain SSL cert provisioning takes 5-15 min after attach. Workaround:
curl https://mts1b-docs.pages.dev/ # use the .pages.dev URL meanwhile
LLM (mts1b-llm)
BudgetExceededError: persona=CRO daily=$5.00
Budget exhausted. Either:
- Wait until UTC midnight (auto-reset)
- Increase budget:
mts mts1b-llm budget set --persona CRO --daily-usd 10
RateLimitError: anthropic ... Too Many Requests
Provider rate-limit hit. The router will fall back to OpenAI / Google automatically if configured. To verify failover:
mts mts1b-llm providers status
LLM responses are inconsistent run-to-run
Set temperature=0.0 (deterministic) in the persona YAML:
name: my_persona
temperature: 0.0
Note: even at temperature 0, providers sometimes have small non-determinism (cache, model updates).
Frontends
webui shows "Cannot connect to OMS"
Check OMS health:
mts mts1b-deploy status mts1b-oms
curl -i http://localhost:8001/healthz
Likely fix:
sudo systemctl restart mts1b-oms
TUI shows garbled characters
Your terminal doesn't support truecolor. Set:
export TERM=xterm-256color
mts tui
Or set MTS1B_TUI_MONOCHROME=1 for a monochrome fallback.
Audit chain
audit verify returns "chain integrity FAILED at seq 42"
Tampering or storage corruption detected. Stop trading. Investigate.
# Inspect the entry
mts mts1b-operations audit show --sequence 42
# Diff against backup
restic restore <snapshot_id> --target /tmp/audit-backup
diff /apps/MTS1B/data/audit/main.log /tmp/audit-backup/main.log
If the diff shows malicious changes, restore from backup + rotate Vault secrets + investigate root cause.
Vault
permission denied on vault kv get
Your token doesn't have read access to that path. Check policy:
vault token lookup
vault policy list
vault policy read <policy_name>
Common fix: rotate via AppRole login:
vault write auth/approle/login \
role_id=<your_role_id> \
secret_id=<your_secret_id>
Then use the returned token.
Vault sealed
vault status
# Sealed: true
Unseal with 3 of 5 shares:
vault operator unseal <share_1>
vault operator unseal <share_2>
vault operator unseal <share_3>
If you don't have the shares, you've lost the Vault data. Restore from backup.
Performance
Backtest is slow
| Symptom | Likely cause | Fix |
|---|---|---|
| 5+ minutes per run | CPU backend with large universe | Switch to GPU: --backend cuda |
| Memory error | Universe × duration too big | Chunk: --chunk-size-mb 512 |
| Lots of disk I/O | Cold parquet cache | First run warm; subsequent runs faster |
OMS slow on submit
mts mts1b-platform metric latency --service mts1b-oms --window 5m
Look for p99 > 50ms. Likely culprits:
- Riskengine gRPC slow (
mts mts1b-platform metric latency --service mts1b-riskengine) - DB pool exhausted (
mts mts1b-platform metric db_pool --pool primary) - NATS publish slow (
mts mts1b-platform metric nats --subject "mts.v1.oms.>")
Where else to look
- GitHub Discussions
- Discord #help
- Audit chain for "what happened when"
- Grafana for metrics dashboards