Scope

What cents is

A personal research workbench for writing down stock theses, attaching evidence, and tracking whether the reasoning held up.
An experiment in multi-agent LLM orchestration applied to a domain where outcomes are unambiguous (price moves) and the regime is non-stationary.
A labeled-outcomes dataset generator — every closed thesis is tagged with its discovery source, premise tags, regime snapshot, and closure reason, so the project itself can be studied.

It is open-source under the MIT license. Treat the codebase as you would any research code: instructive, occasionally clever, not audit-grade.

What cents is not

Not investment advice. Outputs (conviction deltas, premise tags, thesis suggestions, BUY/SELL/HOLD strings) are model-generated and uncalibrated. There is no held-out evaluation of the LLM scorers; there are no performance disclosures on the cohort or backtest outputs.
Not a fiduciary product. No KYC, no suitability check, no risk tolerance intake, no investor profiling. The system will happily open paper shorts on a retiree’s behalf if asked.
Not a trading system. Position sizing defaults to equal-dollar. There is no portfolio-level kill switch, no max-drawdown gate, no vol-targeting (off by default; opt-in via sizing_mode = "vol_scaled"), no factor or correlation cap, no liquidity or borrow filtering, no intraday stop monitoring. The “paired hedge” is dollar-matched, not beta-matched (beta_match_hedge defaults to false).

Vol-scaled sizing, beta-matched hedging, transaction cost modeling, and portfolio drawdown computation are all available as utilities in cents/finance/, but the engine deliberately does not use them as gating decisions in the default research configuration. They exist so callers writing their own analytics can study these dimensions of the outcomes dataset; the engine itself records what happened, it doesn’t filter what gets to happen.

How the pipeline stays falsifiable

The way cents addresses the question “did this experiment actually work?” without post-hoc cherry-picking is cents experiment register. A YAML spec declares the hypothesis, the primary metric, and a minimum_n_per_arm target before the factory opens any theses for the run. Registration snapshots the current factory.toml SHA so the parameters are frozen too. Every thesis the factory opens while the experiment is active is stamped with the experiment_id and an orchestrator_label ("llm" for the real multi-agent stack, "random" for the matched-cadence control arm — see Operating principles). cents experiment finalize <name> locks the verdict against the original spec — not against whatever cut of the data flatters it.

This is the move that makes the cohort and regime tables on this site worth anything. Without it, they’re storytelling.

Not a fund accounting system. SQLite at ~/.cents/data/cents.db is a single-user file with no encryption, no WORM-compliant audit trail, no broker reconciliation, no lot accounting, no wash-sale tracking, no 1099 reconciliation.

If any of those gaps would be load-bearing for the way you’d use cents, you don’t want cents — you want a real OMS, a real adviser, or a real broker.

Real-money trading is out of scope

Cents includes an [broker] extras install and an Alpaca integration because the original experiment needed somewhere to read prices and post paper fills. Live, real-money trading is technically possible but explicitly outside the scope of this project.

What that means in practice:

The cents factory autonomous loop is hard-coded to paper=True.
The cents broker buy / broker sell commands are documented as paper-trading commands; they have no review, suitability, or best-execution layer behind them.
There is no roadmap item to make real-money trading safe. The existing P4 roadmap entry on Alpaca real-money execution is deferred indefinitely — not “soon,” not “after v1.0,” not at all under the current scope.
If you point cents at a non-paper Alpaca account by editing the config or the source, you are doing so against the documented intent of the project, with none of the controls (position sizing discipline, kill switches, recordkeeping, reconciliation, tax accounting) that a real trading system would require.

The author of this project does not want anyone to lose money because of code in this repo. If you are tempted to wire it to a live account, please don’t.

Limits the critique surfaced

The system was put through a five-persona critique (investment PM, AI engineer, CFO, risk manager, compliance officer) and the convergent verdict was that it is a research tool, full stop. The findings are tracked as issues in the .beads directory; the headline gaps are:

LLM components are uncalibrated and not deterministic (temperature unset, model alias not pinned to a snapshot).
Untrusted news/event text is f-stringed into LLM prompts without delimiters — a prompt-injection surface.
No volatility-scaled position sizing, no portfolio-level drawdown gate, no liquidity or borrow checks on shorts.
No transaction cost, slippage, or borrow-fee modeling — recorded P&L is gross of costs that would meaningfully change the sign of small edges.
No prompt/model fingerprint linked to evidence rows, so historical decisions cannot be reconstructed for audit.

These are real and known. They are why this is a research tool.

Survivorship bias in universes

Universes resolved at run-time use current FMP screener output plus any persisted delistings. Forward tests are unbiased once cents universe ingest-delistings has been run; retrospective backtests against the same universe stay biased by symbols delisted before the delistings table existed.

In practice:

Live / forward runs: cents factory run resolves the universe as of today, so survivorship bias is absent for symbols that exist today.
Past-date backtests: cents universe show --as-of YYYY-MM-DD reconstructs membership by layering tracked delistings on top of the current screener output. Coverage is only as good as how recently ingest-delistings has been running.
The fix: keep cents universe ingest-delistings on a daily schedule. The longer it runs, the closer retrospective universes get to being unbiased.

Lookahead leakage in sentiment scoring

NewsAPI returns articles published at any time, including same-day pieces that explicitly reference today’s price action (“NVDA up 8% after …”). The sentiment agent’s LLM scorer sees that text — so when the article body already names the move, the agent is implicitly forward-looking on intraday horizons. The outcome label the factory then learns from is contaminated with information the strategy could not have used at decision time.

An audit test documents the methodology:

pytest tests/test_lookahead_audit.py --runlookahead

It scores two headlines that differ only in whether they mention the day’s move and asserts the live LLM doesn’t move more than 0.3 between them. A failure is the documented “leakage detected” signal. The mocked smoke version runs unconditionally on every test run and only checks plumbing.

Mitigation (TODO). Add a news_cutoff_time config knob on FactoryConfig that filters NewsAPI by publishedAt < cutoff — only consider articles published before market open on the decision day. Tracked as part of cents-ekd.

If you still want to use it

Use cents the way you’d use a Jupyter notebook: as a sandbox for asking questions about your own theses, in paper, on a machine you control, with money you’ve decided to treat as study material. Don’t share its outputs as recommendations to other people. Don’t publish its cohort or backtest tables without disclosing how thin the sample is and how much of the P&L is gross of costs. And don’t point it at a live account.

That’s the scope.

Not financial advice. Cents is an educational and research tool for tracking your own investment theses. Outputs are model-generated and may be inaccurate. You are solely responsible for your own investment decisions.