<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/">
    <channel>
        <title>Keyur Raval</title>
        <link>https://keyval.me</link>
        <description>Notes from Keyur Raval on product engineering, startups, AI systems, and shipping software.</description>
        <lastBuildDate>Mon, 22 Jun 2026 19:24:31 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>https://github.com/jpmonette/feed</generator>
        <image>
            <title>Keyur Raval</title>
            <url>https://keyval.me/favicon.ico</url>
            <link>https://keyval.me</link>
        </image>
        <copyright>All rights reserved 2026</copyright>
        <item>
            <title><![CDATA[Building GreekLab: A Regime-Aware Options Research Platform for Indian Markets]]></title>
            <link>https://keyval.me/articles/building-greeklab-indian-options-research</link>
            <guid isPermaLink="false">https://keyval.me/articles/building-greeklab-indian-options-research</guid>
            <pubDate>Mon, 22 Jun 2026 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<p>Most options research starts with a strategy and works backward to justify it.
GreekLab starts with the market regime and works forward to the strategy that
fits it.</p>
<p>That distinction drives the entire architecture. Before any trade decision is
made, the system classifies the current market state — ranging, trending up,
trending down, or volatility shock — and routes to the strategy families that
have positive expected value in that regime. No regime classification, no trade.</p>
<p>This is a case study on building that research platform for Indian index options:
Nifty, Sensex, and BankNifty.</p>
<h2>Why Indian Options Are Worth a Dedicated Platform</h2>
<p>Indian index options have characteristics that make generic options frameworks
a poor fit.</p>
<p>Nifty options expire every Thursday. The weekly cycle means there are always
multiple active expiries in play — current week, next week, monthly, quarterly
— and the choice of expiry is a research decision, not just a preference.
Implied volatility behaves differently near expiry, and the premium decay
profile changes meaningfully in the final trading day.</p>
<p>The India VIX is a useful but imperfect signal. Its relationship to Nifty
realized volatility shifts across market regimes. A system that treats VIX as
a single threshold for volatility-selling decisions will be correct on average
and wrong at the moments that matter.</p>
<p>Finally, data quality is a genuine problem. Live broker feeds drop candles.
Historical expired options data is fragmented across providers. A platform that
does not invest in data validation and gap repair will produce backtests that
look better than production results.</p>
<p>GreekLab is built around those three constraints.</p>
<h2>Data Layer: Four Brokers, One Pipeline</h2>
<p>The data layer ingests from four sources, each covering a different slice of
the market data problem:</p>
<ul>
<li><strong>Zerodha Kite:</strong> Live 1-minute candle streaming for index and options via
KiteTicker, plus reference data for all active instruments. The primary live
feed.</li>
<li><strong>Dhan:</strong> Historical expired options data from 2021 forward. Expired options
are not available through most providers after expiry; Dhan's API fills that
gap for backtesting.</li>
<li><strong>Upstox:</strong> Alternative source for expired options and index data, used as
validation and for any gaps in Dhan coverage.</li>
<li><strong>Breeze:</strong> Spot and index historical data as a secondary validation source.</li>
</ul>
<p>All four feeds write into PostgreSQL for the control plane (positions,
decisions, regime state) and optionally into ClickHouse for the high-volume
market data tables (option 1-minute bars, greek surface snapshots). A
reconciliation service runs daily to detect candle gaps and repair them from
Kite historical data.</p>
<p>The goal is a clean, gap-free options dataset from 2021 to present for Nifty,
with VIX aligned to each bar. Without that foundation, every backtest result
carries a data-quality caveat that is impossible to quantify.</p>
<h2>Regime Detection</h2>
<p>The regime engine runs continuously during market hours and classifies each
1-minute bar into one of five states:</p>
<ul>
<li><strong>RANGE:</strong> Price contained within the opening range, VIX moderate, no strong
directional momentum. The environment for volatility-selling strategies.</li>
<li><strong>TREND_UP / TREND_DOWN:</strong> Price breaking out of the opening range with
directional confirmation. The environment for momentum and debit spread
strategies.</li>
<li><strong>VOL_SHOCK:</strong> Realized volatility spike, VIX elevated sharply, IV-RV spread
compressed or inverted. The environment for reduced exposure and defined-risk
structures only.</li>
<li><strong>NO_TRADE:</strong> Outside market hours, data quality below threshold, or no
regime with sufficient confidence. No strategies admitted.</li>
</ul>
<p>Each classification combines the opening range position, VWAP relationship,
ATR percentage, realized volatility, implied volatility, and the IV-RV spread.
The regime is stored with the input signals and the classification reason so
every decision can be traced back to the market state that produced it.</p>
<p>In June 2026, the NIFTY regime distribution across 1-minute bars was
predominantly NO_TRADE — the system classifies most intraday time as outside
its confidence threshold. RANGE represented the largest active-trading regime,
with TREND_UP and TREND_DOWN combining for a small fraction of bars. This
selectivity is intentional. The goal is not to be in the market constantly. It
is to be in the market when the regime favors the strategy.</p>
<h2>Strategy Families</h2>
<p>GreekLab currently papers eleven strategy families. Each is admitted only in
its target regime.</p>
<p><strong>Volatility selling (RANGE regime):</strong></p>
<ul>
<li><em>Iron Condor:</em> Short strangle with defined-risk wings. The primary
volatility-selling vehicle — short premium collected on both sides with max
loss capped by the wing width.</li>
<li><em>Iron Fly:</em> ATM short straddle with wings. Higher premium than a condor, but
a narrower profit zone.</li>
<li><em>Protected Strangle:</em> Short strangle with tail-risk hedges. Lower premium
than an iron condor, but reduced exposure to gap events.</li>
</ul>
<p><strong>Directional (TREND regime):</strong></p>
<ul>
<li><em>Bull Debit Spread / Bear Debit Spread:</em> Defined-risk debit structures for
directional momentum. Long the near-the-money option, short the further
strike to reduce cost.</li>
<li><em>ATR Momentum Buy:</em> Directional option buy sized by ATR-derived breakout
confirmation. Short hold target, fast exit rules.</li>
<li><em>Trend Option Buy:</em> Straight directional option buy with technical entry
confirmation and a time-based exit if the move does not follow through.</li>
</ul>
<p><strong>Term structure:</strong></p>
<ul>
<li><em>Calendar Spread:</em> Long a farther expiry, short the nearer expiry at the
same strike. Benefits from the near-expiry leg decaying faster than the
far-expiry leg.</li>
<li><em>Double Calendar:</em> Calendar spreads on both sides of the current price.
Lower directional risk than a single calendar, more complex to manage.</li>
<li><em>Directional Diagonal:</em> Combined directional and term-structure play. Long a
farther expiry at a directional strike, short the current-week expiry at a
closer strike.</li>
</ul>
<p><strong>Scalping (RANGE regime, intraday):</strong></p>
<ul>
<li><em>Greek Scalp:</em> Short-duration momentum trade using real-time greeks as entry
signals. Tight stop-loss and fast exit target. High trade frequency, short
average hold.</li>
</ul>
<p><strong>Expiry-day:</strong></p>
<ul>
<li><em>DTE-0 Butterfly:</em> Same-day expiry butterfly for premium decay in a range.
High-conviction range-day only.</li>
</ul>
<p><strong>ML-ranked cross-strategy:</strong></p>
<ul>
<li><em>Tick-data ranker:</em> An ML model trained on five years of per-tick options
data — real bid/ask spreads, OI velocity, and buy/sell flow imbalance — that
selects the best strategy structure and strike configuration at each entry
time from across the vol-selling menu. The ranker is regime-conditioned and
uses microstructure signals as entry features rather than price-only
indicators.</li>
</ul>
<h2>Paper Trading Results</h2>
<p>GreekLab has been running in paper-trading mode since May 13, 2026. Across 114
closed positions, the platform has generated a net P&amp;L of approximately
₹47,200 on paper capital, with an overall win rate of 56.1%.</p>
<p>The results break down meaningfully by strategy family:</p>
<table><thead><tr><th>Strategy family</th><th style="text-align:right">Positions</th><th style="text-align:right">Win rate</th><th style="text-align:right">Total net P&amp;L</th><th style="text-align:right">Avg hold</th></tr></thead><tbody><tr><td>Iron condor</td><td style="text-align:right">41</td><td style="text-align:right">75.6%</td><td style="text-align:right">₹40,184</td><td style="text-align:right">66 hrs</td></tr><tr><td>Tick-data ranker</td><td style="text-align:right">10</td><td style="text-align:right">80.0%</td><td style="text-align:right">₹5,133</td><td style="text-align:right">74 hrs</td></tr><tr><td>ATR momentum buy</td><td style="text-align:right">12</td><td style="text-align:right">75.0%</td><td style="text-align:right">₹5,048</td><td style="text-align:right">0.5 hrs</td></tr><tr><td>Trend option buy</td><td style="text-align:right">7</td><td style="text-align:right">57.1%</td><td style="text-align:right">₹5,216</td><td style="text-align:right">0.8 hrs</td></tr><tr><td>Calendar</td><td style="text-align:right">3</td><td style="text-align:right">0.0%</td><td style="text-align:right">-₹700</td><td style="text-align:right">1.8 hrs</td></tr><tr><td>Greek scalp</td><td style="text-align:right">26</td><td style="text-align:right">42.3%</td><td style="text-align:right">-₹2,530</td><td style="text-align:right">0.2 hrs</td></tr><tr><td>Greek debit spread</td><td style="text-align:right">14</td><td style="text-align:right">7.1%</td><td style="text-align:right">-₹4,783</td><td style="text-align:right">3.2 hrs</td></tr></tbody></table>
<p>Several things are visible in the data.</p>
<p>Iron condors are carrying the positive P&amp;L. The 75.6% win rate over a 66-hour
average hold is consistent with a range-regime volatility-selling strategy that
collects premium slowly and closes early when the position reaches target. The
result is not surprising — the RANGE regime is the most common active-trading
classification, and iron condors are the primary vehicle for it.</p>
<p>The tick-data ranker is the second-strongest result: 80% win rate across 10
positions with a 74-hour average hold. This is an ML ranker trained on five
years of per-tick options data — real bid/ask spreads, OI delta, and buy/sell
flow — that selects the best strategy type and strike configuration at each
entry time from a menu of vol-selling structures (straddles, strangles, iron
condors, iron flies, and credit spreads). Small sample, but the per-trade
economics are consistent with the iron condor results.</p>
<p>The short-duration directional strategies (ATR momentum, trend option buy) are
profitable with high win rates and sub-one-hour average holds. These are
admitted infrequently and sized conservatively, so the absolute P&amp;L is smaller,
but the per-trade economics look sound.</p>
<p>Greek scalp and greek debit spread are underperforming. The scalp strategy has
a 42.3% win rate with an average hold of 0.2 hours — it is trading frequently,
but more losing trades than winning ones. The debit spread has a 7.1% win rate
across 14 positions, which is the most informative failure in the data. The
debit spread entry rules are being triggered in conditions where directional
momentum does not follow through. That is a signal to tighten the entry regime
filter, not a signal that debit spreads cannot work.</p>
<p>Calendar spreads have lost on all three positions. This is a small sample, but
the structure requires specific term-structure conditions to profit. The current
entry logic is likely admitting calendars in conditions where the near-far
volatility spread does not favor the trade.</p>
<p>These are exactly the kinds of findings a paper-trading pipeline is designed to
produce before real capital is committed.</p>
<h2>The Operations Dashboard</h2>
<p>GreekLab runs a FastAPI backend and a React dashboard at
<a href="https://indianops.vibeship.ai/">indianops.vibeship.ai</a>. The dashboard is
read-only — it surfaces, does not control.</p>
<p>The live view shows the current spot price and day change, the active regime
for each underlying, open positions with entry price and current mark-to-market,
recent decisions with the regime state and reason at the time of each decision,
and data quality indicators (candle coverage, quote freshness, ingestion health).</p>
<p>The regime indicator is the most-watched element. If the system is in NO_TRADE
when the market looks active, the quality check tab will show why — stale
quotes, a candle gap, or a regime confidence threshold not met. That
transparency is what makes the system trustworthy enough to iterate on.</p>
<h2>What the Research Is Teaching</h2>
<p>A few things are already clear from the paper-trading period.</p>
<p><strong>Regime selectivity is correct but may be too conservative.</strong> The high NO_TRADE
fraction means the system passes on many potential setups. Whether those passed
setups would have been profitable is worth examining through backtesting, not
just observation.</p>
<p><strong>Multi-leg strategies with defined risk outperform pure scalping in this
period.</strong> The iron condor and tick-data ranker results versus the scalp and
debit spread results point toward structures where the profit/loss profile is
bounded and the hold period is long enough for premium decay to work.</p>
<p><strong>Entry regime filters matter more than exit rules.</strong> The debit spread and
calendar results suggest the entries are wrong more often than the exits. That
is a common finding in options research and the right place to invest in
tighter conditions.</p>
<p><strong>Paper trading reveals operational failures that backtesting cannot.</strong> Two
positions were voided due to data issues. Several positions were admitted and
closed within minutes because live quote quality degraded. A backtest using
clean historical data would not have shown those failure modes.</p>
<p>The platform is research infrastructure, not a trading product. The value is in
building a rigorous, reproducible process for evaluating whether a strategy
has edge before any real capital is involved.</p>]]></content:encoded>
            <author>keyur@keyval.me (Keyur Raval)</author>
        </item>
        <item>
            <title><![CDATA[Building PatternRank: research-grade equity rankings without trade calls]]></title>
            <link>https://keyval.me/articles/building-patternrank-research-grade-equity-rankings</link>
            <guid isPermaLink="false">https://keyval.me/articles/building-patternrank-research-grade-equity-rankings</guid>
            <pubDate>Mon, 22 Jun 2026 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<figure><img src="/articles/building-patternrank/hero.png" alt="PatternRank product visual showing research-grade US equity rankings"><figcaption><p>PatternRank is positioned as a research workflow for US equities, not a
trading recommendation engine.</p></figcaption></figure>
<p>PatternRank started as an internal research system.</p>
<p>After spending roughly a decade studying Indian markets, I wanted to bring a
stricter research process into public US equities. Most retail screening tools
make it easy to sort tickers. Far fewer force the uncomfortable parts of
research: walk-forward validation, survivorship-bias handling, corporate-action
adjustment, realistic execution assumptions, and honest follow-through after a
ranking is produced.</p>
<p>That gap became the product brief. PatternRank would not be another place to
make confident trade calls. It would rank equity setups with model-derived
evidence, show confidence and outcome context, and keep the product language
firmly in research territory.</p>
<p>The flagship product is the US 7-day momentum workflow. The rankings refresh
after the US market close. Historical rankings remain visible once their 7-day
outcome can be measured. The public app separates the standard 7D All Stocks
universe from the 7D ETB universe, where ETB means Easy-to-Borrow and adds a
liquidity and borrowability filter.</p>
<h2>The Problem With Most Screeners</h2>
<p>A simple screener answers a simple question: what matches these rules right
now?</p>
<p>That is useful, but it leaves too much of the research process outside the
system. A ranking product needs to answer a harder set of questions:</p>
<ul>
<li>Was this setup defined before the move, or after the chart already looked
obvious?</li>
<li>Was the model trained with future information leaking into the features?</li>
<li>Did historical evaluation include names that later disappeared from the
universe?</li>
<li>Were corporate actions and adjusted prices handled consistently?</li>
<li>Were results measured with realistic execution assumptions instead of perfect
fills?</li>
<li>Can a user review completed outcomes without turning the output into advice?</li>
</ul>
<p>PatternRank was built around those questions. The product is less interesting
as a dashboard and more interesting as a research pipeline that happens to have
a dashboard at the end.</p>
<h2>Getting the Labels Right</h2>
<p>The earliest version of the system found historical breakout events, labeled
the patterns around them, and trained models to recognize similar behavior.</p>
<p>That was useful for exploration, but it also exposed the main modeling risk.
If a label is too close to the obvious breakout, the model can become good at
recognizing a move after it is already visible. That is not the product I wanted
to build.</p>
<p>The production workflow shifted toward pre-expansion labels — recognizing
conditions before the setup becomes an obvious momentum move. That changed the
product from "find stocks that already exploded" to "rank setups where the
structure is beginning to form." The distinction matters because a model trained
on confirmed breakouts tends to predict what the chart already shows, which is
not useful research.</p>
<p>PatternRank is not trying to publish a perfect prediction. It is trying to make
a research queue more disciplined.</p>
<h2>The Architecture</h2>
<p>The production story has two halves: a Python research pipeline and a web
product that exposes the outputs.</p>
<p>The Python research pipeline drives the daily US scoring workflow:</p>
<pre><code class="language-text">market data update
  -&gt; instrument and universe preparation
  -&gt; technical indicator calculation
  -&gt; feature generation
  -&gt; ignition model scoring
  -&gt; probability and confidence ranking
  -&gt; EV-aware selection
  -&gt; Supabase publishing
  -&gt; completed performance update after the 7-day window
</code></pre>
<p>The model stack is deliberately conventional at the public explanation level:
tree-based ML with XGBoost and LightGBM-style models, optional ensemble and
calibration work, and feature contribution summaries for review. The edge is
not the name of an algorithm. The edge is in the data hygiene, label design,
feature construction, universe filters, validation process, and the habit of
measuring outcomes after publication.</p>
<p>The product side is a Next.js app hosted on Vercel, with Clerk for auth and
Supabase as the product database. The public site exposes the marketing pages,
sample rankings, pricing, and research disclaimers. Authenticated users can
inspect current rankings, historical rankings, performance analytics, model
types, and feature contribution context.</p>
<h2>What Gets Published</h2>
<p>PatternRank intentionally publishes a small, reviewable surface:</p>
<ul>
<li><code>7D All Stocks</code>: short-horizon momentum rankings across NYSE and NASDAQ
equities.</li>
<li><code>7D ETB</code>: an Easy-to-Borrow ranking universe with additional liquidity and
borrowability constraints.</li>
<li><code>1Y2X</code>: a long-horizon breakout model in beta.</li>
<li><code>Crater</code>: a short-side model marked as coming soon, not the core public story.</li>
</ul>
<p>For this article, the main story is only the US 7D momentum product. India,
crypto, and other experiments are outside the scope. They exist as research
history, not as the public PatternRank positioning.</p>
<figure><img src="/articles/building-patternrank/sample.png" alt="PatternRank sample rankings visual with completed 7-day performance context"><figcaption><p>The sample page shows completed rankings with confidence, entry context,
model type, and measured 7-day outcome.</p></figcaption></figure>
<h2>Measurement Without Overclaiming</h2>
<p>PatternRank's public performance data should be read as historical ranking
outcome analysis, not realized trading performance and not a promise about
future returns.</p>
<p>As of this draft, the public performance API reports measured 7-day outcomes
from January 1, 2026 through June 10, 2026, covering 160 days.</p>
<table><thead><tr><th>Model</th><th style="text-align:right">Signals</th><th style="text-align:right">Avg 7D outcome</th><th style="text-align:right">Win rate</th><th style="text-align:right">Sharpe</th><th style="text-align:right">Max drawdown</th><th style="text-align:right">Cumulative return</th></tr></thead><tbody><tr><td>7D All Stocks</td><td style="text-align:right">897</td><td style="text-align:right">3.12%</td><td style="text-align:right">52.4%</td><td style="text-align:right">2.64</td><td style="text-align:right">-7.67%</td><td style="text-align:right">58.9%</td></tr><tr><td>7D ETB</td><td style="text-align:right">627</td><td style="text-align:right">2.98%</td><td style="text-align:right">53.6%</td><td style="text-align:right">2.19</td><td style="text-align:right">-6.61%</td><td style="text-align:right">74.5%</td></tr></tbody></table>
<p>Cumulative return is compounded across all signals in the period.</p>
<p>Those numbers are useful because they force the product to be accountable to
completed rankings. They are also intentionally incomplete if read outside the
research context. They do not include a user's sizing, entries, exits, taxes,
broker constraints, slippage, or personal risk profile. PatternRank does not
make those decisions.</p>
<p>One interesting detail is that both distributions are right-tailed. The median
7-day outcome sits below 1% for both models — 0.62% for All Stocks and 0.73%
for ETB — while the mean is pulled higher by a smaller number of large winners.
The ETB universe shows this more strongly: its outcome distribution carries a
pronounced positive skew, with upside outliers significantly larger than the
downside tail. That asymmetry is why the compounded cumulative return for ETB
(74.5%) runs well ahead of All Stocks (58.9%) despite a slightly lower average
per signal. It is also exactly why I care more about drawdown, stability, and
evaluation process than headline return screenshots. A research tool should help
users understand the shape of the distribution, not just celebrate the best
examples.</p>
<h2>Product Decisions</h2>
<p>The product decisions came from the research constraints.</p>
<p>First, the UI needed to separate ranking from recommendation. The dashboard
shows model type, confidence, rank, entry context, and completed outcome. It
does not tell a user what to buy, how much to size, or when to exit.</p>
<p>Second, historical review needed to stay open. Completed performance is
available without signup because trust should start with inspectable outcomes,
not gated claims.</p>
<p>Third, model coverage had to be explicit. A 7D All Stocks ranking and a 7D ETB
ranking are not the same product surface. The ETB filter exists for researchers
who want a more conservative universe with additional liquidity and
borrowability constraints.</p>
<p>Fourth, feature contributions had to be visible enough to support review, but
not so detailed that the product becomes a leak of the entire research system.
The product can show the top factors behind a ranking while keeping exact
feature definitions, transformations, thresholds, and label construction
private.</p>
<p>Finally, the legal language is part of the product. PatternRank is presented as
educational analytics and research infrastructure. It is not investment advice,
not personalized guidance, and not a trade recommendation service.</p>
<h2>What I Would Build Again</h2>
<p>The best decision was treating the research system as a product from the start.</p>
<p>That meant the unglamorous pieces mattered early: auth, plans, sample pages,
model configuration, daily refreshes, materialized views, performance APIs,
legal pages, and language that did not overstate what the models do.</p>
<p>It also meant the research code had to serve a repeatable workflow. The daily
pipeline is not just a script that prints tickers. It prepares the universe,
scores candidates, selects rankings, publishes structured data, updates
completed outcomes, and keeps the product-facing dashboard current.</p>
<p>That is the real lesson from PatternRank: the hard part is not getting a model
to output a score. The hard part is building a system around the score so it can
be reviewed, challenged, measured, and improved without becoming a black-box
recommendation engine.</p>
<p>Research systems become products when they make discipline easier to repeat.
That is what PatternRank is trying to do.</p>]]></content:encoded>
            <author>keyur@keyval.me (Keyur Raval)</author>
        </item>
        <item>
            <title><![CDATA[Shipping VibeChess: A Mobile Chess Product, Not Just a Demo]]></title>
            <link>https://keyval.me/articles/shipping-vibechess-mobile-chess-product</link>
            <guid isPermaLink="false">https://keyval.me/articles/shipping-vibechess-mobile-chess-product</guid>
            <pubDate>Mon, 22 Jun 2026 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<p>A chess puzzle demo is a screen. A chess product is a loop: start fast, solve
fast, get feedback fast, and have a reason to come back tomorrow.</p>
<p>VibeChess started from a simple game premise — mate-in-one puzzles, timed,
head-to-head. The chess logic was never the hard part. The hard parts were
everything around it: adaptive difficulty, real-time match infrastructure, Elo
progression, bot fallback when the player pool is thin, subscriptions, crash
reporting, streak tracking, account deletion, and making all of it hold together
on iOS and Android without embarrassing failures.</p>
<p>This is a case study on what it takes to turn a chess game idea into a
shippable mobile product.</p>
<h2>The Core Loop</h2>
<p>VibeChess is built around two modes: practice and duels.</p>
<p>Practice mode serves mate-in-one puzzles from a database of more than 650,000
positions. Difficulty adapts based on the player's puzzle performance, so the
training session stays challenging without becoming discouraging. Players track
streaks, earn milestone rewards, and work through daily puzzle challenges. A
freeze mechanic lets players protect their streak on missed days — which keeps
the habit-building feel without turning a single missed day into a reason to
quit.</p>
<p>Duel mode puts two players against each other on the same position
simultaneously. The player who spots the winning move first takes the point. The
Elo system translates results into a visible rating, with leaderboards and match
history giving the competitive context to care about the number.</p>
<p>Guest mode gives players a path into the core loop without creating an account.
That is a UX decision and a state-management decision at the same time. Guest
progress has to feel real without leaking into account-specific state when the
player later signs up.</p>
<figure><img src="/articles/shipping-vibechess/chessup-adaptive-puzzle.jpeg" alt="VibeChess adaptive puzzle screen showing a mate-in-one position"><figcaption><p>Practice mode adapts puzzle difficulty to the player's performance, keeping
training sessions challenging without becoming discouraging.</p></figcaption></figure>
<h2>Real-Time Was the Hardest Part</h2>
<p>Real-time multiplayer sounds like a feature. It is a different category of
engineering problem.</p>
<p>The match lifecycle has clear stages: queue entry, matchmaking, match found,
move submission, game update, result calculation. Writing out those stages takes
twenty seconds. Making them work reliably across two mobile clients over flaky
connections takes considerably longer.</p>
<p>The structural problem is that real-time bugs are hard to reproduce. They depend
on timing, network conditions, and two simultaneous client states. A unit test
can verify that a single move submission returns the right result. It cannot
simulate what happens when both players submit at the same millisecond, one
client drops mid-match, or a device rotates and triggers a widget rebuild while
a socket event is in flight.</p>
<p>The approach that worked was treating the server as the authoritative source of
truth at every point. The backend holds game state. The Flutter client holds
display state. Reconciling those two views — and recovering cleanly when they
diverge — is where the real work lived.</p>
<p>Two specific problems took the longest to resolve. The first was disconnect
handling: what the server should do when a player disappears mid-match, and what
the client should show when the connection drops during an active game. The
second was move validation under concurrency: ensuring that near-simultaneous
submissions from both players resulted in a deterministic, fair outcome rather
than a corrupted game state.</p>
<p>Neither problem was exotic. Both were well understood by the time we shipped.
The underestimated part was the time it took to surface them through testing,
because they required real concurrent clients rather than isolated unit tests.
You cannot fully simulate two mobile users on the same match. You have to run
it.</p>
<figure><img src="/articles/shipping-vibechess/chessup-1v1-match.jpeg" alt="VibeChess 1v1 duel screen showing a live match in progress"><figcaption><p>The duel screen keeps both players' state visible in real time. The server
is the authoritative source of truth; the client handles display.</p></figcaption></figure>
<h2>Bot Fallback</h2>
<p>Real-time duel mode works well when the player pool is large enough to find a
fair match quickly. VibeChess launched without a large player pool.</p>
<p>The solution was a bot fallback: after fifteen seconds in the queue without a
human match, the system pairs the player with an AI opponent calibrated to their
current Elo. The duel loop stays intact. The player gets a real-time experience.
The match still feels competitive because the difficulty tracks the player's
rating.</p>
<p>This decision mattered more than it looked on paper. Without it, early users
landing in duel mode would sit in an empty queue and leave. With bot fallback,
the multiplayer promise is always kept, even at low traffic. It is not a
workaround. It is the architecture decision that protects the product until a
real player pool forms.</p>
<h2>Progress Systems</h2>
<p>Elo ratings are useful for fair matchmaking. They are also UX. A number that
responds to match results gives the player a reason to care about each game
beyond the game itself. VibeChess surfaces the Elo change after every match,
maintains leaderboards, and stores match history so players can review completed
games.</p>
<p>The streak system — with freeze recovery — works on the same principle. Streaks
build a daily habit. The freeze mechanic reduces the punishment for a missed day
from a full reset to something recoverable. Small product decisions at this
level make a measurable difference in whether players return.</p>
<figure><img src="/articles/shipping-vibechess/chessup-elo-history.jpeg" alt="VibeChess Elo rating history chart showing rating progression over time"><figcaption><p>Elo history gives players a visible record of their progression. A number
that moves in response to results gives the player a reason to care about
each match beyond the game itself.</p></figcaption></figure>
<h2>Shipping as a Real Product</h2>
<p>The Flutter app and backend are only part of the product. The rest is the
operational layer that lets a stranger download the app and use it without
hitting a broken state:</p>
<ul>
<li>Subscriptions enforced consistently across app and backend, with server-side
validation so limits cannot be bypassed at the client</li>
<li>Push notifications for daily puzzle reminders and match activity</li>
<li>Crash reporting to surface production failures before they compound</li>
<li>Analytics to understand how players actually move through the app</li>
<li>In-app review prompts placed at high-engagement moments</li>
<li>Privacy policy, terms of service, support contact, and account deletion</li>
</ul>
<p>Account deletion deserves specific mention. It is often the last feature added
and the first thing flagged in an app store review. Building it correctly means
deleting or anonymizing all user-linked data, not just the auth record. Both app
stores require it. Users expect it. It is not optional.</p>
<h2>What Changed After Real Users Arrived</h2>
<p>The plan before launch covers what you think players will do. Analytics after
launch covers what they actually do.</p>
<p>The highest-signal post-launch inputs were session recordings, event funnels,
and direct user feedback. Each surfaced something the pre-launch model missed:
where players dropped out of onboarding, which difficulty range caused the most
quits, how long players waited in the duel queue before abandoning, and which UI
moments caused visible confusion.</p>
<p>The work after launch was iteration: watch the data, form a hypothesis, change
one thing, watch again. That loop — not the initial build — is where the product
became more useful. The core loop was always sound. The surrounding systems
needed real usage to reveal where friction lived.</p>
<h2>Lessons</h2>
<p><strong>Plan for low liquidity before you have users.</strong> A multiplayer product with no
player pool is not a multiplayer product. Bot fallback is the architecture
decision that keeps the promise until traffic arrives.</p>
<p><strong>Guest mode is a state-management problem, not just an acquisition strategy.</strong>
Guest progress needs to feel real. The transition to an authenticated account
needs to be clean. Both require explicit design; neither works by default.</p>
<p><strong>Real-time features should be scoped around the product loop.</strong> The real-time
engineering in VibeChess exists to serve mate-in-one duels, not to demonstrate
technical capability. Socket.IO is in the stack because the product required it,
not the other way around.</p>
<p><strong>A mobile app is not shipped until the operational layer is real.</strong>
Subscriptions, support, privacy, deletion, analytics, and crash reporting are
not finishing touches. They are the product surface that strangers encounter.</p>
<p><strong>Post-launch iteration requires instrumentation before launch.</strong> You cannot
iterate on data you did not collect. Analytics and crash reporting need to be
wired up before the first real user arrives, not after the first complaint.</p>
<hr>
<p>VibeChess is available on the
<a href="https://apps.apple.com/us/app/vibechess-puzzles/id6744912659">App Store</a> and
<a href="https://play.google.com/store/apps/details?id=ai.vibeship.chessup">Google Play</a>.</p>]]></content:encoded>
            <author>keyur@keyval.me (Keyur Raval)</author>
        </item>
        <item>
            <title><![CDATA[Making Multi-Chain a Config Problem, Not a Codebase Problem]]></title>
            <link>https://keyval.me/articles/talis-multichain-config-driven-architecture</link>
            <guid isPermaLink="false">https://keyval.me/articles/talis-multichain-config-driven-architecture</guid>
            <pubDate>Mon, 22 Jun 2026 00:00:00 GMT</pubDate>
            <content:encoded><![CDATA[<p>The first extra chain is usually an integration. The third one is an
architecture problem.</p>
<p>I worked on a decentralized NFT marketplace built on the Cosmos ecosystem. The
product lets creators launch full collections without deploying their own
contracts, and enforces royalties on secondary sales directly on-chain — so
creators receive a share of every resale automatically, without relying on
platform policy. Each chain it runs on has its own community, its own market,
and its own set of creators and collectors.</p>
<p>That last part is what made multi-chain real rather than cosmetic. The
marketplace was not porting a single audience across chains. It was serving
distinct communities on Terra 1, Terra 2, Injective, Xion, and Orai, each
expecting the same quality product on their chain of choice.</p>
<p>The codebase, though, had not been designed for multi-chain. It had grown into
it, one chain at a time, in the way that feels expedient at the start and
unsustainable by the end.</p>
<h2>The Before State</h2>
<p>The original approach to adding a new chain was the obvious one: copy the
existing codebase, update the chain-specific constants, adjust the environment
variables, and maintain a separate GitHub repository per chain.</p>
<p>That works for the second chain. By the fifth, it had created a set of problems
that were more expensive than the initial engineering would have been.</p>
<p><strong>Diverging codebases.</strong> Each repository started as a copy of the others but
drifted independently. A bug fixed in one chain's repo had to be manually
identified, ported, and verified across every other repo. A new feature built
for Injective was not automatically available on Xion. The repositories had
started as siblings and turned into distant cousins.</p>
<p><strong>Merge overhead.</strong> Any shared improvement — a UI fix, a contract interaction
change, a new wallet adapter — required opening pull requests against multiple
repositories. Each merge carried its own review, its own CI run, its own
chance of a conflict. The larger the shared change, the more painful the
process.</p>
<p><strong>Separate build and deploy environments.</strong> Five chains meant five sets of
environment configuration, five build pipelines, and five separate deployments
to coordinate. A release that touched shared logic required shipping five
times.</p>
<p><strong>Regressions through omission.</strong> The most common failure was not a
wrong code change — it was a missing one. A fix merged to Terra 2 and
forgotten on Orai. A feature enabled on Injective that never reached the other
chains. The bugs were not from bad code. They were from the overhead of
keeping five copies of the same application consistent by hand.</p>
<h2>The Architecture Change</h2>
<p>The goal was to move from one repo per chain to one repo that understood all
chains through configuration. That required solving two distinct problems: what
goes in the config, and what the application code does with it.</p>
<h3>The Chain Config Schema</h3>
<p>Every chain needed a typed configuration entry that captured everything the
application needed to know about it:</p>
<pre><code class="language-ts">interface ChainConfig {
  chainId: string
  chainName: string
  bech32Prefix: string
  rpc: string
  lcd: string
  nativeCurrency: {
    denom: string
    decimals: number
    display: string
  }
  contracts: {
    marketplace: string
    nftFactory: string
  }
  features: {
    auctions: boolean
    lazyMinting: boolean
    collections: boolean
  }
  explorerUrl: string
  wallets: WalletAdapter[]
}
</code></pre>
<p>Each of the five chains became an entry in a validated config map. Contract
addresses, RPC and LCD endpoints, bech32 prefixes, native currency details,
supported feature flags, and wallet adapters all lived in one place per chain.
When something needed to change — an RPC endpoint, a redeployed contract
address — it changed in one file, in one pull request, reviewed once.</p>
<h3>Capability Flags Over Chain Checks</h3>
<p>The second problem was how product code consumed this config. The original
codebase was full of chain-specific conditionals: checks that tested whether
the active chain was Injective, or Terra 2, or Xion, and branched accordingly.</p>
<p>That pattern couples product logic to network identity in a way that does not
scale. Every new chain requires finding every branch and deciding whether it
applies. Every removed chain leaves dead branches behind.</p>
<p>The replacement was capability-driven logic. Product code asks whether a feature
is supported, not which chain is active:</p>
<pre><code class="language-ts">// before
if (chainId === 'injective-1' || chainId === 'xion-1') {
  showAuctionUI()
}

// after
if (chain.features.auctions) {
  showAuctionUI()
}
</code></pre>
<p>The chain config owns the decision. The product code owns the behavior. Adding
a new chain does not require reading all the old chain checks to decide which
ones apply.</p>
<h3>Wallet Adapters</h3>
<p>The Cosmos ecosystem has a common wallet standard but each chain needs to be
registered separately with wallets like Keplr. That registration includes
chain-specific details: the chain ID, the bech32 prefix, gas settings, currency
info, and RPC endpoint.</p>
<p>Rather than handling this ad hoc in each integration, the wallet adapter layer
read directly from the chain config. Registering a new chain with any wallet
became a function of the config entry, not a manual one-off step.</p>
<h3>The Integration Checklist</h3>
<p>Typing the config and moving chain logic into adapters created a side effect
worth naming: the integration surface for a new chain became explicit. Adding
a chain meant filling in a config entry and walking a known checklist — contract
addresses, RPC health check, wallet registration, feature flag decisions, explorer
link, and a smoke test against each capability. The checklist was the same
every time. That made it reviewable, delegatable, and faster to complete
without missing anything.</p>
<h2>What the Merge Problem Looked Like in Practice</h2>
<p>The clearest way to describe the cost of the old approach is to describe a
specific failure mode: shipping the same fix to five repos.</p>
<p>A bug in the marketplace interaction code would be identified, diagnosed, and
fixed in whichever chain's repo it was first noticed. The fix would be merged
and deployed. Then the question was: does this fix apply to the other chains? If
yes — and it usually did, because most logic was shared — someone had to open
four more pull requests, against four more repos, with four more CI runs, and
four more deployments.</p>
<p>That process is not hard. It is just friction that accumulates. Over time, it
trains you to delay fixes, to bundle changes, or to skip the cross-chain work
on the assumption that someone else will catch it. Some fixes never made it
across all five repos.</p>
<p>In the unified codebase, that class of problem stops existing. A fix merged
once is deployed everywhere. There is no omission case because there is no
separate codebase to forget.</p>
<p>For a marketplace where royalty enforcement is the core creator promise, that
reliability matters more than it would in a typical product. A bug that reached
only two of five chains was still a broken promise to creators on those chains.</p>
<h2>Product Decisions</h2>
<p><strong>Make config the boundary, not the escape valve.</strong> A chain config is only
useful if the application actually reads from it rather than working around it
with in-code constants. The discipline of routing all chain-specific state
through the config is what gives the architecture leverage.</p>
<p><strong>Type the config and validate it at startup.</strong> A loose config object catches
mistakes at runtime in the wrong environment. A typed, validated schema catches
them at build time in the right one.</p>
<p><strong>Keep escape hatches small and explicit.</strong> Some chains will have behavior that
does not fit the shared model. That is acceptable. The goal is not zero
per-chain code — it is a small, explicit, named integration surface rather than
scattered conditionals throughout the app.</p>
<p><strong>Build the checklist before you need it.</strong> The integration checklist for a new
chain is most useful the first time you try to add one. Writing it after the
fact, under pressure, from memory, is a good way to miss something.</p>
<h2>Lessons</h2>
<p><strong>Multi-chain is a codebase problem before it is a blockchain problem.</strong> The
Cosmos chains in this project are similar enough at the protocol level that
the engineering differences are manageable. The unsustainable part was five
separate repositories, not five separate chains.</p>
<p><strong>Separate repos are a high-overhead way to handle configuration differences.</strong>
Environment variables and chain-specific constants are not a good reason to
fork a codebase. A typed config entry is almost always the right level of
separation.</p>
<p><strong>The test matrix becomes explicit when the integration surface does.</strong> Before
the refactor, it was unclear what needed to be verified when adding a chain.
After it, the checklist was the test matrix. That clarity alone made new
integrations faster and safer.</p>
<p><strong>The goal is a smaller, predictable, reviewable integration surface.</strong> Not zero
code per chain. Not a framework that handles every possible variation. Just a
model where the cost of the next chain is measurable, bounded, and the same as
the one before it.</p>]]></content:encoded>
            <author>keyur@keyval.me (Keyur Raval)</author>
        </item>
    </channel>
</rss>