Learning Hub Strategy Development Lesson 19
Phase 3 — Lesson 2 of 8
Hub
Phase 3 · Strategy Development · Lesson 19

Vectorised Backtesting
Engine

Build a fast, vectorised backtesting engine using Pandas and NumPy — no loops, anti-lookahead bias, and a complete equity curve in under 50 ms.

~50 min
NumPy · Pandas
Phase 3 · Lesson 2 of 8
Phase 3 Progress25%

How Vectorised Backtesting Works

Instead of looping through candles one by one, vectorised backtesting produces a position series — a column of +1 (long), -1 (short), or 0 (flat) — across the entire historical DataFrame at once. Returns are then calculated in a single line of Pandas arithmetic.

Look-ahead bias is the #1 backtest killer

Always shift signals by one period before multiplying by returns. A signal generated at the close of candle N can only be acted on at candle N+1. Forgetting .shift(1) inflates returns by 30–100% in typical strategies.

BacktestEngine Class

PYTHONbacktest_engine.py
import pandas as pd
import numpy  as np
from dataclasses import dataclass

@dataclass
class BacktestResult:
    equity      : pd.Series    # cumulative equity curve
    positions   : pd.Series    # +1 / -1 / 0
    returns     : pd.Series    # per-candle strategy returns
    trades      : pd.DataFrame # trade log
    params      : dict         # strategy parameters used

    def summary(self) -> dict:
        from performance import PerformanceReport  # built in L22
        return PerformanceReport(self.equity, self.trades).summary()


class BacktestEngine:
    def __init__(
        self,
        initial_capital : float = 100_000,
        commission_pct  : float = 0.0003,  # 0.03% per side
        slippage_pts    : float = 1.0,      # 1 point per trade
        lot_size        : int   = 50,        # NIFTY lot
    ):
        self.initial_capital = initial_capital
        self.commission_pct  = commission_pct
        self.slippage_pts    = slippage_pts
        self.lot_size        = lot_size

    def run(
        self,
        df          : pd.DataFrame,
        signal_fn,                     # callable: df → Series of signals
        params      : dict = None,
    ) -> BacktestResult:
        """
        signal_fn(df) must return a pd.Series with values:
          1  = go / stay LONG
         -1  = go / stay SHORT
          0  = flat / exit
        """
        df = df.copy()

        # 1. Generate raw signals (no look-ahead yet)
        df["signal"] = signal_fn(df)

        # 2. Shift by 1 — act on NEXT candle's open (anti look-ahead)
        df["position"] = df["signal"].shift(1).fillna(0)

        # 3. Raw returns per candle
        df["ret"] = df["close"].pct_change()

        # 4. Strategy return = position × market return
        df["strat_ret"] = df["position"] * df["ret"]

        # 5. Apply transaction costs on position changes
        trade_mask = df["position"].diff().abs() > 0
        cost_per_trade = (
            self.commission_pct +
            self.slippage_pts / df["close"]
        )
        df.loc[trade_mask, "strat_ret"] -= cost_per_trade[trade_mask]

        # 6. Equity curve
        df["equity"] = self.initial_capital * (
            1 + df["strat_ret"]
        ).cumprod()

        # 7. Extract trade log
        trades = self._extract_trades(df)

        return BacktestResult(
            equity    = df["equity"],
            positions = df["position"],
            returns   = df["strat_ret"],
            trades    = trades,
            params    = params or {},
        )

    def _extract_trades(self, df: pd.DataFrame) -> pd.DataFrame:
        """Identify entry/exit rows from position changes."""
        pos   = df["position"]
        diff  = pos.diff()
        entries = df[diff != 0].copy()
        entries["pos_before"] = pos.shift(1)

        trades = []
        open_trade = None

        for ts, row in entries.iterrows():
            if open_trade is None and row["position"] != 0:
                open_trade = {
                    "entry_time" : ts,
                    "direction"  : "LONG" if row["position"] > 0 else "SHORT",
                    "entry_price": row["close"],
                }
            elif open_trade is not None and (
                row["position"] == 0 or row["position"] != open_trade["direction_val"]
            ):
                d = open_trade["direction"]
                ep = open_trade["entry_price"]
                xp = row["close"]
                pnl = (xp - ep) if d == "LONG" else (ep - xp)
                trades.append({**open_trade,
                    "exit_time" : ts,
                    "exit_price": xp,
                    "pnl_pts"   : round(pnl, 2),
                    "pnl_rs"    : round(pnl * self.lot_size, 2),
                })
                open_trade = None

        return pd.DataFrame(trades)

Writing signal_fn for the Engine

The engine accepts any callable that takes a DataFrame and returns a pd.Series of +1/0/−1. This keeps strategy logic separate from backtest machinery.

PYTHONsignal functions
from indicators import ema, rsi, vwap

def ema_cross_signal(df: pd.DataFrame, fast=9, slow=21) -> pd.Series:
    """Returns +1 (long) / -1 (short) / 0 (flat)."""
    c     = df["close"]
    f     = ema(c, fast)
    s     = ema(c, slow)
    trend = ema(c, 200)

    sig = pd.Series(0, index=df.index)
    sig[f > s] =  1   # fast above slow → long
    sig[f < s] = -1   # fast below slow → short

    # Filter: only trade in direction of trend
    sig[(sig ==  1) & (c < trend)] = 0
    sig[(sig == -1) & (c > trend)] = 0

    return sig


# ── Run backtest ──────────────────────────────────────────
engine = BacktestEngine(initial_capital=100_000, lot_size=50)
result = engine.run(
    df     = df_nifty_1min,
    signal_fn = ema_cross_signal,
    params = {"fast": 9, "slow": 21},
)

print(result.equity.iloc[-1])         # final equity
print(result.trades.head())             # first 5 trades
result.equity.plot(title="Equity Curve")
Performance

Running this engine on 2 years of 1-minute NIFTY data (~96,000 candles) takes under 200ms. A loop-based approach for the same dataset typically takes 20–60 seconds.

Backtest Pitfalls Checklist

PitfallFix
Look-ahead biasAlways .shift(1) signals before multiplying by returns
Ignoring costsAdd commission + slippage on every position change
Survivorship biasInclude delisted stocks in universe; don't backtest on current constituents only
Point-in-time dataUse data as it would have appeared at each timestamp (adjusted prices only for dividends)
OverfittingCovered in L23 — always walk-forward validate parameters