When you download NIFTY historical data, you get thousands of rows — dates, open, high, low, close, volume. Pandas is the standard Python library for handling exactly this kind of tabular data. It's fast, expressive, and purpose-built for financial time series.
df["close"] returns a Series.import pandas as pd # Install: pip install pandas # Create a DataFrame from NIFTY candle data data = { "date": ["2024-01-15", "2024-01-16", "2024-01-17", "2024-01-18"], "open": [21450, 21580, 21320, 21410], "high": [21620, 21710, 21490, 21560], "low": [21390, 21500, 21280, 21350], "close": [21580, 21340, 21460, 21510], "volume": [142500, 168700, 129300, 155100] } df = pd.DataFrame(data) print(df)
df for quick single-instrument data, or nifty_df, reliance_df for multi-instrument work. This keeps your code readable at a glance.Series — Single Column Data
When you select a single column from a DataFrame, you get a Series — a one-dimensional labeled array. This is what you'll work with most when calculating indicators.
# A Series is just one column close = df["close"] print(type(close)) # <class 'pandas.core.series.Series'> print(close) # Basic stats on the close Series print(f"Mean : {close.mean():.2f}") print(f"Max : {close.max()}") print(f"Min : {close.min()}") print(f"Std : {close.std():.2f}")
Arithmetic on a Series
# Calculate candle body size (vectorised — no loop needed!) df["body"] = (df["close"] - df["open"]).abs() # Calculate upper & lower wicks df["upper_wick"] = df["high"] - df[["open", "close"]].max(axis=1) df["lower_wick"] = df[["open", "close"]].min(axis=1) - df["low"] # Candle direction df["bullish"] = df["close"] > df["open"] print(df[["date", "body", "upper_wick", "lower_wick", "bullish"]])
Selecting & Slicing Data
Pandas gives you multiple ways to select data. In trading code you'll use these constantly — filtering signals, pulling last N candles, or accessing specific dates.
| Method | Use | Example |
|---|---|---|
| df["col"] | Select column (Series) | df["close"] |
| df[["c1","c2"]] | Select multiple columns | df[["open","close"]] |
| df.iloc[n] | Row by integer position | df.iloc[-1] → last row |
| df.iloc[n:m] | Slice rows by position | df.iloc[-5:] → last 5 |
| df.loc[label] | Row by index label | df.loc["2024-01-15"] |
| df[condition] | Filter rows by condition | df[df["close"] > 21500] |
# Last candle (most recent) last = df.iloc[-1] print(f"Last close: {last['close']}") # Last 3 candles recent = df.iloc[-3:] print(recent[["date", "close"]]) # Filter: only bullish candles bullish_candles = df[df["bullish"] == True] print(f"Bullish candles: {len(bullish_candles)}") # Filter: candles with volume above 150,000 high_volume = df[df["volume"] > 150000] print(high_volume[["date", "volume"]])
df.iloc[-1] is the fastest way to get the latest candle. Use it instead of df.tail(1) when you only need a single row — it returns a Series with all column values directly accessible.DateTime Index — The Key to Market Data
Real market data always has timestamps. Setting the date column as the DataFrame index unlocks time-based slicing, resampling, and alignment — essential for multi-timeframe analysis.
# Build a proper datetime-indexed DataFrame data_ts = { "datetime": pd.date_range("2024-01-15 09:15", periods=6, freq="5min"), "open": [21450, 21480, 21510, 21490, 21530, 21555], "high": [21495, 21520, 21540, 21520, 21570, 21580], "low": [21430, 21465, 21490, 21475, 21520, 21540], "close": [21480, 21510, 21495, 21530, 21555, 21570], "volume": [12400, 9800, 11200, 8700, 13100, 10500] } df5 = pd.DataFrame(data_ts) df5 = df5.set_index("datetime") # Set datetime as index print(df5) print(f"\nIndex type: {type(df5.index)}") # Access by time range morning = df5.between_time("09:15", "09:25") print(morning["close"])
Resampling Timeframes
# Convert 5-min candles to 15-min candles (OHLCV aggregation) df15 = df5.resample("15min").agg({ "open": "first", "high": "max", "low": "min", "close": "last", "volume": "sum" }) print(df15)
Adding Columns — Signal Engine Pattern
In a real trading system, you'll keep adding computed columns to your DataFrame — moving averages, signals, P&L. This is the core DataFrame enrichment pattern.
import pandas as pd # Simulate 10 NIFTY daily candles closes = [21450, 21580, 21320, 21410, 21550, 21620, 21480, 21390, 21510, 21640] df = pd.DataFrame({"close": closes}) # 1. Simple Moving Average (3-period) df["sma3"] = df["close"].rolling(3).mean().round(2) # 2. Percentage change candle-to-candle df["pct_change"] = df["close"].pct_change() * 100 # 3. Signal: close above SMA = BUY, else SELL df["signal"] = df.apply( lambda row: "BUY" if row["close"] > row["sma3"] else "SELL", axis=1 ) print(df.tail(5).to_string())
n-1 rows will be NaN — always drop or handle NaN before using the signal in live trading.Essential DataFrame Methods
These methods come up in nearly every algo trading script. Know them by heart.
| Method | What it does | Trading use |
|---|---|---|
| df.head(n) | First n rows | Inspect loaded data |
| df.tail(n) | Last n rows | View recent candles |
| df.shape | (rows, cols) tuple | Verify data loaded |
| df.dtypes | Column data types | Check float vs object |
| df.isnull().sum() | Count NaN per column | Find missing data |
| df.dropna() | Remove rows with NaN | Clean data after indicators |
| df.describe() | Min/max/mean/std stats | Quick data sanity check |
| df.sort_values(col) | Sort by column | Sort by date or P&L |
| df.reset_index() | Move index to column | After resampling |
| df.to_csv(path) | Save to CSV file | Export trade journal |
# Common pre-trade data checks print(f"Rows loaded : {df.shape[0]}") print(f"Columns : {list(df.columns)}") print(f"NaN in close : {df['close'].isnull().sum()}") print(f"Date range : {df.index[0]} → {df.index[-1]}") # Drop NaN rows (important after adding rolling indicators) df_clean = df.dropna() print(f"Clean rows : {len(df_clean)}")
Quiz — Test Your Knowledge
df with NIFTY candles. Which code returns only the last 5 rows?df["close"].rolling(5).mean() calculate?Exercises
pd.date_range("2024-01-15 09:15", periods=30, freq="1min") as the index with made-up OHLCV values). Resample to 5-minute candles using proper OHLCV aggregation and print the result.