Article

What Happens to a Stock When It Joins (or Leaves) a Major Index

Retrieve historical joiner/leaver events, measuring market returns and volumes to study the effects.

May 29, 2026

Nick Zincone

Imagine you hear that a company just got added to the S&P 500. Your first thought might be: "They must be doing something right — maybe I should take a closer look before everyone else does." And if a stock gets removed? "What went wrong? Maybe I should get out while the price is still reasonable." That instinct — the urge to act on the signal — is natural, and you wouldn't be alone in feeling it.

But here's what makes this really interesting: your instinct is only the surface layer. Beneath it, a much larger force is at work. Hundreds of ETFs, index mutual funds, and institutional accounts are built to hold exactly the same stocks as the S&P 500 — and "hold" means they own real shares. When the roster changes, these funds can't just update a label in a spreadsheet. They have to go into the market and actually buy shares of the new stock — and actually sell shares of the one leaving. Every one of them. At roughly the same time. Not because they want to, but because their rules say they must. No deliberation, no discretion, just execution.

To put a face on it: when Safeway was removed from the S&P 500 in January 2015 (following its acquisition), daily trading volume jumped from around 300,000 shares to over 22 million on the effective date — roughly 60 times normal.

Researchers have been studying this pattern — known as the reconstitution effect — since the 1980s. You can look up the full history of S&P 500 joiners and leavers in LSEG Workspace — it keeps records going back to 1994. What Workspace doesn't do is connect those 2,000+ individual events to price and volume data, measure the impact across all of them, and distill the result into a single composite picture. That's computation — and that's what a notebook is built for.

This notebook retrieves years of S&P 500 membership changes, measures the price and volume impact around each one, and aggregates the results into a composite picture of the "average" joiner and the "average" leaver. If the pattern holds up, it then uses current market data to identify the most likely candidates for the next round of changes — turning a historical insight into something you can act on ahead of time.

Setup

This notebook uses the LSEG Data Library for Python to retrieve index constituent change history, daily pricing, and fundamental data. It is designed for LSEG Workspace desktop users.

The event study logic — computing market-adjusted abnormal returns and abnormal volume — lives in a companion file, index_rebalance_utils.py, which should sit in the same directory as this notebook.

Additional packages:

pandas / numpy for data manipulation
plotly for interactive visualization
scipy for statistical testing

    	
            import lseg.data as ld



import pandas as pd

import numpy as np

import plotly.express as px

import plotly.graph_objects as go

from plotly.subplots import make_subplots

from scipy import stats

from IPython.display import display, HTML

import warnings

warnings.filterwarnings("ignore", category=FutureWarning)



from index_rebalance_utils import compute_market_adjusted_car, compute_abnormal_volume, aggregate_events



pd.set_option("display.max_colwidth", 80)

pd.set_option("display.max_columns", 20)

pd.set_option("display.float_format", "{:.4f}".format)



display(HTML("<style>.dataframe td { white-space: nowrap; }</style>"))

Open the data session

The open_session() function establishes a data session to retrieve our content within the desktop.

    	
            # Establish communication with our LSEG data environment.
# Initialize the session - Desktop mode (Workspace)
ld.open_session()

Configuration

All the settings that control this analysis live here. The defaults target the S&P 500 — the most widely followed index in the world — but you can swap in any index by changing a couple of values.

Two settings that shape the analysis are worth understanding:

Event window (-30 to +30 trading days) — the period around each index change where we measure what happened to the stock's price and volume. Think of it as looking one and a half months before and after the change date.
Volume baseline (-150 to -31 trading days) — a longer period before the event window. We use this stretch of "normal" trading to learn what each stock's typical daily volume looks like, so we can measure how much the rebalancing inflated it. This window deliberately doesn't overlap with the event window — we don't want the index change itself to contaminate our baseline.

    	
            # --- Configuration ---



# Index to analyze

INDEX_RIC = ".SPX"                 # S&P 500

INDEX_CHAIN = "0#.SPX"             # Chain RIC for current constituents

INDEX_NAME = "S&P 500"

BENCHMARK_RIC = ".SPX"             # Market benchmark for abnormal return calculation



# Event study windows (in trading days)

EVENT_WINDOW = (-30, 30)           # Days around the effective date

VOLUME_BASELINE = (-150, -31)      # Pre-event window for normal volume comparison



# Historical lookback for constituent changes

CHANGES_START = "2018-01-01"

CHANGES_END = "2025-12-31"



# Prediction: how far around the threshold to search for candidates

MCAP_BUFFER_PCT = 0.20             # Within 20% of the smallest current constituent



# Display

TOP_N = 20                        # Number of candidates to show in watchlist



print(f"Index: {INDEX_NAME} ({INDEX_RIC})")

print(f"Event window: [{EVENT_WINDOW[0]}, +{EVENT_WINDOW[1]}] trading days")

print(f"Volume baseline: [{VOLUME_BASELINE[0]}, {VOLUME_BASELINE[1]}] trading days")

print(f"Constituent changes: {CHANGES_START} to {CHANGES_END}")

Index: S&P 500 (.SPX)
Event window: [-30, +30] trading days
Volume baseline: [-150, -31] trading days
Constituent changes: 2018-01-01 to 2025-12-31

1. Which Stocks Were Added or Removed?

What we're doing: Pulling the historical record of every stock that joined or left the S&P 500 over the past several years.

Why it matters: Each of these changes is a natural experiment. Something happened to the stock — it got added or removed — and we can look at what its price and trading volume did in response. One event by itself doesn't tell us much (maybe the stock was already moving for other reasons), but when we stack dozens of them together, the noise cancels out and any systematic pattern becomes visible. The more events we collect, the more confident we can be in the result.

A couple of things to note about this data: changes don't follow a neat schedule. Some cluster around the quarterly rebalancing dates (March, June, September, December), when the index committee conducts its routine review. Others happen mid-month because a corporate event — an acquisition, a bankruptcy, or a spin-off — forces the committee to swap a constituent outside the regular schedule. You'll also notice that some removed stocks have RICs with a ^ character (like SWY.N^A15), which marks them as retired instruments. These are companies that no longer trade independently — usually because they were acquired. Their historical price data is still available, so we can include them in the analysis.

    	
            # Retrieve historical index constituent changes

changes_df = ld.get_data(

    universe=INDEX_RIC,

    fields=[

         "TR.IndexJLConstituentRIC",

         "TR.IndexJLConstituentName",

         "TR.IndexJLConstituentChangeDate",

         "TR.IndexJLConstituentituentChange",

    ],

    parameters={

         "SDate": CHANGES_START,

         "EDate": CHANGES_END,

         "IC": "B",

    },

)



changes_df.columns = ["Index", "RIC", "Name", "ChangeDate", "Change"]



print(f"Raw constituent changes retrieved: {len(changes_df)} events")

changes_df

Raw constituent changes retrieved: 338 events

    	
            # Clean the changes data

changes_df["ChangeDate"] = pd.to_datetime(changes_df["ChangeDate"], errors="coerce")

changes_df = changes_df.dropna(subset=["RIC", "ChangeDate"]).copy()



# Summary

joiners = changes_df[changes_df["Change"] == "Joiner"]

leavers = changes_df[changes_df["Change"] == "Leaver"]



print(f"Clean events: {len(changes_df)}")

print(f"  Joiners:  {len(joiners)}")

print(f"  Leavers:  {len(leavers)}")

print(f"  Date range: {changes_df['ChangeDate'].min().date()} to {changes_df['ChangeDate'].max().date()}")

print(f"\nEvents by year:")

events_by_year = changes_df.groupby([changes_df["ChangeDate"].dt.year, "Change"]).size().unstack(fill_value=0)

events_by_year.columns = ["Joiners", "Leavers"]

events_by_year

The table confirms we have a healthy sample to work with. Notice that joiners and leavers roughly balance out — when a stock joins the S&P 500, another typically leaves to keep the count at ~500. The distribution of changes across years is fairly consistent, giving us a broad set of data points from different market conditions to learn from.

2. Collecting Price and Volume Data

What we're doing: For every stock that was added or removed, we're pulling its daily price and trading volume over a window that extends well before and after the change date. We also pull the overall S&P 500 index as a reference point.

Why it matters: We don't just want to know whether a stock went up or down — the whole market might have moved that day. By pulling the benchmark index alongside each stock, we can subtract out the market's move on each day and isolate the part of the stock's performance that was specific to the index change. We also need enough volume history before the event to establish what "normal" trading looks like for each stock, so the volume spike — if there is one — stands out clearly.

    	
            # Determine the overall date range needed for price data

earliest_event = changes_df["ChangeDate"].min()

latest_event = changes_df["ChangeDate"].max()



# Add buffer for volume baseline (150 trading days ~ 7 months) and event window (30 days ~ 6 weeks)

price_start = (earliest_event - pd.DateOffset(months=8)).strftime("%Y-%m-%d")

price_end = (latest_event + pd.DateOffset(months=2)).strftime("%Y-%m-%d")



print(f"Price data range (derived from event/baseline windows): {price_start} to {price_end}")



# Get all unique stock RICs involved in events

event_rics = changes_df["RIC"].unique().tolist()

print(f"Unique stocks in events: {len(event_rics)}")



# Retrieve benchmark index price

print(f"\nRetrieving benchmark ({BENCHMARK_RIC})...")

benchmark_prices = ld.get_history(

    universe=BENCHMARK_RIC,

    fields="TRDPRC_1",

    interval="daily",

    start=price_start,

    end=price_end

)



# Flatten benchmark if needed

if isinstance(benchmark_prices.columns, pd.MultiIndex):

    benchmark_prices = benchmark_prices.droplevel(0, axis=1)

benchmark_close = benchmark_prices["TRDPRC_1"].dropna()

benchmark_returns = benchmark_close.pct_change().dropna()



print(f"Benchmark data: {len(benchmark_close)} trading days")



# Retrieve stock price & volume data in batches to stay within API limits

print(f"\nRetrieving price & volume for {len(event_rics)} stocks...")

BATCH_SIZE = 40

all_batches = []



for i in range(0, len(event_rics), BATCH_SIZE):

    batch = event_rics[i:i + BATCH_SIZE]

    batch_num = i // BATCH_SIZE + 1

    total_batches = (len(event_rics) + BATCH_SIZE - 1) // BATCH_SIZE

    print(f"  Batch {batch_num}/{total_batches}: retrieving {len(batch)} instruments...")

    try:

        df = ld.get_history(

            universe=batch,

            fields=["TRDPRC_1", "ACVOL_UNS"],

            interval="daily",

            start=price_start,

            end=price_end

        )

        if df is not None and len(df) > 0:

            all_batches.append(df)

    except Exception as e:

        print(f"  Warning: batch {batch_num} failed — {e}")



stock_data = pd.concat(all_batches, axis=1) if all_batches else pd.DataFrame()



print(f"\nStock data shape: {stock_data.shape}")

We now have two datasets loaded into memory: benchmark_returns, the daily return series for the S&P 500 index itself, and stock_data, a wide table containing daily closing prices and trading volumes for every stock involved in a constituent change. Together these cover the full date range we need — from well before the earliest event through 30 trading days after the last one. The next step is to use this raw data to measure how each stock performed relative to the market around its index change.

3. Separating the Signal from the Noise

What we're doing: For each index change, calculating how the stock performed relative to the broader market — then stacking all those results together to find the pattern.

The idea in plain terms: On any given day, the entire market might go up 1% or down 2%. We don't want that general market movement mixed in with the effect we're trying to measure. So for each day in the event window, we subtract the market's return from the stock's return. What's left — the abnormal return — is the part that the market alone can't explain.

By adding up these daily abnormal returns over time, we get the cumulative abnormal return (CAR): a running total of how much extra the stock gained or lost, day by day, relative to the market. When we average the CAR across all joiners (or all leavers), individual noise washes out and the systematic pattern — if one exists — becomes clear.

If index fund buying pressure genuinely pushes joiner prices up, the average joiner CAR should climb around the effective date. If forced selling hammers leavers, their average CAR should drop. And after the rebalancing is done, the lines should flatten as the stock resumes normal behavior relative to the market.

    	
            # Compute event studies for all events

joiner_cars = []

leaver_cars = []

joiner_volumes = []

leaver_volumes = []



# Keep metadata aligned to successful CAR events for downstream hover labels

joiner_car_rics = []

leaver_car_rics = []

joiner_car_names = []

leaver_car_names = []

joiner_car_dates = []

leaver_car_dates = []



failed_events = []

skipped_car = {"Joiner": 0, "Leaver": 0}

skipped_vol = {"Joiner": 0, "Leaver": 0}



for _, event in changes_df.iterrows():

    ric = event["RIC"]

    event_date = event["ChangeDate"]

    change = event["Change"]

    name = event.get("Name", "")



    try:

        # Extract stock price and volume

        if isinstance(stock_data.columns, pd.MultiIndex):

            if ric not in stock_data.columns.get_level_values(0):

                failed_events.append((ric, "no price data"))

                continue

            close = stock_data[(ric, "TRDPRC_1")].dropna()

            volume = stock_data[(ric, "ACVOL_UNS")].dropna()

        else:

            failed_events.append((ric, "unexpected data format"))

            continue



        stock_returns = close.pct_change().dropna()



        # Market-adjusted abnormal returns (no estimation window needed)

        car_result = compute_market_adjusted_car(

            stock_returns, benchmark_returns, event_date,

            event_window=EVENT_WINDOW

        )



        # Abnormal volume (uses pre-event baseline for comparison)

        vol_result = compute_abnormal_volume(

            volume, event_date,

            event_window=EVENT_WINDOW,

            baseline_window=VOLUME_BASELINE

        )



        if change == "Joiner":

            if car_result is not None:

                joiner_cars.append(car_result)

                joiner_car_rics.append(ric)

                joiner_car_names.append(name)

                joiner_car_dates.append(event_date)

            else:

                skipped_car["Joiner"] += 1

            if vol_result is not None:

                joiner_volumes.append(vol_result)

            else:

                skipped_vol["Joiner"] += 1

        else:

            if car_result is not None:

                leaver_cars.append(car_result)

                leaver_car_rics.append(ric)

                leaver_car_names.append(name)

                leaver_car_dates.append(event_date)

            else:

                skipped_car["Leaver"] += 1

            if vol_result is not None:

                leaver_volumes.append(vol_result)

            else:

                skipped_vol["Leaver"] += 1



    except Exception as e:

        failed_events.append((ric, str(e)))

        continue



total_joiners = len(joiners)

total_leavers = len(leavers)



print(f"Event studies computed (out of {total_joiners} joiners / {total_leavers} leavers):")

print(f"  Joiners  — CAR: {len(joiner_cars)} of {total_joiners},  Volume: {len(joiner_volumes)} of {total_joiners}")

print(f"  Leavers  — CAR: {len(leaver_cars)} of {total_leavers},  Volume: {len(leaver_volumes)} of {total_leavers}")

print(f"\n  Skipped (insufficient data for CAR):  Joiners: {skipped_car['Joiner']},  Leavers: {skipped_car['Leaver']}")

print(f"  Skipped (insufficient data for vol):  Joiners: {skipped_vol['Joiner']},  Leavers: {skipped_vol['Leaver']}")

print(f"  Errors (no price data at all): {len(failed_events)}")



if skipped_car["Leaver"] > skipped_car["Joiner"]:

    print(f"\n  Note: More leavers were skipped because many are retired instruments (acquired or")

    print(f"  delisted companies) whose price history ends near the event date, leaving too few")

    print(f"  post-event trading days for the full {EVENT_WINDOW[1]}-day window.")



if failed_events:

    print(f"\nSample errors (first 5):")

    for ric, reason in failed_events[:5]:

        print(f"    {ric}: {reason}")

4. The Price Impact: How Do Stocks React to Being Added or Removed?

What we're doing: Combining all the individual event studies into a single composite view — the "average joiner" and the "average leaver." This is the chart that answers the central question of the notebook.

What the research suggests we'll see: Studies going back decades have found that joiners tend to outperform the market around the effective date, while leavers tend to underperform. What remains debated is how large the effect is, how long it lasts, and whether it has weakened over time as more investors have learned to anticipate it.

    	
            # Aggregate CAR across all events

agg_joiner_car = aggregate_events(joiner_cars, "CAR")

agg_leaver_car = aggregate_events(leaver_cars, "CAR")



# Convert to percentage for readability

for df in [agg_joiner_car, agg_leaver_car]:

    if len(df) > 0:

        for col in ["mean", "median", "ci_upper", "ci_lower", "q25", "q75"]:

            if col in df.columns:

                df[col] = df[col] * 100



fig = make_subplots(rows=1, cols=1)



# Joiners

if len(agg_joiner_car) > 0:

    fig.add_trace(go.Scatter(

        x=agg_joiner_car["relative_day"], y=agg_joiner_car["ci_upper"],

        mode="lines", line=dict(width=0), showlegend=False, hoverinfo="skip"

    ))

    fig.add_trace(go.Scatter(

        x=agg_joiner_car["relative_day"], y=agg_joiner_car["ci_lower"],

        mode="lines", line=dict(width=0), fill="tonexty",

        fillcolor="rgba(0,150,0,0.1)", showlegend=False, hoverinfo="skip"

    ))

    fig.add_trace(go.Scatter(

        x=agg_joiner_car["relative_day"], y=agg_joiner_car["mean"],

        mode="lines", name=f"Joiners (n={len(joiner_cars)})",

        line=dict(color="green", width=2.5),

        hovertemplate="Day %{x}: %{y:.2f}%<extra>Joiners</extra>"

    ))



# Leavers

if len(agg_leaver_car) > 0:

    fig.add_trace(go.Scatter(

        x=agg_leaver_car["relative_day"], y=agg_leaver_car["ci_upper"],

        mode="lines", line=dict(width=0), showlegend=False, hoverinfo="skip"

    ))

    fig.add_trace(go.Scatter(

        x=agg_leaver_car["relative_day"], y=agg_leaver_car["ci_lower"],

        mode="lines", line=dict(width=0), fill="tonexty",

        fillcolor="rgba(200,0,0,0.1)", showlegend=False, hoverinfo="skip"

    ))

    fig.add_trace(go.Scatter(

        x=agg_leaver_car["relative_day"], y=agg_leaver_car["mean"],

        mode="lines", name=f"Leavers (n={len(leaver_cars)})",

        line=dict(color="red", width=2.5),

        hovertemplate="Day %{x}: %{y:.2f}%<extra>Leavers</extra>"

    ))



# Reference lines

fig.add_vline(x=0, line_dash="dash", line_color="gray", annotation_text="Effective Date")

fig.add_hline(y=0, line_dash="dot", line_color="lightgray")



fig.update_layout(

    title=f"Average Cumulative Abnormal Returns Around {INDEX_NAME} Reconstitution Events",

    xaxis_title="Trading Days Relative to Effective Date",

    yaxis_title="Cumulative Abnormal Return (%)",

    template="plotly_white",

    width=900,

    height=550,

    legend=dict(yanchor="top", y=0.99, xanchor="left", x=0.01)

)



fig.show()



# Statistical summary at key points

if len(agg_joiner_car) > 0:

    day0_join = agg_joiner_car[agg_joiner_car["relative_day"] == 0]

    day_end = agg_joiner_car[agg_joiner_car["relative_day"] == agg_joiner_car["relative_day"].max()]

    if len(day0_join) > 0:

        print(f"\nJoiners — CAR at effective date (day 0): {day0_join['mean'].values[0]:.2f}%")

    if len(day_end) > 0:

        print(f"Joiners — CAR at day +{EVENT_WINDOW[1]}: {day_end['mean'].values[0]:.2f}%")



if len(agg_leaver_car) > 0:

    day0_leave = agg_leaver_car[agg_leaver_car["relative_day"] == 0]

    day_end = agg_leaver_car[agg_leaver_car["relative_day"] == agg_leaver_car["relative_day"].max()]

    if len(day0_leave) > 0:

        print(f"Leavers — CAR at effective date (day 0): {day0_leave['mean'].values[0]:.2f}%")

    if len(day_end) > 0:

        print(f"Leavers — CAR at day +{EVENT_WINDOW[1]}: {day_end['mean'].values[0]:.2f}%")

Joiners — CAR at effective date (day 0): 7.35%
Joiners — CAR at day +30: 6.88%
Leavers — CAR at effective date (day 0): -6.98%
Leavers — CAR at day +30: -7.82%

How to read this chart: The green line traces the average price path for joiners; the red line shows leavers. Day 0 is the effective date — the day the change officially takes place. The shaded bands show how much individual events vary around that average (tighter bands = more consistent pattern).

A few things to look for:

Does the green line start rising before day 0? That would mean the market is anticipating the addition before it takes effect — perhaps from pre-announcements or speculation. The earlier the rise starts, the more efficiently the market is pricing in the expected buying pressure.
Does the green line hold its gains after day 0, or does it fade? If it holds, the index change may have permanently increased demand for the stock (more funds now hold it, more analysts cover it). If it fades, the price boost was likely just temporary pressure from funds buying their required shares — once the rebalancing is done, the extra demand disappears and the price drifts back.
Is the red line a mirror image of the green? Leavers often show a steeper drop, because selling under a hard deadline tends to create more price impact than gradual buying. Funds must exit by the effective date, which compresses the selling into a narrower window.
How wide are the confidence bands? Narrow bands mean the pattern was remarkably consistent across different events and time periods. Wide bands mean the average is real but individual outcomes varied a lot — important context if you're thinking about any single upcoming event.

5. Following the Money: The Volume Spike

What we're doing: Looking at how much each stock was traded around the index change, compared to its normal level of activity.

Why it matters: Price tells you which direction things moved, but volume tells you how much force was behind it. When hundreds of index-tracking funds all need to buy (or sell) the same stock on the same day, trading volume can surge to several times its normal level. This is the most direct evidence of the mechanical rebalancing at work — it's not speculation or sentiment, it's funds executing trades they're contractually obligated to make.

A volume ratio of 3.0 means three times the stock's usual daily trading. For the S&P 500, that's common. For smaller indices like the Russell 2000, where the affected stocks are less liquid to begin with, volume spikes of 5–10x have been documented.

    	
            # Aggregate abnormal volume across events

agg_joiner_vol = aggregate_events(joiner_volumes, "abnormal_volume")

agg_leaver_vol = aggregate_events(leaver_volumes, "abnormal_volume")



fig = make_subplots(rows=1, cols=1)



# Joiners volume

if len(agg_joiner_vol) > 0:

    fig.add_trace(go.Scatter(

        x=agg_joiner_vol["relative_day"], y=agg_joiner_vol["q75"],

        mode="lines", line=dict(width=0), showlegend=False, hoverinfo="skip"

    ))

    fig.add_trace(go.Scatter(

        x=agg_joiner_vol["relative_day"], y=agg_joiner_vol["q25"],

        mode="lines", line=dict(width=0), fill="tonexty",

        fillcolor="rgba(0,150,0,0.1)", showlegend=False, hoverinfo="skip"

    ))

    fig.add_trace(go.Scatter(

        x=agg_joiner_vol["relative_day"], y=agg_joiner_vol["mean"],

        mode="lines", name=f"Joiners (n={len(joiner_volumes)})",

        line=dict(color="green", width=2.5),

        hovertemplate="Day %{x}: %{y:.2f}x<extra>Joiners</extra>"

    ))



# Leavers volume

if len(agg_leaver_vol) > 0:

    fig.add_trace(go.Scatter(

         x=agg_leaver_vol["relative_day"], y=agg_leaver_vol["q75"],

        mode="lines", line=dict(width=0), showlegend=False, hoverinfo="skip"

    ))

    fig.add_trace(go.Scatter(

        x=agg_leaver_vol["relative_day"], y=agg_leaver_vol["q25"],

        mode="lines", line=dict(width=0), fill="tonexty",

        fillcolor="rgba(200,0,0,0.1)", showlegend=False, hoverinfo="skip"

    ))

    fig.add_trace(go.Scatter(

        x=agg_leaver_vol["relative_day"], y=agg_leaver_vol["mean"],

        mode="lines", name=f"Leavers (n={len(leaver_volumes)})",

        line=dict(color="red", width=2.5),

        hovertemplate="Day %{x}: %{y:.2f}x<extra>Leavers</extra>"

    ))



# Reference lines

fig.add_vline(x=0, line_dash="dash", line_color="gray", annotation_text="Effective Date")

fig.add_hline(

    y=1.0,

    line_dash="dot",

    line_color="black",

    line_width=1.5,

    annotation_text="Normal Volume (1.0x)"

)



fig.update_layout(

    title=f"Average Abnormal Volume Around {INDEX_NAME} Reconstitution Events",

    xaxis_title="Trading Days Relative to Effective Date",

    yaxis_title="Volume Multiple vs Baseline (1.0 = Normal)",

    template="plotly_white",

    width=900,

    height=500,

    legend=dict(yanchor="top", y=0.99, xanchor="left", x=0.01)

)



fig.show()



# Peak volume statistics

for label, agg in [("Joiners", agg_joiner_vol), ("Leavers", agg_leaver_vol)]:

    if len(agg) > 0:

        peak_row = agg.loc[agg["mean"].idxmax()]

        print(f"{label} — peak volume at day {peak_row['relative_day']:.0f}: "

              f"{peak_row['mean']:.1f}x normal (median: {peak_row['median']:.1f}x)")



print("\nInterpretation note: 1.0x is normal baseline volume. Values above 1.0x indicate elevated trading volume.")

Joiners — peak volume at day -1: 40.2x normal (median: 36.9x)
Leavers — peak volume at day -1: 27.6x normal (median: 26.0x)

Interpretation note: 1.0x is normal baseline volume. Values above 1.0x indicate elevated trading volume.

How to read this chart: The y-axis shows trading volume as a multiple of each stock's normal daily average. 1.0x = normal baseline volume; values above 1.0x are elevated, and 0.0 would mean no trading at all. The shaded bands show the range across individual events (25th to 75th percentile).

A few things to notice:

The spike at day 0 is the fingerprint of index fund rebalancing. This is the day the change officially takes effect, and funds that track the index need to have completed their trades.
Activity before day 0 suggests that some investors are trading ahead of the rebalancing — either anticipating the price impact or front-running the known deadline.
How quickly does volume return to normal? A sharp spike that drops off immediately means the rebalancing was a one-day event. If elevated volume lingers, the trade was large enough that it took multiple days to absorb.
Which side spikes harder — joiners or leavers? Forced selling (leavers) is often more intense, because funds liquidating a position under a deadline face more urgency than funds gradually accumulating a new one.

6. How Consistent Is the Pattern? Looking at Individual Events

What we're doing: The charts above showed the average effect across all events. But averages can hide a lot of variation. Here we look at each event individually to see how spread out the outcomes really are.

Why it matters: If every joiner gained roughly 5% and every leaver lost roughly 5%, the pattern would be extremely reliable. But if individual outcomes range from -10% to +20%, then even though the average is positive, any single event is far less predictable. This section answers the question: Is this a pattern you can count on, or just a tendency that plays out more often than not?

    	
            # Extract the CAR at the effective date (day 0) for each individual event

def get_car_at_day(car_results, day=0):

    """Extract CAR at a specific relative day for each event."""

    values = []

    for i, df in enumerate(car_results):

        if df is not None and len(df) > 0:

            row = df[df["relative_day"] == day]

            if len(row) > 0:

                values.append({"event_id": i, "CAR_at_day": row["CAR"].values[0] * 100})

    return pd.DataFrame(values)



# Also extract the full-window CAR (day +30 or max available)

def get_car_at_end(car_results):

    """Extract final CAR for each event."""

    values = []

    for i, df in enumerate(car_results):

        if df is not None and len(df) > 0:

            final_car = df["CAR"].iloc[-1] * 100

            values.append({"event_id": i, "CAR_final": final_car})

    return pd.DataFrame(values)



joiner_day0 = get_car_at_day(joiner_cars, day=0)

leaver_day0 = get_car_at_day(leaver_cars, day=0)

joiner_final = get_car_at_end(joiner_cars)

leaver_final = get_car_at_end(leaver_cars)



# Combine for plotting

joiner_scatter = joiner_day0.merge(joiner_final, on="event_id")

joiner_scatter["Change"] = "Joiner"

leaver_scatter = leaver_day0.merge(leaver_final, on="event_id")

leaver_scatter["Change"] = "Leaver"



# Attach instrument metadata if available from Section 3

if "joiner_car_rics" in globals() and len(joiner_car_rics) == len(joiner_scatter):

    joiner_scatter["RIC"] = [joiner_car_rics[i] for i in joiner_scatter["event_id"]]

else:

    joiner_scatter["RIC"] = "N/A"



if "leaver_car_rics" in globals() and len(leaver_car_rics) == len(leaver_scatter):

    leaver_scatter["RIC"] = [leaver_car_rics[i] for i in leaver_scatter["event_id"]]

else:

    leaver_scatter["RIC"] = "N/A"



if "joiner_car_names" in globals() and len(joiner_car_names) == len(joiner_scatter):

    joiner_scatter["Company"] = [joiner_car_names[i] for i in joiner_scatter["event_id"]]

else:

    joiner_scatter["Company"] = ""



if "leaver_car_names" in globals() and len(leaver_car_names) == len(leaver_scatter):

    leaver_scatter["Company"] = [leaver_car_names[i] for i in leaver_scatter["event_id"]]

else:

    leaver_scatter["Company"] = ""



if "joiner_car_dates" in globals() and len(joiner_car_dates) == len(joiner_scatter):

    joiner_scatter["Effective Date"] = [pd.to_datetime(joiner_car_dates[i]).strftime("%Y-%m-%d") for i in joiner_scatter["event_id"]]

else:

    joiner_scatter["Effective Date"] = ""



if "leaver_car_dates" in globals() and len(leaver_car_dates) == len(leaver_scatter):

    leaver_scatter["Effective Date"] = [pd.to_datetime(leaver_car_dates[i]).strftime("%Y-%m-%d") for i in leaver_scatter["event_id"]]

else:

    leaver_scatter["Effective Date"] = ""



scatter_df = pd.concat([joiner_scatter, leaver_scatter], ignore_index=True)



# Scatter: CAR at day 0 vs final CAR

fig = px.scatter(

    scatter_df,

    x="CAR_at_day",

    y="CAR_final",

    color="Change",

    color_discrete_map={"Joiner": "green", "Leaver": "red"},

    title=f"Individual Event Outcomes — CAR at Effective Date vs. End of Window ({INDEX_NAME})",

    labels={

        "CAR_at_day": "CAR at Effective Date (%)",

        "CAR_final": f"CAR at Day +{EVENT_WINDOW[1]} (%)"

    },

    hover_name="RIC",

    hover_data={

        "Company": True,

        "Effective Date": True,

        "CAR_at_day": ":.2f",

        "CAR_final": ":.2f",

        "event_id": False

    },

    opacity=0.7

)



# Reference lines

fig.add_hline(y=0, line_dash="dot", line_color="lightgray")

fig.add_vline(x=0, line_dash="dot", line_color="lightgray")



# Diagonal (no change after effective date)

fig.add_trace(go.Scatter(

    x=[-30, 30], y=[-30, 30],

    mode="lines", line=dict(dash="dash", color="gray", width=1),

    showlegend=False, hoverinfo="skip"

))



fig.update_layout(template="plotly_white", width=800, height=600)

fig.show()



# Prepare summary table for the hit-rate subsection below

summary_rows = []

for label, data in [("Joiners", joiner_final), ("Leavers", leaver_final)]:

    if len(data) > 0:

        mean_car = data["CAR_final"].mean()

        median_car = data["CAR_final"].median()

        std_car = data["CAR_final"].std()

        min_car = data["CAR_final"].min()

        max_car = data["CAR_final"].max()

        positive_pct = (data["CAR_final"] > 0).mean() * 100



        summary_rows.append({

            "Group": label,

            "n": len(data),

            "Mean final CAR": f"{mean_car:.2f}%",

            "Median final CAR": f"{median_car:.2f}%",

            "Std dev": f"{std_car:.2f}%",

            "Range": f"[{min_car:.2f}%, {max_car:.2f}%]",

            "Positive outcomes": f"{positive_pct:.0f}%",

        })



summary_df = pd.DataFrame(summary_rows)

How to read this chart: Each dot represents a single index change. Green dots are joiners, red are leavers. The x-axis shows how the stock had performed by the effective date; the y-axis shows where it ended up 30 trading days later.

The diagonal dashed line is the "nothing changed after the effective date" reference. Dots on the line had all their movement before day 0 and then held steady. Dots above it continued gaining. Dots below gave back some of their move.

Distance from the dashed line matters: the farther a dot is from the line, the larger the post-effective-date reassessment. In practical terms, that often signals a stronger initial overreaction, hype, or market misread that was later corrected (or, in some cases, a delayed realization that the first reaction was too conservative).

Where the dot sits	What it means
Upper-right (green)	Joiner that gained and kept gaining — the textbook outcome
Lower-left (red)	Leaver that declined and stayed down — also the expected pattern
Lower-right (green)	Joiner that initially gained, but then the gain reversed — the buying pressure was temporary
Upper-left (red)	Leaver that initially declined, but then recovered — the selling pressure was overdone

Hit rate

The hit rate is shown in the table below under Positive outcomes. It is the share of events that finished with the expected sign and serves as a quick reliability check for the pattern by group.

As a rule of thumb, values well above 50% suggest the pattern has been fairly consistent across events, while values near 50% suggest a weaker and less dependable edge.

    	
            # Hit-rate summary table

if len(summary_df) > 0:

    display(summary_df)

else:

    print("No summary statistics available.")

7. Looking Ahead: Which Stocks Might Be Next?

The S&P 500 typically adds only 17–28 stocks per year. Casting a wide net and calling every Russell 1000 stock above the minimum market cap a "candidate" would produce hundreds of names — far too many to be useful. So rather than screening on market cap alone, we apply the same eligibility criteria the S&P committee publishes:

Market cap — at least as large as the smallest current constituent
Positive earnings — the most recent EPS must be positive
Public float ≥ 50% — adequate shares available for trading
Adequate liquidity — average daily volume above the 10th percentile of current members

Each filter narrows the pool. What starts as hundreds of stocks that simply "exceed the minimum market cap" gets trimmed to a much shorter list — the subset that actually clears all the quantitative hurdles.

This still won't predict exact committee decisions (the S&P 500 is actively managed, not formula-driven, and the committee weighs qualitative factors like sector balance and corporate structure), but it surfaces stocks that currently meet every published criterion — a much sharper starting point than market cap alone.

    	
            # Current S&P 500 constituents with key metrics

print("Retrieving current S&P 500 constituents...")

current_spx = ld.get_data(

    universe=INDEX_CHAIN,

    fields=[

         "TR.CompanyName",

         "TR.CompanyMarketCap",

         "TR.TRBCEconomicSector",

         "TR.AvgDailyVolume20D",

         "TR.FreeFloatPct",           # Free float percentage

         "TR.EPSActValue"             # Most recent actual EPS (profitability check)

    ]

)



# Ensure Instrument is a column (API may return it as index or regular column)

if "Instrument" not in current_spx.columns:

    current_spx = current_spx.reset_index()



# Rename using a dict so column order doesn't matter

current_spx = current_spx.rename(columns={

     "Company Name": "Name",

     "Company Market Cap": "MarketCap",

     "TRBC Economic Sector Name": "Sector",

     "Free Float (Percent)": "FreeFloat",

     "Earnings Per Share - Actual": "EPS_Actual",

})

# Volume column display name may vary — match any column containing "volume"

vol_cols = [c for c in current_spx.columns if "volume" in c.lower()]

if vol_cols:

    current_spx = current_spx.rename(columns={vol_cols[0]: "AvgVolume"})



# Clean — .astype(float) converts pd.NA to np.nan so dropna/formatting work

current_spx["MarketCap"] = pd.to_numeric(current_spx["MarketCap"], errors="coerce").astype(float)

current_spx["AvgVolume"] = pd.to_numeric(current_spx["AvgVolume"], errors="coerce").astype(float)



print(f"Current S&P 500 constituents: {len(current_spx)}")



# Market cap distribution

mcap = current_spx["MarketCap"].dropna()

print(f"\nMarket cap distribution (USD billions):")

print(f"  Min:    ${mcap.min() / 1e9:.1f}B")

print(f"  10th %: ${mcap.quantile(0.10) / 1e9:.1f}B")

print(f"  Median: ${mcap.median() / 1e9:.1f}B")

print(f"  Max:    ${mcap.max() / 1e9:.1f}B")



# The threshold for leaver risk: stocks in the bottom decile

leaver_threshold = mcap.quantile(0.10)

print(f"\nLeaver risk threshold (10th percentile): ${leaver_threshold / 1e9:.1f}B")

    	
            # Retrieve a broader universe to find joiner candidates

# Using Russell 1000 (0#.RUI) as the candidate pool for S&P 500 joiners

print("Retrieving Russell 1000 constituents for candidate screening...")

russell_1000 = ld.get_data(

    universe="0#.RUI",

    fields=[

         "TR.CompanyName",

         "TR.CompanyMarketCap",

         "TR.TRBCEconomicSector",

         "TR.AvgDailyVolume20D",

         "TR.FreeFloatPct",

         "TR.EPSActValue"

    ]

)



# Ensure Instrument is a column

if "Instrument" not in russell_1000.columns:

    russell_1000 = russell_1000.reset_index()



russell_1000 = russell_1000.rename(columns={

     "Company Name": "Name",

     "Company Market Cap": "MarketCap",

     "TRBC Economic Sector Name": "Sector",

     "Free Float (Percent)": "FreeFloat",

     "Earnings Per Share - Actual": "EPS_Actual",

})

vol_cols = [c for c in russell_1000.columns if "volume" in c.lower()]

if vol_cols:

    russell_1000 = russell_1000.rename(columns={vol_cols[0]: "AvgVolume"})



russell_1000["MarketCap"] = pd.to_numeric(russell_1000["MarketCap"], errors="coerce").astype(float)

russell_1000["AvgVolume"] = pd.to_numeric(russell_1000["AvgVolume"], errors="coerce").astype(float)

russell_1000["EPS_Actual"] = pd.to_numeric(russell_1000["EPS_Actual"], errors="coerce").astype(float)

russell_1000["FreeFloat"] = pd.to_numeric(russell_1000["FreeFloat"], errors="coerce").astype(float)



print(f"Russell 1000 constituents: {len(russell_1000)}")



# Identify non-S&P 500 stocks in Russell 1000

spx_rics = set(current_spx["Instrument"].tolist())

non_spx = russell_1000[~russell_1000["Instrument"].isin(spx_rics)].copy()

print(f"Russell 1000 stocks NOT in S&P 500: {len(non_spx)}")



# --- JOINER CANDIDATES ---

# Apply the S&P 500's published eligibility criteria, not just market cap.

# The committee considers: market cap, positive earnings, public float,

# and adequate liquidity. We apply each filter and show how the pool narrows.



min_mcap = current_spx["MarketCap"].dropna().min()

joiner_threshold = min_mcap * (1 - MCAP_BUFFER_PCT)



# Calibrate thresholds from current S&P 500 members

spx_median_vol = current_spx["AvgVolume"].dropna().median()

spx_vol_floor = current_spx["AvgVolume"].dropna().quantile(0.10)  # 10th %ile volume



# Step-by-step screening funnel

pool = non_spx.copy()

n_start = len(pool)



# 1. Market cap — at least as large as the smallest current member

pool = pool[pool["MarketCap"] >= min_mcap]

n_after_mcap = len(pool)



# 2. Positive earnings — S&P requires positive trailing earnings

pool = pool[pool["EPS_Actual"] > 0]

n_after_eps = len(pool)



# 3. Public float >= 50% — S&P requires adequate float

pool = pool[pool["FreeFloat"] >= 50]

n_after_float = len(pool)



# 4. Adequate liquidity — volume above the 10th percentile of current members

pool = pool[pool["AvgVolume"] >= spx_vol_floor]

n_after_vol = len(pool)



joiner_candidates = pool.copy()

joiner_candidates["MarketCap_B"] = joiner_candidates["MarketCap"] / 1e9

joiner_candidates = joiner_candidates.sort_values("MarketCap", ascending=False)



# Buffer zone: stocks within 20% below min_mcap, meeting the same quality filters

buffer_pool = non_spx[

     (non_spx["MarketCap"] >= joiner_threshold) &

     (non_spx["MarketCap"] < min_mcap) &

     (non_spx["MarketCap"].notna()) &

     (non_spx["EPS_Actual"] > 0) &

     (non_spx["FreeFloat"] >= 50) &

     (non_spx["AvgVolume"] >= spx_vol_floor)

].copy()

buffer_pool["MarketCap_B"] = buffer_pool["MarketCap"] / 1e9

buffer_zone = buffer_pool.sort_values("MarketCap", ascending=False)



# Screening funnel summary

print(f"\n{'─'*60}")

print(f"JOINER SCREENING FUNNEL")

print(f"{'─'*60}")

print(f"  Russell 1000 non-S&P 500 universe:      {n_start:>4}")

print(f"  ├─ Market cap >= ${min_mcap / 1e9:.1f}B:                 {n_after_mcap:>4}")

print(f"  ├─ Positive earnings (EPS > 0):         {n_after_eps:>4}")

print(f"  ├─ Public float >= 50%:                 {n_after_float:>4}")

print(f"  └─ Adequate liquidity (vol >= 10th %):  {n_after_vol:>4}  ← joiner candidates")

print(f"\n  Buffer zone (${joiner_threshold / 1e9:.1f}B – ${min_mcap / 1e9:.1f}B, same criteria): {len(buffer_zone)}")



# --- LEAVER CANDIDATES ---

# Current constituents with small or declining market cap

leaver_candidates = current_spx[

     (current_spx["MarketCap"] <= leaver_threshold * (1 + MCAP_BUFFER_PCT)) &

     (current_spx["MarketCap"].notna())

].copy()



leaver_candidates["MarketCap_B"] = leaver_candidates["MarketCap"] / 1e9

leaver_candidates = leaver_candidates.sort_values("MarketCap", ascending=True)



print(f"\n  Leaver candidates (bottom decile ± {int(MCAP_BUFFER_PCT*100)}%): {len(leaver_candidates)}")

    	
            # Display the watchlist tables

print(f"{'='*90}")

print(f"JOINER WATCHLIST — Top {TOP_N} Candidates by Market Cap")

print(f"{'='*90}")

print(f"Russell 1000 stocks not in the {INDEX_NAME} that pass all S&P 500 eligibility screens:")

print(f"market cap, positive earnings, adequate float, and liquidity.\n")



joiner_display = joiner_candidates.head(TOP_N)[

     ["Instrument", "Name", "MarketCap_B", "Sector", "AvgVolume", "EPS_Actual"]

].reset_index(drop=True)

joiner_display.index = joiner_display.index + 1

joiner_display.index.name = "Rank"

joiner_display.columns = ["RIC", "Company", "MktCap ($B)", "Sector", "Avg Volume", "Last EPS"]

display(joiner_display)



print(f"\n{'='*90}")

print(f"LEAVER WATCHLIST — Smallest {INDEX_NAME} Constituents")

print(f"{'='*90}")

print(f"Current {INDEX_NAME} members with market caps near the bottom of the index.")

print(f"Ranked by market cap (ascending) — smallest first.\n")



leaver_display = leaver_candidates.head(TOP_N)[

     ["Instrument", "Name", "MarketCap_B", "Sector", "AvgVolume", "EPS_Actual"]

].reset_index(drop=True)

leaver_display.index.name = "Rank"

leaver_display.columns = ["RIC", "Company", "MktCap ($B)", "Sector", "Avg Volume", "Last EPS"]

display(leaver_display)

==========================================================================================
JOINER WATCHLIST — Top 20 Candidates by Market Cap
==========================================================================================
Russell 1000 stocks not in the S&P 500 that pass all S&P 500 eligibility screens:
market cap, positive earnings, adequate float, and liquidity.

==========================================================================================
LEAVER WATCHLIST — Smallest S&P 500 Constituents
==========================================================================================
Current S&P 500 members with market caps near the bottom of the index.
Ranked by market cap (ascending) — smallest first.

How to use these watchlists:

The joiner watchlist shows the top candidates after applying all four S&P 500 eligibility screens — not just market cap, but also positive earnings, adequate float, and liquidity. Stocks that pass all four are a much narrower pool than the hundreds that merely exceed the minimum market cap.

The leaver watchlist flags the smallest current members. Being small doesn't automatically trigger removal (there are buffer rules to avoid constant churn), but a stock that has shrunk significantly while larger alternatives sit outside the index is at elevated risk.

Practical next steps:

Connect the dots — if the event study shows that joiners historically gained X% on average, and you spot a strong candidate early, you have a quantitative basis for evaluating the opportunity.
Check the calendar — S&P 500 changes happen at quarterly rebalancing dates (March, June, September, December) or between those dates for corporate events like mergers, spin-offs, or bankruptcies.
Dig deeper — open each candidate in Workspace. Read the latest news. Check for corporate actions (mergers, spin-offs) that might affect eligibility.
Track over time — run this notebook monthly. Stocks that consistently appear on the watchlist are stronger candidates than those that show up once and disappear.

    	
            # Focused visual: top joiner candidates vs. smallest current members

# Side-by-side horizontal bars make the market cap mismatch obvious



CHART_N = 15  # show fewer than the full watchlist for readability



top_joiners = joiner_candidates.head(CHART_N).copy()

top_leavers = leaver_candidates.head(CHART_N).copy()



# Clean labels: company name only (RIC goes in hover)

top_joiners["Label"] = top_joiners["Name"].str[:35].str.strip()

top_leavers["Label"] = top_leavers["Name"].str[:35].str.strip()



fig = make_subplots(

    rows=1, cols=2,

    subplot_titles=[

        f"Top {len(top_joiners)} Joiner Candidates",

        f"Smallest {len(top_leavers)} Current Members",

    ],

    shared_xaxes=True,

    horizontal_spacing=0.18,

)



# --- Left panel: joiner candidates (green) ---

fig.add_trace(go.Bar(

    y=top_joiners["Label"][::-1],

    x=top_joiners["MarketCap_B"][::-1],

    orientation="h",

    marker_color="rgba(0,180,0,0.6)",

    hovertemplate=(

        "<b>%{y}</b> (%{customdata[0]})<br>"

        "Market Cap: $%{x:.1f}B<br>"

        "Sector: %{customdata[1]}<extra></extra>"

    ),

    customdata=np.column_stack([

        top_joiners["Instrument"][::-1],

        top_joiners["Sector"].fillna("—")[::-1],

    ]),

    showlegend=False,

), row=1, col=1)



# --- Right panel: leaver candidates (red) ---

fig.add_trace(go.Bar(

    y=top_leavers["Label"][::-1],

    x=top_leavers["MarketCap_B"][::-1],

    orientation="h",

    marker_color="rgba(200,0,0,0.6)",

    hovertemplate=(

         "<b>%{y}</b> (%{customdata[0]})<br>"

         "Market Cap: $%{x:.1f}B<br>"

         "Sector: %{customdata[1]}<extra></extra>"

    ),

    customdata=np.column_stack([

        top_leavers["Instrument"][::-1],

        top_leavers["Sector"].fillna("—")[::-1],

    ]),

    showlegend=False,

), row=1, col=2)



fig.update_layout(

    title="Joiner Candidates vs. Leaver Candidates: Market Cap Comparison",

    template="plotly_white",

    width=1100,

    height=max(500, CHART_N * 32 + 120),

    xaxis_title="Market Cap (USD Billions)",

    xaxis2_title="Market Cap (USD Billions)",

    yaxis=dict(tickfont=dict(size=11)),

    yaxis2=dict(tickfont=dict(size=11)),

)



fig.show()

How to read this chart: The left panel shows the top joiner candidates ranked by market cap — the strongest contenders based on the screening funnel above. The right panel shows the smallest current S&P 500 members — the stocks most exposed to removal.

A few things to notice:

How much bigger are the top joiner candidates than the smallest members? The larger the gap, the more glaring the mismatch — and the stronger the case that the committee will eventually act.
Are any leavers significantly smaller than the rest? A single outlier on the right panel is more exposed than a cluster of similarly-sized stocks — the committee has a clearer target.
Compare the two panels side by side. If the top joiners are 3–5x the size of the smallest members, the pressure to swap is substantial. If they're only marginally larger, the committee may wait.

8. Where to Go From Here

This notebook covers the core analysis, but there are many ways to extend it depending on what you're most interested in:

The same LSEG data and functions work across thousands of indices globally. Change INDEX_RIC and INDEX_CHAIN in the configuration section, and the entire analysis adapts.

In short: When the S&P 500 swaps its members, the stocks involved tend to move — not because of any change in the company's business, but because trillions of dollars in index-tracking funds are obligated to trade. This notebook measures how large and how consistent that effect has been, and identifies which stocks might be next. The analysis is designed to be re-run before each quarterly rebalancing so the results stay current. It's not a trading strategy — but it's the kind of evidence-based groundwork that a thoughtful strategy can be built on.

Register or Log in to applaud this article
Let the author know how much this article helped you

If you require assistance, please contact us here

API

LSEG Data Library for Python

GitHub - Python Jupyter Notebook

What Happens to a Stock When It Joins (or Leaves) a Major Index

Setup

Open the data session

Configuration

1. Which Stocks Were Added or Removed?

2. Collecting Price and Volume Data

3. Separating the Signal from the Noise

4. The Price Impact: How Do Stocks React to Being Added or Removed?

5. Following the Money: The Volume Spike

6. How Consistent Is the Pattern? Looking at Individual Events

Hit rate

7. Looking Ahead: Which Stocks Might Be Next?

8. Where to Go From Here

Related APIs

Source Code