Authors:
In this article, we will investigate Public Companies' Climate targets and whether they are on track to meet them. The data needed for such an investigation is not onboarded on LSEG's Data Platform yet; but, thankfully, it is on both (i) on LSEG's Sustainable Finance Investing (SFI) Snowflake and (ii) on Workspace. Thanks to the latter, we can call for this data on CodeBook. For more information on the Climate data we are collecting and the methodology, please refer to the article "LSEG's SFI Snowflake Datawarehouse: A use-case evaluating public companies' Climate track record using Python and SQL".
LSEG usually collects data from a number of vendors, but when it comes to climate data, we rely on our analysts to retrieve the information directly from companies disclosures. The most exciting data onboarded has, recently, related to ESG and Climate topics. In this article, we will directly source the data we are after using the Data Item Browser and CodeBook.
Importing libraries and Authentifying ourselves to LSEG's services
import re # We'll use the `re` library to seperate numbers from words in strings.
import pandas as pd # this library is great for dataframe manipulations.
import plotly.graph_objects as go # `plotly` and its modules are useful for us to plot graphs.
import numpy as np # `numpy` in this case will be useful for us to plot graphs.
import lseg.data as ld
ld.open_session()
Calling and tidying our data
For more information on the reason why we are collecting this data, please refer to the article "LSEG's SFI Snowflake Datawarehouse: A use-case evaluating public companies' Climate track record using Python and SQL". With that said, it will be evident, during the article, why we collected this data.
First, let's put in place the primary parameters:
RIC = "MSFT.O" # SHEL.L, AAPL.O
scopeOfInterest = "1" # choice between "1", "2", "3", "1 and 2" or "1, 2 and 3"
setOfInterest = 1 # choice from 1 to 5
termOfInterest = "LT" # choice between "ST" and "LT"
sourceOfInterest = "LSEG" # choice between "LSEG" and "CDP"
We need extra parameters for the logic we'll create below, I define them here:
scopeOfInterestFirstNumber = int(re.search(r'\d+', scopeOfInterest).group())
if termOfInterest == "ST":
termNotOfInterest = "LT"
else:
termNotOfInterest = "ST"
if sourceOfInterest == "LSEG":
_sourceOfInterest = ""
elif sourceOfInterest == "CDP":
_sourceOfInterest = sourceOfInterest
Now, we can create the list of fields we're after from LSEG's services:
lstOfFlds = ["TR.CO2EquivalentEmissionsTotalMarketbased", "TR.CO2EmissionTotal"]
for i in range(1,4):
lstOfFlds.append(f"TR.{_sourceOfInterest}CO2DirectScope{i}")
lstOfFlds.append(f"TR.{_sourceOfInterest}CO2IndirectScope{i}")
lstOfFlds.append(f"TR.{_sourceOfInterest}CO2EmissionTotal")
lstOfFlds.append(f"TR.{_sourceOfInterest}TotalCO2EquivalentEmissionsScope1and2and3")
lstOfFlds.append(f"TR.{_sourceOfInterest}TotalCO2EquivalentEmissionScope1and2and3Marketbased")
lstOfFlds.append(f"TR.{_sourceOfInterest}{termOfInterest}GHGEmissionPercentageReductionTargetedSet{setOfInterest}")
lstOfFlds.append(f"TR.{_sourceOfInterest}{termOfInterest}PercentageofGHGEmissionCoveredbyTargetSet{setOfInterest}")
lstOfFlds.append(f"TR.{_sourceOfInterest}{termOfInterest}GHGEmissionScopeSet{setOfInterest}")
lstOfFlds.append(f"TR.{_sourceOfInterest}{termOfInterest}GHGEmissionBaseYearSet{setOfInterest}")
lstOfFlds.append(f"TR.{_sourceOfInterest}{termOfInterest}GHGEmissionTargetYearSet{setOfInterest}")
print(lstOfFlds)
['TR.CO2EquivalentEmissionsTotalMarketbased', 'TR.CO2EmissionTotal', 'TR.CO2DirectScope1', 'TR.CO2IndirectScope1', 'TR.CO2DirectScope2', 'TR.CO2IndirectScope2', 'TR.CO2DirectScope3', 'TR.CO2IndirectScope3', 'TR.CO2EmissionTotal', 'TR.TotalCO2EquivalentEmissionsScope1and2and3', 'TR.TotalCO2EquivalentEmissionScope1and2and3Marketbased', 'TR.LTGHGEmissionPercentageReductionTargetedSet1', 'TR.LTPercentageofGHGEmissionCoveredbyTargetSet1', 'TR.LTGHGEmissionScopeSet1', 'TR.LTGHGEmissionBaseYearSet1', 'TR.LTGHGEmissionTargetYearSet1']
len(lstOfFlds)
16
Now we're cooking:
df0 = ld.get_history(
universe=[RIC],
fields=lstOfFlds,
start="1990-01-01",
end="2025-01-01",
header_type=ld.HeaderType.NAME_AND_TITLE,
).dropna(how='all')
df0.tail(3)
MSFT.O | TR.CO2EQUIVALENTEMISSIONSTOTALMARKETBASED|CO2 Equivalent Emissions Total Market-based | TR.CO2EMISSIONTOTAL|CO2 Equivalent Emissions Total | TR.CO2DIRECTSCOPE1|CO2 Equivalent Emissions Direct, Scope 1 | TR.CO2INDIRECTSCOPE2|CO2 Equivalent Emissions Indirect, Scope 2 | TR.CO2INDIRECTSCOPE3|CO2 Equivalent Emissions Indirect, Scope 3 | TR.TOTALCO2EQUIVALENTEMISSIONSSCOPE1AND2AND3|Total CO2 Equivalent Emissions Scope 1, 2 and 3 | TR.TOTALCO2EQUIVALENTEMISSIONSCOPE1AND2AND3MARKETBASED|Total CO2 Equivalent Emission Scope 1, 2 and 3 Market-based | TR.LTGHGEMISSIONPERCENTAGEREDUCTIONTARGETEDSET1|LT GHG Emission Percentage Reduction Targeted Set1 | TR.LTPERCENTAGEOFGHGEMISSIONCOVEREDBYTARGETSET1|LT Percentage of GHG Emission Covered by Target Set1 | TR.LTGHGEMISSIONSCOPESET1|LT GHG Emission Scope Set1 | TR.LTGHGEMISSIONBASEYEARSET1|LT GHG Emission Base Year Set1 | TR.LTGHGEMISSIONTARGETYEARSET1|LT GHG Emission Target Year Set1 |
Date | ||||||||||||
30/06/2020 | 574219 | <NA> | <NA> | <NA> | <NA> | <NA> | 12227000 | <NA> | <NA> | <NA> | <NA> | |
30/06/2021 | 553109 | 5134371 | 123704 | 5010667 | 13839000 | 18973371 | 14392000 | 75 | 100 | Scope 1 and 2 | 2013 | 2045 |
30/06/2022 | 427442 | 6520663 | 139413 | 6381250 | 16111000 | 22631663 | 16538000 | 75 | 100 | Scope 1 and 2 | 2013 | 2045 |
As it happens, "CO2 Equivalent Emissions Total" is "CO2 Equivalent Emissions Direct, Scope 1"; I don't want to overwright the column names, so let's just add a new one:
if scopeOfInterest == "1 and 2":
df0[f"TR.{sourceOfInterest}CO2EQUIVALENTEMISSIONTOTAL|CO2 Equivalent Emissions Direct, Scope 1 and 2"] = df0[f"TR.{sourceOfInterest}CO2EQUIVALENTEMISSIONTOTAL|CO2 Equivalent Emissions Total"]
print(list(df0.columns))
['TR.CO2EQUIVALENTEMISSIONSTOTALMARKETBASED|CO2 Equivalent Emissions Total Market-based', 'TR.CO2EMISSIONTOTAL|CO2 Equivalent Emissions Total', 'TR.CO2DIRECTSCOPE1|CO2 Equivalent Emissions Direct, Scope 1', 'TR.CO2INDIRECTSCOPE2|CO2 Equivalent Emissions Indirect, Scope 2', 'TR.CO2INDIRECTSCOPE3|CO2 Equivalent Emissions Indirect, Scope 3', 'TR.TOTALCO2EQUIVALENTEMISSIONSSCOPE1AND2AND3|Total CO2 Equivalent Emissions Scope 1, 2 and 3', 'TR.TOTALCO2EQUIVALENTEMISSIONSCOPE1AND2AND3MARKETBASED|Total CO2 Equivalent Emission Scope 1, 2 and 3 Market-based', 'TR.LTGHGEMISSIONPERCENTAGEREDUCTIONTARGETEDSET1|LT GHG Emission Percentage Reduction Targeted Set1', 'TR.LTPERCENTAGEOFGHGEMISSIONCOVEREDBYTARGETSET1|LT Percentage of GHG Emission Covered by Target Set1', 'TR.LTGHGEMISSIONSCOPESET1|LT GHG Emission Scope Set1', 'TR.LTGHGEMISSIONBASEYEARSET1|LT GHG Emission Base Year Set1', 'TR.LTGHGEMISSIONTARGETYEARSET1|LT GHG Emission Target Year Set1']
As you see, there are several Scopes and Sets we may be interested in. Scope can be "1", "2", "1+2", "3" or "1+2+3". Scope can be anything from 1 to 5 (inclusive).
if sourceOfInterest == "LSEG":
cols = [
col for col in df0.columns
if 'CO2EQUIVALENTEMISSIONSTOTALMARKETBASED' in col
or "CO2EMISSIONTOTAL" in col
or (
'CDP' not in col # not in
and f'Scope {scopeOfInterest}, ' not in col
and f'Scope {scopeOfInterest} and ' not in col
and f'Scope {scopeOfInterestFirstNumber+1}' not in col
and f'Scope {scopeOfInterestFirstNumber+2}' not in col
and f'Scope {scopeOfInterestFirstNumber-1}' not in col
and f'Scope {scopeOfInterestFirstNumber-2}' not in col
and f'{termNotOfInterest}GHGEMISSION' not in col
and f'{termNotOfInterest}PERCENTAGE' not in col
and (
f'Scope {scopeOfInterest}' in col
or f'Set{setOfInterest}' in col))]
if sourceOfInterest == "CDP": # choice of "LSEG or "CDP"
cols = [
col for col in df0.columns
if 'CO2EQUIVALENTEMISSIONSTOTALMARKETBASED' in col
or "CO2EMISSIONTOTAL" in col
or (
'CDP' in col # in
and f'Scope {scopeOfInterest}, ' not in col
and f'Scope {scopeOfInterest} and ' not in col
and f'Scope {scopeOfInterestFirstNumber+1}' not in col
and f'Scope {scopeOfInterestFirstNumber+2}' not in col
and f'Scope {scopeOfInterestFirstNumber-1}' not in col
and f'Scope {scopeOfInterestFirstNumber-2}' not in col
and f'{termNotOfInterest}GHGEMISSION' not in col
and f'{termNotOfInterest}PERCENTAGE' not in col
and (
f'Scope {scopeOfInterest}' in col
or f'Set{setOfInterest}' in col))]
print(len(cols))
cols
8
['TR.CO2EQUIVALENTEMISSIONSTOTALMARKETBASED|CO2 Equivalent Emissions Total Market-based',
'TR.CO2EMISSIONTOTAL|CO2 Equivalent Emissions Total',
'TR.CO2DIRECTSCOPE1|CO2 Equivalent Emissions Direct, Scope 1',
'TR.LTGHGEMISSIONPERCENTAGEREDUCTIONTARGETEDSET1|LT GHG Emission Percentage Reduction Targeted Set1',
'TR.LTPERCENTAGEOFGHGEMISSIONCOVEREDBYTARGETSET1|LT Percentage of GHG Emission Covered by Target Set1',
'TR.LTGHGEMISSIONSCOPESET1|LT GHG Emission Scope Set1',
'TR.LTGHGEMISSIONBASEYEARSET1|LT GHG Emission Base Year Set1',
'TR.LTGHGEMISSIONTARGETYEARSET1|LT GHG Emission Target Year Set1']
df1 = df0[cols]
df1.tail(5)
MSFT.O | TR.CO2EQUIVALENTEMISSIONSTOTALMARKETBASED|CO2 Equivalent Emissions Total Market-based | TR.CO2EMISSIONTOTAL|CO2 Equivalent Emissions Total | TR.CO2DIRECTSCOPE1|CO2 Equivalent Emissions Direct, Scope 1 | TR.LTGHGEMISSIONPERCENTAGEREDUCTIONTARGETEDSET1|LT GHG Emission Percentage Reduction Targeted Set1 | TR.LTPERCENTAGEOFGHGEMISSIONCOVEREDBYTARGETSET1|LT Percentage of GHG Emission Covered by Target Set1 | TR.LTGHGEMISSIONSCOPESET1|LT GHG Emission Scope Set1 | TR.LTGHGEMISSIONBASEYEARSET1|LT GHG Emission Base Year Set1 | TR.LTGHGEMISSIONTARGETYEARSET1|LT GHG Emission Target Year Set1 |
Date | ||||||||
30/06/2019 | <NA> | <NA> | <NA> | 75 | 100 | Scope 1 and 2 | 2013 | 2045 |
30/06/2020 | <NA> | 4447016 | 118100 | 75 | 100 | Scope 1 and 2 | 2013 | 2045 |
30/06/2020 | 574219 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | |
30/06/2021 | 553109 | 5134371 | 123704 | 75 | 100 | Scope 1 and 2 | 2013 | 2045 |
30/06/2022 | 427442 | 6520663 | 139413 | 75 | 100 | Scope 1 and 2 | 2013 | 2045 |
df2 = df1.replace('', pd._libs.missing.NA)
df3 = df2.dropna(how='all')
df3.tail(3)
MSFT.O | TR.CO2EQUIVALENTEMISSIONSTOTALMARKETBASED|CO2 Equivalent Emissions Total Market-based | TR.CO2EMISSIONTOTAL|CO2 Equivalent Emissions Total | TR.CO2DIRECTSCOPE1|CO2 Equivalent Emissions Direct, Scope 1 | TR.LTGHGEMISSIONPERCENTAGEREDUCTIONTARGETEDSET1|LT GHG Emission Percentage Reduction Targeted Set1 | TR.LTPERCENTAGEOFGHGEMISSIONCOVEREDBYTARGETSET1|LT Percentage of GHG Emission Covered by Target Set1 | TR.LTGHGEMISSIONSCOPESET1|LT GHG Emission Scope Set1 | TR.LTGHGEMISSIONBASEYEARSET1|LT GHG Emission Base Year Set1 | TR.LTGHGEMISSIONTARGETYEARSET1|LT GHG Emission Target Year Set1 |
Date | ||||||||
30/06/2020 | 574219 | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> | <NA> |
30/06/2021 | 553109 | 5134371 | 123704 | 75 | 100 | Scope 1 and 2 | 2013 | 2045 |
30/06/2022 | 427442 | 6520663 | 139413 | 75 | 100 | Scope 1 and 2 | 2013 | 2045 |
df4 = df3.filter(like='PERCENTAGEOFGHGEMISSIONCOVEREDBYTARGETSET')
if df4.apply(pd.to_numeric, errors='coerce').dropna().gt(80).all().all():
print("'GHGEmissionCoveredbyTarget' figures are above 80%. The higher this figure, the more reliable the results.")
print("In this case, we consider them to be high enough to be reliable.")
else:
print("Be weary of the fact that 'GHGEmissionCoveredbyTarget' is below 80%. The lower this figure, the less reliable the results.")
'GHGEmissionCoveredbyTarget' figures are above 80%. The higher this figure, the more reliable the results.
In this case, we consider them to be high enough to be reliable.
Plotting the results
df5 = df3.copy()
df5.index = [np.int64(z) for z in pd.to_datetime(df5.index, errors='coerce').year.astype(object)] # keep only the years
baseEmissionYear = int(df3.filter(like='GHGEMISSIONBASEYEARSET').iloc[-1].values[0])
targetEmissionReductionYear = int(df3.filter(like='GHGEMISSIONTARGETYEARSET').iloc[-1].values[0])
baseEmission = df3[df3.index.year == baseEmissionYear].filter(like='CO2EMISSIONTOTAL').values[0][0]
targetReductionEmission = baseEmission * ((df3.filter(like='GHGEMISSIONPERCENTAGEREDUCTIONTARGETED').dropna().iloc[-1] / 100).iloc[0] * (df3.filter(like='PERCENTAGEOFGHGEMISSIONCOVEREDBYTARGET').dropna().iloc[-1] / 100).iloc[0])
targetEmission = baseEmission - targetReductionEmission
print(f"baseEmissionYear = {baseEmissionYear} & targetEmissionReductionYear = {targetEmissionReductionYear}")
print(f"baseEmission = {baseEmission} & targetReductionEmission = {targetReductionEmission} & targetEmission = {targetEmission}")
baseEmissionYear = 2013 & targetEmissionReductionYear = 2045
baseEmission = 1355480 & targetReductionEmission = 1016610.0 & targetEmission = 338870.0
start_year = df3.T.filter(like=str(baseEmissionYear)).columns.values[0]
end_year = np.datetime64(
f"{targetEmissionReductionYear}-{str(start_year)[5:7]}-{str(start_year)[8:10]}")
start_year = np.datetime64(start_year, 'ns')
end_year = np.datetime64(end_year, 'ns')
start_value = baseEmission
end_value = targetEmission
# Calculate the decay constant 'k' for exponential decay
# We use the formula for exponential decay: y(t) = y0 * exp(-k * (t - t0))
# Solve for 'k' using y(t) at the end point
# time_difference_years = (end_year - start_year).astype('timedelta64[Y]').astype(float)
# k = -np.log(end_value / start_value) / time_difference_years
if end_value / start_value == 0:
k = -np.log(0.01) / (int(str(end_year)[:4]) - int(str(start_year)[:4])) # Use a small value instead of zero to avoid log(0) issue
else:
k = -np.log(end_value / start_value) / ((end_year - start_year).astype('timedelta64[Y]').astype(float))
# Generate years and corresponding values using exponential decay
years = np.arange(
start_year.astype('datetime64[Y]').astype(int) + 1970,
end_year.astype('datetime64[Y]').astype(int) + 1971)
values = start_value * np.exp(
-k * (years - (start_year.astype('datetime64[Y]').astype(int) + 1970)))
decayDf = pd.DataFrame(data=values, index=years, columns=[f'Exponential Decay to target of {int(targetEmission)}'])
df6 = pd.concat([df5, decayDf])
# Create the plot
fig = go.Figure()
# Add the exponential decay line trace
fig.add_trace(
go.Scatter(
x=df6.index, y=df6[f'Exponential Decay to target of {int(targetEmission)}'], mode='lines',
name=f'Exponential Decay to target of {int(targetEmission)}'))
# Set plot title and labels
fig.update_layout(
title=f"{RIC} GHG Emissions (Base in {baseEmissionYear} and Target in {targetEmissionReductionYear}) in CO2 Tons",
xaxis_title="Year",
yaxis_title="CO2 Tons",
template="plotly_dark" # Optional: You can use different templates, such as 'plotly', 'ggplot2', etc.
)
# Add the dots representing the actual CO2 data from df3
fig.add_trace(go.Scatter(
x=df6.index,
y=df6[df6.filter(like="CO2EMISSIONTOTAL").columns[0]],
mode='lines+markers',
name='CO2 Equivalents Emission Total',
marker=dict(color='red', size=8) # Customize the dot color and size
))
fig.add_trace(go.Scatter(
x=df6.index,
y=df6[df6.filter(like="TR.CO2EQUIVALENTEMISSIONSTOTALMARKETBASED").columns[0]],
mode='lines+markers',
name='CO2 Equivalent Emissions Total Market-based',
marker=dict(color='grey', size=8) # Customize the dot color and size
))
# Show the plot
fig.show()
ld.close_session()
All together now
You can find the code for this function in the GitHub repo linked a the top right of this article.
ld.open_session()
df = GHGEmissionsBaseVsTargetUsingLSEGWorkspace(
RIC="SHEL.L", # MSFT.O, AAPL.O, SHEL.L, LVMH.PA
scope="1", # choice between "1", "2", "3", "1 and 2" or "1, 2 and 3"
set=1, # choice from 1 to 5
term="ST", # choice between "ST" and "LT"
source="LSEG") # choice between "LSEG" and "CDP"
ld.close_session()
'GHGEmissionCoveredbyTarget' figures are above 80%. The higher this figure, the more reliable the results.
In this case, we consider them to be high enough to be reliable.
baseEmissionYear = 2016 & targetEmissionReductionYear = 2025
baseEmission = 72000000 & targetReductionEmission = 6480000.0 & targetEmission = 65520000.0
Conclusion
As you can see, it is rather simple to make a target vs. actual emissions graph using both Snowflake and, in this case, Workspace/CodeBook!
Please note that this is not production code and is not to be used in production for investment tools, or used in investment decisions. It is only shown/published for educational purposes.