Overview
Last Updated: November 2023
This article is the second part of the series that demonstrate how to export financial data and report from Python/Jupyter application to Excel report file using xlwings CE and xlwings Pro libraries. The demo applications use content from Refinitiv Data Platform (RDP) via the the ease-of-use Data Library for Python as an example dataset. If you are not familiar with xlwings library or xlwings CE, please see more detail on first part article.
This second article is focusing on xlwings Reports and PDF features.
Introduction to xlwings
xlwings is a Python library that makes it easy to call Python from Excel and vice versa on Windows and macOS. The library lets you automate Excel from Python source code to produce reports or to interact with Jupyter notebook applications. It also allows you to replace VBA macros with Python Code or write UDFs (user defined functions - Windows only).
- The xlwings CE is a free and open-source library (BSD-licensed) which provides basic functionalities to lets developers integrate Python with Excel.
- The xlwings PRO provides more advance features such as reports, embedded Python code in Excel, one-click installers for easy deployment, video training, dedicated support and much more.
Note:
- This notebook is based on xlwings versions 0.30.12.
Introduction to the Data Library for Python
The Refinitiv Data Library for Python provides a set of ease-of-use interfaces offering coders uniform access to the breadth and depth of financial data and services available on the Refinitiv Data Platform. The API is designed to provide consistent access through multiple access channels and target both Professional Developers and Financial Coders. Developers can choose to access content from the desktop, through their deployed streaming services, or directly to the cloud. With the Refinitiv Data Library, the same Python code can be used to retrieve data regardless of which access point you choose to connect to the platform.
Depending on the level of service and data requirements, developers can easily choose how to access data services using the appropriate access channel defined within the library.
The RDP Library are available in the following programming languages:
For more deep detail regarding the RD Libraries, please refer to the following articles and tutorials:
Disclaimer
As this notebook is based on RD Library Python versions 1.5.0.
Introduction to xlwings Reports
The xlwings Reports is part of xlwings PRO and a solution for template-based Excel and PDF reporting. The xlwings Reports let business users design and maintain their reports directly within Excel without being dependent on a dedicated reporting team or Python programmer.
The main features of the xlwings Reports are the following:
- Separation of code and design: Users without coding skills can change the template on their own without having to touch the Python code.
- Template variables: Python variables (between curly braces) can be directly used in cells, e.g. {{ title }}. They act as placeholders that will be replaced by the actual values.
- Frames for dynamic tables: Frames are vertical containers that dynamically align and style tables that have a variable number of rows.
You can get a free trial for xlwings PRO here, then follow the instruction on How to activate xlwings PRO page.
Intel vs AMD Report Template Preparation
We will use Intel and AMD stock price comparison as example data for this xlwings Reports file.
Firstly, we create the Excel template as part2_rdp_report_template.xlsx file. The report template contains two sheets, one for daily pricing comparison and one for volume comparison.
The daily pricing sheet template example is following:
Templates Variables
You will noticed the double curly bracket placeholders like {{ intel_price_title}}, {{ intel_price_df}}, {{ amd_graph }}, etc in the Excel template file. They are called templates variable. xlwings Reports will replace those template variables with data (Pandas DataFrame, text, Matplotlib/Plotly Charts, etc) from the Python code automatically.
Excel Table
Let take a closer look in the Daily Price Sheet, the {{ intel_price_df}} and {{ amd_price_df}} template variables are in the Excel Table.
Using Excel tables is the recommended way to format tables as the styling (themes and alternating colors) can be applied dynamically across columns and rows. You can create Excel Table by go to Insert > Table menus and make sure that you activate My table has headers before clicking on OK. Then add the placeholder as usual on the top-left in your template.
Note:
- For Excel table support, you need at least xlwings version 0.21.0.
- This feature supports Pandas DataFrame objects only (As of November 2020)
Multiple Sheets
The xlwings Reports also support multiple Excel Sheets. Business users just create new Sheets in a single Excel template file and place template variables, frame tags in those Sheets based on the business requirements. xlwings PRO automatically replaces associate data in all Sheets. Developers do not need to manually create and manipulate new Excel Sheet(s) anymore.
Let's demonstrate with Intel vs AMD Volume comparison Sheet template.
Shape Text
With newly released xlwings version 0.21.4, xlwings Report supports template text in Shapes objects like Boxes or Rectangles with the templates variable. Please see more detail on the Shape Text feature page.
Note: This article does not demonstrate the Shape Text feature.
Now the template file is ready, we can continue on the data preparation side of the Python Code.
Code Walkthrough
Intel vs AMD Data Preparation: Days Daily Pricing data with RDP Content Layers
The Content layer refers to logical market data objects, representing financial items like level 1 market data prices and quotes, News, Historical Pricing, Bond Analytics and so on.
When comparing to the Function Layer, the Content Layer provides much more flexible manners for developers:
- Richer / fuller response e.g. metadata, sentiment scores - where available
- Using Asynchronous or Event-Driven operating modes - in addition to Synchronous
- Streaming Level 1 Market Price Data - as well as Snapshot requests
The Content layer can easily be used by both professional developers and financial coders. It provides great flexibility for familiar and commonly used financial data models.
Please find more detail regarding the Content Layer on Data Library for Python document page.
Import xlwings PRO and RD Library
The application needs to import xlwings, xlwings.pro.reports, and refinitiv.data packages in order to interact with the xlwings Reports and RD library.
# import xlwings and RD library
import xlwings as xw
from xlwings.pro.reports import create_report
import refinitiv.data as rd
from refinitiv.data.content import historical_pricing
from refinitiv.data.content.historical_pricing import Intervals
# import all required libraries for this notebook
import datetime
import configparser as cp
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib
import matplotlib.ticker as tick
import json
import datetime
import asyncio
Initiate and Getting Data from RD Library
The RD Library let the application consume data from the following platforms:
- DesktopSession (Eikon/Refinitiv Workspace)
- PlatformSession (RDP, Refinitiv Real-Time Optimized)
- DeployedPlatformSession (deployed Refinitiv Real-Time/ADS)
This article only focuses on the DesktopSession. However, for other session types, the main logic is the same when it interacts with xlwings library.
# Open Desktop Session
rd.open_session(config_name='./refinitiv-data.config.json')
# Result: <refinitiv.data.session.Definition object at 0x1b9bf0caaa0 {name='workspace'}>
Next, we define all the necessary variables for requesting data from Desktop Session.
# Define RICs
intel_ric = 'INTC.O'
amd_ric = 'AMD.O'
fields = ['BID','ASK','OPEN_PRC','HIGH_1','LOW_1','TRDPRC_1','BLKVOLUM']
count = 90
This notebook example utilizes Python asyncio library to retrieve data from RD Content Layer's historical_pricing interface get_data_async() function asynchronously.
# Run two requests processes concurrently.
tasks = asyncio.gather(
historical_pricing.summaries.Definition(
universe = intel_ric,
fields = fields,
count = count,
interval = Intervals.DAILY).get_data_async(closure='Intel'),
historical_pricing.summaries.Definition(
universe = amd_ric,
fields = fields,
count = count,
interval = Intervals.DAILY).get_data_async(closure='AMD')
)
Once the task (requests Daily data of Intel and AMD) is completed, get the response data in Pandas DataFrame object format via <response>.data.df statement.
await tasks
def display_reponse(response):
print(response)
print("\nReponse received for", response.closure)
intel_interday, amd_interday = tasks.result()
print("\nHistorical Pricing Summaries - Interday - Intel")
intel_df_pricing = intel_interday.data.df
display(intel_df_pricing)
#check AMD data with the same code above
Now we have the raw Intel and AMD Daily Price data, we can close the RDP Session.
# -- Close Session, just calls close_session() function
rd.close_session()
The next phase is restructuring data to make it easier to read and to plot the report graphs.
Restructure DataFrame
Please note that the restructure steps are identical to the part 1 notebook application. We start by naming the index column to "Date"
Next, we change all non-Date columns data type from String to Float, and change the DataFrame Date index to be a data column. This will let us plot a graph using Date as X-Axis.
for column in intel_df_pricing:
intel_df_pricing[column]=intel_df_pricing[column].astype(float)
for column in amd_df_pricing:
amd_df_pricing[column]=amd_df_pricing[column].astype(float)
intel_df_pricing.reset_index(level=0, inplace=True)
amd_df_pricing.reset_index(level=0, inplace=True)
intel_df_pricing.head(5)
Finally, we sort data in ascending order.
# Sort DataFrame by Date
intel_df_pricing.sort_values('Date',ascending=True,inplace=True)
amd_df_pricing.sort_values('Date',ascending=True,inplace=True)
Plotting Intel and AMD 90 Daily Pricing Charts
We use Matplotlib's Pyplot library to plot Intel and AMD Daily Pricing graphs. Each graph represents interday data for the last 90 days pricing information.
The source code also creates the Pyplot Figure objects which we will pass them to the report file as pictures.
# Plotting a Graph for Intel
columns = ['OPEN_PRC','HIGH_1','LOW_1','TRDPRC_1']
intel_df_pricing.set_index('Date',drop=True,inplace=True)
intel_figure = plt.figure()
plt.xlabel('Date', fontsize='large')
plt.ylabel('Price', fontsize='large')
# Create graph title from Company and RIC names dynamically.
plt.ticklabel_format(style = 'plain')
plt.title('Intel interday data for last 90 days', color='black',fontsize='x-large')
ax = intel_figure.gca()
intel_df_pricing.plot(kind='line', ax = intel_figure.gca(),y=columns,figsize=(14,7) , grid = True)
plt.show()
# Plotting a Graph for AMD
columns = ['OPEN_PRC','HIGH_1','LOW_1','TRDPRC_1']
amd_df_pricing.set_index('Date',drop=True,inplace=True)
amd_figure = plt.figure()
plt.xlabel('Date', fontsize='large')
plt.ylabel('Price', fontsize='large')
# Create graph title from Company and RIC names dynamically.
plt.ticklabel_format(style = 'plain')
plt.title('AMD interday data for last 90 days', color='black',fontsize='x-large')
ax = amd_figure.gca()
amd_df_pricing.plot(kind='line', ax = amd_figure.gca(),y=columns,figsize=(14,7), grid = True )
plt.show()
Now we got the charts and figure objects ready for the Pricing Sheet report. Next, we will create the Volume comparison chart for the Intel vs AMD Volume comparison Sheet.
Plotting Intel vs AMD Volume Comparison Chart
The next chart is block trading volume comparison which is the BLKVOLUM data field. This chart contains Intel and AMD data in the same figure.
columns = ['BLKVOLUM']
# Intel
intel_amd_volume_figure = plt.figure()
plt.xlabel('Date', fontsize='large')
plt.ylabel('Trading Volume', fontsize='large')
# Create graph title from Company and RIC names dynamically.
plt.ticklabel_format(style = 'plain')
plt.title('AMD vs Intel total block trading volume comparison for last 90 days', color='black',fontsize='x-large')
ax = intel_amd_volume_figure.gca()
intel_df_pricing.plot(kind='line', ax = intel_amd_volume_figure.gca(),y=columns,figsize=(14,7) , label=['Intel trading volume'],grid = True)
# AMD
amd_df_pricing.plot(kind='line', ax = ax ,y=columns,figsize=(14,7), label=['AMD trading volume'],grid = True)
plt.show()
Now all data (DataFrame and Charts) is ready. We have demonstrated the Reports-API and with Reports API render_template() function in the part-1 article as the following example.
app = xw.App(visible=True)
report = app.render_template(
'rdp_report_template.xlsx',
'rdp_report_pro.xlsx',
historical_title=historical_title,
df_historical=df_historical.head(10),
graph = fig)
The code above is ok for small data. This part-2 notebook will show more features that developers can work with the render_template() function for supporting various requirements and template variables.
Firstly, let's define static texts and template/report file location.
# Define Static texts and template/report file location.
intel_price_title='Intel Hitorical Data'
amd_price_title = 'AMD Historical Data'
template_file = 'part2_rdp_report_template.xlsx'
report_file = 'part2_rdp_intel_vs_amd.xlsx'
Next, we create the Python Dictionary object to collect all data for template variables. Please note that the Dictionary keys must have the same names as template variables.
# Create a Dictionary to collect all report data
data = dict(
intel_price_title=intel_price_title,
intel_price_df = intel_df_pricing.head(15),
intel_graph = intel_figure,
amd_price_title = amd_price_title,
amd_price_df = amd_df_pricing.head(15),
amd_graph = amd_figure,
intel_amd_volume_graph = intel_amd_volume_figure,
)
Then we call the create_report function.
app = xw.App(visible=True)
report = app.render_template(
template_file,
report_file,
**data)
The render_template() function above generates part2_rdp_intel_vs_amd.xlsx Excel report file with format/style defined in part2_rdp_report_template.xlsx and data that we pass to the function. With the default parameter, the part2_rdp_intel_vs_amd.xlsx file will be open automatically.
Developers can control the Excel instance with the xlwings App instance. For example, to run the report in a separate and hidden instance of Excel. This is a useful feature if Developers aim to create an application that runs as a background service to generate reports daily, monthly, or weekly based on the business requirements.
with xw.App(visible=False) as app:
report = app.render_template(
template_file,
'part_2_daily_report.xlsx,
**data)
Now the part_2_daily_report.xlsx Excel report file is created in the background.
Exporting Excel report to PDF
With newly release xlwings version 0.21.1 onward, the xlwings CE can export the whole Excel workbook or a subset of the sheets to a PDF file with xlwings Book.to_pdf() function. Please see more detail regarding the to_pdf function on xlwings API reference page.
We will demonstrate this feature with a quick Python source code from the first notebook to create the Intel Daily Pricing report in PDF file format.
Firstly, create a new blank Excel report file and set a basic Report style.
wb = xw.Book() # Creating an new excel file. wb = xw.Book(filename) would open an existing file
intel_price_sheet = wb.sheets[0]
intel_price_sheet.name = 'Intel Pricing'
intel_price_sheet.range("A1").value = 'Intel Pricing'
intel_price_sheet.range("A1").api.Font.Size = 14 # Change font size
intel_price_sheet.range("A1").api.Font.ColorIndex = 2 # Change font color
intel_price_sheet.range('A1:H1').color = (0,0,255) # Change cell background color
# Set Pandas DataFrame object to newly created Excel File
intel_price_sheet.range("A2").value = intel_df_pricing.head(15)
# Set data table format
intel_price_sheet.range('2:1').api.Font.Bold = True #Make Column headers bold
intel_price_sheet.range('A2:H2').color = (144,238,144) # Change cell background color
intel_price_sheet.autofit('c') # Set sheet autofit the width of column
Next, find the position of the last row of the report table as a position to plot a graph (intel_figure).
# historical_sheet.cells.last_cell.row = row of the lower right cell
'''
change to your specified column, then go up until you hit a non-empty cell
'''
intel_price_last_row = intel_price_sheet.range((intel_price_sheet.cells.last_cell.row, 1)).end('up').row
rng = intel_price_sheet.range('A{row}'.format(row = intel_price_last_row + 1))
# Resize inte_figure Figure object
intel_figure.set_figheight(6)
intel_figure.set_figwidth(6)
# Add figure to Excel report file as a picture
intel_price_sheet.pictures.add(intel_figure, name='MyPlot', update=True, top=rng.top, left=rng.left)
Then call the Book.to_pdf() function to save this Excel report as PDF file.
wb.to_pdf('./part_2_xlwings_to_pdf.pdf') # defaults to the same name as the workbook, in the same directory
The Excel report with RDP content will be saved as part_2_xlwings_to_pdf.pdf file.
Conclusion and Next Step
The xlwings CE library lets Python developers integrate data with Excel in a simple way. The xlwings PRO allows Python developers and business users to work together to integrate data with Excel or PDF report file in much easier than xlwings CE.
The xlwings Reports help businesses and financial teams design the report to match their business requirement freely. The Python developers/data engineers can focus on how to retrieve and optimize data without no need to worry about report design, look & feel. xlwings Reports also help developers can automate report generator process periodicity (such as a daily, weekly, or monthly report).
If users want dynamic data and charts in the report file, the xlwings Embedded Code feature lets users run Python code in the macro-enabled Excel report directly. Users do not need to run a separate Python code themselves or wait for Developers to generate a report file for them.
The newly introduced to_pdf feature also lets developers export the Excel Workbook/Sheets to the PDF file. This function helps business users who do not have Microsoft Office installed can still be able to open the PDF report file.
At the same time, the Data Library for Python let developers rapidly access the Data Platform content with a few lines of code that easy to understand and maintain. Developers can focus on implement the business logic or analysis data without worry about the connection, authentication detail with the Platforms.
The integration between LSEG APIs and xlwings is not limited to only the Data Library. Any LSEG APIs that support Python programming language such as Data API (Data API-xlwings article), or RKD API, or DataStream Web Service - Python can work with xlwings library using the same concept and code logic as this Data Library notebook examples.
References
You can find more details regarding the Data Library for Python, xlwings and, related technologies for this notebook from the following resources:
- Data Library for Python on the LSEG Developer Community web site.
- Xlwings web site.
- Xlwings CE Document page.
- xlwings Pro page.
- xlwings API Reference page.
- xlwings Reports page.
- xlwings Embedded Code page.
- Financial Reporting with Eikon, xlwings and Excel.
For any questions related to this article or Data Library, please use the Developers Community Q&A Forum.