Introduction
The goal of this article is to demonstrate the Eikon API with the focus on the news retrieval in a Jupyter Notebook environment. So, for that purpose we are going to look at new issue news from International Financial Review (IFR), a global capital markets intelligence provider, that is a part of Refinitiv.
We will capture the PRICED or DEAL notifications that contain structured text that we will extract.
Before we start, let's make sure that:
- Refinitiv Eikon desktop application is up and running;
- Eikon Data API library is installed;
- You have created an application ID for this script.
If you have not yet done this, have a look at the quick start section for this API.
A general note on the Jupyter Notebook usage: in order to execute the code in the cell, press Shift+Enter. While notebook is busy running your code, the cell will look like this: In [*]. When its finished, you will see it change to the sequence number of the task and the output, if any. For example,
In [8]: df['Asset Type'].value_counts()
`Out[8]: Investment Grade 47
High Yield 24
Islamic 10
Covered 2
Name: Asset Type, dtype: int64`
For more info on the Jupyter Notebook, check out Project Jupyter site http://jupyter.org or 'How to set up a Python development environment for Refinitiv Eikon' tutorial on the Developer Community portal.
Getting started
Let's start with referencing Eikon API library and pandas:
import eikon as ek
import pandas as pd
Paste your application ID in to this line:
ek.set_app_key('your_app_id')
We are going to request emerging market new issue (ISU) Eurobond (EUB) news from International Financial Review Emerging EMEA service (IFREM), focusing on the notifications of the already priced issues. You can replicate this request in the News Monitor app with the following query:
- Product:IFREM AND Topic:ISU AND Topic:EUB AND ("PRICED" OR "DEAL")
from datetime import date
start_date, end_date = date(2017, 1, 1), date.today()
q = "Product:IFREM AND Topic:ISU AND Topic:EUB"
headlines = ek.get_news_headlines(query=q, date_from=start_date,
date_to=end_date, count=100)
headlines.head()
versionCreated | text | storyId | sourceCode | |
2018-01-05 11:12:25.000 | 2018-01-05 11:12:25.000 | Slovenia sweeps to 10-year marker | urn:newsml:reuters.com:20180105:nL8N1P01LX:2 | NS:IFR |
2018-01-04 15:15:33.050 | 2018-01-04 15:15:33.050 | DEAL: Slovenia prices EUR1.5bn 1% Mar 2028 10y... | urn:newsml:reuters.com:20180104:nIFR1WtLpT:1 | NS:IFR |
2018-01-04 14:34:38.000 | 2018-01-04 14:39:30.000 | Macedonia hires banks for seven-year euro benc... | urn:newsml:reuters.com:20180104:nL8N1OZ3DS:1 | NS:IFR |
2018-01-04 14:34:38.000 | 2018-01-04 14:34:38.000 | MACEDONIA HIRES CITIGROUP, DEUTSCHE BANK AND E... | urn:newsml:reuters.com:20180104:nL8N1OZ3DS:1 | NS:IFR |
2018-01-04 14:27:25.000 | 2018-01-04 14:27:25.000 | Slovenia 10yr allocs out | urn:newsml:reuters.com:20180104:nL8N1OZ3BY:1 | NS:IFR |
In the context of news, each story has its own unique idenifier, created according to the RFC 3085 standard. Here's what the story looks like, notice that I am using the standard HTML() function from Notebook to display it
from IPython.core.display import HTML
html = ek.get_news_story('urn:newsml:reuters.com:20180104:nIFR1WtLpT:1')
HTML(html)
[Status]: PRICED [Asset Type]: Investment Grade
[Pricing Date]: 04-Jan-18 [Issuer/Borrower Type]: Sovereign
[Issuer]: Slovenia [Offering Type]: Eurobond
[Issuer Long Name]: SLOVENIA, REPUBLIC [Bookrunners]:
OF (GOVERNMENT) Citi/CMZ/GS/HSBC/Jeff/Nova Ljubjanska
[Size]: EUR 1.5bn [Coupon]: 1.000 Fxd
[Ratings]: Baa1/A+/A- [Price]: 99.6540
[Tenor/Mty]: 10yr 06-Mar-28 [Reoffer Price]: 99.6540
[Issue Type]: bmk, snr unsec [Yield]: 1.036
[CUSIP/ISIN]: SI0002103776 [Spread]: MS+13
[Law]: Slovenian [Price Guidance]: MS+20 area
[Country]: SLOVENIA [Listed]: Ljubljana
[Region]: EEMEA [Denoms]: 1k/1k
[Settledate]: 11-Jan-18 [Fees]: Undisclosed
[Format]: Reg S only
[NOTES]: EUR1.5bn 10yr bmk. Baa1/A+/A-. Citi/CMZ/GS/HSBC(B&D)/Jeff/NLB. IPTs
MS+20 area, guidance +17 area, set +13 on bks closed >3.6bn (395m JLM). Long
first cpn. Vs DBR 0.5% 8/27 +59.3 @100.535 / HR 102%. . FTT 3:30pm
Now we can parse the data using a regular expression but before this we will need to convert HTML into text. Let's create a function that is going to return a dictionary from the this type of article. I will be using lxml library to convert HTML and re to parse its output.
from lxml import html
import re
def termsheet_to_dict(storyId):
x = ek.get_news_story(storyId)
story = html.document_fromstring(x).text_content()
matches = dict(re.findall(pattern=r"\[(.*?)\]:\s?([A-Z,a-z,0-9,\-,\(,\),\+,/,\n,\r,\.,%,\&,>, ]+)",
string=story))
clean_matches = {key.strip(): item.strip() for key, item in matches.items()}
return clean_matches
Let's test it and see if it works:
termsheet_to_dict('urn:newsml:reuters.com:20170323:nIFR9z7ZFL:1')['NOTES']
'EUR400m (from 300m+) 3yr LPN. RegS. Follows rshow. Exp nr/B+/BB.\r\nAlfa/ING/UBS(B&D). IPTs 2.75% area, guidance 2.625%/2.75% WPIR, set at 2.625% on\r\nbks closed >750m.'
Let's extract all data for all headlines:
result = []
index = pd.DataFrame(headlines, columns=['storyId']).values.tolist()
for i, storyId in enumerate(index):
try:
x = termsheet_to_dict(storyId[0])
if x:
result.append(x)
except:
pass
df = pd.DataFrame(result)
df.head()
|
Asset Type |
Bookrunners |
CUSIP/ISIN |
Call |
Country |
Coupon |
Denoms |
Fees |
Format |
Guarantor |
... |
Reoffer Price |
Sector |
Settledate |
Size |
Spread |
Stabilis |
Status |
Tenor/Mty |
Total |
Yield |
0 |
Investment Grade |
OF (GOVERNMENT) Citi/CM... |
SI0002103776 |
NaN |
SLOVENIA |
1.000 Fxd |
1k/1k |
Undisclosed |
Reg S only |
NaN |
... |
99.654 |
NaN |
11-Jan-18 |
EUR 1.5bn |
MS+13 |
NaN |
PRICED |
10yr 06-Mar-28 |
NaN |
1.036 |
1 |
Investment Grade |
Citi/Halyk/JPM\r\nKAZAKHSTANA AO |
XS1734574137 |
NaN |
KAZAKHSTAN |
9.500 Fxd |
50m+250k |
Undisclosed |
Reg S only |
NaN |
... |
99.681 |
NaN |
14-Dec-17 |
KZT 100bn |
NaN |
NaN |
PRICED |
3yr 14-Dec-20 |
NaN |
9.625 |
2 |
Investment Grade |
StCh/DIB/ENBD/Warba\r\n(CEIC) LTD |
XS1720817540 |
NaN |
UNITED ARAB EMIRATES |
5.125 Fxd |
200k/1 |
Undisclosed |
Reg S only |
Emirates REIT (CEIC) Ltd |
... |
100 |
NaN |
12-Dec-17 |
USD 400m |
MS+291 |
NaN |
PRICED |
5yr 12-Dec-22 |
NaN |
5.125 |
3 |
Investment Grade |
DB |
XS1731920291 , 40 day |
NaN |
LUXEMBOURG |
2.125 Fxd |
100k/1k |
Undisclosed |
Reg S only |
NaN |
... |
100.323 |
Financials-Real Estate |
06-Dec-17 |
EUR 225m |
MS+165\r\nfungw w/ XS1693959931 |
FCA/ICMA |
PRICED |
7yr 04-Oct-24 |
825M |
2.072 |
4 |
High Yield |
DAIWA/Miz/SMBC Nikko\r\nOF (GOVERNMENT) |
NaN |
NaN |
TURKEY |
1.810 Fxd |
NaN |
Undisclosed |
Reg S only |
NaN |
... |
NaN |
NaN |
07-Dec-17 |
JPY 60bn |
PS+170 |
NaN |
PRICED |
3yr 20-Dec-20 |
60BN |
NaN |
5 rows × 36 columns
Now, when we have the dataframe in place, we can perform simple stats on our data. For instance, how many of those issues reported were Investment Grade versus High Yield.
df['Asset Type'].value_counts()
High Yield 9
Investment Grade 6
Name: Asset Type, dtype: int64
What about a specific country?
df[df['Country']=='RUSSIA']['Asset Type'].value_counts()
High Yield 1
Name: Asset Type, dtype: int64
Conclusion
You can experiment further by changing the original headline search query, for example, by including the RIC into your request.