Tutorial Source Code |
Examples referenced: |
Last Update | June 2024 |
Interpreter | Python 3.11.x or greater |
Additional Reading | The article, Building Search into your Application Workflow, provides an overview of the capabilities of the Search service including tips and tricks and many useful techniques. The article also references a series of .Net examples describing multiple use cases - most of the learning can also be applied to Python. Search documentation: Search Reference Guide |
Whether accessing from the desktop or directly to the cloud, the LSEG Data Libraries provide ease-of-use interfaces to retrieve content defined within standard web-based APIs. Built on top of the request-response paradigm, the library includes Search interfaces that allow developers to query LSEG's search engines covering content such as quotes, instruments, organizations, and many other assets that can be programmatically integrated within your business workflow.
The following tutorial demonstrates a few use cases when searching for content across various types of assets ranging from simple queries to more complex filter expressions. The search service will allow developers to fine-tune their queries by utilizing specific properties allowing criteria to be generalized or extremely granular, depending on your requirements.
Search offers the ability to form google-like query expressions that are free-formed text matching against documents within the specific views within the search ecosystem. For example, the following query searches for IBM bonds:
// Search for IBM bonds - basic query
response = search.Definition("IBM Bonds").get_data()
response.data.df
and generates something like:
RIC | PermID | PI | DocumentTitle | BusinessEntity |
---|---|---|---|---|
US137584115= | 46638470066 | 0x00102c2565da06a0 | International Business Machines Corp, Plain Va... | INSTRUMENTxFIXEDINCOMExGOVCORP |
459200HX2= | 44657599150 | 0x0004051b94d20854 | International Business Machines Corp, Fixed Ma... | INSTRUMENTxFIXEDINCOMExGOVCORP |
459200JQ5= | 46642939296 | 0x00102c28b5f20b58 | International Business Machines Corp, Plain Va... | INSTRUMENTxFIXEDINCOMExGOVCORP |
459200JX0= | 192841929423 | 0x00102c2452251a75 | International Business Machines Corp, Plain Va... | INSTRUMENTxFIXEDINCOMExGOVCORP |
459200HG9= | 44640212240 | 0x0004050e34388b88 | International Business Machines Corp, Plain Va... | INSTRUMENTxFIXEDINCOMExGOVCORP |
US127166528= | 46635711098 | 0x00102c05c4dd0222 | International Business Machines Corp, Plain Va... | INSTRUMENTxFIXEDINCOMExGOVCORP |
459200JC6= | 46637038358 | 0x00102c80bf6b046c | International Business Machines Corp, Plain Va... | INSTRUMENTxFIXEDINCOMExGOVCORP |
US194445601= | 192826899363 | 0x00102c53ff1f1897 | International Business Machines Corp, Plain Va... | INSTRUMENTxFIXEDINCOMExGOVCORP |
US114316318= | 44657843041 | 0x0004051bd47954e3 | International Business Machines Corp, Plain Va... | INSTRUMENTxFIXEDINCOMExGOVCORP |
459200HP9= | 44649129642 | 0x00040513a4865c6c | International Business Machines Corp, Plain Va... | INSTRUMENTxFIXEDINCOMExGOVCORP |
Refer to the Search Reference Guide within the Additional Readings section for more details on query expressions.
The above result displays a table of document hits based on the query expression. By default, the search service will return 10 rows of data containing general fields representing the results of the most relevant bonds found for IBM. When search delivers the response, it also carries the number of document hits. The document hits represent all matches that fulfilled the query expression.
While we can only see the default of 10 rows displayed for the total document hits found, we do have the ability to specify more rows using the following request:
// Search for IBM bonds - basic query specifying up to 100 rows returned
response = search.Definition("IBM Bonds", top=100).get_data()
While a basic query appears to have returned relevant content, once we dig a little deeper we would realize quickly that the total number of document hits contain many bonds that are of no interest. Because we only chose 10 rows using default fields returned, it may be difficult to detect issues. However, if we instead decided to choose more rows or requested specific fields that provide more critical information, such as the maturity date, things would be more obvious.
To help guide us to make better decisions, Search provides a filter syntax and other criteria that allow developers to return more precise results. For example:
// Search for IBM bonds - filter specific criteria
response = search.Definition(
view=search.SearchViews.GOV_CORP_INSTRUMENTS,
select="ISIN,RIC,IssueDate,Currency,FaceIssuedTotal,CouponRate,MaturityDate",
filter="IssuerTicker eq 'IBM' and IsActive eq true and AssetStatus ne 'MAT'"
).get_data()
response.data.df
will generate something like:
RIC | MaturityDate | ISIN | IssueDate | FaceIssuedTotal | CouponRate | Currency |
---|---|---|---|---|---|---|
45920FTU2= | 2021-06-28T00:00:00.000Z | US45920FTU20 | 2021-06-15T00:00:00.000Z | <NA> | 0.0 | USD |
45920FTV0= | 2021-06-29T00:00:00.000Z | US45920FTV03 | 2021-06-15T00:00:00.000Z | <NA> | 0.0 | USD |
45920FTW8= | 2021-06-30T00:00:00.000Z | US45920FTW85 | 2021-06-15T00:00:00.000Z | <NA> | 0.0 | USD |
US137584115= | 2021-09-07T00:00:00.000Z | XS1375841159 | 2016-03-07T00:00:00.000Z | 1000000000.0 | 0.5 | EUR |
459200HX2= | 2021-11-06T00:00:00.000Z | US459200HX26 | 2014-11-06T00:00:00.000Z | 1100000000.0 | 0.75538 | USD |
459200JQ5= | 2022-01-27T00:00:00.000Z | US459200JQ56 | 2017-01-27T00:00:00.000Z | 1000000000.0 | 2.5 | USD |
459200JX0= | 2022-05-13T00:00:00.000Z | US459200JX08 | 2019-05-15T00:00:00.000Z | 2750000000.0 | 2.85 | USD |
459200HG9= | 2022-08-01T00:00:00.000Z | US459200HG92 | 2012-07-30T00:00:00.000Z | 1000000000.0 | 1.875 | USD |
US127166528= | 2022-08-05T00:00:00.000Z | XS1271665280 | 2015-08-05T00:00:00.000Z | 300000000.0 | 2.625 | GBP |
459200JC6= | 2022-11-09T00:00:00.000Z | US459200JC60 | 2015-11-09T00:00:00.000Z | 900000000.0 | 2.875 | USD |
In this specific request, a number of things have changed. First, we have specified a View to search against. The View represents a logical space for specific content, in our case, Government Corporate Instruments. Secondly, we have included a Filter expression that includes logical operators and boolean expressions against document properties. The above filter has chosen to look for bonds related to a specific ticker symbol that is considered active and to look for corporate bonds that have not matured. Finally, we decided to forgo the default properties in place of specific properties relevant to our requirements. This is done using the Select parameter within the request.
The clear difference in the above result vs the first one is the number of document hits. We can clearly see we have narrowed down the number of bonds returned. In addition, the result set contains details relevant to the type of data we pulled out, such as the ISIN, Coupon Rate, and Maturity Date.
In the above examples, we reference the values we want to pull from Search as well as values we chose to filter to narrow down our search result. Referred to as document properties, these values are at the heart of your search expressions, allowing you to not only filter and select fields of interest, but are also used to perform other criteria such as grouping, sorting, and categorizing results into navigation buckets. There are hundreds of Properties available for each View providing critical details needed to fulfill your searching requirements.
In order for you to be successful at search, you will need to understand how to access the available Properties, how to determine what values they contain, and what they mean. Referred to as metadata, the Search interfaces provide a means to query Views to pull down the list of available Properties and some important attributes for each. For example:
// Request the metadata properties for the GovCorpInstruments View
response = search.metadata.Definition(
view = search.SearchViews.GOV_CORP_INSTRUMENTS # Required parameterc
).get_data()
# for Pandas Display purpose only
pd.set_option("display.max_columns", None)
pd.set_option("display.max_rows", 10) # Just show 10 rows
pd.set_option("display.max_colwidth", 1)
response.data.df
generates something like:
Type | Searchable | Sortable | Navigable | Groupable | Exact | Symbol | ||
---|---|---|---|---|---|---|---|---|
AccrualDate | AccrualDate | Date | True | True | True | False | False | False |
AccruedInterest | AccruedInterest | Double | True | True | True | False | False | False |
ActiveEstimatesExist | ActiveEstimatesExist | Boolean | True | False | False | False | False | False |
AdtLocalCurrencyValue | AdtLocalCurrencyValue | String | True | False | False | False | False | False |
AdtLocalCurrencyValueName | AdtLocalCurrencyValueName | String | True | False | False | False | False | False |
... | ... | ... | ... | ... | ... | ... | ... | ... |
WorstStandardYield | WorstStandardYield | Double | True | True | True | False | False | False |
WorstYearsToRedem | WorstYearsToRedem | Double | True | True | True | False | False | False |
YieldCurveBenchmarkRIC | YieldCurveBenchmarkRIC | String | False | False | False | False | False | False |
YieldTypeDescription | YieldTypeDescription | String | True | False | False | False | False | False |
ZCodeValue | ZCodeValue | String | True | True | False | False | False | True |
Each property determines how it can be used in the different parameters defined within Search. For example, some properties are sortable, navigable, searchable, etc. Refer to the Additional Readings references at the top of this tutorial for an explanation of these values and the details around their attributes. In addition, the additional readings will include some debugging techniques and ways to interrogate values of properties to help guide the meaning and content available.
Navigators provide the ability to summarize the distribution of your results. They are particularly useful when you are interested in gathering the domain of values for a specific property. or simply to understand how to build your filters based on specific values contained within Properties. Navigators can be used against a specific View, used in conjunction with either a query, a filter, or both. Navigators can be simple or very powerful, but provide a very useful way to capture results in logical buckets.
To demonstrate one way to use a navigator is to list all the industry sectors available within LSEG's search ecosystem
// Navigators - basic usage
response = search.Definition(
top=0,
navigators="RCSTRBC2012Leaf"
).get_data()
response.data.raw['Navigators']['RCSTRBC2012Leaf']
will generate something like:
{'Buckets': [{'Label': 'Banks (NEC)', 'Count': 3775631},
{'Label': 'Corporate Financial Services (NEC)', 'Count': 2886931},
{'Label': 'Corporate Banks', 'Count': 1521533},
{'Label': 'Retail & Mortgage Banks', 'Count': 1298611},
{'Label': 'Public Finance Activities', 'Count': 788085},
{'Label': 'Investment Banking & Brokerage Services (NEC)', 'Count': 684443},
{'Label': 'Investment Management & Fund Operators (NEC)', 'Count': 379391},
{'Label': 'Consumer Lending (NEC)', 'Count': 366829},
{'Label': 'Electric Utilities (NEC)', 'Count': 296854},
{'Label': 'Diversified Investment Services', 'Count': 292829},
{'Label': 'Investment Holding Companies (NEC)', 'Count': 281621},
{'Label': 'Professional Information Services (NEC)', 'Count': 276387},
{'Label': 'Wealth Management', 'Count': 272080},
{'Label': 'Construction & Engineering (NEC)', 'Count': 251681},
{'Label': 'Real Estate Rental, Development & Operations (NEC)',
'Count': 249816},
{'Label': 'Business Support Services (NEC)', 'Count': 226806},
{'Label': 'Building Contractors', 'Count': 201989},
{'Label': 'Government & Government Finance (NEC)', 'Count': 186280},
{'Label': 'Financial & Commodity Market Operators & Service Providers (NEC)',
'Count': 182456},
....
....
The result of this specific request contains an array of buckets representing a list of industry sectors and a distribution of the number of documents available within each sector. Navigators are extremely useful for a number of different applications. Not only to gather values within a category but also to perform ways to organize your distribution based on ranges, calculations, and distribute values as histograms. Refer to the additional readings section for details around the power of Navigators.
All request-reply responses will carry standard response details such as the success of the response, HTTP status details, which include reasons why the request may fail, and a data section containing details specific to the service. The structure and format of the data returned from a Search contain a table representing the document hits. In addition, if the request includes a specification to summarize the results based on a Navigator, these details will also be carried within the data response body.
The following code segment determines if the request was successful, using a simple boolean property available within the response. Upon success, we begin to extract and display details populated within the data. Upon failure, the details of why the underlying HTTP request failed are displayed.
In this specific example, we are checking if the response contains a DataFrame as well as the Raw data.
# Check if request was successful
if (response.is_success):
# Should have raw data when is_success==true
if response.data.raw:
display(response.data.raw)
else:
print("Response does not contain raw data")
# Check if the response contains a Dataframe too?
if (not response.data.df.empty):
display(response.data.df)
else:
print("\nResponse does not contain DataFrame")
else:
# Something went wrong
print(f"Not Sucessful : {response.http_status}")
In the notebook, we've prepared a number of code segments demonstrating different features of Search. In addtion, demonstrate how to retrieve metadata to list out Properties for a specified View. To execute the examples within the source code package, refer to the pre-requisites section at the top of this tutorial.
Refer to the above screenshots to compare the output expected from the execution for some of the code segments.
This tutorial was meant as a simple guide to get you familiar with Search and what it can provide. Search is a very rich and powerful service that requires more attention to get the most out of it. We would suggest you refer to the Additional Readings section at the top of this tutorial, as well as trying out some of the additional Search examples available within the tutorial package. In addition, we encourage the use of auto-complete within your Jupyter Notebook or Python editor to discover other useful capabilities.