Tutorial Source Code

Python

Examples referenced

Last Update June 2024
Interpreter Python 3.11.x or greater
Additional Reading

The article, Building Search into your Application Workflow, provides an overview of the capabilities of the Search service including tips and tricks and many useful techniques.

The article also references a series of .Net examples describing multiple use cases - most of the learning can also be applied to Python. 

Search documentation: Search Reference Guide


Overview

Whether accessing from the desktop or directly to the cloud, the LSEG Data Libraries provide ease-of-use interfaces to retrieve content defined within standard web-based APIs. Built on top of the request-response paradigm, the library includes Search interfaces that allow developers to query LSEG's search engines covering content such as quotes, instruments, organizations, and many other assets that can be programmatically integrated within your business workflow.

The following tutorial demonstrates a few use cases when searching for content across various types of assets ranging from simple queries to more complex filter expressions.  The search service will allow developers to fine-tune their queries by utilizing specific properties allowing criteria to be generalized or extremely granular, depending on your requirements. 

Basic queries

Search offers the ability to form google-like query expressions that are free-formed text matching against documents within the specific views within the search ecosystem.  For example, the following query searches for IBM bonds:

    	
            

// Search for IBM bonds - basic query

response = search.Definition("IBM Bonds").get_data()

response.data.df

and generates something like:

RIC PermID PI DocumentTitle BusinessEntity
US137584115= 46638470066 0x00102c2565da06a0 International Business Machines Corp, Plain Va... INSTRUMENTxFIXEDINCOMExGOVCORP
459200HX2= 44657599150 0x0004051b94d20854 International Business Machines Corp, Fixed Ma... INSTRUMENTxFIXEDINCOMExGOVCORP
459200JQ5= 46642939296 0x00102c28b5f20b58 International Business Machines Corp, Plain Va... INSTRUMENTxFIXEDINCOMExGOVCORP
459200JX0= 192841929423 0x00102c2452251a75 International Business Machines Corp, Plain Va... INSTRUMENTxFIXEDINCOMExGOVCORP
459200HG9= 44640212240 0x0004050e34388b88 International Business Machines Corp, Plain Va... INSTRUMENTxFIXEDINCOMExGOVCORP
US127166528= 46635711098 0x00102c05c4dd0222 International Business Machines Corp, Plain Va... INSTRUMENTxFIXEDINCOMExGOVCORP
459200JC6= 46637038358 0x00102c80bf6b046c International Business Machines Corp, Plain Va... INSTRUMENTxFIXEDINCOMExGOVCORP
US194445601= 192826899363 0x00102c53ff1f1897 International Business Machines Corp, Plain Va... INSTRUMENTxFIXEDINCOMExGOVCORP
US114316318= 44657843041 0x0004051bd47954e3 International Business Machines Corp, Plain Va... INSTRUMENTxFIXEDINCOMExGOVCORP
459200HP9= 44649129642 0x00040513a4865c6c International Business Machines Corp, Plain Va... INSTRUMENTxFIXEDINCOMExGOVCORP

Refer to the Search Reference Guide within the Additional Readings section for more details on query expressions.

The above result displays a table of document hits based on the query expression.  By default, the search service will return 10 rows of data containing general fields representing the results of the most relevant bonds found for IBM.  When search delivers the response, it also carries the number of document hits. The document hits represent all matches that fulfilled the query expression.

While we can only see the default of 10 rows displayed for the total document hits found, we do have the ability to specify more rows using the following request:

    	
            

// Search for IBM bonds - basic query specifying up to 100 rows returned

response = search.Definition("IBM Bonds", top=100).get_data()

Filter Expressions

While a basic query appears to have returned relevant content, once we dig a little deeper we would realize quickly that the total number of document hits contain many bonds that are of no interest.  Because we only chose 10 rows using default fields returned, it may be difficult to detect issues.  However, if we instead decided to choose more rows or requested specific fields that provide more critical information, such as the maturity date, things would be more obvious.

To help guide us to make better decisions, Search provides a filter syntax and other criteria that allow developers to return more precise results.  For example:

    	
            

// Search for IBM bonds - filter specific criteria

response = search.Definition(

    view=search.SearchViews.GOV_CORP_INSTRUMENTS,

    select="ISIN,RIC,IssueDate,Currency,FaceIssuedTotal,CouponRate,MaturityDate", 

    filter="IssuerTicker eq 'IBM' and IsActive eq true and AssetStatus ne 'MAT'"

).get_data()

response.data.df

will generate something like:

RIC MaturityDate ISIN IssueDate FaceIssuedTotal CouponRate Currency
45920FTU2= 2021-06-28T00:00:00.000Z US45920FTU20 2021-06-15T00:00:00.000Z <NA> 0.0 USD
45920FTV0= 2021-06-29T00:00:00.000Z US45920FTV03 2021-06-15T00:00:00.000Z <NA> 0.0 USD
45920FTW8= 2021-06-30T00:00:00.000Z US45920FTW85 2021-06-15T00:00:00.000Z <NA> 0.0 USD
US137584115= 2021-09-07T00:00:00.000Z XS1375841159 2016-03-07T00:00:00.000Z 1000000000.0 0.5 EUR
459200HX2= 2021-11-06T00:00:00.000Z US459200HX26 2014-11-06T00:00:00.000Z 1100000000.0 0.75538 USD
459200JQ5= 2022-01-27T00:00:00.000Z US459200JQ56 2017-01-27T00:00:00.000Z 1000000000.0 2.5 USD
459200JX0= 2022-05-13T00:00:00.000Z US459200JX08 2019-05-15T00:00:00.000Z 2750000000.0 2.85 USD
459200HG9= 2022-08-01T00:00:00.000Z US459200HG92 2012-07-30T00:00:00.000Z 1000000000.0 1.875 USD
US127166528= 2022-08-05T00:00:00.000Z XS1271665280 2015-08-05T00:00:00.000Z 300000000.0 2.625 GBP
459200JC6= 2022-11-09T00:00:00.000Z US459200JC60 2015-11-09T00:00:00.000Z 900000000.0 2.875 USD

In this specific request, a number of things have changed.  First, we have specified a View to search against.  The View represents a logical space for specific content, in our case, Government Corporate Instruments.  Secondly, we have included a Filter expression that includes logical operators and boolean expressions against document properties.  The above filter has chosen to look for bonds related to a specific ticker symbol that is considered active and to look for corporate bonds that have not matured.  Finally, we decided to forgo the default properties in place of specific properties relevant to our requirements.  This is done using the Select parameter within the request.

The clear difference in the above result vs the first one is the number of document hits.  We can clearly see we have narrowed down the number of bonds returned.  In addition, the result set contains details relevant to the type of data we pulled out, such as the ISIN, Coupon Rate, and Maturity Date.

Properties

In the above examples, we reference the values we want to pull from Search as well as values we chose to filter to narrow down our search result.  Referred to as document properties, these values are at the heart of your search expressions, allowing you to not only filter and select fields of interest, but are also used to perform other criteria such as grouping, sorting, and categorizing results into navigation buckets. There are hundreds of Properties available for each View providing critical details needed to fulfill your searching requirements.

In order for you to be successful at search, you will need to understand how to access the available Properties, how to determine what values they contain, and what they mean.  Referred to as metadata, the Search interfaces provide a means to query Views to pull down the list of available Properties and some important attributes for each.  For example: 

    	
            

// Request the metadata properties for the GovCorpInstruments View

response = search.metadata.Definition(

    view = search.SearchViews.GOV_CORP_INSTRUMENTS  # Required parameterc

).get_data()

 

# for Pandas Display purpose only
pd.set_option("display.max_columns", None) 
pd.set_option("display.max_rows", 10)  # Just show 10 rows
pd.set_option("display.max_colwidth", 1)  

 

response.data.df

generates something like:

 

    Type Searchable Sortable Navigable Groupable Exact Symbol
AccrualDate AccrualDate Date True True True False False False
AccruedInterest AccruedInterest Double True True True False False False
ActiveEstimatesExist ActiveEstimatesExist Boolean True False False False False False
AdtLocalCurrencyValue AdtLocalCurrencyValue String True False False False False False
AdtLocalCurrencyValueName AdtLocalCurrencyValueName String True False False False False False
... ... ... ... ... ... ... ... ...
WorstStandardYield WorstStandardYield Double True True True False False False
WorstYearsToRedem WorstYearsToRedem Double True True True False False False
YieldCurveBenchmarkRIC YieldCurveBenchmarkRIC String False False False False False False
YieldTypeDescription YieldTypeDescription String True False False False False False
ZCodeValue ZCodeValue String True True False False False True

 

Each property determines how it can be used in the different parameters defined within Search. For example, some properties are sortable, navigable, searchable, etc. Refer to the Additional Readings references at the top of this tutorial for an explanation of these values and the details around their attributes.  In addition, the additional readings will include some debugging techniques and ways to interrogate values of properties to help guide the meaning and content available.

Navigators

Navigators provide the ability to summarize the distribution of your results. They are particularly useful when you are interested in gathering the domain of values for a specific property. or simply to understand how to build your filters based on specific values contained within Properties. Navigators can be used against a specific View, used in conjunction with either a query, a filter, or both. Navigators can be simple or very powerful, but provide a very useful way to capture results in logical buckets.

To demonstrate one way to use a navigator is to list all the industry sectors available within LSEG's search ecosystem

    	
            

// Navigators - basic usage

response = search.Definition(

    top=0,

    navigators="RCSTRBC2012Leaf"

).get_data()

 

response.data.raw['Navigators']['RCSTRBC2012Leaf']

will generate something like:

{'Buckets': [{'Label': 'Banks (NEC)', 'Count': 3775631},
{'Label': 'Corporate Financial Services (NEC)', 'Count': 2886931},
{'Label': 'Corporate Banks', 'Count': 1521533},
{'Label': 'Retail & Mortgage Banks', 'Count': 1298611},
{'Label': 'Public Finance Activities', 'Count': 788085},
{'Label': 'Investment Banking & Brokerage Services (NEC)', 'Count': 684443},
{'Label': 'Investment Management & Fund Operators (NEC)', 'Count': 379391},
{'Label': 'Consumer Lending (NEC)', 'Count': 366829},
{'Label': 'Electric Utilities (NEC)', 'Count': 296854},
{'Label': 'Diversified Investment Services', 'Count': 292829},
{'Label': 'Investment Holding Companies (NEC)', 'Count': 281621},
{'Label': 'Professional Information Services (NEC)', 'Count': 276387},
{'Label': 'Wealth Management', 'Count': 272080},
{'Label': 'Construction & Engineering (NEC)', 'Count': 251681},
{'Label': 'Real Estate Rental, Development & Operations (NEC)',
'Count': 249816},
{'Label': 'Business Support Services (NEC)', 'Count': 226806},
{'Label': 'Building Contractors', 'Count': 201989},
{'Label': 'Government & Government Finance (NEC)', 'Count': 186280},
{'Label': 'Financial & Commodity Market Operators & Service Providers (NEC)',
'Count': 182456},
....
....

The result of this specific request contains an array of buckets representing a list of industry sectors and a distribution of the number of documents available within each sector.  Navigators are extremely useful for a number of different applications.  Not only to gather values within a category but also to perform ways to organize your distribution based on ranges, calculations, and distribute values as histograms.  Refer to the additional readings section for details around the power of Navigators.

Data structure returned

All request-reply responses will carry standard response details such as the success of the response, HTTP status details, which include reasons why the request may fail, and a data section containing details specific to the service.  The structure and format of the data returned from a Search contain a table representing the document hits.   In addition, if the request includes a specification to summarize the results based on a Navigator, these details will also be carried within the data response body.

The following code segment determines if the request was successful, using a simple boolean property available within the response.  Upon success, we begin to extract and display details populated within the data.  Upon failure, the details of why the underlying HTTP request failed are displayed. 

In this specific example, we are checking if the response contains a DataFrame as well as the Raw data.

    	
            

# Check if request was successful

if (response.is_success):

    # Should have raw data when is_success==true

    if response.data.raw:

        display(response.data.raw)

    else:

        print("Response does not contain raw data")    

 

    # Check if the response contains a Dataframe too?

    if (not response.data.df.empty):

        display(response.data.df)

    else:

        print("\nResponse does not contain DataFrame")

else:

    # Something went wrong

    print(f"Not Sucessful : {response.http_status}")

Executing the Notebook

In the notebook, we've prepared a number of code segments demonstrating different features of Search.  In addtion, demonstrate how to retrieve metadata to list out Properties for a specified View. To execute the examples within the source code package, refer to the pre-requisites section at the top of this tutorial.

Refer to the above screenshots to compare the output expected from the execution for some of the code segments.

This tutorial was meant as a simple guide to get you familiar with Search and what it can provide.  Search is a very rich and powerful service that requires more attention to get the most out of it.  We would suggest you refer to the Additional Readings section at the top of this tutorial, as well as trying out some of the additional Search examples available within the tutorial package.  In addition, we encourage the use of auto-complete within your Jupyter Notebook or Python editor to discover other useful capabilities.