Article

LSEG Data Library for Python: News Pagination

Jirapongse Phuriphanvichai
Developer Advocate Developer Advocate

The LSEG Data Library for Python provides a set of ease-of-use interfaces offering coders uniform access to the breadth and depth of financial data and services available on the LSEG Data Platform. The services include real-time pricing, historical data, fundamental data, pricing analytics, news, filing, ESG, and much more. This article introduces the news pagination feature which allows users to retrieve news headlines by pages and demonstrates how to use it in the LSEG Data Library for Python. 

News Headlines Service

News services on the desktop and data platform sessions can provide news headlines, news stories, and top news. For the news headlines service, at the time of writing, the current limit value is 100 headlines per request and the depth of history is 15 months. The service also supports the pagination used to retrieve the next snapshot of headlines that matches a news query string.

Pagination

The data platform has the news headlines endpoint (/data/news/v1/headlines) to retrieve a snapshot of headlines from a query. The endpoint limits 100 headlines per request and the depth of history is 15 months.

The endpoint supports the following options:

Option Type Description
query String The user search query
limit Number Number of items to get (Default: 10, range value: [1, 100])
dateFrom String The beginning of date range. (ex: 2025-01-01T10:00:00.000Z)
dateTo String The end of date range. (ex: 2025-01-01T10:00:00.000Z)
sort String Headlines sort order (Default: “newToOld”, Value: “newToOld”, “oldToNew”)
relevance String Relevancy (Value: “High”, “Medium”, “All”)
cursor String The pagination cursor
 

For example:

The following request obtains 100 English headlines for stories related to Microsoft company using RIC.

    	
            https://api.refinitiv.com/data/news/v1/headlines?query=LEN%20and%20R:MSFT.O&limit=100
        
        
    

The first headline of the first batch is at 2025-04-04T06:08:20.000Z and the last headline of the first batch is at 2025-04-03T20:15:00.000Z. 

The raw response contains headlines and pagination information.

    	
            

{

… 

  "meta": {

    "count": 100,

    "pageLimit": 100,

    "next": "H4sIAAAAAAAAAGWQUU/DIBSF…",

    "prev": "H4sIAAAAAAAAAGWQQW8CIR…"

  }

}

The value in the “meta.next” property can be used to retrieve the next 100 headlines. 

    	
            https://api.refinitiv.com/data/news/v1/headlines?cursor= H4sIAAAAAAAAAGWQUU/DIBSF…
        
        
    

The application can continuously get the cursor from the “meta.next” property and use it to retrieve the next headlines.

Use the news pagination in the LSEG Data Library for Python

The LSEG Data Library for Python provides the ld.news.get_headlines method in the access layer to retrieve news headlines from a news query. It supports the following parameters and it returns a Python dataframe. 

Parameter Type Description
query String The user search query for news headlines.
count Int (Optional) The count to limit number of headlines
start String or timedelta (Optional)

The beginning of date range.

String format is: '%Y-%m-%dT%H:%M:%S'. e.g. '2025-01-20T15:04:05'.

end String or timedelta (Optional)

The end of date range.

String format is: '%Y-%m-%dT%H:%M:%S'. e.g. '2025-01-25T15:04:05'.

order_by String or SortOrder (Optional) Sort order for headline items.

This method doesn’t support the cursor parameters and users can’t access the cursor from the response. Thus, the ld.news.get_headlines method doesn’t support the cursor pagination.

To use the cursor pagination, users can use the news.headlines.Definition class in the content layer. This class supports the following parameters.

Parameter Type Description
query String The user search query.
count Int (Optional) Count to limit number of headlines. Min value is 0. Default: 10
date_from String or timedelta (Optional)

The beginning of date range.

String format is: '%Y-%m-%dT%H:%M:%S'. e.g. '2025-01-20T15:04:05'.

date_to String or timedelta (Optional)

The end of date range.

String format is: '%Y-%m-%dT%H:%M:%S'. e.g. '2025-01-25T15:04:05'.

sort_order String or SortOrder Sort order for the response. Default: SortOrder.new_to_old
extended_params Dictionary (Optional) Dictionary (Optional)

The news.headlines.Definition class provides the get_data method that returns the NewsHeadlinesResponse. Users can access the raw data that contains the cursor property from the NewsHeadlinesResponse

    	
            

response = news.headlines.Definition(

    query = "R:LSEG.L",

    date_from = "2024-01-31T06:20:00.000",

    date_to = "2024-01-31T07:03:50.012",

    count = 100).get_data()

print(response.data.raw[0]["meta"])

next = response.data.raw[0]["meta"]["next"]

The output is:

    	
            

{'count': 4,

 'pageLimit': 100,

 'next': 'eyJxdWVyeSI6eyJpbn…',

 'prev': 'eyJxdWVyeSI6eyJ…'}

Users can access the next page cursor from the meta.next property. Then, the cursor parameter can be specified in the extended_params parameter.

    	
            

response = news.headlines.Definition(

        query = "",

        extended_params={

            "cursor":next

        }).get_data()

Sample Code

The following method demonstrates how to use the cursor to retrieve the next page of news headlines. The method accepts a news.headlines.Definition and a blank array of dataframe.

    	
            

def get_all_headlines(news_definition, df_list):

    _cursor = ""

    _response = news_definition.get_data()

    if _response.data.df.shape[0] > 0:

        df_list.append(_response.data.df)   

    if type(_response.data.raw) == list:

        if "next" in _response.data.raw[0]["meta"]:

            _cursor = _response.data.raw[0]["meta"]["next"]

        else:

            _cursor = ""

    else:

        if "next" in _response.data.raw["meta"]:

            _cursor = _response.data.raw["meta"]["next"]

        else:

            _cursor = ""

  

    if _cursor != "":

        _news_def = news.headlines.Definition(

            query = "",

            extended_params={

                "cursor":_cursor

            })

        get_all_headlines(_news_def, df_list)

All headlines are returned in the provided array of dataframe. The sample code that uses this function is:

    	
            

headlines_def = news.headlines.Definition(

    query = "R:LSEG.L",

    date_from = "2025-01-01T00:00:00.000",

    date_to = "2025-01-31T23:59:59.999",

    count=100)

headlines_df_list = []

get_all_headlines(headlines_def, headlines_df_list)

pd.concat(headlines_df_list)

The code will retrieve all news headlines related to LSEG.L from 01 Jan 2025 to 31 Jan 2025.

The output looks like this:

Summary

The news headlines service on the desktop and data platform sessions can provide historical news for the past 15 months, with a limit of 100 headlines per request. Fortunately, the service supports cursor-based pagination, so users can easily get more headlines with the same query.

The LSEG Data Library for Python can use cursor-based pagination for news headlines with the news.headlines.Definition class in the content layer. Users can access the cursor data from the raw response returned by this class, and then set the cursor in the news.headlines.Definition class to retrieve more headlines.