Overview
Machine Readable News (MRN) is an advanced service for automating the consumption and systematic analysis of news. It delivers deep historical news archives, ultra-low latency structured news and news analytics directly to your applications. This enables algorithms to exploit the power of news to seize opportunities, capitalize on market inefficiencies, and manage event risk.
The MRN data Real-Time news and News Analytics data are available for consumers via the LSEG Real-Time Platform and SFTP connections. While the Real-Time news connection can be accessed programmatically with the Real-Time APIs only (we have covered it via the Introduction to Machine Readable News (MRN) with Enterprise Message API (EMA) and Introduction to Machine Readable News with WebSocket API articles), the SFTP connection can be access with any SFTP applications like the FileZilla, WinSCP, or even Windows native SFTP command.
However, some developers might need to access the SFTP remote server and get the MRN file programmatically. This article shows a step-by-step guide to access and get the file from the MRN Remote SFTP site using the Python programming language with the Paramiko library on the JupyterLab notebook.
Prerequisite
Before I am going further, there is some prerequisite, dependencies, and libraries that the project is needed.
Python
This article demonstrates in Python programming language. You need the Python SDK (at least version 3.10.x) or Anaconda/Miniconda Python distribution.
I am using the Miniconda Python distribution as an example.
JupyterLab Application
JupyterLab is an interactive a browser-based REPL application that includes the source code, output with documentation in the same document file. Developer can reads the document, run each cell and see the result instantly.
The example code uses JupyterLab application as a Jupyter Notebook server application.
Access to the MRN SFTP
This project uses MRN Machine-ID access credentials to connect to MRN SFTP remote server.
Please contact your LSEG representative to help you with the MRN access.
That covers the prerequisite of this project.
Code Walkthrough
The chosen SFTP connection library for this demonstration is Paramiko (PyPI). I choose it over pysftp because Paramiko is more up-to-date library when compare with the outdated-pysftp. However, the concept for connecting to the SFTP remote server and get the file can be applied to any programming languages and libraries.
Preparing a connection
The first step is importing the required library and module.
import paramiko
Next, create variables for storing MRN credential and MRN connection point.
username = 'Machine-ID' #Or input your Machine-ID Manually
password = 'Password'
hostname = 'archive.news.refinitiv.com'
localfilepath = '.\download'
I am putting the downloaded file on the <project location>\download folder, so the localfilepath variable is set to '.\download' folder.
Let leave the initial variables process here.
Establish the SSH Client
The high-level client API starts with creation of an SSHClient object. The SSHClient object encapsulate all supported connection types (such as SSH, SFTP, etc). It is a starting point for an application.
ssh = paramiko.SSHClient()
# Automatically add host keys (not secure for production)
ssh.set_missing_host_key_policy(paramiko.AutoAddPolicy())
Please be noticed about the second statement above. The set_missing_host_key_policy(paramiko.AutoAddPolicy()) method adds the hostname and new host key to the local HostKeys object automatically. Please note that it is for the demonstration purpose only. You should add a known host key to the connection with Paramiko's SSHClient.load_system_host_keys() method in the production use.
Connect to the Remote Server and Establish the SFTP client connection
Next, we use the newly created ssh client object to connect to the MRN SFTP remote server and open the SFTP connection.
sftp = None
try:
ssh.connect(hostname=hostname, username=username, password=password)
sftp = ssh.open_sftp()
except Exception as e:
print(f'Fail on connection :{e}')
If your environment uses Proxy to connect to network, you may need to set the Proxy detail in a Python code with this code.
import socks
sock = socks.socksocket()
sock.set_proxy(
proxy_type=socks.SOCKS5,
addr="Your_Proxy_URL",
port=Your_proxy_port_number,
username="Your Proxy Username",
password="Your Proxy Username"
)
sftp = None
try:
ssh.connect(hostname=hostname, username=username, password=password, sock=sock)
sftp = ssh.open_sftp()
except Exception as e:
print(f'Fail on connection :{e}')
Once a connection is succeed, the library returns SFTPClient object to an application which application can use it to interact with the SFTP remote server.
To test if our SFTPClient object is connected to the MRN remote server, you can list the current directory with the SFTPClient interface as follows:
print(sftp.listdir())
The result is shown below:
That covers the SFTP client connection part.
Change Directory and Get a File
Now our Python application is connected to the MRN SFTP remote server. You can use the SFTPClient object to navigate to the desired directory on the remote server and get the MRN files.
Note: You can use any SFTP client application to navigate through the MRN SFTP remote server to get the file and directory path that you need.
I am demonstrating with the /mpsych/MI4/CMPNY_REF/BASIC folder and get the first file from this folder. To check the directory, you can use the SFTPClient.chdir() method.
try:
sftp.chdir('/mpsych/MI4/CMPNY_REF/BASIC')
#lists files on this /mpsych/MI4/CMPNY_REF/BASIC folder
print(f'lists files on this /mpsych/MI4/CMPNY_REF/BASIC folder: {sftp.listdir()}')
except Exception as e:
print(f'Fail on changing directory :{e}')
Then download the file with the SFTPClient.get() method.
fileName = sftp.listdir()[0]
try:
sftp.get(remotepath = fileName, localpath=f'{localfilepath}\{fileName}')
except Exception as e:
print(f'Fail on get a file :{e}')
Next, read the file's content from <project location>\download folder to confirm if the file is download.
try:
with open(f'{localfilepath}\{fileName}') as finput:
data = finput.read(200)
print(data)
except Exception as e:
print(f'The error is:{e}'
Closing A Connection
The last step is to closing our connection. Firstly, an application needs to close the SFTP session.
sftp.close()
Lastly, close the SFTP SSH Client session.
ssh.close()
That all I have to say about how to connect to MRN SFTP remote server and download the file.
Next Step
This source code example just shows the basic step of getting the MRN file from the MRN SFTP remote server. Developers can use any SFTP library and any programming language to perform the same task with the same concept. You can apply this concept to make an application that run schedule to get MRN file, or crate an GUI application for ease-of-use case.
Or if you need the Real-Time MRN data content, you can use the Real-Time APIs to consume the real-time MRN streaming data from the LSEG Real-Time Platform too.
References
For further details, please check out the following resources:
- LSEG Developer Community website.
- Machine Readable News website.
- Paramiko offical website
- Paramiko API document
- Paramiko SFTP API document
- Paramiko Client API document
- Paramiko- How to transfer files with Remote System (SFTP Servers) using Python blog post.
- How-to: Python Paramiko blog post.
- Paramiko SFTP: A Guide with Examples website.