Introduction To Filings - Python

Refinitiv Data Platform APIs

March 04, 2021

Zoya Farberov

Introducing Filings API Service on Refinitiv Data Platform

A new Filings API service is available on Refinitiv Data Platform (RDP) providing access to Global and EDGAR filing data for over 40 million documents from 135,000 companies worldwide, spanning over 50 years of history from 1968. Automated document feeds and newswires delivers timely and comprehensive collections for USA, Canada, Japan, Norway, Italy, Australia, Singapore, India, China and Korea.

Filings service consists of search and retrieval of public corporate disclosures. In this article, we will review on how to search for specific filings documents and to download the documents from the API.

Filings Search Using GraphQL

Filings documents can be searched through our GraphQL endpoint. GraphQL is a query language for APIs used on the front end to request and receive specific data in the response. Some capabilities used to search for filings documents include:

Filtering
Sorting
Limit
Pagination
Keyword Search

To learn more about GraphQL, please visit https://graphql.org/.

Python Environment

For the purpose of demonstration, we are going to use Jupyter Lab with Python 3..8. We are going to discuss the code that is available for download from https://github.com

Valid Credentials - Replace in Code or Read From File

Valid RDP credentials are required to interact with an RDP service:

USERNAME
PASSWORD
CLIENTID

    	
            USERNAME = "VALIDUSER"
PASSWORD = "VALIDPASSWORD"
CLIENT_ID = "SELFGENERATEDCLIENTID"
 
def readCredsFromFile(filePathName):
### Read valid credentials from file
    global USERNAME, PASSWORD, CLIENT_ID
    credFile = open(filePathName,"r")    # one per line
                                                #--- RDP MACHINE ID---
                                                #--- LONG PASSWORD---
                                                #--- GENERATED CLIENT ID---
 
    USERNAME = credFile.readline().rstrip('\n')
    PASSWORD = credFile.readline().rstrip('\n')
    CLIENT_ID = credFile.readline().rstrip('\n')
 
    credFile.close()
 
readCredsFromFile("..\creds\credFileHuman.txt")
 
# Uncomment - to make sure that creds are either set in code or read in correctly
#print("USERNAME="+str(USERNAME))
#print("PASSWORD="+str(PASSWORD))
#print("CLIENT_ID="+str(CLIENT_ID))

We include two ways to supply the valid credentials.

One is, to replace the placeholders in code, "VALIDUSER" ... with the valid personal credential values. To enact, comment out the call to read cred from file:

#readCredsFromFile("..\creds\credFileHuman.txt")

The other way is to store a set of valid RDP credentials in a file that is stored in path "../creds" in file "credsFileHuman.txt" and have the code retrieve the credentials from the file.

The file is expected to be in simple format one entity per line:

    	
            VALIDUSER
VALIDPASSWORD
SELFGENERATEDCLIENTID

Define Token Handling and Obtain a Valid Token

Having a valid token is a pre-requisite to requesting of any RDP content, and will be passed into the next steps. For additional information on Authorization and Tokens, refer to RDP Tutorial: Authorization - All about tokens.

The implementation steps that come next may look familiar, as with some variation they come up repeatedly, with any RDP service interaction.

    	
            TOKEN_ENDPOINT = RDP_BASE_URL + CATEGORY_URL + RDP_AUTH_VERSION + ENDPOINT_URL
 
def _requestNewToken(refreshToken):
    if refreshToken is None:
        tData = {
            "username": USERNAME,
            "password": PASSWORD,
            "grant_type": "password",
            "scope": SCOPE,
            "takeExclusiveSignOnControl": "true"
        };
    else:
        tData = {
            "refresh_token": refreshToken,
            "grant_type": "refresh_token",
        };
 
    # Make a REST call to get latest access token
    response = requests.post(
        TOKEN_ENDPOINT,
        headers = {
            "Accept": "application/json"
        },
        data = tData,
        auth = (
            CLIENT_ID,
            CLIENT_SECRET
        )
    )
    
    if response.status_code != 200:
        raise Exception("Failed to get access token {0} - {1}".format(response.status_code, response.text));
 
    # Return the new token
    return json.loads(response.text);
 
def saveToken(tknObject):
    tf = open(TOKEN_FILE, "w+");
    print("Saving the new token");
    # Append the expiry time to token
    tknObject["expiry_tm"] = time.time() + int(tknObject["expires_in"]) - 10;
    # Store it in the file
    json.dump(tknObject, tf, indent=4)
    
def getToken():
    try:
        print("Reading the token from: " + TOKEN_FILE);
        # Read the token from a file
        tf = open(TOKEN_FILE, "r+")
        tknObject = json.load(tf);
 
        # Is access token valid
        if tknObject["expiry_tm"] > time.time():
            # return access token
            print(tknObject["expiry_tm"])
            print("time.time()="+ str(time.time()))
            return tknObject["access_token"];
 
        print("Token expired, refreshing a new one...");
        tf.close();
        # Get a new token from refresh token
        tknObject = _requestNewToken(tknObject["refresh_token"]);
 
    except Exception as exp:
        print("Caught exception: " + str(exp))
        print("Getting a new token using Password Grant...");
        tknObject = _requestNewToken(None);
 
    # Persist this token for future queries
    saveToken(tknObject)
    # Return access token
    return tknObject["access_token"];

Define Filings Helper Function requestSearch

    	
            FILINGS_ENDPOINT = RDP_BASE_URL+'/data-store'+RDP_FILINGS_VERSION + '/graphql'
 
def requestSearch(token, payloadSearch):   
    
    global FILINGS_ENDPOINT
    print("requestSearch...")
  
    querystring = {}
    payload = ""
    jsonfull = ""
    jsonpartial = ""
    
    headers = {
            'Content-Type': "application/json",
            'Authorization': "Bearer " + token,
            'cache-control': "no-cache"
    }
        
    response =requests.post(FILINGS_ENDPOINT, json={'query': payloadSearch}, headers=headers)
 
    
    print("Response status code ="+str(response.status_code))
    
    if response.status_code != 200:
        if response.status_code == 401:   # error when token expired
                accessToken = getToken();     # token refresh on token expired
                headers["Authorization"] = "Bearer " + accessToken
                response =requests.post(FILINGS_ENDPOINT, json={'query': payloadSearch}, headers=headers)
         
    print('Raw response=');
    print(response);
    
    if response.status_code == 200:
        jsonFullResp = json.loads(response.text)
        return jsonFullResp; 
    else:
        return '';

Search Filings by File Type

The following example searches for all 10-Qs on February 12, 2021.

    	
            payloadIn = """
{
  FinancialFiling(filter: {AND: [
    {FilingDocument: {DocumentSummary: {FormType: {EQ: "10-Q"}}}}, 
    {FilingDocument: {DocumentSummary: {FilingDate: {BETWN: {FROM: "2021-02-12T00:00:00Z", TO: "2021-02-12T23:59:59Z"}}}}}]}, 
    sort: {FilingDocument: {DocumentSummary: {FilingDate: DESC}}},
    limit: 25 ) {
    _metadata {
      totalCount
      cursor
    }
    FilingOrganization {
      Names {
        Name {
          OrganizationName (filter: {OrganizationNameTypeCode: {EQ: "LNG"}}){
            OrganizationName
          }
        }
      }
    }
    FilingDocument {
      Identifiers {
        OrganizationId
        Dcn
      }
      DocId
      FinancialFilingId
      DocumentSummary {
        DocumentTitle
        FeedName
        FormType
        HighLevelCategory
        MidLevelCategory
        FilingDate
        SecAccessionNumber
        SizeInBytes
      }
      FilesMetaData {
        FileName
        MimeType
      }
    }
  }
}
"""
jsonFullResp = requestSearch(accessToken,payloadIn);
print('Parsed json response=');
print(json.dumps(jsonFullResp, indent=2));
docId = jsonFullResp["data"]["FinancialFiling"][0]["FilingDocument"]["DocId"]
print('DocId is',str(docId))
cursor = jsonFullResp["data"]["FinancialFiling"][0]["_metadata"]["cursor"]
print('cursor is', str(cursor))

Once we have identified the required DocId or DocIds, and cusrsor, this info is used by the next steps to request the required Filings documents

Pagination

The maximum number response is 200 if a limit is not specified in the query. Since we use a cursor-based pagination, returns a pointer to a specific item in the dataset. Each cursors is unique to a specific record. Last record is used to paginate.

To view the next 25 responses in the previous example, set the cursor from the last data point in the response to retrieve 26-50.

    	
            payloadIn1 = """
{
  FinancialFiling(filter: {AND: [
    {FilingDocument: {DocumentSummary: {FormType: {EQ: "10-Q"}}}}, 
    {FilingDocument: {DocumentSummary: {FilingDate: {BETWN: {FROM: "2021-02-12T00:00:00Z", TO: "2021-02-12T23:59:59Z"}}}}}]}, 
    sort: {FilingDocument: {DocumentSummary: {FilingDate: DESC}}},
    limit: 25
  cursor: """
payloadIn2 = """
) {
    _metadata {
      totalCount
      cursor
    }
    FilingOrganization {
      Names {
        Name {
          OrganizationName (filter: {OrganizationNameTypeCode: {EQ: "LNG"}}){
            OrganizationName
          }
        }
      }
    }
    FilingDocument {
      Identifiers {
        OrganizationId
        Dcn
      }
      DocId
      FinancialFilingId
      DocumentSummary {
        DocumentTitle
        FeedName
        FormType
        HighLevelCategory
        MidLevelCategory
        FilingDate
        SecAccessionNumber
        SizeInBytes
      }
      FilesMetaData {
        FileName
        MimeType
      }
    }
  }
}
"""
print("Request="+payloadIn1+"\""+str(cursor)+"\""+payloadIn2)
jsonFullResp = requestSearch(accessToken,payloadIn1+"\""+str(cursor)+"\""+payloadIn2);
print('Parsed json response=');
print(json.dumps(jsonFullResp, indent=2));

Search by OrganizationId

Search for all filings documents for Tesla in 2021

    	
            payloadIn = """
{
  FinancialFiling(filter: {AND: [{FilingDocument: {Identifiers: {OrganizationId: {EQ: "4297089638"}}}}, 
    {FilingDocument: {DocumentSummary: {FilingDate: {BETWN: {FROM: "2021-01-01T00:00:00Z", TO: "2021-12-31T11:59:59Z"}}}}}]}, 
    sort: {FilingDocument: {DocumentSummary: {FilingDate: DESC}}}, 
    limit: 10) {
    _metadata {
      totalCount
    }
    FilingOrganization {
      Names {
        Name {
          OrganizationName (filter: {OrganizationNameTypeCode: {EQ: "LNG"}}){
            OrganizationName
          }
        }
      }
    }
    FilingDocument {
      Identifiers {
        OrganizationId
        Dcn
      }
      DocId
      FinancialFilingId
      DocumentSummary {
        DocumentTitle
        FeedName
        FormType
        HighLevelCategory
        MidLevelCategory
        FilingDate
        SecAccessionNumber
        SizeInBytes
      }
      FilesMetaData {
        FileName
        MimeType
      }
    }
  }
}
"""
jsonFullResp = requestSearch(accessToken,payloadIn);
print('Parsed json response=');
print(json.dumps(jsonFullResp, indent=2));

Keyword Search by Document Text

One of the other features available in keyword word against document or section text.

    	
            payloadIn = """
    {
        FinancialFiling( 
            sort: {FilingDocument: {DocumentSummary: {FilingDate: DESC}}}, 
            filter: {FilingDocument: {DocumentSummary: {FilingDate: {BETWN: {FROM: "2020-07-01T00:00:00Z", TO: "2020-08-01T00:00:00Z"}}}}}, 
            keywords: {searchstring: "FinancialFiling.FilingDocument.DocumentText:COVID-19"}, 
            limit: 5) { 
            _metadata { 
                totalCount 
                } 
            FilingOrganization { 
                Names { 
                    Name { 
                        OrganizationName(  
                        filter: {AND: [ {
                            OrganizationNameLanguageId: {EQ: "505062"}}, {
                            OrganizationNameTypeCode: {EQ: "LNG"}}]}) 
                        { 
                            OrganizationName 
                        } 
                    } 
                } 
            }             
            FilingDocument { 
                DocId
                DocumentSummary { 
                    DocumentTitle 
                    FilingDate 
                    FormType 
                    FeedName                     
                } 
                DocumentText 
            } 
        }
    } 
    """
jsonFullResp = requestSearch(accessToken,payloadIn);
print('Parsed json response=');
print(json.dumps(jsonFullResp, indent=2));
docId = jsonFullResp["data"]["FinancialFiling"][0]["FilingDocument"]["DocId"]
 
print('DocId is',str(docId))

Download Filings Documents

There are four identifers or retrieval methods you can use to download a document.

FilingId (FilingId, or Financial Filing Id, is an internal permanent identifier assigned to each filings document. This is our strategic filings identifier.)
Dcn (Dcn, also known as Document Control Number, is an external identifier and an enclosed film-number specific to Edgar documents.)
DocId (DocId, or Document Identifier, is an internal identifier assigned to financial filings documents.)
Filename (Filename provides a faster and direct route to download documents without going through a resolver.)

Define Helper Function retrieveURL

    	
            def retrieveURL(token, retrievalParameters):   
 
    ENDPOINT_DOC_RETRIEVAL = RDP_BASE_URL+'/data/filings'+RDP_FILINGS_VERSION + '/retrieval/search/' + retrievalParameters
    
    headers = {
        "Authorization": "Bearer " + token,
        "X-API-Key": "155d9dbf-f0ac-46d9-8b77-f7f6dcd238f8",
        "ClientID" : "api_playground"
    }
    print("Next we retrieve: " + ENDPOINT_DOC_RETRIEVAL);
    
    response = requests.get(ENDPOINT_DOC_RETRIEVAL, headers = headers);
    
    print("Response status code ="+str(response.status_code))
    
    if response.status_code != 200:
        if response.status_code == 401:   # error when token expired
                token = getToken();     # token refresh on token expired
                print("Token now is: "+token)
                headers["Authorization"] = "Bearer " + token
                response = requests.get(ENDPOINT_DOC_RETRIEVAL, headers = headers);
 
    print("Response status code ="+str(response.status_code))
    if response.status_code == 200:
        jsonFullResp = json.loads(response.text)        
        return jsonFullResp; 
    else:
        return '';

Retrieve URL by DocID

    	
            jsonFullResp = retrieveURL(accessToken,'docId/54932207')
print('full response is =');
print(json.dumps(jsonFullResp, indent=2));
fileName = list(jsonFullResp.keys())[0]
print("fileName is: ")
print(fileName)
signedUrl = jsonFullResp[list(jsonFullResp.keys())[0]]["signedUrl"]
print("signedUrl to retrieve is: ")
print(signedUrl)

Retrieve URL by FilingId

    	
            jsonFullResp = retrieveURL(accessToken,'filingId/97661417885')
print('full response is =');
print(json.dumps(jsonFullResp, indent=2));
fileName = list(jsonFullResp.keys())[0]
print("fileName is: ")
print(fileName)
signedUrl = jsonFullResp[list(jsonFullResp.keys())[0]]["signedUrl"]
print("signedUrl to retrieve is: ")
print(signedUrl)

Download the Document

Now we are ready to download the Filings document and save it under downloads folder

    	
            def retrieveSaveDoc(fileName, signedUrl, token):   
    
    headers = {
            'clientId': CLIENT_ID,
            'Authorization': "Bearer " + token
    }
 
    response = requests.get(signedUrl, headers = headers, allow_redirects=True);
 
    if response.status_code == 200:
        filenameWithDir = './downloads/'+ str(fileName)
        os.makedirs(os.path.dirname(filenameWithDir), exist_ok=True)
        open(filenameWithDir, 'wb').write(response.content)
        print("The document ",fileName," has been downloaded indo downloads subfolder")
        return fileName; 
    else:
        print("Response code on error is:",str(response.status_code))
        return '';
 
retrieveSaveDoc(fileName, signedUrl, accessToken)ken)

At this point a filings pdf file is stores under downloads folder:

For more common use case examples that can be implemented in Python analogously, see Filings Developer Guide -> Section Use Cases

Articles

Developer Tools

Use Cases

Recent and Upcoming

Past Events

About Us