Article

Using Research API with Serverless application Part 2

Veerapath Rungruengrayubkul
Developer Advocate Developer Advocate

This is the second part of the Using Research API with the Serverless application article. The first part is here.

In the previous article, you should be able to create Lambda Functions, parameters in SSM Parameter Store and AWS S3 bucket used by the application. In this article, we will describe basic information about how to implement and create serverless application which coordinates all functions and other services with Step Functions.

Step Functions Implementation Overview

Step Functions is based on the concepts of tasks and state machines. You define state machines using the JSON-based Amazon States Language. States can perform a variety of functions in your state machine:

  • Do some work in your state machine (a Task state).
  • Make a choice between branches of execution (a Choice state)
  • Stop execution with a failure or success (a Fail or Succeed state)
  • Simply pass its input to its output or inject some fixed data (a Pass state)
  • Provide a delay for a certain amount of time or until a specified time/date (a Wait state)
  • Begin parallel branches of execution (a Parallel state)
  • Dynamically iterate steps using a Map state

The application in this article utilizes some functions of States. The following diagram shows the defined types of each state.

 

 

  • Almost states are Task state which executes Lambda Functions to request RDP access token (getEDPToken), subscribe Research Alert (subscribeResearch), poll SQS queue (getAlertMessage), download documents (downloadDocuments) etc.
  • The Get Alert Message and Refresh Token functions use Parallel state to execute in parallel.
  • The Check Status uses Choice state to make a choice to invoke download document function when a new message is received from SQS.
  • The Download Documents function is defined as Map state which iterates through a map of the document ID and invokes a Lambda function to download/upload a document for each ID in parallel.  This should improve the performance of the application when multiple Alert messages are received from the queue.
  • The Wait X Seconds is defined as Wait state which delays the state machine from continuing for a specified time. This is to wait for a specific interval before polling the queue for a new message. For this application, the interval time is 10 seconds.
  • The Queue Failed is defined as a Fail state to stop execution when the application is failed to receive a message from SQS.

 

Create Step Functions

Please following the following instructions to create a Step Function coordinate the Lambda Functions created in the previous step.

  • Open the AWS Step Functions Console -> click “Create state machine” button.
  • In Step1: Define state machine, select “Author with code snippets” and fill Name of Steps Function.
  • Copy the following codes on the State machine definition. You need to correct the ARN of the Lambda Function in the “Resource” fields of the code.
    	
            

{

  "Comment": "An example of the Amazon States Language that runs an AWS Batch job and monitors the job until it completes.",

  "StartAt": "Get EDP Token",

  "States": {

    "Get EDP Token": {

      "Type": "Task",

      "Resource": "arn:aws:lambda:us-east-1:<user>:function:getEDPToken",

      "Next": "Subscribe Research"

    },

    "Subscribe Research": {

      "Type": "Task",

      "Resource": "arn:aws:lambda:us-east-1:<user>:function:subscribeResearch",

      "ResultPath": "$.subscriptionInfo",

      "Next": "Get Cloud Credential"

    },

     "Get Cloud Credential": {

      "Type": "Task",

      "Resource": "arn:aws:lambda:us-east-1:<user>:function:getCloudCredential",

      "InputPath": "$.subscriptionInfo",

      "ResultPath": "$.cloudCredentialInfo",

      "Next": "Wait X Seconds"

    },

    "Wait X Seconds": {

      "Type": "Wait",

      "Seconds": 10,

      "Next": "Get Message and Refresh Token"

    },

    "Get Message and Refresh Token": {

    "Type": "Parallel",

     "Next": "Check Status",

     "Branches": [

        {

         "StartAt": "Get Alert Message",

         "States": {

            "Get Alert Message": {

              "Type": "Task",

              "Resource": "arn:aws:lambda:us-east-1:<user>:function:getAlertMessage",

              "End": true,

              "ResultPath": "$.queueStatus",

              "Parameters": {

                "endpoint.$": "$.subscriptionInfo.endpoint",

                "cryptographyKey.$":"$.subscriptionInfo.cryptographyKey",

                "accessKeyId.$": "$.cloudCredentialInfo.accessKeyId",

                "secretKey.$": "$.cloudCredentialInfo.secretKey",

                "sessionToken.$": "$.cloudCredentialInfo.sessionToken"

              }

            }

         }},

     {

        "StartAt": "Refresh Token",

        "States":{

        "Refresh Token": {

      "Type": "Task",

      "Resource": "arn:aws:lambda:us-east-1:<user>:function:refreshToken",

      "End": true

      }

     }

     }

    ]},

    "Check Status": {

      "Type": "Choice",

      "InputPath": "$.[0]",

      "Choices": [

        {

          "Variable": "$.queueStatus['status']",

          "StringEquals": "Doc Available",

          "Next": "Download Documents"

        },

        {

          "Variable": "$.queueStatus['status']",

          "StringEquals": "Queue Failed",

          "Next": "Queue Failed"

        },

        {

          "Variable": "$.queueStatus['status']",

          "StringEquals": "ExpiredToken",

          "Next": "Get Cloud Credential"

        },

        {

          "Variable": "$.queueStatus['status']",

          "StringEquals": "None",

          "Next": "Wait X Seconds"

        }

      ],

      "Default": "Wait X Seconds"

    },

    "Queue Failed": {

      "Type": "Fail",

      "Cause": "SQS returned error",

      "Error": "SQS FAILED"

    },

    "Download Documents": {

      "Type": "Map",

      "InputPath": "$.queueStatus",

      "ItemsPath": "$.docIds",

      "MaxConcurrency": 10,

      "ResultPath": "$.status",

      "Iterator": {

        "StartAt": "Download Document",

        "States": {

          "Download Document": {

            "Type": "Task",

            "Resource": "arn:aws:lambda:us-east-1:<user>:function:downloadDocuments",

            "End": true

          }

        }

      },

        "ResultPath": "$.status",

        "Next": "Wait X Seconds"

    }

  }

}

 

  • Click the “Next” button to next step,
  • In Step2: Configure settings, select “Create an IAM role for me”, and then fill an IAM role name.
  • Finally, click “Create state machine”.

 

 

The code will generate a similar workflow as follows.

 

 

Steps to run the serverless application

1. Configure information in the following parameters under the SSM parameter store. If you do not have the RDP Username, Password, and UUID, please contact your local Refinitiv representative. Regarding the Client ID, please follow the steps in this tutorial to create Client ID (App Key). BucketStorage is the name of the AWS S3 bucket created in the setup steps.

  • EDPUsername
  • EDPPassword
  • EDPClientId
  • UUID
  • BucketStorage

2. Open the Step Functions Console, select the state machine you have created in the setup steps.

3. Select the "Start Execution" button to start the application. You will see a new pop-up window for filling comments. Just click the "Start Execution" button again. You will see the application is running through each step of Lambda function and status on the Visual Workflow panel.

 

 

4. Once a new Alert is available, the Research information will be stored in the S3 bucket defined in the BucketStorate parameter.

 

 

Finally, new Research documents will be uploaded continually as long as the application is executing. You can manually download/open the file via Amazon S3 Bucket Console or integrate Amazon S3 with other AWS as needed. For example, trigger new Lambda Function.

Diagnostic and Troubleshooting

In the Console, Steps Function provides Execution Event History which can help you get more information when it fails.

For example:

  • In TaskStateEntered event type, the event will display input data when it executed a function.

 

 

  • The Event also can link to CloudWatch logs. The Log displays all logger messages for specific Lambda Function. You can add any debug code or verify the sequence of event here.

 

 

 

 

  • If the Step Functions is failed with the following message: “The cause could not be determined because Lambda did not return an error type.”, it likely is because of timeout in Lambda Function. This indicates that the function cannot be completed with the defined timeout. You may need to extend the timeout configuration defined for the function.

Conclusion

In this article, we demonstrate how to create a serverless application with AWS services which continually retrieves RDP Research information and store it in Cloud Storage. 

DOWNLOADS

Article.EDPResearch.Python.ServerlessApplication