Last update |
Dec 2023 |
Environment | Any |
Language | Any HTTP is supported |
Compilers | None |
Prerequisites | DSS login, internet access |
Source code | Below |
This tutorial is a prerequisite for the following tutorials.
It explains the workflow for a raw data extraction, using an On Demand extraction request.
It also gives some tips on request tuning for best performance.
The raw data extraction workflow can apply to several types of On Demand historical data requests (which are covered in the following tutorials):
Here are the basic steps:
This step is optional. If you do not know what content field names are available, you can request a list of those available.
The available field set depends on the type of data you want to request. The data type must therefore be specified in the request, which is done by setting a report template type in the URL.
List of possible values:
Data type | Report template type |
---|---|
Tick data | TickHistoryTimeAndSales |
Market depth | TickHistoryMarketDepth |
Intraday bars | TickHistoryIntradaySummaries |
End of Day | ElektronTimeseries |
As an example, here is the call to retrieve the available fields for tick data:
URL:
https://selectapi.datascope.refinitiv.com/RestApi/v1/Extractions/GetValidContentFieldTypes(ReportTemplateType=DataScope.Select.Api.Extractions.ReportTemplates.ReportTemplateTypes'TickHistoryTimeAndSales')
Method: GET
Headers:
Note: for all requests we need a user token, set in the header. The token was retrieved in Tutorial 1.
Prefer: respond-async
Content-Type: application/json
Authorization: Token F0ABE9A3FFF2E02E10AE2765ED872C59B8CC3B40EBB61B30E295E71DE31C254B8648DB9434C2DF9299FDC668AA123501F322D99D45C8B93438063C912BC936C7B87062B0CF812138863F5D836A7B31A32DCA67EF07B3B50B2FC4978DF6F76784FDF35FCB523A8430DA93613BC5730CDC310D4D241718F9FC3F2E55465A24957CC287BDEC79046B31AD642606275AEAD76318CB221BD843348E1483670DA13968D8A242AAFCF9E13E23240C905AE46DED9EDCA9BB316B4C5C767B18DB2EA7ADD100817ADF059D01394BC6375BECAF6138C25DBA57577F0061
Content-Type: application/json; charset=utf-8
Body:
As an example, here is the beginning of the response, for tick data:
{
"@odata.context": "https://selectapi.datascope.refinitiv.com/RestApi/v1/$metadata#ContentFieldTypes",
"value": [
{
"Code": "THT.Auction - Exchange Time",
"Name": "Auction - Exchange Time",
"Description": "Exchange supplied exchange time (Local or GMT depending on the exchange)",
"FormatType": "Text",
"FieldGroup": "Auction"
},
{
"Code": "THT.Auction - Price",
"Name": "Auction - Price",
"Description": "Auction Price",
"FormatType": "Number",
"FieldGroup": "Auction"
},
This goes on with all the other available fields. Here is the last part:
{
"Code": "THT.Trade - Yield",
"Name": "Trade - Yield",
"Description": "An update to indicate Dividend Yield as adjusted by the last trade or closing price",
"FormatType": "Number",
"FieldGroup": "Trade"
}
]
}
The resulting records contain for each field:
Use this to choose all the field names you want. In the next step we will make a request for data, using some of these fields.
This is an On Demand extraction request, which means it will be immediately queued, then executed as soon as possible.
URL:
The raw extraction URL is the same for all data types.
https://selectapi.datascope.refinitiv.com/RestApi/v1/Extractions/ExtractRaw
Method: POST
Headers:
The raw extraction header is the same for all data types.
To avoid keeping the connection open and the caller code waiting, we set the preference to an asynchronous response.
For all requests we need to set the user token in the authorization header; the token was retrieved in Tutorial 1.
Prefer: respond-async
Content-Type: application/json
Authorization: Token F0ABE9A3FFF2E02E10AE2765ED872C59B8CC3B40EBB61B30E295E71DE31C254B8648DB9434C2DF9299FDC668AA123501F322D99D45C8B93438063C912BC936C7B87062B0CF812138863F5D836A7B31A32DCA67EF07B3B50B2FC4978DF6F76784FDF35FCB523A8430DA93613BC5730CDC310D4D241718F9FC3F2E55465A24957CC287BDEC79046B31AD642606275AEAD76318CB221BD843348E1483670DA13968D8A242AAFCF9E13E23240C905AE46DED9EDCA9BB316B4C5C767B18DB2EA7ADD100817ADF059D01394BC6375BECAF6138C25DBA57577F0061
Body:
The body contents vary depending on the type of data. It mentions this is an extraction request, and contains several parts:
Data type | Data type |
---|---|
Tick data | TickHistoryTimeAndSalesExtractionRequest |
Market depth | TickHistoryMarketDepthExtractionRequest |
Intraday bars | TickHistoryIntradaySummariesExtractionRequest |
End of Day | ElektronTimeseriesExtractionRequest |
As an example, here is the body of a request for tick data:
{
"ExtractionRequest": {
"@odata.type": "#DataScope.Select.Api.Extractions.ExtractionRequests.TickHistoryTimeAndSalesExtractionRequest",
"ContentFieldNames": [
"Trade - Price",
"Trade - Volume",
"Trade - Exchange Time"
],
"IdentifierList": {
"@odata.type": "#DataScope.Select.Api.Extractions.ExtractionRequests.InstrumentIdentifierList",
"InstrumentIdentifiers": [
{ "Identifier": "CARR.PA", "IdentifierType": "Ric" }
]
},
"Condition": {
"MessageTimeStampIn": "GmtUtc",
"ApplyCorrectionsAndCancellations": false,
"ReportDateRangeType": "Range",
"QueryStartDate": "2016-09-29T00:00:00.000Z",
"QueryEndDate": "2016-09-29T12:00:00.000Z",
"DisplaySourceRIC": true
}
}
}
On Demand extraction requests are executed as soon as possible. There is no guarantee on the delivery time, it depends on the amount of requested data, and the server load.
In the request we set a preference for an asynchronous response. We will get a response in 30 seconds (default wait time) or less.
The HTTP status of the response can have one of several values, here we shall detail the most likely ones:
It is strongly recommended that you ensure your code handles all possible status codes.
When requests take more than 30 seconds, a 202 Accepted is returned as the first response. Usually Tick History requests will take more than 30 seconds, which means that 202 Accepted will be the normal first response.
You can customize the wait time, but this is not recommended.
Let us now look at the two most common responses in detail.
The request was accepted, but processing has not yet completed. This response is the most likely, especially if the request is for a large amount of data.
Status: 202 Accepted
Relevant headers:
Location: https://selectapi.datascope.refinitiv.com/RestApi/v1/Extractions/ExtractRawResult(ExtractionId='0x0785d7e9572c76b1')
The location URL must be saved, we will use it in the next step, check request status. Note: the last part of the URL (0x0785d7e9572c76b1) is the job ID for this request.
Body: Response does not contain any data
Instead of a 202 Accepted, we could receive a 200 OK. This means the request has completed.
Status: 200 OK
Relevant headers:
Content-Type: application/json; charset=utf-8
Body:
{
"@odata.context": "https://selectapi.datascope.refinitiv.com/RestApi/v1/$metadata#RawExtractionResults/$entity",
"JobId": "0x0785d7e9572c76b1",
"Notes": [
"Extraction Services Version 14.5.42294 (737b0965c07f), Built Apr 8 2021 13:43:46\nUser ID: 9008895\nExtraction ID: 2000000249659457\nSchedule: 0x0785d7e9572c76b1 (ID = 0x0000000000000000)\nInput List (1 items): (ID = 0x0785d7e9572c76b1) Created: 04/19/2021 10:23:30 Last Modified: 04/19/2021 10:23:30\nReport Template (3 fields): _OnD_0x0785d7e9572c76b1 (ID = 0x0785d7e9574c76b1) Created: 04/19/2021 10:21:53 Last Modified: 04/19/2021 10:21:53\nSchedule dispatched via message queue (0x0785d7e9572c76b1), Data source identifier (ADF5F7B662B34B91A118EEF071688A29)\nSchedule Time: 04/19/2021 10:21:55\nProcessing started at 04/19/2021 10:21:55\nProcessing completed successfully at 04/19/2021 10:23:30\nExtraction finished at 04/19/2021 10:23:30 UTC, with servers: tm02n02\nInstrument <RIC,CARR.PA> expanded to 1 RIC: CARR.PA.\nQuota Message: INFO: Tick History Cash Quota Count Before Extraction: 998; Instruments Extracted: 1; Tick History Cash Quota Count After Extraction: 998, 99.8% of Limit; Tick History Cash Quota Limit: 1000\nManifest: #RIC,Domain,Start,End,Status,Count\nManifest: CARR.PA,Market Price,2016-09-29T07:00:11.672415651Z,2016-09-29T11:59:46.552806988Z,Active,3620\n"
]
}
The JobId value must be saved, we will use it in the next step.
The Notes contain important information on the request, IDs, timestamps, eventual errors, and extraction quota status. If the request completed successfully, it will also contain the message: Processing completed successfully. It is recommended to store them, and analyze the text to detect issues, warnings or errors.
As the request status has been returned directly (because it was a very quick extraction), the next step (check request status) is not required.
We skip it to go directly to retrieve the data, using the returned JobId.
Skip this step if the previous step returned an HTTP status of 200 OK.
If the previous step returned an HTTP status of 202 Accepted, this step must be executed, and repeated in a polling loop until it returns an HTTP status of 200 OK.
URL:
This is the Location URL, taken from the 202 response header received in the previous step.
https://selectapi.datascope.refinitiv.com/RestApi/v1/Extractions/ExtractRawResult(ExtractionId='0x0785d7e9572c76b1')
Method: GET
Headers:
This is the same as for the other steps:
Prefer: respond-async
Content-Type: application/json
Authorization: Token F0ABE9A3FFF2E02E10AE2765ED872C59B8CC3B40EBB61B30E295E71DE31C254B8648DB9434C2DF9299FDC668AA123501F322D99D45C8B93438063C912BC936C7B87062B0CF812138863F5D836A7B31A32DCA67EF07B3B50B2FC4978DF6F76784FDF35FCB523A8430DA93613BC5730CDC310D4D241718F9FC3F2E55465A24957CC287BDEC79046B31AD642606275AEAD76318CB221BD843348E1483670DA13968D8A242AAFCF9E13E23240C905AE46DED9EDCA9BB316B4C5C767B18DB2EA7ADD100817ADF059D01394BC6375BECAF6138C25DBA57577F0061
If you receive an HTTP status 202 Accepted response (the same as in the previous step), it means the request has not yet completed. You must wait a bit and check the request status again.
If you receive an HTTP status 200 OK response, the request has completed:
Status: 200 OK
Relevant headers:
Content-Type: application/json; charset=utf-8
Body:
{
"@odata.context": "https://selectapi.datascope.refinitiv.com/RestApi/v1/$metadata#RawExtractionResults/$entity",
"JobId": "0x0785d7e9572c76b1",
"Notes": [
"Extraction Services Version 14.5.42294 (737b0965c07f), Built Apr 8 2021 13:43:46\nUser ID: 9008895\nExtraction ID: 2000000249659457\nSchedule: 0x0785d7e9572c76b1 (ID = 0x0000000000000000)\nInput List (1 items): (ID = 0x0785d7e9572c76b1) Created: 04/19/2021 10:23:30 Last Modified: 04/19/2021 10:23:30\nReport Template (3 fields): _OnD_0x0785d7e9572c76b1 (ID = 0x0785d7e9574c76b1) Created: 04/19/2021 10:21:53 Last Modified: 04/19/2021 10:21:53\nSchedule dispatched via message queue (0x0785d7e9572c76b1), Data source identifier (ADF5F7B662B34B91A118EEF071688A29)\nSchedule Time: 04/19/2021 10:21:55\nProcessing started at 04/19/2021 10:21:55\nProcessing completed successfully at 04/19/2021 10:23:30\nExtraction finished at 04/19/2021 10:23:30 UTC, with servers: tm02n02\nInstrument <RIC,CARR.PA> expanded to 1 RIC: CARR.PA.\nQuota Message: INFO: Tick History Cash Quota Count Before Extraction: 998; Instruments Extracted: 1; Tick History Cash Quota Count After Extraction: 998, 99.8% of Limit; Tick History Cash Quota Limit: 1000\nManifest: #RIC,Domain,Start,End,Status,Count\nManifest: CARR.PA,Market Price,2016-09-29T07:00:11.672415651Z,2016-09-29T11:59:46.552806988Z,Active,3620\n"
]
}
The JobId value must be saved, we will use it in the next step.
The Notes contain information on the request, IDs, timestamps, eventual errors, and extraction quota status. If the request completed successfully, it will also contain the message: Processing completed successfully.
We can now retrieve the data, using the returned JobId.
Note: this 200 response is identical in format and content to the 200 OK response that could also have been returned directly when we requested the extraction.
It is mandatory to have received a 200 OK response with a JobID from a previous step before proceeding with this last step.
URL:
Note the JobId value used as parameter in the URL:
https://selectapi.datascope.refinitiv.com/RestApi/v1/Extractions/RawExtractionResults('0x0785d7e9572c76b1')/$value
Method: GET
Headers:
This is the same as for the other steps, except for the Content-Type. The server will always send the data in compressed format (gzip CSV). You can include “Accept-Encoding: gzip” for raw stream downloads, this will allow the client (browser or SDK) to automatically unzip the data.
Prefer: respond-async
Content-Type: Accept-Encoding: gzip, deflate
Authorization: Token F0ABE9A3FFF2E02E10AE2765ED872C59B8CC3B40EBB61B30E295E71DE31C254B8648DB9434C2DF9299FDC668AA123501F322D99D45C8B93438063C912BC936C7B87062B0CF812138863F5D836A7B31A32DCA67EF07B3B50B2FC4978DF6F76784FDF35FCB523A8430DA93613BC5730CDC310D4D241718F9FC3F2E55465A24957CC287BDEC79046B31AD642606275AEAD76318CB221BD843348E1483670DA13968D8A242AAFCF9E13E23240C905AE46DED9EDCA9BB316B4C5C767B18DB2EA7ADD100817ADF059D01394BC6375BECAF6138C25DBA57577F0061
Content-Encoding: gzip
Content-Type: text/plain
Body:
The content is compressed plain text in CSV format. Depending on the nature of the data, the time range and number of instruments, the response can be quite long and contain tens of thousands of lines.
As an example, here is the beginning of the response content for a tick data request:
#RIC,Domain,Date-Time,Type,Price,Volume,Exch Time
CARR.PA,Market Price,2016-09-29T07:00:11.672415651Z,Trade,23.25,63,07:00:11.000000000
CARR.PA,Market Price,2016-09-29T07:00:11.672415651Z,Trade,23.25,64,07:00:11.000000000
CARR.PA,Market Price,2016-09-29T07:00:11.672415651Z,Trade,23.25,27,07:00:11.000000000
CARR.PA,Market Price,2016-09-29T07:00:11.672415651Z,Trade,23.25,2115,07:00:11.000000000
CARR.PA,Market Price,2016-09-29T07:00:11.672415651Z,Trade,23.25,21,07:00:11.000000000
CARR.PA,Market Price,2016-09-29T07:00:11.672674987Z,Trade,23.25,21,07:00:11.000000000
CARR.PA,Market Price,2016-09-29T07:00:11.672674987Z,Trade,23.25,11,07:00:11.000000000
CARR.PA,Market Price,2016-09-29T07:00:11.672674987Z,Trade,23.25,61,07:00:11.000000000
CARR.PA,Market Price,2016-09-29T07:00:11.672674987Z,Trade,23.25,235,07:00:11.000000000
Requests for raw data, tick data and market depth data can generate very large result sets.
To optimize the retrieval times, see the Request tuning and best practices document under the Documentation tab.