This tutorial is the first in a series on LSEG Tick History data inside Google BigQuery.
After an introduction on the underlying technology, it explains how to set-up an environment in the Google Cloud Platform (GCP) console, how the data is structured, and how to examine the data.
The following tutorials cover data retrieval, filtering, and analytics on the data.
For a higher-level explanation of LSEG Tick History, Google Cloud Platform and BigQuery, and the use cases for this offering, please refer to the Tick History in Google BigQuery article.
To run the sample queries delivered with this tutorial you need:
Google allows queries to be submitted in different ways that include a graphical user interface, a command line tool, or through an API using a variety of client libraries such as Java, .Net or Python. This tutorial uses the Google Cloud Platform (GCP) console, an interactive GUI.
Queries use SQL (Structured Query Language), a familiar and quite powerful industry standard query language. Both the standard and legacy SQL dialects are supported by BigQuery.
For more details, extensive documentation and training guides, please refer to the Google BigQuery documentation.
In the Google cloud, we created a project dedicated to Tick History. Inside that, we created columnar data tables for normalized tick history data, on a per venue basis, and partitioned them by day. These tables have been populated with historical data, and are updated daily, a short time after the content sources embargo period expires.
Access to these tables is subject to entitlements, based on your Google ID.
The initial offering has 10 years of history (MiFID II venues start from 2018), for a subset of all venues. The breadth of coverage and depth of history will be extended in the future.
Adjacent content sets, like corporate actions or reference data, will also be made available in BigQuery in a future release; in the meantime, they are accessible via the LSEG Tick History REST API (using the LSEG DataScope Select servers), or the Refinitiv Data Platform.
Let us see how to do this in the in the GCP (Google Cloud Platform) console.
To access the LSEG Tick History data sets from your own projects, you need to access the appropriate project.
Start by logging into GCP using a Google ID that was entitled for LSEG Tick History; if you are unsure about which ID that may be, please contact your LSEG account manager.
Select the GCP.REFINITIV.COM organisation:
Then find and select the project called dbd-sdlc-prod:
Pin the dbd-sdlc-prod project:
Once you have done this, you will be able to examine the tick history data and run queries on it from within your own BigQuery project.
Opening the project we just pinned, we see a list of data sets. Their name is in the format <venue>_<view>. The venue is the data source; most of the time this is an exchange. There are two views:
The market depth view is only available for markets where such data exists. The number of levels of order book depth varies on a per exchange basis.
Let us look at some data, using as example the London Stock Exchange (LSE):
Let us click on the NORMALISED view:
We discover the data schema, with all the available fields:
This allows us to find all the field names and types, which will be useful when we query the data.
Now that we have found the data and understood its schema, we can move on to interrogate it, using SQL queries. Please proceed to the next tutorial that covers data retrieval and tick filtering.