Let’s Create a Cryptocurrency Ticker Database on Google Sheets With Python

Abhik Chakraborty
Geek Culture
Published in
7 min readFeb 6, 2021

--

In a world where data is money, databases are nothing short of banks. However, the great thing about them is their variety. They are of so many types and come with so many choices, that the data engineers like us are spoilt with it. Today let’s talk about one of the better and free ones — the Google Sheet

Google Sheet is known for its spreadsheet-like functionality and more. However, not many people realize that with its cloud-based approach of storing the data, greater row limit (5 Million compared to 1.1 Million in MS Excel), and 200 sheets per file, a single google sheet can hold an amazing 1 billion records! Not bad, right? So, let’s jump into our case.

Photo by André François McKenzie on Unsplash

Our task is divided into a few steps. I will be explaining each step with relevant screenshots, links, and of-course the Github link. So, here we go:

  • Activate Google Drive and Sheets API
  • Generate authentication credentials and linking them to the relevant sheet
  • Fire up our python code and let the data flow!

Prerequisites

Before jumping into the steps, let’s look at what are the things that would be required to implement this:

  • A Google Account
  • Basic Knowledge of Python
  • Love!

Okay, so now let’s get started with the steps…

Activate Google Drive & Sheets API

The first thing to get started is to activate the Google APIs so that we can talk to them while sending the data. Here are the steps:

Activation

Generate authentication credentials and linking them to the relevant sheet

This is a tricky part. Here we want to ensure that we can reach out to our APIs by creating relevant credentials and then linking them to our Google sheet

Generate Credentials

To create a credential, the following are the steps:

Form to create a service account
  • Next, come back to the link provided in the first step and select the service account you just created from under service accounts
Click on the email under a service account
  • Next, when a new page opens, under keys, select Add Keys and then ‘Create New Key’. Finally, in the dialog box, select format as JSON and click Create
  • After clicking create, a new JSON credential file will be created. Download it and keep it safe. It’s important for the next steps

Provide Permission

  • For this step, create a workbook on your google drive
  • Next, share it with the service account email you generated
  • Done!

If the first and second steps were confusing, here are some links from google’s official documentation

Code!

Before starting, let me share the GitHub link first.

https://github.com/abhikchakraborty92/VCP.git

Now let’s look at the steps:

Authentication Steps

  • Remember the JSON file we downloaded? I have kept a sample of it inside the config folder on Github. That would be used to authenticate our requests. Below is a screenshot of the same
{  
"type": "service_account",
"project_id": "some id",
"private_key_id": "some long key",
"private_key": "very very very long key",
"client_email": "email id for which access has to be provided to your google sheet",
"client_id": "client unique id",
"auth_uri": "auth url",
"token_uri": "token url",
"auth_provider_x509_cert_url": "authentication provider certification url",
"client_x509_cert_url": "client certification url"
}
  • Next, we have to download a few libraries. Use the requirements.txt file to pip them into your system!
  • After downloading the libraries, we will authenticate by using the googleauthenticate.py script inside the src folder. The library gspread and oauth2client will help us. The script contains all the steps to accomplish it. The only thing that is needed is the credential file and a few changes inside config.json configuration file, which we will see next. Using the googleworkbook function, we connect with the workbook
googleauthenticate.py

Configuration

  • Next, let’s see the configuration file. I have posted the same under the config folder. The file is divided into a few sections. Let’s go through the google sections first
  1. Here, the googlescope variable stores the scopes required to authenticate. You can see that I read the same inside the scope variable in the googleauthenticate.py script
  2. googleworkbookname and googlesheetname are what they suggest — the name of the workbook and the sheet where data will be stored
  3. googlesheetcolumns are the column headers of the ticker feed. This is required whenever we want to create a new sheet on our google workbook
  4. Finally, googlesheetrowlimit is the max number of rows we can store in one sheet of the google workbook. I have added 1 extra over 4 million for the headers. This value can go a maximum of 5 million per sheet. As soon as this limit is exhausted, the code would create a new worksheet inside the workbook
  • Next, we look at the ticker settings inside the configuration
  1. Here the ticker dictionary variable inside the market consists of the link we are going to hit. I am using the awesome WAZIRX API for this. The other variables are for some other functionality that is out of scope for this article.
  2. Next, tickercodes, are the identifiers for different cryptocurrencies along with their names in a [ticker, name] format. I am hitting for 10 cryptocurrencies. Any number of tickers can be added here

Final Code

In the final code, we do the following:

  • Read the whole ticker feed and filter out the ticker feeds we need

Here getgoogleworkbook function is being used from the googleauthenticate.py script to get our workbook. Other steps have been written with relevant comments.

  • Convert the feed for each ticker into a list and append all the lists to a final list

The above two screenshots show the process of getting the parsed data rows using the tickerparse function, which is present in the helperfunctions.py file. The valuelist variable is the parsed list of values for a given ticker.

  • Check for the row count & limit before every write and create a new sheet if necessary

Here the getworkbooksheet function helps run the logic to get the appropriate sheet to write the data

  • Write the rows

This part is straightforward. For the sheet, the code checks if the first cell is the first header column and inserts if it’s not. The header columns are selected using the configuration file. Finally, using the append_rows function of the gspread library, we append our rows of rows ValuesToInsert variable into our sheet

Output

This is the moment we have been waiting for. Let’s first look at the TickerInfo workbook I created. I have added the columns as a header for now and have set the row limit as 21 instead of 4,000,001.

Before and After

Notice that a new sheet has been created after 21 rows. Also, since we had already put the headers in the TickerData sheet, the code avoided inserting the header row again

In the next sheet, the header row was added and doesn’t have any formatting. However, this also has 21 rows since that is the maximum limit.

Final Thoughts

Google sheets is a very powerful and useful tool for any analyst or data engineer. With its huge data storing capability and a cloud-based concept of storage, this should be used more often by us as business intelligence and data service

--

--

Abhik Chakraborty
Geek Culture

Data Engineering Manager at Expedia | Ex - Amazon