HomeChallengesWorking with APIs and Databases using Python
learnbook poster
Working with APIs and Databases using PythonLast updated at Thu Oct 21 2021Skills
data-ingestion
Tools
python

Hello There!

wave-hello-20.gif

How are you doing today?

In this learnbook, I'll share my learning experience of building an end-to-end data pipeline using plain Python.

As a Data Analyst, I once got an opportunity to work on an interesting problem statement. Here it goes :


What was the problem?

I was working in an Agritech startup. We had good data on crop growth, germination etc. However, one of our customers wished to build a predictive model which could predict the harvest outcome based on several inputs like :

  • Temperature variation
  • Soil Moisture
  • Relative Humidity
  • Rainfall etc.

The problem was. We didn't have any weather data of our own. Also, the challenge was how to fetch data from a given source on a daily basis, save it in some database and make it ready for ad-hoc analysis as quickly as possible and completely open source.


How did we solve it?

Thinking through the problem, we broke down the entire pipeline into following components :

  • Data Ingestion : We had to ingest data from an API after registering to the website and fetching the API KEY.
    • We first set up an account on Openweather.org
    • Next step, fetch the API key
    • Finally, look into the API docs to fetch the API end point.

database-operations-with-python-01.png

database-operations-with-python-02.png

  • Data Wrangling : Convert JSON data to dataframes. Being a Data Analyst, I would love to have data in dataframes. That opens a lot of avenues for data analysis or machine learning
  • Data Storage : Persist the dataframes onto hard disks to be able to refer later for use in analytics or predictive modeling.

Given that we wanted a quick solution using open source, Python seemed to be a superb choice. We already had a SQL server database on Azure cloud with a write access.

Following resources helped us immensely to fill our knowledge gaps :

  • requests package to read from an API end point : Refer this excellent article by DataQuest to understand the usage of requests package to fetch data from an API in Python
  • pyodbc/pymssql packages to perform the below mentioned tasks :
    • Connecting to a relational database

    • Creating a new table in the relational database

    • Writing the dataframe into the newly created table

    • Reading from the SQL table and print output row by row

    • These articles : Using pyodbc , Using pymssql and Using Pandas helped us overcome the storing, writing and reading challenge using an Azure SQL Server Database.

Note : In any of the above approaches if you face issues of Driver not available or not identified, you may need to physically download and install the ODBC drivers in your system. Refer this link to download the required drivers


Solution in progress

database-operations-with-python-04.png

Notice how several libraries like requests, json, csv come together to solve the problem in question


Mini Lab

Hope you have spent some quality time going through the articles by now. How about applying your skills right away?

Well, this mini lab has got you covered.

Head over to this Google Colab notebook and try implementing the above scenario by yourself!

Remember : Always try typing out your code by hand. Resisting the temptation to Copy/Paste will go long way developing the brain muscle much needed by good programmers.


Created with 💙 by
author avatar
Amit Choudhary
Data Practitioner,Mentorskool
mentorskool logo
Mentee Today, Mentor Tomorrow
No 206, A Block, Sonesta Silver Oak,Varthur, Bangalore 560066,Karnataka
Copyright - 2022 © Mentorskool - All rights reserved.