How to call APIs with Python to request data
 Yelp and Twitter examples step-by-step

Lianne & Justin

Lianne & Justin

Share on twitter
Share on linkedin
Share on facebook
Share on email
python api call red phone
Source: Pexels

In this tutorial, you’ll discover how to request data faster by making an API call with Python.

As data analysts/scientists, it’s critical to know how to collect data. Websites like Twitter, Yelp often have data that are valuable for our analysis. While we can manually download or web scrape the data, there’s a better way: sending requests for data to the website through the APIs.

Following this guide, you’ll learn:

  • What is an API, HTTP (request and response)?
  • How to make an API call in Python, step-by-step?
  • Two examples of requesting data from the Yelp and Twitter APIs.

If you want to add more information to your analysis efficiently, try calling the APIs with Python!



If you are not familiar with Python, please take our FREE Python crash course: breaking into Data Science.

Before looking at examples to make API calls with Python, we need to introduce some basic definitions.

What is an API?

Imagine you want to get:

  • the Tweets from Twitter to analyze the sentiments about a product.
  • the Yelp data to find lists of businesses of interest.
  • the Google Maps data to find the travel time between locations.
  • etc.

How would you pull the data?

We might be able to download the data manually. But it’s not efficient if the data needs to be updated frequently, or if we only need part of the dataset.

We may also automate the process by web scraping using Python. But this is time-consuming and requires us writing code based on the structure of the webpages. So even a small change to the website could force us to change the code.

Fortunately, many large websites provide APIs that allow their data exposure to external developers in a standard and convenient way.

An application programming interface (API) is a computing interface that defines interactions between multiple software intermediaries.

It defines the kinds of calls or requests that can be made, how to make them, the data formats that should be used, the conventions to follow, etc. It can also provide extension mechanisms so that users can extend existing functionality in various ways and to varying degrees.

Wikipedia

Besides sharing data, APIs also help developers in other ways. But within this data science tutorial, we’ll focus on getting data through the APIs. Websites that expose a web API typically use the HTTP method for the request.

Next, let’s learn the basics of HTTP.

What is HTTP?

HTTP (Hypertext Transfer Protocol) is a protocol which allows the fetching of resources. It is the foundation of any data exchange on the Web and it is a client-server protocol, which means requests are initiated by the recipient, usually the Web browser.

Mozilla.org

There’s a lot to HTTP, but we’ll focus on the basics related to using the APIs.

Assume we are using the web browser to communicate with the server from a website. Each time we want the site to do something (for example, send the data), we must submit an HTTP request (make an API call) to the server. Then the server provides a response message as the answer. The response could contain the data requested (if applicable) and some other information like completion status.

The two common HTTP request methods for API calls are GET and POST:

  • Within a GET request, the query strings/parameters are sent in the URL, which is easy to bookmark or save. But we can only send ASCII characters and a limited amount of data. Requests using GET should only retrieve data.
  • Within a POST request, the query parameters typically are sent in the body of the request message. This method is a little more secure. We can send both ASCII and binary data and more data than a GET request. This method can be used to send data to the server.

Some APIs only allow one of these two request methods.

Don’t worry if you don’t understand all the definitions. We’ll see examples of a GET request for Yelp and a POST request for Twitter in the example sections.

After the website responds to us, it’s essential to check the status code. Some common HTTP response status codes are:

  • 200: The request was a success. – the data requested should be in the response.
  • 400: Bad Request – the request was invalid due to bad syntax.
  • 401: Unauthorized – missing or incorrect authentication credentials.
  • 403: Forbidden – you are not allowed to see the data requested based on your authentication.
  • 404: Not Found – the URL is invalid, or the resource does not exist, or you are unauthorized to see it.
  • 502: Bad Gateway – the servers might have issues.

Great! That’s enough definitions.

Now we are ready to see some examples in Python.

Python Example #1: Yelp API call

We’ll start from the Yelp API, which is more straightforward. Then we’ll provide a more complicated example with the Twitter API.

We tried to summarize the general procedures of using an API with this Yelp example. Please note that the process of setting up other APIs might still differ.

Step #1: Read the Documentation

The good starting point to learn about a new API is the documentation. Before diving into the details of the API, it’s critical to learn:

  • the data we can request for.
  • the parameters we can set when asking for data.
  • sample requests and responses.

Once we are sure the API offers the type of data needed, we can dig more details into the particular API.

For example, to find the Yelp API document, you can search ‘Yelp API’ in the web browser. Within the Yelp developers page, we can see different tools offered by Yelp; we’ll focus on the Yelp Fusion.

Yelp Fusion has several Business Endpoints. Each endpoint is a path that’s used to retrieve different data from the API. For example, the /businesses/search endpoint returns up to 1000 businesses based on the provided search criteria. It has some basic information about the business. To get detailed information and reviews, we need to access other endpoints such as /businesses/{id} and /businesses/{id}/reviews.

Also, the documentation often contains other information, such as how to authenticate API calls, which is what we’ll do next.

Step #2: Create Authentication Keys

The APIs usually require authentication when we call for data. This often has 2 steps:

  • register for a (developer) account of the website.
  • create an app and generate the API keys.

You can think of the API keys as credentials that authenticate you as the user of the API. The API keys should be kept secret like other passwords. Whenever you use the API, you will put these keys in the HTTP request according to the docs, which will be shown in the next step.

Different APIs have different authentication schemes. Some just need to see the API keys like Yelp, while others have extra steps like Twitter.

The Yelp Fusion API documentation provides detailed instructions on how to set up the authentication. You can follow it to obtain the API key before moving onto the next step.

Step #3: Create Request

With the authentication, we are ready to request for data by making API calls in Python.

The standard Python library for handling HTTP is Requests: HTTP for Humans, which provides functions for easy sending of HTTP requests.

First, let’s install and import this package.

Next, we can copy the Yelp API key and assign it to a variable api_key. The API key should look like a string of random characters.

Don’t forget that it’s private credentials, so we are not showing it here.

Then we need to find out the base URL for the API endpoint. For example, the base URL for the Yelp Fusion business search is https://api.yelp.com/v3/businesses/search.

We’ll be sending a GET request to call for data. Before using the GET method, let’s specify the input variables with this basic information:

  • headers with the API key.
    Following the API sample code structure, we set up the request headers as a dictionary with the authentication information.
  • search_api_url with the API endpoint URL (https://api.yelp.com/v3/businesses/search).
  • params with some search parameters/criteria.
    For example, we can search for 50 businesses with the term “coffee” located in ‘Toronto, Ontario’, as shown in the code below.
    For more details about the parameters, you can look at the documentation. There are some mandatory parameters and some optional parameters.

Then we can feed these variables and form the GET response object called response. We also set timeout = 5 to stop Requests from waiting for a response after 5 seconds.

To take a look at the URL and the response object’s status, we can use the code below.

https://api.yelp.com/v3/businesses/search?term=coffee&location=Toronto%2C+Ontario&limit=50
200

We can see that since this is a GET request, the URL contains the search criteria.

The status code of 200 tells us the request was a success, which means the data should be returned. It’s always a good idea to check the status code before proceeding further.

Step #4: Check the Response

It’s great to know that we have the data, but how do we get them?

Let’s check some basic information on the response content using the headers method.

{'Connection': 'keep-alive', 'server': 'openresty/1.13.6.2', 'content-type': 'application/json', 'x-routing-service': 'routing-main--useast1-ff7994d96-shnxx; site=public_api_v3', 'ratelimit-remaining': '4999', 'x-b3-sampled': '0', 'x-zipkin-id': '686c5779f1002440', 'ratelimit-resettime': '2020-08-12T00:00:00+00:00', 'ratelimit-dailylimit': '5000', 'x-proxied': '10-65-147-6-useast1bprod', 'content-encoding': 'gzip', 'x-extlb': '10-65-147-6-useast1bprod', 'Accept-Ranges': 'bytes', 'Date': 'Tue, 11 Aug 2020 15:39:37 GMT', 'Via': '1.1 varnish', 'X-Served-By': 'cache-yyz4528-YYZ', 'X-Cache': 'MISS', 'X-Cache-Hits': '0', 'Vary': 'Accept-Encoding', 'transfer-encoding': 'chunked'}

The ‘content-type’ shows ‘application/json’, which means the content is in JSON format. The JSON (JavaScript Object Notation) is a lightweight data-interchange format that’s easy for machines to handle. For API calls, the response content is often found in JSON format.

Yet, JSON is not a good format for us to do analysis in Python. We can decode it to a Python object (dictionary) using the .json() method.

To verify that we have a dictionary, we can print out the type of data_dict. We can see that the dictionary has three keys ‘businesses’, ‘total’, and ‘region’.

By checking the Response Body part of the documentation, we can learn more about the data returned. For example, the key ‘business’ within the data is a list of dictionaries with information about the 50 businesses returned.

To print out the first business and its basic information, we can run the code below.

{'id': 'c6f8wBjPLDzyubEBqgcMnw',
 'alias': 'fahrenheit-coffee-toronto',
 'name': 'Fahrenheit Coffee',
 'image_url': 'https://s3-media2.fl.yelpcdn.com/bphoto/185TXlWlMtQn-voNzRBqPA/o.jpg',
 'is_closed': False,
 'url': 'https://www.yelp.com/biz/fahrenheit-coffee-toronto?adjust_creative=gSa3Zp7bFDYOEsbqB7lP5g&utm_campaign=yelp_api_v3&utm_medium=api_v3_business_search&utm_source=gSa3Zp7bFDYOEsbqB7lP5g',
 'review_count': 316,
 'categories': [{'alias': 'coffee', 'title': 'Coffee & Tea'}],
 'rating': 4.5,
 'coordinates': {'latitude': 43.65244524304, 'longitude': -79.372932326803},
 'transactions': [],
 'price': '$$',
 'location': {'address1': '120 Lombard Avenue',
  'address2': None,
  'address3': '',
  'city': 'Toronto',
  'zip_code': 'M5A 4J6',
  'country': 'CA',
  'state': 'ON',
  'display_address': ['120 Lombard Avenue', 'Toronto, ON M5A 4J6', 'Canada']},
 'phone': '+16478961774',
 'display_phone': '+1 647-896-1774',
 'distance': 309.454471500597}

Great!

With the data in a familiar Python format, you should be able to manipulate it using Python.

Python Example #2: Twitter API call

Besides the Yelp API, we’d like to show you an example of a Twitter API using the Requests package, which is more complicated. We won’t get into all the details like the Yelp example, but will include general descriptions and the Python code with an example API call. Note that there are easier ways than this example, but this should help you better understand the basics.

First, we import the requests module and the base64, which will be used for encoding the API keys.

There are three tiers of Tweet Search APIs on Twitter. We’ll show the Premium 30-days endpoint since it contains more data and is free.

After following the Twitter documentation, you should have set up a developer account, created an app, and generated a set of keys. You can copy the API key and the API secret key below to assign them to client_key and client_secret variables.

Twitter offers a few different authentication methods. We’ll be using the OAuth 2.0 Bearer Token, which allows access to information publicly available on Twitter.

We need a Bearer Token to use this method, which you can generate by passing your API key and secret key through the POST oauth2/token endpoint. So before any API calls for data, we need to complete this extra step of authentication to obtain this Bearer Token.

Note: the Requests package does have an extension package (Requests-OAuthlib) that handles this protocol better, but we won’t use it because we want to show how the standard HTTP requests/responses work.

The basic steps are as follows:

  • encode the client_key and client_secret as a base64 format.
  • send a POST request with the encoded keys in the header to get the Bearer Token.

Below is the code executing the above procedures and generate the token access_token.

Finally, with this access_token, we are ready to send requests for data to the Twitter API.

Let’s send a POST request asking for 10 Tweets with the term ‘Toronto Raptors’ within a specific time range below.

200
https://api.twitter.com/1.1/tweets/search/30day/justintodata.json

If everything’s set up correctly, you should see a status code of 200 again. And since this is a POST request, the URL doesn’t contain the parameters, which is different from the GET request.

Again, we can convert the response object to a Python dictionary.

We can print out the keys of the dictionary.

dict_keys(['results', 'next', 'requestParameters'])

Or we can take a look at the first element for the key ‘results’. We can take a look at its keys.

dict_keys(['created_at', 'id', 'id_str', 'text', 'source', 'truncated', 'in_reply_to_status_id', 'in_reply_to_status_id_str', 'in_reply_to_user_id', 'in_reply_to_user_id_str', 'in_reply_to_screen_name', 'user', 'geo', 'coordinates', 'place', 'contributors', 'retweeted_status', 'is_quote_status', 'quote_count', 'reply_count', 'retweet_count', 'favorite_count', 'entities', 'extended_entities', 'favorited', 'retweeted', 'possibly_sensitive', 'filter_level', 'lang', 'matching_rules'])

We can look at its key ‘text’, which is the text of the Tweet.

'RT @Pasion_Basket1: ESPECTACULAR movimiento de balón de los Toronto Raptors 👌\n\n https://t.co/RT93KThoBW'

That’s it!

Note: there’re other libraries available for easier access to the Twitter APIs. Check out the FAQ sections. Feel free to explore it yourself after learning the basics.


By now, you should be able to call data from APIs automatically using Python.

Try it in your next data science project!

We’d love to hear from you. Leave a comment for any questions you may have or anything else.

Share on twitter
Twitter
Share on linkedin
LinkedIn
Share on facebook
Facebook
Share on email
Email
Lianne & Justin

Lianne & Justin

Leave a Comment

Your email address will not be published. Required fields are marked *

More recent articles

Scroll to Top
We use cookies to ensure you get the best experience on our website.  Learn more.