AI.news
AI+医疗机器人教育金融能源健康

Beyond the Model: Why Data Scientists Must Embrace APIs and API Documentation | Towards Data Science

1. Introduction

at the intersection of various domains — statistics, programming, AI — the ability to convey complex methodologies and insights becomes crucial. Thus, a skill to deal with comprehensive API concepts is essential for effective communication within the team.

First, it fosters collaboration among team members and stakeholders. Data Science (DS) projects often involve multidisciplinary teams consisting of not only data specialists, but also software developers, business analysts, project managers, etc. Well-documented APIs serve as a bridge between all of them, enabling these diverse groups to understand and utilize DS models and tools correctly.

Second, high-quality API documentation enhances reproducibility and might reduce on boarding time for newcomers. In DS, where models and analyses must be validated and replicated, clear API documentation ensures that others can follow the same processes, use the same data, and achieve consistent results. This is particularly important in developing data-driven decision-making.

Finally, as Data Science becomes increasingly integrated into business strategies, well-documented APIs can improve the scalability of data solutions and simplify the process of working with data. For example, APIs can play a significant role in data gathering for projects, enabling rapid prototyping and development of applications that rely on up-to-date information. By leveraging APIs for data gathering from sources like REST Countries (see Case 6.1), data scientists can focus on analysis rather than data acquisition.

In this post we will:

  • Briefly explore what an API is and its purpose in software development.
  • Meet with the main components of the REST API.
  • Describe the most common formats and provide practical cases of API calls and responses.
  • Sum up how a good API documentation should look, with information on endpoints, parameters, and responses.

2. What is API

An API (Application Programming Interface) comprises a set of methods by which different programs communicate with each other and exchange data. Essentially, it is an intermediary that allows applications, devices, servers, and other systems to exchange information, while hiding the processes within each system from each other.

Imagine a library with a large collection of books, and the librarian who knows where to find the exact book a certain reader needs. Here we can referred the librarian as an API that simplifies the process of accessing information, saving readers (our “frontend”) from wasting time searching through the entire book catalog (our “backend”), allowing to focus only on their specific request. Furthermore, if readers need another books, they can repeat the process of sending request to API again.

Image generated by Author using NightCafé

This analogy highlights the role of the API as an intermediary between the user and the data source, providing convenient and efficient access to information.

A special case of API is a REST API which follows the concepts of the REST (REpresentational State Transfer) architecture. REST APIs are named the industry standard because they are lightweight, flexible, and use common data formats like JSON or XML.

3. Components of REST API

Each of the REST API components below plays a vital role in organizing client-server interactions.

3.1. Resources

A resource is any entity that can be accessed through the API. Each resource has a unique identifier (URI), for example:

https://api.thecatapi.com/v1/images/search?size=med

Here, images is a collection of cats’ images from The Cat API web page [1], and search?size=med is the filter to view only medium-sized images.

3.2. HTTP Methods

HTTP methods are used to interact with resources:

  • GET — retrieve data about a resource;
  • POST — create a new resource;
  • PUT — update a resource;
  • PATCH — partial update a resource;
  • DELETE — delete a resource.

3.3. Requests and Responses

Data is exchanged between the client and server via HTTP requests and responses. In most cases, the JSON format is used because it is easy to read and supported by the vast majority of programming languages.

3.4. HTTP Headers

Headers are used to convey additional information, such as the content type (Content-Type) or authentication parameters (Authorization).

3.5. HTTP Response Codes

Each HTTP request receives a response with a specific status code:

  • 200 OK — successful request;
  • 201 Created — resource successfully created;
  • 400 Bad Request — client request error;
  • 401 Unauthorized — lack of access rights;
  • 404 Not Found — resource not found;
  • 500 Internal Server Error — server-side error.

4. API Clients

API clients like Postman or Bruno [2] simplify API interaction by providing a dedicated workspace for sending requests and managing responses. Instead of using command-line tools or writing code as we did in Case 6.1, these agents offer visual interfaces and automation features that speed up workflows.

Thus, in Case 6.2, we will consider using Bruno to interact with the JokeAPI web page [3]. Using Bruno simplifies the complex process of interaction between different software systems. Without Bruno and other API clients, developers would have to manually construct each HTTP request and process each raw response from scratch.

5. Tips on Creating Good API Documentation

Creating effective API documentation is crucial for ensuring that users can easily understand and utilize your API. Here are some key tips to keep in mind:

5.1. Prioritize Simplicity, Clarity, and Consistency

Avoid technical jargon and inconsistent terminology. Instead, use straightforward and simple enough language that is easy to follow. If necessary, establish a style guide to maintain uniformity throughout your documentation. Here you can state the main rules that are used throughout your documentation, e.g. how to format the code snippets, snapshots, the preferred tone of voice, etc.

5.2. Include Comprehensive Details

Thorough API documentation should encompass several essential elements, in particular a typical page with API method includes:

  • A Brief Description (1-2 sentences) which clearly outline the main purpose of the endpoint.
  • The Request Syntax: An overview of API call.
  • Authentication Methods: Detail the authentication processes needed to access the API securely.
  • Parameters and Data Types: Specify the required parameters and their corresponding data types for requests.
  • Examples of Requests: Provide examples of correct request and request with error to illustrate how to use the API effectively.

6. Practical Cases

Case 6.1: Make a request to a RESTful API using Python

Collecting country-level data is essential for understanding global, regional, or national trends, enabling informed decision-making for governments, businesses, and individual researchers. When working with country data like the REST Countries website [4], data scientists can get information about countries via a RESTful API to fetch area, population, and demonyms efficiently without manually scraping tons of web data. The code below retrieves and displays data about countries in Central America:

import requests
import json

url = 'https://restcountries.com/v3.1/subregion/Central America/?fields=name,area,population,demonyms'
response = requests.get(url)
jdata = response.json()
formatted_json = json.dumps(jdata, indent=4)
print(formatted_json)

Geographic regions are defined using UN methodology [5]. You can also filter response on certain fields [6]: in our case, these are name, area, amount of population, and demonyms.

The output is given as a human-readable JSON file:

[
    {
        "name": {
            "common": "Honduras",
            "official": "Republic of Honduras",
            "nativeName": {
                "spa": {
                    "official": "Rep\u00fablica de Honduras",
                    "common": "Honduras"
                }
            }
        },
        "demonyms": {
            "eng": {
                "f": "Honduran",
                "m": "Honduran"
            },
            "fra": {
                "f": "Hondurienne",
                "m": "Hondurien"
            }
        },
        "area": 112492.0,
        "population": 9892632
    },
    {
        "name": {
            "common": "Costa Rica",
            "official": "Republic of Costa Rica",
            "nativeName": {
                "spa": {
                    "official": "Rep\u00fablica de Costa Rica",
                    "common": "Costa Rica"
                }
            }
        },
        "demonyms": {
            "eng": {
                "f": "Costa Rican",
                "m": "Costa Rican"
            },
            "fra": {
                "f": "Costaricaine",
                "m": "Costaricain"
            }
        },
        "area": 51100.0,
        "population": 5309625
    },
    {
        "name": {
            "common": "Guatemala",
            "official": "Republic of Guatemala",
            "nativeName": {
                "spa": {
                    "official": "Rep\u00fablica de Guatemala",
                    "common": "Guatemala"
                }
            }
        },
        "demonyms": {
            "eng": {
                "f": "Guatemalan",
                "m": "Guatemalan"
            },
            "fra": {
                "f": "Guat\u00e9malt\u00e8que",
                "m": "Guat\u00e9malt\u00e8que"
            }
        },
        "area": 108889.0,
        "population": 18079810
    },
    {
        "name": {
            "common": "Panama",
            "official": "Republic of Panama",
            "nativeName": {
                "spa": {
                    "official": "Rep\u00fablica de Panam\u00e1",
                    "common": "Panam\u00e1"
                }
            }
        },
        "demonyms": {
            "eng": {
                "f": "Panamanian",
                "m": "Panamanian"
            },
            "fra": {
                "f": "Panam\u00e9enne",
                "m": "Panam\u00e9en"
            }
        },
        "area": 75417.0,
        "population": 4064780
    },
    {
        "name": {
            "common": "Nicaragua",
            "official": "Republic of Nicaragua",
            "nativeName": {
                "spa": {
                    "official": "Rep\u00fablica de Nicaragua",
                    "common": "Nicaragua"
                }
            }
        },
        "demonyms": {
            "eng": {
                "f": "Nicaraguan",
                "m": "Nicaraguan"
            },
            "fra": {
                "f": "Nicaraguayenne",
                "m": "Nicaraguayen"
            }
        },
        "area": 130373.0,
        "population": 6803886
    },
    {
        "name": {
            "common": "Belize",
            "official": "Belize",
            "nativeName": {
                "bjz": {
                    "official": "Belize",
                    "common": "Belize"
                },
                "eng": {
                    "official": "Belize",
                    "common": "Belize"
                },
                "spa": {
                    "official": "Belice",
                    "common": "Belice"
                }
            }
        },
        "demonyms": {
            "eng": {
                "f": "Belizean",
                "m": "Belizean"
            },
            "fra": {
                "f": "B\u00e9lizienne",
                "m": "B\u00e9lizien"
            }
        },
        "area": 22966.0,
        "population": 417634
    },
    {
        "name": {
            "common": "El Salvador",
            "official": "Republic of El Salvador",
            "nativeName": {
                "spa": {
                    "official": "Rep\u00fablica de El Salvador",
                    "common": "El Salvador"
                }
            }
        },
        "demonyms": {
            "eng": {
                "f": "Salvadoran",
                "m": "Salvadoran"
            },
            "fra": {
                "f": "Salvadorienne",
                "m": "Salvadorien"
            }
        },
        "area": 21041.0,
        "population": 6029976
    }
]

Case 6.2: Make a request to JokeAPI using Bruno

JokeAPI is a free, open-source REST API that delivers jokes in various formats, e.g. JSON, XML, YAML, or plain text [3].

  1. Open Bruno and select Collections+ Create collection.
  2. Select a name for your collection, e.g. Sample API.
  3. The created collection is displayed in the left panel. To create a request, click New Request.
  4. Select the request type (HTTP) and specify its name e.g. joke_request.
  5. In the URL cell, select the method (GET) and enter the endpoint https://v2.jokeapi.dev/joke/Any?blacklistFlags=religious,political,racist,sexist&type=single.
    The URL had been constructed based on preferences you selected on JokeAPI website. In our example, we chose any joke category, except nsfw, religious, political, racist and sexist ones (they were flagged and put to the blacklist).
  6. The parameters that we selected on the website and copied appeared in the request after the ? as a query string to the endpoint URL in the GET field, separated by & from each other . They will also appear in a table in the Params tab.
  7. Click Send, wait a bit … and you’ll get a not-that-bad joke in response (“The generation of random numbers is too important to be left to chance.”). Pay attention to the status of out request – it’s 200 OK which means success.
How Bruno interface looks like. Screenshot made by Author.

It’s important to note that in this example, we didn’t require an API key to access our REST API resource. Otherwise, we would have to pass it as a header in a separate Headers tab.

Case 6.3: Make a request to NASA Open APIs with API key

The APOD (Astronomy Picture of the Day) of NASA is a popular service which provides users with access to the daily photo or video related to astronomy, along with a description [7].

Let’s briefly make an example of NASA APOD API documentation sample based on the tips from the 5th paragraph.

NASA APOD API Documentation

Description: This API allows users to retrieve images or video for specific dates, ranges, or just randomly selected ones from the APOD NASA website.

Request Syntax: GET https://api.nasa.gov/planetary/apod

Authentication Methods: To access the APOD API, you should include an API key in your request. To get a free API key, you need to sigh up at https://api.nasa.gov/. This key should be included as a query parameter in the request.

Parameters and Data Types: see the table below

ParameterTypeDescription
api_key*stringYour personal NASA API key. If not specified, one may use DEMO_KEY to check how requests look like
datestring (datetime)The date of the APOD image to retrieve. If not specified, defaults as today
start_datestring (datetime)The start of the date range for retrieving images. Cannot be used in one request with date
end_datestring (datetime)The end of the date range for retrieving images. Using with start_date in the same request
countintegerReturns a particular number of randomly chosen images. Do not use with datetime parameters
thumbsbooleanReturns the URL of the video thumbnail, if true. In a case if the APOD object is not a video, this parameter is ignored

* — required parameters

Examples of Request

Correct Request with 200 OK status

GET https://api.nasa.gov/planetary/apod?api_key=<your_API_key>

{
    "copyright": "Simone Curzi",
    "date": "2026-05-18",
    "explanation": "Spiral galaxy NGC 3169 looks to be unraveling like a ball of cosmic yarn. It lies some 70 million light-years away, south of bright star Regulus toward the faint constellation Sextans. Wound up spiral arms are pulled out into sweeping tidal tails as NGC 3169 (left) and neighboring NGC 3166 interact gravitationally. Eventually the galaxies will merge into one, a common fate even for bright galaxies in the local universe. Drawn out stellar arcs and plumes are clear indications of the ongoing gravitational interactions across the deep and colorful galaxy group photo. The telescopic frame spans about 20 arc minutes or about 400,000 light-years at the group's estimated distance, and includes smaller, bluish NGC 3165 to the right. NGC 3169 is also known to shine across the spectrum from radio to X-rays, harboring an active galactic nucleus that is the site of a supermassive black hole.",
    "hdurl": "https://apod.nasa.gov/apod/image/2605/ngc3169_ngc3166_ngc3165.jpg",
    "media_type": "image",
    "service_version": "v1",
    "title": "Unraveling NGC 3169",
    "url": "https://apod.nasa.gov/apod/image/2605/ngc3169_ngc3166_ngc3165px1024.jpg"
}

Correct Request with 200 OK status for a range of dates

GET https://api.nasa.gov/planetary/apod?start_date=2025-03-03&end_date=2025-03-05&api_key=<your_API_key>

[
    {
        "date": "2025-03-03",
        "explanation": "There's a new lander on the Moon. Yesterday Firefly Aerospace's Blue Ghost executed the first-ever successful commercial lunar landing. During its planned 60-day mission, Blue Ghost will deploy several NASA-commissioned scientific instruments, including PlanetVac which captures lunar dust after creating a small whirlwind of gas. Blue Ghost will also host the telescope LEXI that captures X-ray images of the Earth's magnetosphere. LEXI data should enable a better understanding of how Earth's magnetic field protects the Earth from the Sun's wind and flares.  Pictured, the shadow of the Blue Ghost lander is visible on the cratered lunar surface, while the glowing orb of the planet Earth hovers just over the horizon. Goals for future robotic Blue Ghost landers include supporting lunar astronauts in NASA's Artemis program, with Artemis III currently scheduled to land humans back on the Moon in 2027.",
        "hdurl": "https://apod.nasa.gov/apod/image/2503/BlueGhostShadow_Firefly_4096.jpg",
        "media_type": "image",
        "service_version": "v1",
        "title": "Blue Ghost on the Moon",
        "url": "https://apod.nasa.gov/apod/image/2503/BlueGhostShadow_Firefly_960.jpg"
    },
    {
        "copyright": "Valerio Minato",
        "date": "2025-03-04",
        "explanation": "Why does this Moon look so unusual?  A key reason is its vivid red color. The color is caused by the deflection of blue light by Earth's atmosphere -- the same reason that the daytime sky appears blue.  The Moon also appears unusually distorted.  Its strange structuring is an optical effect arising from layers in the Earth's atmosphere that refract light differently due to sudden differences in temperature or pressure.  A third reason the Moon looks so unusual is that there is, by chance, an airplane flying in front. The featured picturesque gibbous Moon was captured about two weeks ago above Turin, Italy. Our familiar hovering sky orb was part of an unusual quadruple alignment that included two historic ground structures: the Sacra di San Michele on the near hill and Basilica of Superga just beyond.   Your Sky Surprise: What picture did APOD feature on your friend's birthday? (post 1995)",
        "hdurl": "https://apod.nasa.gov/apod/image/2503/QuadMoon_Minato_960.jpg",
        "media_type": "image",
        "service_version": "v1",
        "title": "A Quadruple Alignment over Italy",
        "url": "https://apod.nasa.gov/apod/image/2503/QuadMoon_Minato_960.jpg"
    },
    {
        "copyright": "Todd Anderson",
        "date": "2025-03-05",
        "explanation": "On the right, dressed in blue, is the Pleiades.  Also known as the Seven Sisters and M45, the Pleiades is one of the brightest and most easily visible open clusters on the sky. The Pleiades contains over 3,000 stars, is about 400 light years away, and only 13 light years across. Surrounding the stars is a spectacular blue reflection nebula made of fine dust.  A common legend is that one of the brighter stars faded since the cluster was named. On the left, shining in red, is the California Nebula.  Named for its shape, the California Nebula is much dimmer and hence harder to see than the Pleiades.  Also known as NGC 1499, this mass of red glowing hydrogen gas is about 1,500 light years away. Although about 25 full moons could fit between them, the featured wide angle, deep field image composite has captured them both.  A careful inspection of the deep image will also reveal the star forming region IC 348 and the molecular cloud LBN 777 (the Baby Eagle Nebula).    Jump Around the Universe: Random APOD Generator",
        "hdurl": "https://apod.nasa.gov/apod/image/2503/California2Pleiades_Anderson_9953.jpg",
        "media_type": "image",
        "service_version": "v1",
        "title": "Seven Sisters versus California",
        "url": "https://apod.nasa.gov/apod/image/2503/California2Pleiades_Anderson_960.jpg"
    }
]

Error Request with 400 Bad Request status

GET https://api.nasa.gov/planetary/apod?date=2023-03-01&end_date=2023-03-01&api_key=<your_API_key>

{
    "code": 400,
    "msg": "Bad Request: invalid field combination passed. Allowed request fields for apod method are 'concept_tags', 'date', 'hd', 'count', 'start_date', 'end_date', 'thumbs'",
    "service_version": "v1"
}

7. Conclusion

Knowing how to read (and perhaps write) API documentation is not just a technical task; it is a vital component of successful data analytics practice, improving collaboration, reproducibility, adoption, and scalability. By prioritizing clear and detailed documentation, data scientists can ensure they will be comfortable working with modern tools.

For example, many data scientists now use tools like Claude Code, a coding AI agent. With Claude Code, your files are stored locally on your computer, and the AI ​​assistant reads them from there and sends the text content to the Anthropic API for processing. It’s worth noting that comprehensive documentation for the Claude API describes all the nuances of its operation. Specifically, the Claude API is a RESTful API at https://api.anthropic.com that provides programmatic access to Claude models and managed Claude agents [8]. Hopefully, after reading this post you will understand this (and other) documentation a bit better 🙂

Thanks for reading!

List of References

  1. The Cat API web page: https://thecatapi.com/
  2. Bruno documentation: https://docs.usebruno.com/introduction/getting-started
  3. JokeAPI web page: https://jokeapi.dev/
  4. REST countries v3.1: https://restcountries.com/
  5. UNSD Methodology: Standard country or area codes for statistical use (M49)
  6. List of fields on GitLab page of the project: https://gitlab.com/restcountries/restcountries/-/blob/master/FIELDS.md
  7. NASA Open APIs: https://api.nasa.gov/
  8. Claude API Docs — An Overview: https://platform.claude.com/docs/en/api/overview