How to update a pandas dataframe, from multiple API calls?

Code Explanation

  • Create dataframe, df, with pd.read_csv.
    • It is expected that all of the values in 'person_id', are unique.
  • Use .apply on 'person_id', to call prepare_data.
    • prepare_data expects 'person_id' to be a str or int, as indicated by the type annotation, Union[int, str]
  • Call the API, which will return a dict, to the prepare_data function.
  • Convert the 'rents' key, of the dict, into a dataframe, with pd.json_normalize.
  • Use .apply on 'carId', to call the API, and extract the 'mileage', which is added to dataframe data, as a column.
  • Add 'person_id' to data, which can be used to merge df with s.
  • Convert pd.Series, s to a dataframe, with pd.concat, and then merge df and s, on person_id.
  • Save to a csv with pd.to_csv in the desired form.

Potential Issues

  • If there's an issue, it's most likely to occur in the call_api function.
  • As long as call_api returns a dict, like the response shown in the question, the remainder of the code will work correctly to produce the desired output.
import pandas as pd
import requests
import json
from typing import Union

def call_api(url: str) -> dict:
    r = requests.get(url)
    return r.json()

def prepare_data(uid: Union[int, str]) -> pd.DataFrame:
    
    d_url = f'http://api.myendpoint.intranet/get-data/{uid}'
    m_url = 'http://api.myendpoint.intranet/get-mileage/'
    
    # get the rent data from the api call
    rents = call_api(d_url)['rents']
    # normalize rents into a dataframe
    data = pd.json_normalize(rents)
    
    # get the mileage data from the api call and add it to data as a column
    data['mileage'] = data.carId.apply(lambda cid: call_api(f'{m_url}{cid}')['mileage'])
    # add person_id as a column to data, which will be used to merge data to df
    data['person_id'] = uid
    
    return data
    

# read data from file
df = pd.read_csv('file.csv', sep=';')

# call prepare_data
s = df.person_id.apply(prepare_data)

# s is a Series of DataFrames, which can be combined with pd.concat
s = pd.concat([v for v in s])

# join df with s, on person_id
df = df.merge(s, on='person_id')

# save to csv
df.to_csv('output.csv', sep=';', index=False)
  • If there are any errors when running this code:
    1. Leave a comment, to let me know.
    2. edit your question, and paste the entire TraceBack, as text, into a code block.

Example

# given the following start dataframe
   person_id    name  flag
0       1000  Joseph     1
1        400     Sam     1

# resulting dataframe using the same data for both id 1000 and 400
   person_id    name  flag  carId  price rentStatus  mileage
0       1000  Joseph     1   6638   1000     active   1000.0
1       1000  Joseph     1   5566   2000     active   1000.0
2        400     Sam     1   6638   1000     active   1000.0
3        400     Sam     1   5566   2000     active   1000.0