DataRobot

DataRobot for Developers

Embed cutting edge AI into the applications and services you build.

Get Started    API Reference

Quickstart Guide

A guide to getting started with DataRobot APIs

DataRobot is an advanced enterprise AI platform that democratizes data science and
automates the end-to-end process for building, deploying, and maintaining artificial
intelligence and machine learning at scale. This quickstart guides you through a simple example problem, predicting fuel economy in miles-per-gallon from data about an automobile (e.g., vehicle weight, number of cylinders, etc.). You will go through all the steps required steps to train an AI, deploy a model, make predictions, and monitor the deployment with DataRobot APIs. The code for this quickstart guide is available on github.

What you will need

  1. DataRobot account. Sign up for free to get started.
  2. Example Automotive MPG dataset auto-mpg.csv.
  3. API Authentication.
  4. Optional: Python client.

Getting Started

Now that you have your API credentials you can use the APIs to:

  1. Upload a dataset.
  2. Train and Deploy an AI to learn something in your dataset and deploy the model.
  3. Predict outcomes with new data.
  4. Monitor the health of your deployment.

To simplify this tutorial, we are going to create some environmental
variables that we will reference by name in the steps below:

export DATAROBOT_API_TOKEN=<your api key>
export DATAROBOT_ENDPOINT=https://app2.datarobot.com/api/v2

If you did not create your account with the link above find your endpoint in Guide to different DataRobot endpoints.

Upload dataset

First, upload your dataset file and create a ProjectProject - All of the modeling in DataRobot happens within a project. Each project has one dataset that is used as the source from which to train models. named 'Auto MPG'.

"""
This example code relies on the `datarobot` library
Install it with:
pip3 install datarobot
"""
dr.Client(endpoint=datarobot_endpoint, token=datarobot_api_token)
project = dr.Project.start(project_name="Auto MPG DR-Client",
                           sourcedata=dataset_file_path.as_posix(),
                           target="mpg")
DATAROBOT_API_TOKEN=${DATAROBOT_API_TOKEN}
DATAROBOT_ENDPOINT=${DATAROBOT_ENDPOINT}

location=$(curl -Lsi \
  -X POST \
  -H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" \
  -F 'projectName="Auto MPG"' \
  -F "[email protected]${DATASET_FILE_PATH}" \
  # if you are not using app2.datarobot.com endpoints
  # change `location` to `Location`
  "${DATAROBOT_ENDPOINT}"/projects/ | grep -i 'location: .*$' | \
  cut -d " " -f2 | tr -d '\r')
echo "Uploaded dataset. Checking status of project at: ${location}"
while true; do
  project_id=$(curl -Ls \
    -X GET \
    -H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" "${location}" \
    | grep -Eo 'id":\s"\w+' | cut -d '"' -f3 | tr -d '\r')
  if [ "${project_id}" = "" ]
  then
    echo "Setting up project..."
    sleep 10
  else
    echo "Project setup complete."
    echo "Project ID: ${project_id}"
    break
  fi
done
"""
This example code relies on the `requests` and `requests-toolbelt` libraries.
Install it with:
pip3 install requests requests-toolbelt
"""
import os
from pathlib import Path
from time import sleep

import requests
from requests_toolbelt.multipart.encoder import MultipartEncoder

api_key = os.getenv("DATAROBOT_API_TOKEN")
host_url = os.getenv("DATAROBOT_ENDPOINT")
dataset_path = Path('.').parent.joinpath('data', 'auto-mpg.csv')

form_data = {
    "file": (dataset_path.name, dataset_path.open("rb")),
    "projectName": "Auto MPG",
}
encoder = MultipartEncoder(fields=form_data)
headers = {}
headers.update({"Authorization": f"Bearer {api_key}"})
headers.update({"Content-Type": encoder.content_type})
response = requests.post(
    f"{host_url}/projects/", headers=headers, data=encoder.read()
)
try:
    assert response.status_code == 202
except AssertionError:
    print(
        f"Status code: {response.status_code}, Reason: {response.reason}, "
        f"Details: {response.content}"
    )
else:
    # if you are not using app2.datatrobot.com endpoints 
    # change `location` to `Location`
    while True:
        project_id = requests.get(
            response.headers["location"], headers=headers
        ).json().get("id", None)
        if project_id is not None:
            print(f"Project setup complete. Project ID: {project_id}")
            break
        else:
            print("Setting up project...")
            sleep(10)

Train & Deploy

Now you will tell DataRobot to build an AI and deploy the model that best understands how to predict "mpg."

# Train
# Autopilot will take a bit to complete.
# Run the following and then grab a coffee or catch up on email.
project.wait_for_autopilot()
model = dr.ModelRecommendation.get(project.id)
# Deploy
prediction_server = dr.PredictionServer.list()[0]
deployment = dr.Deployment.create_from_learning_model(
    model_id=model.model_id, label="MPG Prediction Server",
    description="Deployed with DataRobot client",
    default_prediction_server_id=prediction_server.id
)
## Train
response=$(curl -Lsi \
  -X PATCH \
  -H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" \
  -H "Content-Type: application/json" \
  --data '{"target": "mpg", "mode": "quick"}' \
  "${DATAROBOT_ENDPOINT}/projects/${project_id}/aim" | grep 'location: .*$' \
  | cut -d " " | tr -d '\r')
echo "AI training initiated. Checking status of training at: ${response}"
while true; do
  initial_project_status=$(curl -Ls \
    -X GET \
    -H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" "${response}" \
    | grep -Eo 'stage":\s"\w+' | cut -d '"' -f3 | tr -d '\r')
  if [ "${initial_project_status}" = "" ]
  then
    echo "Setting up AI training..."
    sleep 10
  else
    echo "Training AI. This will take a bit."
    echo "Grab a coffee or catch up on email."
    break
  fi
done

while true; do
  project_status=$(curl -Lsi \
    -X GET \
    -H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" \
    "${DATAROBOT_ENDPOINT}/projects/${project_id}/status" \
    | grep -Eo 'autopilotDone":\strue'
  )
  if [ "${project_status}" = "" ]
  then
    echo "Autopilot training in progress..."
    sleep 60
  else
    echo "Autopilot training complete. AI ready to deploy."
    break
  fi
done
## Deploy
recommended_model_id=$(curl -s \
  -X GET \
  -H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" \
  "${DATAROBOT_ENDPOINT}/projects/${project_id}/recommendedModels"\
  "/recommendedModel/" \
  | grep -Eo 'modelId":\s"\w+' | cut -d '"' -f3 | tr -d '\r')
server_data=$(curl -s -X GET \
  -H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" \
  "${DATAROBOT_ENDPOINT}/predictionServers/")
default_server_id=$(echo $server_data \
  | grep -Eo 'id":\s"\w+' | cut -d '"' -f3 | tr -d '\r')
server_url=$(echo $server_data | grep -Eo 'url":\s".*?"' \
  | cut -d '"' -f3 | tr -d '\r')
server_key=$(echo $server_data | grep -Eo 'datarobot-key":\s".*?"' \
  | cut -d '"' -f3 | tr -d '\r')
read -r -d '' request_data<<EOF
{
        "defaultPredictionServerId":"${default_server_id}",
        "modelId":"${recommended_model_id}",
        "description":"Deployed with curl",
        "label":"MPG Prediction Server"
}
EOF
deployment_response=$(curl -Lsi -X POST \
  -H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" \
  -H "Content-Type: application/json" \
  --data "${request_data}" \
  "${DATAROBOT_ENDPOINT}/deployments/fromLearningModel/")
deploy_response_code_202=$(echo $deployment_response | grep -Eo 'HTTP/2 202')
if [ "${deploy_response_code_202}" = "" ]
then
  deployment_id=$(echo "$deployment_response" | grep -Eo 'id":\s"\w+' \
    | cut -d '"' -f3 | tr -d '\r')
  echo "Prediction server ready."
else
  deployment_status=$(echo "$deployment_response" | grep -Eo 'location: .*$' \
    | cut -d " " | tr -d '\r')
  while true; do
    deployment_ready=$(curl -Ls \
    -X GET \
    -H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" "${deployment_status}" \
    | grep -Eo 'id":\s"\w+' | cut -d '"' -f3 | tr -d '\r')
    if [ "${deployment_ready}" = "" ]
    then
      echo "Waiting for deployment..."
      sleep 10
    else
      deployment_id=$deployment_ready
      echo "Prediction server ready."
      break
    fi
  done
fi
# Train
headers = {}
headers.update(
    {
        "Authorization": f"Bearer {datarobot_api_token}",
        "Content-Type": "application/json",
    }
)
form_data = {"target": "mpg", "mode": "quick"}
train_response = requests.patch(
    f"{datarobot_endpoint}/projects/{project_id}/aim",
    headers=headers,
    json=form_data,
)
try:
    assert train_response.status_code == 202
except AssertionError:
    print(
        f"Status code: {train_response.status_code}, "
        f"Reason: {train_response.reason}, "
        f"Details: {train_response.content}"
    )
else:
    while True:
        training_status = (
            requests.get(train_response.headers["location"], headers=headers)
            .json()
            .get("stage", None)
        )
        if training_status is not None:
            print(
                "Training AI. This will take a bit. "
                "Grab a coffee or catch up on email."
            )
            break
        else:
            print("Setting up AI training...")
            sleep(10)
while True:
    try:
        project_status = requests.get(
            f"{datarobot_endpoint}/projects/{project_id}/status",
            headers=headers,
        )
        assert project_status.status_code == 200
    except AssertionError:
        print(
            f"Something went wrong. Status code: {project_status.status_code}, "
            f"Reason: {project_status.reason}"
        )
        exit()
    else:
        if project_status.json()["autopilotDone"]:
            print("Autopilot training complete. AI ready to deploy.")
            break
        else:
            print("Autopilot training in progress...")
            sleep(60)
# Deploy
recommended_model = requests.get(
    f"{datarobot_endpoint}/projects/{project_id}/recommendedModels/"
    f"recommendedModel/",
    headers=headers,
)
model_id = recommended_model.json()["modelId"]

server_response = requests.get(
    f"{datarobot_endpoint}/predictionServers/", headers=headers
)
server_data = server_response.json()["data"][0]
default_server_id = server_data["id"]
default_server_url = server_data["url"]
default_server_key = server_data["datarobot-key"]

request_data = {
    "defaultPredictionServerId": default_server_id,
    "modelId": model_id,
    "description": "Deployed with python",
    "label": "MPG Prediction Server",
}
deploy_response = requests.post(
    f"{datarobot_endpoint}/deployments/fromLearningModel",
    headers=headers,
    json=request_data,
)
if deploy_response.status_code == 202:
    deployment_status = deploy_response.headers["location"]
    while True:
        deployment_status = requests.get(deployment_status)
        if deployment_status.json().get("id", None) is not None:
            print(f"Prediction server ready.")
            deployment_id = deployment_status.json()["id"]
            break
        else:
            print("Waiting for deployment...")
            sleep(10)
elif deploy_response.status_code != 200:
    print(
        f"Something went wrong. Status code: {deploy_response.status_code}, "
        f"Reason: {deploy_response.reason}"
    )
else:
    deployment_id = deploy_response.json()["id"]
    print(f"Prediction server ready.")

Make predictions

With a prediction server deployed, you can make predictions.

autos = [
    {
        "cylinders": 4,
        "displacement": 119.0,
        "horsepower": 82.00,
        "weight": 2720.0,
        "acceleration": 19.4,
        "model year": 82,
        "origin": 1,
    },
    {
        "cylinders": 8,
        "displacement": 120.0,
        "horsepower": 79.00,
        "weight": 2625.0,
        "acceleration": 18.6,
        "model year": 82,
        "origin": 1,
    },
]
prediction_headers = {
    "Authorization": f"Bearer {datarobot_api_token}",
    "Content-Type": "application/json",
    "datarobot-key": prediction_server.datarobot_key,
}
predictions = requests.post(
    f"{prediction_server.url}/predApi/v1.0/deployments"
    f"/{deployment.id}/predictions",
    headers=prediction_headers,
    data=json.dumps(autos),
)
pprint(predictions.json())
autos='[{
  "cylinders": 4,
  "displacement": 119.0,
  "horsepower": 82.00,
  "weight": 2720.0,
  "acceleration": 19.4,
  "model year": 82,
  "origin":1
},{
  "cylinders": 8,
  "displacement": 120.0,
  "horsepower": 79.00,
  "weight": 2625.0,
  "acceleration": 18.6,
  "model year": 82,
  "origin":1
}]'
curl -X POST \
  -H 'Content-Type: application/json' \
  -H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" \
  -H "datarobot-key: ${server_key}" \
  --data "${autos}" \
  "${server_url}/predApi/v1.0/deployments/${deployment_id}/predictions"
autos = [
    {
        "cylinders": 4,
        "displacement": 119.0,
        "horsepower": 82.00,
        "weight": 2720.0,
        "acceleration": 19.4,
        "model year": 82,
        "origin": 1,
    },
    {
        "cylinders": 8,
        "displacement": 120.0,
        "horsepower": 79.00,
        "weight": 2625.0,
        "acceleration": 18.6,
        "model year": 82,
        "origin": 1,
    },
]
prediction_headers = {
    "Authorization": f"Bearer {datarobot_api_token}",
    "Content-Type": "application/json",
    "datarobot-key": default_server_key,
}
predictions = requests.post(
    f"{default_server_url}/predApi/v1.0/deployments/{deployment_id}/predictions",
    headers=prediction_headers,
    data=json.dumps(autos),
)
pprint(predictions.json())

You can also find more examples of how to make predictions with DataRobot in this GitHub repository.

Monitor

DataRobot keeps track of the health of the prediction server over time through various performance metrics.

service_stats = deployment.get_service_stats()
pprint(service_stats.metrics)
curl -s -X GET \
  -H "Authorization: Bearer ${DATAROBOT_API_TOKEN}" \
  -H "Content-Type: application/json" \
  "${DATAROBOT_ENDPOINT}/deployments/${deployment_id}/serviceStats/"
service_health_headers = {"Authorization": f"Bearer {datarobot_api_token}",
                          "Content-Type": "application/json"}
service_health = requests.get(
    f"{datarobot_endpoint}/deployments/{deployment_id}/serviceStats/",
    headers=service_health_headers,
)
pprint(service_health.json())

📘

DataRobot Community Pages

Learn more about the DataRobot platform and machine learning:
DataRobot Community
Using Python with DataRobot

Updated 17 days ago


Quickstart Guide


A guide to getting started with DataRobot APIs

Suggested Edits are limited on API Reference Pages

You can only suggest edits to Markdown body content, but not to the API spec.