Getting Started

Let’s go through the basics of the Crossing Minds API with a few key examples.

Before you can create your own database, you will need to be provided with an API key.

If your organization has already been created in the API by Crossing Minds but you don’t have an account yet, you should ask your API manager to create one for you.

Overview

Most applications and developers will use an official client library in a language we cover, but it’s important to get familiar with the underlying HTTP methods and headers first.

Hello World

In this document we will use cURL, so you can discover how to do API requests without hiding any detail. For instance, let’s consider the root endpoint /:

$ curl https://api.crossingminds.com/
RESPONSE
"Welcome to the Crossing Minds API"

The response of the API is actually a JSON string.

Indeed, when no header is present in the request, the encoding of the response is set by default to JSON.

Examining the response headers with the -i option will confirm this:

$ curl https://api.crossingminds.com/ -i
RESPONSE
HTTP/2 200
content-type: application/json
vary: Accept, Origin
allow: GET, HEAD, OPTIONS
date: Wed, 1 Jan 2020 20:00:00 GMT
via: 1.1 google
alt-svc: clear

"Welcome to the Crossing Minds API"

Data Encoding

You may get an equivalent response by explicitly setting the Accept header to application/json. Later on, we will pipe the output to the jq command to parse JSON. It is not really helpful when the response is a single string, but will be handy to extract pieces of JSON objects:

$ curl https://api.crossingminds.com/ -s -H "Accept: application/json" | jq -r
RESPONSE
Welcome to the Crossing Minds API

When posting some data, cURL is by default using the encoding application/x-www-form-urlencoded, even if JSON is used in Accept. In order to send the data as JSON instead, you will need to set the Content-Type header to application/json as well.

The Crossing Minds API also supports python’s binary pickle encoding. This can be done by setting both the Accept and Content-Type headers to application/xminds-pkl. Using a binary encoding method will provide great performance improvement when it comes to sending lots of data stored as contiguous arrays.

You may display the binary content as a hexadecimal string using the xxd command:

$ curl https://api.crossingminds.com/ -s -H "Accept: application/xminds-pkl" \
  | xxd -p -c99
RESPONSE
8003581d00000057656c636f6d6520746f2043726f7373696e67204d696e64732041504971002e

To confirm that this binary string is a python value encoded using pickle, we can do:

$ curl https://api.crossingminds.com/ -s -H "Accept: application/xminds-pkl" \
  | python -c "import pickle, sys; print(pickle.loads(sys.stdin.buffer.read()))"
RESPONSE
Welcome to the Crossing Minds API

Of course, you will (hopefully) never have to do such things manually.

While using the official python client, setting the headers and parsing the binary response is done automatically.

Authentication

To explore further than the root endpoint /, you will need a valid account.

Generally speaking, the Crossing Minds API uses the JWT standard to authenticate the requests. The Authentication documentation explains these mechanisms in more details.

Before creating any database or receiving any recommendation, you will need to authenticate using the root account. Let’s assume the correct values are found in the following environment variables:

export XMINDS_API_ROOT_EMAIL="your.root@your.email.com"
export XMINDS_API_ROOT_PWD="y0urAP1key"

Your first JWT token will be obtained from the endpoint POST login/root/ using the root email/password combination.

$ curl https://api.crossingminds.com/login/root/ -s \
  -H "Content-Type: application/json" \
  -d '{"email": "'"$XMINDS_API_ROOT_EMAIL"'", "password": "'"$XMINDS_API_ROOT_PWD"'"}' \
  | jq -r
RESPONSE
{
  "token": "eyJ0eXAiOiJ..."
}

(Note: the bash notation "'" is a way to expand the environment variables within single quotes)

You can properly extract the value using jq and set in the JWT_TOKEN variable such as:

$ JWT_TOKEN=$( \
  curl https://api.crossingminds.com/login/root/ -s \
  -H "Content-Type: application/json" \
  -d '{"email": "'"$XMINDS_API_ROOT_EMAIL"'", "password": "'"$XMINDS_API_ROOT_PWD"'"}' \
  | jq -r '.token')

You may now use this token to authenticate your requests. This is done by using the Authorization header.

For instance, you can list the available databases by calling the endpoint GET databases/.

$ curl https://api.crossingminds.com/databases/ -s \
  -H "Authorization: Bearer $JWT_TOKEN" \
  | jq -r
RESPONSE
{
  "has_next": false,
  "next_page": 1,
  "databases": [
    {
      "id": "dRWccG8KS4x4IJd634qt",
      "name": "My Test Database",
      "description": "This is a test database",
      "item_id_type": "uint32",
      "user_id_type": "uint32"
    },
    {
      "id": "x-x8-hzmXrSxC32W_X-b",
      "name": "My Production Database",
      "description": "The actual database used in production",
      "item_id_type": "hex10",
      "user_id_type": "uuid"
    }
  ]
}

Once you get a token, you won’t need to use a password to authenticate.

In the Authentication documentation you may also read about refresh tokens, which provides a mechanism to renew the short-lived JWT token automatically without having to enter your password again. This is particularly helpful to implement frontend clients without having to expose your API key publicly.

Using Our Official Python Client

When possible, it’s preferred to use official clients that take care of this boilerplate for you.

As of today we only provide a client for python3, and more languages will be available in the future.

You can install the python client using pip with:

pip install xminds

You can now easily reproduce the steps of the previous endpoints by simply calling the client methods.

from xminds.api.client import CrossingMindsApiClient
client = CrossingMindsApiClient()
client.login_root(XMINDS_API_ROOT_EMAIL, XMINDS_API_ROOT_PWD)
client.get_all_databases()
RESPONSE
{'databases': [
   {'id': 'kfXmRB0ZvNzRj5mC3ckb',
    'name': 'My Test Database',
    'description': 'This is a test database',
    'item_id_type': 'uint32',
    'user_id_type': 'uint32'},
   {'id': 'x-x8-hzmXrSxC32W_X-b',
    'name': 'My Production Database',
    'description': 'The actual database used in production',
    'item_id_type': 'hex10',
    'user_id_type': 'uuid'}],
 'has_next': False,
 'next_page': 1}

Note how the client handles authentication with JWT and encoding transparently.

Preparing Databases

Before you can request recommendations, you will need to upload the training data, in other words tables containing:

  • item data,

  • user data (optional),

  • user-item interactions.

All this data will be gathered under the “database” entity, so you will need to create a new database using POST databases/.

In this example, we introduce how to configure the database to use integers for the items and users ID, but you can replace it by what is more appropriate to your needs, as explained in the next sections.

The endpoint returns the database ID, let’s save it to the DB_ID variable.

$ DB_ID=$(curl https://api.crossingminds.com/databases/ -s \
  -H "Authorization: Bearer $JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{ \
      "name": "My First Database", \
      "description": "Longer description...", \
      "item_id_type": "uint32", \
      "user_id_type": "uint32" \
      }' \
  | jq -r '.id')

In order to select the newly created database in all the following requests, you will need to update the JWT token so it specifies the database ID.

You can do this by login as an individual, on the database, using the endpoint POST login/individual/:

$ JWT_TOKEN=$( \
  curl https://api.crossingminds.com/login/individual/ -s \
  -H "Content-Type: application/json" \
  -d '{ \
      "email": "'"$XMINDS_API_ROOT_EMAIL"'", \
      "password": "'"$XMINDS_API_ROOT_PWD"'", \
      "db_id": "'"$DB_ID"'" \
      }' \
  | jq -r '.token')

Item Data

Let’s go through the steps to upload the items that are available for recommendation.

The items of your database can be updated at any time, so you will be able to add and remove items later on.

Item ID

An item has to have at least an ID, and optionally additional key/value properties.

Databases can be configured with various types of item IDs such as UUID or integers. When creating the database above we selected uint32, which stands for 32-bits (4-bytes) unsigned integers.

See the Flexible Identifiers documentation to find the available types of item IDs.

Item Property

Enriching the item data with properties will improve the recommendations, and allow your client to dynamically filter the returned items.

Items are abstract key/value maps that follow a schema you specify by creating properties. Values are strongly typed (e.g. floating points, integers, categorical strings), and can be repeated or not.

See the Property Types documentation to find the available property types and filters.

The endpoint POST items-properties/ is used to create items property. In this example we consider a property 'price' which is a non-repeated float, and a property 'tags' which is a repeated unicode string of (up to) 20 characters.

$ curl https://api.crossingminds.com/items-properties/ -s \
  -H "Authorization: Bearer $JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"property_name": "price", "value_type": "float", "repeated": false}'

$ curl https://api.crossingminds.com/items-properties/ -s \
  -H "Authorization: Bearer $JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"property_name": "tags", "value_type": "unicode20", "repeated": true}'

You can now create the items of your catalog, using the endpoint PUT items-bulk/.

$ curl -X PUT https://api.crossingminds.com/items-bulk/ -s \
  -H "Authorization: Bearer $JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{ \
    "items": [ \
      {"item_id": "1", "price": 9.99, "tags": ["family", "sci-fi"]}, \
      ... \
      {"item_id": "999", "price": 4.49, "tags": ["family"]} \
    ]}'

When sending thousands of items or more, it is strongly recommended to split into multiple requests. The optimal size of the batch depends on the amount of data that is sent and on the encoding. A batch size of 500 should be a good starting point.

User Data

Sending user data is optional but can improve the recommendation quality, especially for the cold-start problem where a user doesn’t have a lot of ratings or even no rating at all.

The structure of the users entities is exactly similar to the items. The respective endpoints are POST users-properties/ and PUT users-bulk/.

In the example below, we consider a property 'age' which is a non-repeated int8, and a property 'subscriptions' which is a repeated unicode string of (up to) 10 characters.

$ curl https://api.crossingminds.com/users-properties/ -s \
  -H "Authorization: Bearer $JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"property_name": "age", "value_type": "int8", "repeated": false}'

$ curl https://api.crossingminds.com/users-properties/ -s \
  -H "Authorization: Bearer $JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"property_name": "subscriptions", "value_type": "unicode10", "repeated": true}'

$ curl -X PUT https://api.crossingminds.com/users-bulk/ -s \
  -H "Authorization: Bearer $JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{ \
    "users": [ \
      {"user_id": "1", "age": 25, "subscriptions": ["channel1", "channel2"]}, \
      ... \
      {"user_id": "999", "age": 32, "subscriptions": ["channel1"]} \
    ]}'

Interactions

In the current version, the Crossing Minds API only accepts blackbox user/item interactions, modeled by a floating point number from 1 (worst) to 10 (best). This real number may represent an explicit rating from the user to the item, or a virtual rating inferred from implicit feedback.

To simplify, let’s assume that you have access to explicit ratings. You can upload them to the API using the endpoint PUT ratings-bulk/.

$ curl -X PUT https://api.crossingminds.com/ratings-bulk/ -s \
  -H "Authorization: Bearer $JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{ \
    "ratings": [ \
      {"user_id": 1, "item_id": "1", "rating": 8.5, "timestamp": 1588812345}, \
      {"user_id": 1, "item_id": "42", "rating": 2.0, "timestamp": 1588854321}, \
      ... \
      {"user_id": 999, "item_id": "123", "rating": 5.5, "timestamp": 1588811111} \
    ]}'

Getting Recommendations

Once your new databases created, the first set of machine models has to be trained in order to receive recommendations.

Warning

Minimal Amount of Ratings

You need at least 500 ratings for the automatic training to start.

The Crossing Minds API will automatically start training the core machine learning models once 500 ratings are created.

After this initial threshold, the API will retrain the core machine learning models:

  • at least once every week.

  • when the number of ratings reaches an exponentially growing threshold (e.g. 500, 1k, 2k, 4k… ).

This is to ensure the core machine learning models are always using recent data to generate the recommendations.

Monitoring Background Tasks

In order to get updates on the model retraining, you can check the status of the recent background tasks, using the endpoint GET tasks/<str:task_name>/recents/.

$ curl https://api.crossingminds.com/tasks/ml_model_retrain/recents/ -s \
  -H "Authorization: Bearer $JWT_TOKEN" \
  | jq -r '.tasks'
RESPONSE
[
  {
      "name": "ml_model_retrain",
      "start_time": 123456789,
      "status": "RUNNING",
      "progress": "trained model 38c6744c"
  }
]

Once the task "ml_model_retrain" switches to "COMPLETED", the database will be ready to serve recommendations. You can confirm this by calling the endpoint GET databases/current/status/:

$ curl https://api.crossingminds.com/databases/current/status/ -s \
  -H "Authorization: Bearer $JWT_TOKEN" \
  | jq -r '.status'
RESPONSE
ready

Item-Item Recommendations

The first kind of recommendations we present is “item-item” recommendations.

Given an item ID, the API will return similar items.

Under the hood, the similarity is computed using a hybrid model that leverages both the content-based part (the properties) and the collaborative-filtering part (the interactions).

The endpoint GET recommendation/items/<str:item_id>/items/ returns the similar items ID.

$ curl https://api.crossingminds.com/recommendation/items/123/items/?amt=3 -s \
  -H "Authorization: Bearer $JWT_TOKEN" \
  | jq -r '.items_id'
RESPONSE
[456, 789, 321]

User-Item Recommendations

Let’s now consider “user-item” recommendations.

Given an user ID, the API will return items that match the profile of this user.

This is the most common endpoint to get recommendations for users that are already in the database, such as users who are signed in with your application. See Under the Hood to read on the underlying technologies.

The endpoint GET recommendation/users/<str:user_id>/items/ returns the recommended items ID.

$ curl https://api.crossingminds.com/recommendation/users/111/items/?amt=3 -s \
  -H "Authorization: Bearer $JWT_TOKEN" \
  | jq -r '.items_id'
RESPONSE
[111, 222, 333]

Session-Item Recommendations

The third kind of recommendations is “session-item”.

You may use this endpoint when you need to generate recommendations for a user who is not already signed up in your application, and therefore not in the database.

In order to compute personalized recommendations, you will need to provide the set of ratings of the anonymous session (which are typically virtual ratings from implicit feedback) and/or the user properties for this session.

The endpoint POST recommendation/sessions/items/ returns the recommended items ID.

$ curl https://api.crossingminds.com/recommendation/sessions/items/?amt=3 -s \
  -H "Authorization: Bearer $JWT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{ \
      "ratings": [ \
        {"item_id": 1, "rating": 8.5}, \
        ... \
        {"item_id": 42, "rating": 2.0}, \
      ], \
      "user_properties": {"age": 25} \
    }' \
  | jq -r '.items_id'
RESPONSE
[111, 222, 333]

This endpoint uses the HTTP verb POST so it is possible to send ratings in JSON format, but it should be seen as a GET endpoint. Indeed, no state is altered from this endpoint. In particular, the ratings of the session will not be used to update the machine learning models. In order to do this, you will need to create a user ID and then send the ratings of the new user using the endpoint PUT users/<str:user_id>/ratings/.