title | excerpt | updated |
---|---|---|
Discover the OVH Prescience APIs |
Learn how to manage OVH Prescience APIs |
2018-09-26 |
Prescience is an automatic learning tool that can be managed through several APIs to automate a wide range of actions.
This guide is a detailed introduction to those APIs and will show you how to manage your own OVH Prescience platform.
API | URL | Description |
---|---|---|
Prescience API | https://prescience-api.ai.ovh.net | API that allows to manipulate Prescience’s “sources”, “datasets” and “models”. |
Prescience Serving | https://prescience-serving.ai.ovh.net | API allows to assess a model that was generated by Prescience. |
Using Prescience requires an authentication token.
Here is an example of an API call:
curl -X GET "https://prescience-api.ai.ovh.net/project" -H "Authorization: Bearer ${TOKEN}"
The “source” object is the result of a parsing task (analysis). During the API call, the returned object includes the following items:
Record | Description | Type | Orderable | Filterable |
---|---|---|---|---|
source_id |
Source identifier | String |
Yes | No |
input_url |
Internal URL of the pre-parsing file | String |
No | No |
source_url |
Internal URL of the pre-parsing file | String |
No | No |
input_type |
Type of source file | String |
Yes | No |
headers |
The pre-parsing file contains the headers | Boolean |
Yes | No |
separator |
Separator of the pre-parsing file if CSV | String |
No | No |
diagram |
Character string that represents the diagram in JSON | String |
No | No |
status |
Source status | Status |
Yes | No |
last_update |
Last updated on 26/09/2018 | Timestamp |
Yes | No |
created_at |
Creation date | Timestamp |
Yes | No |
total_step |
Total number of steps in the parsing process | Integer |
No | No |
current_step |
Current step in the parsing process | Integer |
No | No |
current_step_description |
Description of the current step in the parsing process | String |
No | No |
Resource list:
GET https://prescience-api.ai.ovh.net/source
Settings: | Type | In | Required | Default | Meaning | Example |
---|---|---|---|---|---|---|
Page | Integer |
Query | No | 1 |
Page number | 2 |
Size | Integer |
Query | No | 100 |
Number of items per page | 50 |
Sort_column | String |
Query | No | created_at |
Field in which results are ordered | source_id |
Sort_direction | String |
Query | No | created_at |
Field in which results are ordered | source_id |
Source retrieval:
GET https://prescience-api.ai.ovh.net/source/{id_source}
Settings: | Type | In | Required | Default | Meaning | Example |
---|---|---|---|---|---|---|
id_source |
String |
Path | Yes | Source identifier | ma_source |
Source deletion:
DELETE https://prescience-api.ai.ovh.net/source/{id_source}
Settings: | Type | In | Required | Default | Meaning | Example |
---|---|---|---|---|---|---|
id_source |
String |
Path | Yes | Source identifier | ma_source |
The “dataset” object is the result of a “preprocessing” task. During the API call, the returned object will contain the following items:
Record | Description | Type | Orderable | Filterable |
---|---|---|---|---|
dataset_id |
Dataset identifier | String |
Yes | Yes |
source |
“Source” object that generated the dataset | Source |
No | Yes |
dataset_url |
Internal URL of the file resulting from the preprocess | String |
No | No |
transformation_url |
Internal URL of the transformation PMML file | String |
No | No |
label_id |
Identifier of the “label” column | String |
Yes | No |
problem_type |
Type of machine learning problem (“Classification” / “Regression”) | String |
Yes | No |
nb_fold |
Number of cutoffs done through the preprocess | Boolean |
Yes | No |
selected_columns |
List of columns chosen in the source | String[] |
No | No |
diagram |
Character string that represents the diagram in JSON | String |
No | No |
status |
Dataset status | Status |
Yes | No |
last_update |
Last updated on 26/09/2018 | Timestamp |
Yes | No |
created_at |
Creation date | Timestamp |
Yes | No |
total_step |
Total number of steps in the preprocess | Integer |
No | No |
current_step |
Current step of the preprocess operation | Integer |
No | No |
current_step_description |
Description of the current step in the preprocess operation | String |
No | No |
List of datasets:
GET https://prescience-api.ai.ovh.net/dataset/
Settings: | Type | In | Required | Default | Meaning | Example |
---|---|---|---|---|---|---|
Page | Integer |
Query | No | 1 |
Page number | 2 |
Size | Integer |
Query | No | 100 |
Number of items per page | 50 |
Sort_column | String |
Query | No | created_at |
Field in which results are ordered | source_id |
Sort_direction | String |
Query | No | created_at |
Field in which results are ordered | source_id |
Dataset_id | String |
Query | No | Filtering field on the dataset name (search in LIKE mode) | dataset |
|
Source_id | String |
Query | No | Filtering field on the dataset source name (search in LIKE mode) | source |
Dataset retrieval:
GET https://prescience-api.ai.ovh.net/dataset/{id_dataset}
Settings: | Type | In | Required | Default | Meaning | Example |
---|---|---|---|---|---|---|
id_dataset |
String |
Path | Yes | Dataset identifier | my_dataset |
Deleting a dataset:
DELETE https://prescience-api.ai.ovh.net/dataset/{id_dataset}
Settings: | Type | In | Required | Default | Meaning | Example |
---|---|---|---|---|---|---|
id_dataset |
String |
Path | Yes | Dataset identifier | my_dataset |
The “model” object is the result of a “train” task. During the API call, the returned object will contain the following items:
Record | Description | Type | Orderable | Filterable |
---|---|---|---|---|
model_id |
Model identifier | String |
Yes | No |
dataset |
“Dataset” object that generated the model | Dataset |
No | Yes |
label_id |
Identifier of the “label” column | String |
Yes | No |
config |
“Config” object that generated the model | Config |
No | No |
status |
Dataset status | Status |
Yes | No |
last_update |
Last updated on 26/09/2018 | Timestamp |
Yes | No |
created_at |
Creation date | Timestamp |
Yes | No |
total_step |
Total number of steps in the “train” process | Integer |
No | No |
current_step |
Current step of the “train” process. | Integer |
No | No |
current_step_description |
Description of the current step of the “train” process | String |
No | No |
The “config” object describes the configuration used to generate the machine learning model.
Record | Description | Type |
---|---|---|
name |
Name of the algorithm used | String |
class_identifier |
Interne identifier | String |
kwargs |
Model hyparameters | Dictionary |
Model list:
GET https://prescience-api.ai.ovh.net/model
Settings: | Type | In | Required | Default | Meaning | Example |
---|---|---|---|---|---|---|
Page | Integer |
Query | No | 1 |
Desired page number | 2 |
Size | Integer |
Query | No | 100 |
Number of desired items per page | 50 |
Sort_column | String |
Query | No | created_at |
Field in which results are ordered | model_id |
Sort_direction | String |
Query | No | created_at |
Field in which results are ordered | model_id |
Dataset_id | String |
Query | No | Filtering field on the dataset name (search in LIKE mode) | dataset |
Model retrieval:
GET https://prescience-api.ai.ovh.net/model/{id_model}
Settings: | Type | In | Required | Default | Meaning | Example |
---|---|---|---|---|---|---|
id_model |
String |
Path | Yes | Model identifier | my_model |
Deleting a model:
DELETE https://prescience-api.ai.ovh.net/model/{id_model}
Settings: | Type | In | Required | Default | Meaning | Example |
---|---|---|---|---|---|---|
id_model |
String |
Path | Yes | Model identifier | my_model |
To create a “source”, you need to launch a parsing task.
POST https://prescience-api.ai.ovh.net/ml/upload/source
Settings: | Type | In | Required | Default | Meaning | Example |
---|---|---|---|---|---|---|
parse.source_id |
String |
Multipart parse JSON |
Yes | Source name | my-source |
|
parse.input_type |
String |
Multipart parse JSON |
Yes | CSV or Parquet file format only |
CSV |
|
parse.separator |
String |
Multipart parse JSON |
No | , |
Separator in the case of a CSV file | ; |
files |
Files |
Multipart input-file-file-index name |
No | File to upload (may contain several) | input-file-0 |
For example:
Assuming that the “data-1.csv” and “data-2.csv” CSV files are in the same directory:
parse.json
file
{
"source_id": "my-source",
"input_type": "csv",
"separator": ","
}
curl -H "Authorization: Bearer ${TOKEN}" -v \
-F parse='@parse.json;type=application/json' \
-F [email protected] \
-F [email protected] \
https://prescience-api.ai.ovh.net/ml/upload/source
Warning
The source that was sent back in the response is incomplete. Since the task is asynchronous, it will be completed as it progresses.
To create a "dataset", you must first have generated a "source", and then have created a preprocess task.
POST https://prescience-api.ai.ovh.net/ml/preprocess/{source_id}
Settings: | Type | In | Required | Default | Meaning | Example |
---|---|---|---|---|---|---|
source_id |
String |
Query | Yes | Name of the source to be parsed | my-source |
|
dataset_id |
String |
Body JSON | Yes | Name of the future dataset | my-big-dataset |
|
label_id |
String |
Body JSON | Yes | Identifier of the column of the dataset to be labelled | my-label |
|
nb_fold |
String |
Body JSON | No | 10 | Number of folds to create during parsing | 6 |
problem_type |
String |
Body JSON | Yes | Type of machine learning problem (classification / Regression ) |
regression |
|
selected_columns |
String[] |
Body JSON | No | [] |
Selecting columns for the dataset. By default, all columns are selected | ["colonne_1", "colonne_2"] |
For example:
preprocess.json
file
{
"dataset_id": "my-dataset",
"label_id": "my-label",
"problem_type": "classification"
}
curl -H "Authorization: Bearer ${TOKEN}" \
-H "Content-Type:application/json" \
-X POST https://prescience-api.ai.ovh.net/ml/preprocess/ma-source \
--data-binary "@preprocess.json"
Warning
The dataset that was sent back in the response is incomplete. Since the task is asynchronous, it will be completed as it progresses.
Once the dataset has been created, it is possible to start optimising it.
POST https://prescience-api.ai.ovh.net/ml/optimize/{dataset_id}
Settings: | Type | In | Required | Default | Meaning | Example |
---|---|---|---|---|---|---|
dataset_id |
String |
Query | Yes | Name of dataset to be optimised | my-big-dataset |
|
scoring_metric |
String |
Body JSON | Yes | Optimisation metric (Regression: mae /mse / R2 , Classification : accuracy , f1 , roc_auc ) |
my-source |
|
budget |
Integer |
Body JSON | 6 | Budget allocated to optimisation | 10 |
For example:
optimize.json
file
{
"scoring_metric": "roc_auc",
"budget": 6
}
curl -H "Authorization: Bearer ${TOKEN}" \
-H "Content-Type:application/json" \
-X POST https://prescience-api.ai.ovh.net/ml/optimize/my-big-dataset \
--data-binary "@optiumize.json"
Warning
The optimisation task returns an object called "Optimization". Once the optimisation is complete, it will be possible to run a query on the "Evaluation-Result" objects to obtain the best possible configuration.
The "Evaluation-Result" object is the result of an optimisation task. During the API call, the returned object will contain the following items:
Record | Description | Type |
---|---|---|
uuid |
UUID of evaluation | Integer |
spent_time |
Time spent evaluating the configuration | Integer |
costs |
Dictionary containing the metrics associated with the configuration | Dict{} |
config |
Tested configuration | Config |
status |
Dataset status | Status |
last_update |
Last updated on 26/09/2018 | Timestamp |
created_at |
Creation date | Timestamp |
total_step |
Total number of steps in the optimisation process | Integer |
current_step |
Current step of the optimisation process. | Integer |
current_step_description |
Description of the current step of the optimisation process | String |
Evaluation list:
GET https://prescience-api.ai.ovh.net/evaluation-result
Settings: | Type | In | Required | Default | Meaning | Example |
---|---|---|---|---|---|---|
Dataset_id | String |
Query | Yes | Filtering of evaluations on the dataset | my-big-dataset |
|
Page | Integer |
Query | No | 1 |
Desired page number | 2 |
Size | Integer |
Query | No | 100 |
Number of desired items per page | 50 |
Sort_column | String |
Query | No | created_at |
Field in which results are ordered | source_id |
Sort_direction | String |
Query | No | created_at |
Field in which results are ordered | source_id |
Status | String |
Query | No | Filtering data based on status | BUILT |
After choosing the best configuration from the list of "Evaluation-Results" we can train a model:
POST https://jedison.ai.ovh.net/ml/train
Settings: | Type | In | Required | Default | Meaning | Example |
---|---|---|---|---|---|---|
model_id |
String |
Query | Yes | Name of the future model | my-model |
|
evaluation_uuid |
String |
Query | Yes | Evaluation-Result identifier | bcaef619-4bf3-4c15-b49f-bc325f98d891 |
|
dataset_id |
String |
Query | No | dataset_id linked to Evaluation-Result |
To be completed if training with a dataset different than the Evaluation-Result dataset | my-alternative-dataset |
For example:
curl -H "Authorization: Bearer ${TOKEN}" \
-H "Content-Type:application/json" \
-X POST https://prescience-api.ai.ovh.net/ml/train/?model_id=mon-model&evaluation_uuid=bcaef619-4bf3-4c15-b49f-bc325f98d891 \
Warning
The training task returns an incomplete model object. Indeed, since the task is asynchronous, it will be completed as it progresses.
Once a model is trained, it can be used to make inferences.
Warning
Both APIs have a "model" object. These do not have the same structure. Only the model_id
identifier is common to both.
Model description:
POST https://prescience-serving.ai.ovh.net/model/{model_id}
The returned object describes the "model" object according to Prescience Serving.
Example of result:
{
"id": "model",
"properties": {
"created.timestamp": 1537170170985,
"accessed.timestamp": null,
"file.size": 3737,
"file.md5sum": "a13e6e482bb2e62d1376b502f8cbc8a2"
},
"schema": {
"argumentsFields": [{
"id": "hours-per-week",
"dataType": "integer",
"opType": "ordinal"
}, {
"id": "capital-gain",
"dataType": "integer",
"opType": "ordinal"
}, {
"id": "education-num",
"dataType": "integer",
"opType": "ordinal"
}, {
"id": "age",
"dataType": "integer",
"opType": "ordinal"
}, {
"id": "fnlwgt",
"dataType": "integer",
"opType": "ordinal"
}, {
"id": "capital-loss",
"dataType": "integer",
"opType": "ordinal"
}],
"transformFields": [{
"id": "imputed_hours-per-week",
"dataType": "integer",
"opType": "ordinal"
}, {
"id": "imputed_capital-gain",
"dataType": "integer",
"opType": "ordinal"
}, {
"id": "imputed_education-num",
"dataType": "integer",
"opType": "ordinal"
}, {
"id": "imputed_age",
"dataType": "integer",
"opType": "ordinal"
}, {
"id": "imputed_fnlwgt",
"dataType": "integer",
"opType": "ordinal"
}, {
"id": "imputed_capital-loss",
"dataType": "integer",
"opType": "ordinal"
}, {
"id": "scaled_imputed_hours-per-week",
"dataType": "double",
"opType": "continuous"
}, {
"id": "scaled_imputed_capital-gain",
"dataType": "double",
"opType": "continuous"
}, {
"id": "scaled_imputed_education-num",
"dataType": "double",
"opType": "continuous"
}, {
"id": "scaled_imputed_age",
"dataType": "double",
"opType": "continuous"
}, {
"id": "scaled_imputed_fnlwgt",
"dataType": "double",
"opType": "continuous"
}, {
"id": "scaled_imputed_capital-loss",
"dataType": "double",
"opType": "continuous"
}]
}
}
Warning
During the preprocessing stage, a data transformation is performed. Since the model is based on the output of this transformation, it is imperative that the data is transformed before using the model. Prescience Serving provides methods of performing both this transformation and the inference.
The Serving platform allows you to perform the following:
- Transformation and evaluation
- Evaluation only
- Transformation only
Settings: | Type | In | Required | Default | Meaning |
---|---|---|---|---|---|
id |
String |
JSON | No | Query identifier | |
arguments |
Dict |
JSON | Yes | Query arguments |
Example of unit inference:
example.json
file
{
"arguments": {
"hours-per-week": 1,
"capital-gain": 1,
"education-num": 1,
"age": 1,
"fnlwgt": 1,
"capital-loss": 1
}
}
Query
curl -H "Authorization: Bearer ${TOKEN}" \
-H "Content-Type:application/json" \
-X POST https://prescience-serving.ai.ovh.net/eval/mon-model/transform-model \
--data-binary "@example.json"
Example of the evaluation of a JSON batch:
example.json
file
[
{
"id": "eval-1",
"arguments": {
"hours-per-week": 1,
"capital-gain": 1,
"education-num": 1,
"age": 1,
"fnlwgt": 1,
"capital-loss": 1
}
},
{
"id": "eval-2",
"arguments": {
"hours-per-week": 1,
"capital-gain": 1,
"education-num": 1,
"age": 1,
"fnlwgt": 1,
"capital-loss": 1
}
}
]
Query
curl -H "Authorization: Bearer ${TOKEN}" \
-H "Content-Type:application/json" \
-X POST https://prescience-serving.ai.ovh.net/eval/mon-model/transform-model/batch/json \
--data-binary "@example.json"
Join our community of users on https://community.ovh.com/en/.