NAV Navbar
Logo
shell

Storage API

The Storage API is a scalable document store designed to warehouse and index structured and unstructured data points, along with their annotations. All URLs for the Storage API are relative to https://data.pavlov.ai.

Authentication

curl "API_ENDPOINT_HERE" \
  -H "Authorization: Bearer $AUTH_TOKEN"

All API endpoints require an valid auth token. This auth token may be passed via the username or password of HTTP Basic scheme, or via the token for the HTTP Bearer scheme like so:

Authorization: Bearer $AUTH_TOKEN

Databases

Databases are the highest level of organization within the Storage API. Each database contains a heterogenous collection of documents and their associated annotations. All databases have an owner which can manage the documents and annotations in the database.

GET /

curl "https://data.pavlov.ai/" \
  -H "Authorization: Bearer $AUTH_TOKEN"

Lists all databases.

Returns

a page of databases

POST /

curl -X POST "https://data.pavlov.ai/" \
  -H "Authorization: Bearer $AUTH_TOKEN" \
  --form name="My Database" \
  --form desc="This is where I put my data."

Creates a database.

Parameter Default Description
name required a human-readable name of the database
desc required a human-readable description of the database

Returns

the newly created database

GET /:database

curl "https://data.pavlov.ai/$DATABASE" \
  -H "Authorization: Bearer $AUTH_TOKEN"

Finds a database.

Returns

the corresponding database

PUT /:database

curl -X PUT "https://data.pavlov.ai/$DATABASE" \
  -H "Authorization: Bearer $AUTH_TOKEN" \
  --form name="My Database" \
  --form desc="This is where I put my data."

Updates a database’s metadata.

Parameter Default Description
name required a human-readable name of the database
desc required a human-readable description of the database

Returns

the updated database

DELETE /:database

curl -X DELETE "https://data.pavlov.ai/$DATABASE" \
  -H "Authorization: Bearer $AUTH_TOKEN"

Deletes a database, including all tables, documents, and annotations.

Returns

true upon success

Documents & Tables

Documents are data points assigned random unique identifiers based on UUID v4. All documents belong to a table, whose name must be alphanumeric and cannot begin with a number.

Documents may be indexed by arbitrary fields. You can create indices on a table by specifying the JSONPath within the document that you’d like indexed, along with the type of the index.

For methods accepting request bodies (i.e. POST, PUT, and PATCH), the request body specifies the document’s fields and files.

GET /:database/_types

curl "https://data.pavlov.ai/$DATABASE/_types" \
  -H "Authorization: Bearer $AUTH_TOKEN"

Lists all tables in the database.

Returns

a page of tables

GET /:database/_items

curl "https://data.pavlov.ai/$DATABASE/_items?fetchSize=100" \
  -H "Authorization: Bearer $AUTH_TOKEN"

Lists all documents in the database.

Parameter Default Description
fetchSize 5000 maximum number of documents to return per page

Returns

a page of documents

GET /:database/:table

curl "https://data.pavlov.ai/$DATABASE/$TABLE" \
  -H "Authorization: Bearer $AUTH_TOKEN"

Finds a table.

Returns

the corresponding table

PUT /:database/:table

curl -X PUT "https://data.pavlov.ai/$DATABASE/$TABLE" \
  -H "Authorization: Bearer $AUTH_TOKEN" \
  --form schema="$JSON_SCHEMA" \
  --form indices="$INDICES"

Creates or replaces a table.

Tables optionally have an associated JSON schema which documents must always conform to. The fields, files, and annotations of the document may be validated using this schema. Learn more about JSONSchema.

Tables may have indices associated with them that allow you to search and filter by arbitrary paths in the JSON object. Learn more about index definition syntax.

Parameter Default Description
schema null the JSON Schema for documents to conform to
indices null a JSON object of index names to index configurations

Returns

the new or updated table

DELETE /:database/:table

curl -X DELETE "https://data.pavlov.ai/$DATABASE/$TABLE" \
  -H "Authorization: Bearer $AUTH_TOKEN"

Deletes a table and all associated documents.

Returns

true upon success

POST /:database/:table

curl -X POST "https://data.pavlov.ai/$DATABASE/$TABLE" \
  -H "Authorization: Bearer $AUTH_TOKEN" \
  --form my_custom_field=some_value \
  --form my_custom_file=@some_file.txt

Creates a document.

Returns

the newly created document

GET /:database/:table/_items

curl "https://data.pavlov.ai/$DATABASE/$TABLE/_items" \
  -H "Authorization: Bearer $AUTH_TOKEN" \
  --form query="$INDEX_QUERY"

Lists documents in the database.

You may take advantage of indices to retrieve a filtered or ordered subset of an table you’re requesting. Learn more about query syntax.

Parameter Default Description
query optional the JSON-encoded query to execute

Returns

a page of documents

DELETE /:database/:table/_items

curl -X DELETE "https://data.pavlov.ai/$DATABASE/$TABLE/_items" \
  -H "Authorization: Bearer $AUTH_TOKEN"

Deletes documents in a table in the database.

Returns

true upon success

GET /:database/:table/:document

curl "https://data.pavlov.ai/$DATABASE/$TABLE/$DOCUMENT" \
  -H "Authorization: Bearer $AUTH_TOKEN"

Finds a document.

Returns

the corresponding document

PUT /:database/:table/:document

curl -X PUT "https://data.pavlov.ai/$DATABASE/$TABLE/$DOCUMENT" \
  -H "Authorization: Bearer $AUTH_TOKEN" \
  --form my_custom_field=some_value \
  --form my_custom_file=@some_file.txt

Updates a document.

The document’s current fields and files will be replaced with the request budy’s fields and files.

Returns

the updated document

PATCH /:database/:table/:document

curl -X PATCH "https://data.pavlov.ai/$DATABASE/$TABLE/$DOCUMENT" \
  -H "Authorization: Bearer $AUTH_TOKEN" \
  --form my_custom_field=some_value \
  --form my_custom_file=@some_file.txt

Merges a document.

The document’s current fields and files will be merged with the request body’s fields and files. The request body’s fields and files will take precedence if a conflict occurs. The merge is shallow, meaning only top-level keys will considered for merging.

Returns

the merged document

DELETE /:database/:table/:document

curl -X DELETE "https://data.pavlov.ai/$DATABASE/$TABLE/$DOCUMENT" \
  -H "Authorization: Bearer $AUTH_TOKEN"

Deletes a document and its annotations.

Returns

true upon success

GET /:database/:table/:document/_revisions

curl "https://data.pavlov.ai/$DATABASE/$TABLE/$DOCUMENT/_revisions" \
  -H "Authorization: Bearer $AUTH_TOKEN"

Finds a document’s revisions. Older revisions are retained for 2 weeks.

Returns

a page of document revisions

Annotations

Documents may be annotated with rich tags. Annotations have three parts: a source, a tag, and a score. The source is the user identifier of whomever created the annotation. The tag is arbitrary JSON or plain strings that provide the content of the annotation. The score is a value between 0 and 1 inclusive suggesting the annotator’s confidence interval.

POST /:database/:table/:document

curl -X POST "https://data.pavlov.ai/$DATABASE/$TABLE/$DOCUMENT" \
  -H "Authorization: Bearer $AUTH_TOKEN" \
  --form tag="this tag rules" \
  --form score=0.999

Annotates a document.

The annotation’s identifier will be a UUID v4 without dashes.

Parameter Default Description
tag required arbitrary JSON defining the metadata of the annotation
score required confidence interval of the annotation between [0, 1]

Returns

the newly created annotation

GET /:database/:table/:document/:annotation

curl "https://data.pavlov.ai/$DATABASE/$TABLE/$DOCUMENT/$ANNOTATION" \
  -H "Authorization: Bearer $AUTH_TOKEN"

Finds an annotation.

Returns

the corresponding annotation

DELETE /:database/:table/:document/:annotation

curl -X DELETE "https://data.pavlov.ai/$DATABASE/$TABLE/$DOCUMENT/$ANNOTATION" \
  -H "Authorization: Bearer $AUTH_TOKEN"

Deletes an annotation on a document.

Returns

true upon success

Indices

Example: Imagine we’re defining an event table and want to search for events made by a particular user in the last 30 days. Defines three indices, primary, username, and event_time. The primary index refers to the field id in. There are secondary indices on the username and event_time fields that allow you to query for exact matches on usernames and both exact matches and ranges of the event time.

{
  "primary": {
    "type": "string",
    "options": {
      "path": "$.fields.id"
    }
  },
  "username": {
    "type": "string",
    "options": {
      "path": "$.fields.username"
    }
  },
  "event_time": {
    "type": "date",
    "options": {
      "path": "$.fields.event_time"
    }
  }
}

Now assume you’re making a query on a table with the above index definition. You can find all of user kern‘s events after a particular date sorted in reverse chronological order using the following index query:

{
  "filter": [
    {
      "index": "username",
      "value": "kern"
    },
    {
      "index": "event_time",
      "from": "2017-03-29T20:24:30.591Z"
    }
  ],
  "sort": {
    "index": "event_time",
    "reverse": true
  }
}

Indices are a powerful way to filter and sort documents in a table. You can define indices on values located at a specific JSONPath within a table. Indices have a name, type, and path.

Index names are up to you to decide. There is a special index primary that is unique within a table. The types may be:

Index queries are objects with two fields: filter and sort. Both filters and sorts reference an index. Filters may be exact matches using value, or range queries using from and to. Sorts can be reversed using reverse.

Pagination

curl "https://data.pavlov.ai/_page?pageToken=$PAGE_TOKEN" \
  -H "Authorization: Bearer $AUTH_TOKEN"

Long lists of documents are paginated into chunks defined by the fetchSize argument. To resolve the next page in a paginated list, follow the next link. A null link value means that the list has ended.

GET /_page

Resolves a page token, returning the next batch of documents from a long list.

Parameter Default Description
pageToken required opaque string provided by other endpoints defining the current page state
fetchSize 5000 maximum number of documents to return

Returns

the following page

Protect API

Pavlov Protect is an integrated solution for detecting copyrighted and trademarked content at scale. It offers a simple API that lets you programmatically submit images and receive detection decisions on the content they contain. Additionally, you can train Protect to learn to detect new content.

All images submitted to Protect are run through our well-trained and object detection state-of-the-art neural network. You can provide feedback and ask for further review on detections.

URLS for the Protect API are relative to https://protect.pavlov.ai/. All request parameters (except file uploads) can be passed via application/x-www-form-urlencoded or multipart/form-data unless otherwise noted. Responses are in JSON format. Successful requests will have their relevant information in the response under the top-level "data" field. Unsuccessful requests will have an "errors" field with more information.

Workflow

When using Protect to detect regions of interest in an image:

  1. Submit an image to Protect using the RESTful API
  2. Retrieve the tag in the response body
  3. Provide feedback to Protect so it can improve accuracy for future submissions
  4. Review detection if you want Protect to dive deeper into its decision
  5. Receive a webhook whenever Protect updates its decision on any of your submitted images

List all images

curl "https://protect.pavlov.ai/images" \
  -u "$USERNAME:$PASSWORD"

will return something like:

{
  "data": [
    {
      "type": "image",
      "id": "123",
      "url": "http://kern.io/images/logo.png",
      "createdAt": "2016-07-18T03:31:30.000Z",
      "status": "complete",
      "name": "Kern's Logo",
      "description": "A good gibraltar.",
      "detect": true,
      "contentOwner": false,
      "contentId": null,
      "contentIdLabel": null
    },
    {
      "type": "image",
      "id": "456",
      "url": "http://bit.ly/2o1VFNs",
      "createdAt": "2016-07-18T04:20:00.000Z",
      "status": "complete",
      "name": "Trump Analogy",
      "description": "Ugh this guy",
      "detect": false,
      "contentOwner": false,
      "contentId": null,
      "contentIdLabel": null
    }
  ]
}

List all submitted images. Images are returned in reverse chronological order (latest to oldest).

HTTP Request

GET https://protect.pavlov.ai/images

Query Parameters

Parameter Default Description
offset 0 the number of images to skip
limit 25 the maximum number of images to return
contentOwnerOnly false whether or not to only list content owner assets
contentIdOnly false whether or not to only list content id assets
feedbackOnly false whether or not to only list assets with feedback

Response Fields

Field Type Description
id string the image’s unique identifier
url string the image’s url, persisted via Pavlov Protect
createdAt ISO 8601 date string the time the image was submitted
status \"pending\", \"review\" or \"complete\" \"pending\" if Protect is still tagging the image, \"review\" if it is currently under review, \"complete\" when it has finished
name string the image’s name
description string the image’s description
detect boolean whether or not Pavlov Protect detected infringing content in the image
contentOwner boolean whether or not the image is from a rights-holder
contentId string null
contentIdLabel string null
feedback object null
feedback.label string a description of the tag
feedback.bounds array of 4 numbers, or null for unbounded the bounds of the tag in the images as 2 points: [x1, y1, x2, y2]

Detect in an image

curl -X POST "https://protect.pavlov.ai/images" \
  -u "$USERNAME:$PASSWORD" \
  --form image=@pikachu.png

will return something like:

{
  "data": {
    "type": "image",
    "id": "123",
    "url": "http://kern.io/images/logo.png",
        "createdAt": "2016-07-18T03:31:30.000Z",
    "status": "complete",
    "name": "Kern's Logo",
    "description": "A good gibraltar.",
    "detect": true,
    "contentOwner": false,
    "contentId": null,
    "contentIdLabel": null
  }
}

Submits a new image for tagging. The tags are returned in the response body.

Images are downsampled to fit within a 512x512 pixel box and stored. The maximum image size accepted is 25MiB. PNG, JPG, and TIFF formats are all acceptable.

HTTP Request

POST https://protect.pavlov.ai/images

Query Parameters

Parameter Default Description
image required unless url is provided the image you want to submit for tagging
url required unless image is provided the URL of the image you want to submit for tagging
name optional an optional name of the image
description optional an optional description of the image
label optional an optional label to tag the image with
bounds optional an optional comma-separated list of 4 integers defining the location of the tag within the image
contentOwner false indicates the asset is from a rights holder
contentId false marks the asset as a content id asset (i.e. its label is returned if matched)

Response Fields

Field Type Description
id string the image’s unique identifier
url string the image’s url, persisted via Pavlov Protect
createdAt ISO 8601 date string the time the image was submitted
status \"pending\", \"review\" or \"complete\" \"pending\" if Protect is still tagging the image, \"review\" if it is currently under review, \"complete\" when it has finished
name string the image’s name
description string the image’s description
detect boolean whether or not Pavlov Protect detected infringing content in the image
contentOwner boolean whether or not the image is from a rights-holder
contentId string null
contentIdLabel string null
feedback object null
feedback.label string a description of the tag
feedback.bounds array of 4 numbers, or null for unbounded the bounds of the tag in the images as 2 points: [x1, y1, x2, y2]

Retrieve an image

curl "https://protect.pavlov.ai/images/$IMAGE" \
  -u "$USERNAME:$PASSWORD"

will return something like:

{
  "data": {
    "type": "image",
    "id": "123",
    "url": "http://kern.io/images/logo.png",
        "createdAt": "2016-07-18T03:31:30.000Z",
    "status": "complete",
    "name": "Kern's Logo",
    "description": "A good gibraltar.",
    "detect": true,
    "contentOwner": false,
    "contentId": null,
    "contentIdLabel": null
  }
}

Retrieves the URL, tags, name, description, and detection result of a previously submitted image.

HTTP Request

GET https://protect.pavlov.ai/images/:id

Response Fields

Field Type Description
id string the image’s unique identifier
url string the image’s url, persisted via Pavlov Protect
createdAt ISO 8601 date string the time the image was submitted
status \"pending\", \"review\" or \"complete\" \"pending\" if Protect is still tagging the image, \"review\" if it is currently under review, \"complete\" when it has finished
name string the image’s name
description string the image’s description
detect boolean whether or not Pavlov Protect detected infringing content in the image
contentOwner boolean whether or not the image is from a rights-holder
contentId string null
contentIdLabel string null
feedback object null
feedback.label string a description of the tag
feedback.bounds array of 4 numbers, or null for unbounded the bounds of the tag in the images as 2 points: [x1, y1, x2, y2]

Provide feedback

curl -X POST "https://protect.pavlov.ai/images/$IMAGE/feedback" \
  -u "$USERNAME:$PASSWORD" \
  --form label=pikachu \
  --form bounds=10,10,20,20

will return something like:

{
  "data": true
}

Provides Protect with feedback on a previously submitted image.

HTTP Request

POST https://protect.pavlov.ai/images/:id/feedback

Body Parameters

Field Type Description
label string the label to tag the image with. if not provided, the image is marked as having no detection
bounds array of 4 numbers, or null for unbounded the bounds of the tag in the images as 2 points: [x1, y1, x2, y2]

Remove feedback

curl -X DELETE "https://protect.pavlov.ai/images/$IMAGE/feedback" \
  -u "$USERNAME:$PASSWORD"

will return something like:

{
  "data": true
}

Notifies Protect to remove feedback provided on a previously submitted image.

HTTP Request

DELETE https://protect.pavlov.ai/images/:id/feedback

Request review

curl -X POST "https://protect.pavlov.ai/images/$IMAGE/review" \
  -u "$USERNAME:$PASSWORD"

will return something like:

{
  "data": true
}

Ask Pavlov Protect to review the image’s detection. If your account has a webhook configured, it will be called when the review has completed.

Pavlov Protect will perform an HTTP POST request to a URL of your choice whenever it updates its decision on any of your submitted images, including when you provide feedback and when a review has completed. The body of the request is identical to the response when you retrieve an image.

HTTP Request

POST https://protect.pavlov.ai/images/:id/review

Submit an asset

curl -X POST "https://protect.pavlov.ai/submit" \
  -u "$USERNAME:$PASSWORD" \
  --form image=@pikachu.png \
  --form label=pikachu

will return something like:

{
  "data": {
    "id": "1234567890"
  }
}

Tags an asset for future detections.

There are flags for special kinds of assets:

contentOwner - indicates the asset is from a rights holder contentId - marks the asset as a content id asset (i.e. its label is returned if matched)

Images are downsampled to fit within a 512x512 pixel box and stored. The maximum image size accepted is 25MiB. PNG, JPG, and TIFF formats are all acceptable.

HTTP Request

POST https://protect.pavlov.ai/submit

Response Fields

Field Type Description
id string the image’s unique identifier

Errors

The Pavlov API uses the following error codes:

Error Code Meaning
400 Bad Request – A required argument was not supplied or its format was incorrect
401 Unauthorized – An invalid auth token was provided
404 Not Found – The requested resource cannot be found
500 Internal Server Error – Something went wrong