Skip to main content

Exxa Batch Inference API (0.1.0)

Download OpenAPI specification:Download

API for managing batch inference processes

For more information, please contact the Exxa team at support@withexxa.com

requests

List Requests

List all your requests, it streams a jsonl line by line. If you want the full details, set the full query parameter to True.

Authorizations:
APIKeyHeader
query Parameters
full
boolean (Full)
Default: false

Return full details if set to True, else only the simple status

Responses

Response samples

Content type
text/plain
{"id": "0123456789ab0123456789ab", "status": "completed", "created_at": 1701234567, "expires_at": 1701320967, "in_progress_at": 1701234700, "ended_at": 1701235000}
{"id": "0123456789ab0123456789ac", "status": "completed", "created_at": 1701234567, "expires_at": 1701320967, "in_progress_at": 1701234700, "ended_at": 1701235000}

Create Request

Authorizations:
APIKeyHeader
Request Body schema: application/json
required
Metadata (object) or Metadata (null) (Metadata)
Default: {}

Optional custom metadata for the request.

required
object (ChatCompletionRequest)

Body of the request to send to the LLM

completion_window
string (Completion Window)
Default: "24h"
Value: "24h"

The time frame within which the batch should be processed.

Value: "24h"
Webhook (string) or Webhook (null) (Webhook)

Webhook to notify when the request is completed

Responses

Request samples

Content type
application/json
{
  • "metadata": { },
  • "request_body": {
    },
  • "completion_window": "24h",
  • "webhook": "string"
}

Response samples

Content type
application/json
[
  • {
    }
]

Get Request Result

Authorizations:
APIKeyHeader
path Parameters
request_id
required
string (Request Id)

Responses

Response samples

Content type
application/json
[
  • {
    }
]

Get Request Status

Authorizations:
APIKeyHeader
path Parameters
request_id
required
string (Request Id)

Responses

Response samples

Content type
application/json
{
  • "id": "0123456789ab0123456789ab",
  • "status": "completed",
  • "created_at": 1701234567,
  • "expires_at": 1701320967,
  • "in_progress_at": 1701234700,
  • "ended_at": 1701235000
}

Cancel Request

Authorizations:
APIKeyHeader
path Parameters
request_id
required
string (Request Id)

Responses

Response samples

Content type
application/json
{
  • "id": "0123456789ab0123456789ab",
  • "status": "completed",
  • "created_at": 1701234567,
  • "expires_at": 1701320967,
  • "in_progress_at": 1701234700,
  • "ended_at": 1701235000,
  • "metadata": { },
  • "request_body": {
    },
  • "completion_window": "24h",
  • "webhook": "string",
  • "result_body": {
    }
}

batches

List Batches

List all your batches, it streams a jsonl line by line.

Authorizations:
APIKeyHeader

Responses

Response samples

Content type
text/plain
{"id": "0123456789ab0123456789ab", "status": "completed", "requests_counts": {"total": 2, "completed": 0, "failed": 0}, "created_at": 1701234567, "in_progress_at": 1701235000, "completed_at": 1701237500, "cancelled_at": null, "requests_ids": ["0123456789ab0123456789ab", "0123456789ab0123456789ac"], "webhook": "https://example.com/webhook", "metadata": {"user_id": "my_custom_id"}}
{"id": "0123456789ab0123456789ab", "status": "completed", "requests_counts": {"total": 2, "completed": 0, "failed": 0}, "created_at": 1701234567, "in_progress_at": 1701235000, "completed_at": 1701237500, "cancelled_at": null, "requests_ids": ["0123456789ab0123456789ab", "0123456789ab0123456789ac"], "webhook": "https://example.com/webhook", "metadata": {"user_id": "my_custom_id"}}

Create Batch

Authorizations:
APIKeyHeader
Request Body schema: application/json
required
requests_ids
required
Array of strings (Requests Ids)

Ids of the requests to group in the batch

Webhook (string) or Webhook (null) (Webhook)

Optional Webhook to notify when the batch is completed

Metadata (object) or Metadata (null) (Metadata)
Default: {}

Optional custom metadata for the batch.

Responses

Request samples

Content type
application/json
{
  • "requests_ids": [
    ],
  • "metadata": {
    }
}

Response samples

Content type
application/json
{
  • "id": "0123456789ab0123456789ab",
  • "status": "registered",
  • "requests_counts": {
    },
  • "created_at": 1701234567
}

Get Batch

Authorizations:
APIKeyHeader
path Parameters
batch_id
required
string (Batch Id)

Responses

Response samples

Content type
application/json
{
  • "id": "0123456789ab0123456789ab",
  • "status": "completed",
  • "requests_counts": {
    },
  • "created_at": 1701234567,
  • "in_progress_at": 1701235000,
  • "completed_at": 1701237500,
  • "cancelled_at": null,
  • "requests_ids": [
    ],
  • "metadata": {
    }
}

Get Batch Status

Authorizations:
APIKeyHeader
path Parameters
batch_id
required
string (Batch Id)

Responses

Response samples

Content type
application/json
{
  • "id": "0123456789ab0123456789ab",
  • "status": "completed",
  • "requests_counts": {
    },
  • "created_at": 1701234567,
  • "in_progress_at": 1701235000,
  • "completed_at": 1701237500,
  • "cancelled_at": null
}

Cancel Batch

Authorizations:
APIKeyHeader
path Parameters
batch_id
required
string (Batch Id)

Responses

Response samples

Content type
application/json
{
  • "id": "0123456789ab0123456789ab",
  • "status": "completed",
  • "requests_counts": {
    },
  • "created_at": 1701234567,
  • "in_progress_at": 1701235000,
  • "completed_at": 1701237500,
  • "cancelled_at": null
}

Get Batch Requests

Streams all requests statuses in the batch in a jsonl, one line at a time

Authorizations:
APIKeyHeader
path Parameters
batch_id
required
string (Batch Id)

Responses

Response samples

Content type
text/plain
{"id": "0123456789ab0123456789ab", "status": "completed", "created_at": 1701234567, "expires_at": 1701320967, "in_progress_at": 1701234700, "ended_at": 1701235000}
{"id": "0123456789ab0123456789ab", "status": "completed", "created_at": 1701234567, "expires_at": 1701320967, "in_progress_at": 1701234700, "ended_at": 1701235000}

Get Batch Results

Stream of all requests in the batch

Authorizations:
APIKeyHeader
path Parameters
batch_id
required
string (Batch Id)

Responses

Response samples

Content type
text/plain
{"id": "0123456789ab0123456789ab", "status": "completed", "created_at": 1701234567, "expires_at": 1701320967, "in_progress_at": 1701234700, "ended_at": 1701235000, "metadata": {}, "request_body": {"messages": [{"role": "system", "content": "You are a helpful assistant"}, {"role": "user", "content": "Hello, how are you?"}], "model": "llama-3.1-70b-instruct-fp16", "frequency_penalty": 0.0, "logprobs": false, "max_tokens": 16384, "n": 1, "presence_penalty": 0.0, "response_format": {"type": "text"}, "stream": false, "temperature": 0.7, "top_p": 1.0}, "completion_window": "24h", "result_body": {"object": "chat.completion", "model": "llama-3.1-70b-instruct-fp16", "id": "0123456789ab0123456789ab", "created": 1701234567, "choices": [{"message": {"role": "assistant", "content": "I'm fine, and you?"}, "finish_reason": "stop"}], "usage": {"total_tokens": 10, "prompt_tokens": 5, "completion_tokens": 5}}}
{"id": "0123456789ab0123456789ab", "status": "completed", "created_at": 1701234567, "expires_at": 1701320967, "in_progress_at": 1701234700, "ended_at": 1701235000, "metadata": {}, "request_body": {"messages": [{"role": "system", "content": "You are a helpful assistant"}, {"role": "user", "content": "Hello, how are you?"}], "model": "llama-3.1-70b-instruct-fp16", "frequency_penalty": 0.0, "logprobs": false, "max_tokens": 16384, "n": 1, "presence_penalty": 0.0, "response_format": {"type": "text"}, "stream": false, "temperature": 0.7, "top_p": 1.0}, "completion_window": "24h", "result_body": {"object": "chat.completion", "model": "llama-3.1-70b-instruct-fp16", "id": "0123456789ab0123456789ab", "created": 1701234567, "choices": [{"message": {"role": "assistant", "content": "I'm fine, and you?"}, "finish_reason": "stop"}], "usage": {"total_tokens": 10, "prompt_tokens": 5, "completion_tokens": 5}}}

Health

Responses

Response samples

Content type
application/json
null