Exxa Batch Inference API (0.1.0)
Download OpenAPI specification:Download
API for managing batch inference processes
For more information, please contact the Exxa team at support@withexxa.com
List Requests
List all your requests, it streams a jsonl line by line. If you want the full details, set the full query parameter to True.
Authorizations:
query Parameters
full | boolean (Full) Default: false Return full details if set to True, else only the simple status |
Responses
Response samples
- 200
- 422
{"id": "0123456789ab0123456789ab", "status": "completed", "created_at": 1701234567, "expires_at": 1701320967, "in_progress_at": 1701234700, "ended_at": 1701235000} {"id": "0123456789ab0123456789ac", "status": "completed", "created_at": 1701234567, "expires_at": 1701320967, "in_progress_at": 1701234700, "ended_at": 1701235000}
Create Request
Authorizations:
Request Body schema: application/jsonrequired
Metadata (object) or Metadata (null) (Metadata) Default: {} Optional custom metadata for the request. | |
required | object (ChatCompletionRequest) Body of the request to send to the LLM |
completion_window | string (Completion Window) Default: "24h" Value: "24h" The time frame within which the batch should be processed. Value: "24h" |
Webhook (string) or Webhook (null) (Webhook) Webhook to notify when the request is completed |
Responses
Request samples
- Payload
{- "metadata": { },
- "request_body": {
- "frequency_penalty": 0,
- "logprobs": false,
- "max_tokens": 16384,
- "messages": [
- {
- "content": "You are a helpful assistant",
- "role": "system"
}, - {
- "content": "Hello, how are you?",
- "role": "user"
}
], - "model": "llama-3.1-70b-instruct-fp16",
- "n": 1,
- "presence_penalty": 0,
- "response_format": {
- "type": "text"
}, - "stream": false,
- "temperature": 0.7,
- "top_p": 1
}, - "completion_window": "24h",
- "webhook": "string"
}
Response samples
- 201
- 422
[- {
- "id": "0123456789ab0123456789ab",
- "status": "registered",
- "created_at": 1701234567,
- "expires_at": 1701320967
}
]
Get Request Result
Authorizations:
path Parameters
request_id required | string (Request Id) |
Responses
Response samples
- 200
- 422
[- {
- "id": "0123456789ab0123456789ab",
- "status": "completed",
- "created_at": 1701234567,
- "expires_at": 1701320967,
- "in_progress_at": 1701234700,
- "ended_at": 1701235000,
- "metadata": { },
- "request_body": {
- "messages": [
- {
- "role": "system",
- "content": "You are a helpful assistant"
}, - {
- "role": "user",
- "content": "Hello, how are you?"
}
], - "model": "llama-3.1-70b-instruct-fp16",
- "frequency_penalty": 0,
- "logprobs": false,
- "max_tokens": 16384,
- "n": 1,
- "presence_penalty": 0,
- "response_format": {
- "type": "text"
}, - "stream": false,
- "temperature": 0.7,
- "top_p": 1
}, - "completion_window": "24h",
- "result_body": {
- "object": "chat.completion",
- "model": "llama-3.1-70b-instruct-fp16",
- "id": "0123456789ab0123456789ab",
- "created": 1701234567,
- "choices": [
- {
- "message": {
- "role": "assistant",
- "content": "I'm fine, and you?"
}, - "finish_reason": "stop"
}
], - "usage": {
- "total_tokens": 10,
- "prompt_tokens": 5,
- "completion_tokens": 5
}
}
}
]
Get Request Status
Authorizations:
path Parameters
request_id required | string (Request Id) |
Responses
Response samples
- 200
- 422
{- "id": "0123456789ab0123456789ab",
- "status": "completed",
- "created_at": 1701234567,
- "expires_at": 1701320967,
- "in_progress_at": 1701234700,
- "ended_at": 1701235000
}
Cancel Request
Authorizations:
path Parameters
request_id required | string (Request Id) |
Responses
Response samples
- 200
- 422
{- "id": "0123456789ab0123456789ab",
- "status": "completed",
- "created_at": 1701234567,
- "expires_at": 1701320967,
- "in_progress_at": 1701234700,
- "ended_at": 1701235000,
- "metadata": { },
- "request_body": {
- "frequency_penalty": 0,
- "logprobs": false,
- "max_tokens": 16384,
- "messages": [
- {
- "content": "You are a helpful assistant",
- "role": "system"
}, - {
- "content": "Hello, how are you?",
- "role": "user"
}
], - "model": "llama-3.1-70b-instruct-fp16",
- "n": 1,
- "presence_penalty": 0,
- "response_format": {
- "type": "text"
}, - "stream": false,
- "temperature": 0.7,
- "top_p": 1
}, - "completion_window": "24h",
- "webhook": "string",
- "result_body": {
- "choices": [
- {
- "finish_reason": "stop",
- "message": {
- "content": "I'm fine, and you?",
- "role": "assistant"
}
}
], - "created": 1701234567,
- "id": "0123456789ab0123456789ab",
- "model": "llama-3.1-70b-instruct-fp16",
- "object": "chat.completion",
- "usage": {
- "completion_tokens": 5,
- "prompt_tokens": 5,
- "total_tokens": 10
}
}
}
List Batches
List all your batches, it streams a jsonl line by line.
Authorizations:
Responses
Response samples
- 200
{"id": "0123456789ab0123456789ab", "status": "completed", "requests_counts": {"total": 2, "completed": 0, "failed": 0}, "created_at": 1701234567, "in_progress_at": 1701235000, "completed_at": 1701237500, "cancelled_at": null, "requests_ids": ["0123456789ab0123456789ab", "0123456789ab0123456789ac"], "webhook": "https://example.com/webhook", "metadata": {"user_id": "my_custom_id"}} {"id": "0123456789ab0123456789ab", "status": "completed", "requests_counts": {"total": 2, "completed": 0, "failed": 0}, "created_at": 1701234567, "in_progress_at": 1701235000, "completed_at": 1701237500, "cancelled_at": null, "requests_ids": ["0123456789ab0123456789ab", "0123456789ab0123456789ac"], "webhook": "https://example.com/webhook", "metadata": {"user_id": "my_custom_id"}}
Create Batch
Authorizations:
Request Body schema: application/jsonrequired
requests_ids required | Array of strings (Requests Ids) Ids of the requests to group in the batch |
Webhook (string) or Webhook (null) (Webhook) Optional Webhook to notify when the batch is completed | |
Metadata (object) or Metadata (null) (Metadata) Default: {} Optional custom metadata for the batch. |
Responses
Request samples
- Payload
{- "requests_ids": [
- "0123456789ab0123456789ab",
- "0123456789ab0123456789ac"
], - "metadata": {
- "user_id": "my_custom_id"
}
}
Response samples
- 201
- 422
{- "id": "0123456789ab0123456789ab",
- "status": "registered",
- "requests_counts": {
- "total": 2,
- "completed": 0,
- "failed": 0
}, - "created_at": 1701234567
}
Response samples
- 200
- 422
{- "id": "0123456789ab0123456789ab",
- "status": "completed",
- "requests_counts": {
- "completed": 0,
- "failed": 0,
- "total": 2
}, - "created_at": 1701234567,
- "in_progress_at": 1701235000,
- "completed_at": 1701237500,
- "cancelled_at": null,
- "requests_ids": [
- "0123456789ab0123456789ab",
- "0123456789ab0123456789ac"
], - "metadata": {
- "user_id": "my_custom_id"
}
}
Get Batch Status
Authorizations:
path Parameters
batch_id required | string (Batch Id) |
Responses
Response samples
- 200
- 422
{- "id": "0123456789ab0123456789ab",
- "status": "completed",
- "requests_counts": {
- "completed": 0,
- "failed": 0,
- "total": 2
}, - "created_at": 1701234567,
- "in_progress_at": 1701235000,
- "completed_at": 1701237500,
- "cancelled_at": null
}
Response samples
- 200
- 422
{- "id": "0123456789ab0123456789ab",
- "status": "completed",
- "requests_counts": {
- "completed": 0,
- "failed": 0,
- "total": 2
}, - "created_at": 1701234567,
- "in_progress_at": 1701235000,
- "completed_at": 1701237500,
- "cancelled_at": null
}
Get Batch Requests
Streams all requests statuses in the batch in a jsonl, one line at a time
Authorizations:
path Parameters
batch_id required | string (Batch Id) |
Responses
Response samples
- 200
- 422
{"id": "0123456789ab0123456789ab", "status": "completed", "created_at": 1701234567, "expires_at": 1701320967, "in_progress_at": 1701234700, "ended_at": 1701235000} {"id": "0123456789ab0123456789ab", "status": "completed", "created_at": 1701234567, "expires_at": 1701320967, "in_progress_at": 1701234700, "ended_at": 1701235000}
Get Batch Results
Stream of all requests in the batch
Authorizations:
path Parameters
batch_id required | string (Batch Id) |
Responses
Response samples
- 200
- 422
{"id": "0123456789ab0123456789ab", "status": "completed", "created_at": 1701234567, "expires_at": 1701320967, "in_progress_at": 1701234700, "ended_at": 1701235000, "metadata": {}, "request_body": {"messages": [{"role": "system", "content": "You are a helpful assistant"}, {"role": "user", "content": "Hello, how are you?"}], "model": "llama-3.1-70b-instruct-fp16", "frequency_penalty": 0.0, "logprobs": false, "max_tokens": 16384, "n": 1, "presence_penalty": 0.0, "response_format": {"type": "text"}, "stream": false, "temperature": 0.7, "top_p": 1.0}, "completion_window": "24h", "result_body": {"object": "chat.completion", "model": "llama-3.1-70b-instruct-fp16", "id": "0123456789ab0123456789ab", "created": 1701234567, "choices": [{"message": {"role": "assistant", "content": "I'm fine, and you?"}, "finish_reason": "stop"}], "usage": {"total_tokens": 10, "prompt_tokens": 5, "completion_tokens": 5}}} {"id": "0123456789ab0123456789ab", "status": "completed", "created_at": 1701234567, "expires_at": 1701320967, "in_progress_at": 1701234700, "ended_at": 1701235000, "metadata": {}, "request_body": {"messages": [{"role": "system", "content": "You are a helpful assistant"}, {"role": "user", "content": "Hello, how are you?"}], "model": "llama-3.1-70b-instruct-fp16", "frequency_penalty": 0.0, "logprobs": false, "max_tokens": 16384, "n": 1, "presence_penalty": 0.0, "response_format": {"type": "text"}, "stream": false, "temperature": 0.7, "top_p": 1.0}, "completion_window": "24h", "result_body": {"object": "chat.completion", "model": "llama-3.1-70b-instruct-fp16", "id": "0123456789ab0123456789ab", "created": 1701234567, "choices": [{"message": {"role": "assistant", "content": "I'm fine, and you?"}, "finish_reason": "stop"}], "usage": {"total_tokens": 10, "prompt_tokens": 5, "completion_tokens": 5}}}