Getting Started
EXXA Off-Peak Computing is an asynchronous inference API without hard rate limits designed for serving Generative AI models efficiently and sustainably. By aggregating requests over a set period (starting with 24 hours), it is possible to maximize GPU efficiency and prioritize GPUs in locations and during times where electricity has low emissions.
With EXXA API, you can:
- Send requests one by one (similar to using a streaming API)
- Aggregate requests into a batch (for more convenient processing)
1. Access EXXA Batch API
- Visit EXXA Console and sign in
- Create a new API key in the API key management section
- Ensure your account has sufficient credits to perform operations
For detailed API documentation, refer to our API Docs.
If you have any questions, please reach out to us on Discord or send us an email at founders@withexxa.com.
2. Send Requests
EXXA API provides a seamless way for developers to send requests with just a few lines of code. You need to activate payments on your account to enable your API keys. Use the following code to send a request:
- python
- curl
import requests
import os
api_key = os.environ["EXXA_API_KEY"]
url = "https://api.withexxa.com/v1/requests"
headers = {"X-API-Key": api_key, "Content-Type": "application/json"}
payload = {
"request_body": {
"model": "llama-3.1-70b-instruct-fp16",
"messages": [{"role": "user", "content": "Your query here"}],
"temperature": 0.7,
"top_p": 1.0,
"n": 1,
"logprobs": False,
"stream": False,
"presence_penalty": 0.0,
"frequency_penalty": 0.0
}
}
response = requests.post(url, headers=headers, json=payload)
print(response.json())
curl -X POST https://api.withexxa.com/v1/requests \
-H "X-API-Key: $EXXA_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"request_body": {
"model": "llama-3.1-70b-instruct-fp16",
"messages": [{"role": "user", "content": "Your query here"}],
"temperature": 0.7,
"top_p": 1.0,
"n": 1,
"logprobs": false,
"stream": false,
"presence_penalty": 0.0,
"frequency_penalty": 0.0
}
}'
3. Create a Batch
After sending each request, you can aggregate requests into a batch for processing. Assign a name to your batch for easier management:
- python
- curl
import requests
import os
api_key = os.environ["EXXA_API_KEY"]
url = "https://api.withexxa.com/v1/batches"
headers = {"X-API-Key": api_key, "Content-Type": "application/json"}
payload = {
"requests_ids": ["request_id1", "request_id2"],
"metadata": {"batch_name": "MyFirstBatch"}
}
response = requests.post(url, headers=headers, json=payload)
print(response.json())
curl -X POST https://api.withexxa.com/v1/batches \
-H "X-API-Key: $EXXA_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"requests_ids": ["request_id1", "request_id2"],
"metadata": {"batch_name": "MyFirstBatch"}
}'
The advantages of batching requests are the following:
- Check the status of a group of requests at once
- Cancel a group of requests simultaneously
It is important to note that you are not required to create a batch; you can use EXXA asynchronous API for each request individually and retrieve results for each request.
You can easily check if a request is included in a batch or not by checking the Batch ID
parameter of the request. If the Batch ID
is null, it is not included in any batch.
4. Check the Status
1. Batch
Check the status of your batch using the following snippet:
- python
- curl
import requests
import os
api_key = os.environ["EXXA_API_KEY"]
batch_id = 'batch_id_here'
url = f"https://api.withexxa.com/v1/batches/{batch_id}/status"
headers = {"X-API-Key": api_key}
response = requests.get(url, headers=headers)
print(response.json())
curl -X GET https://api.withexxa.com/v1/batches/{batch_id_here}/status \
-H "X-API-Key: $EXXA_API_KEY"
The status of a given batch can be any of the following:
Status | Description |
---|---|
registered | The batch was received and is pending processing |
in progress | The batch was validated; processing underway |
cancelled | The batch was cancelled |
completed | The batch was processed; all results are available |
2. Request
Check the status of your request using the following snippet:
- python
- curl
import requests
import os
api_key = os.environ["EXXA_API_KEY"]
request_id = 'request_id_here'
url = f"https://api.withexxa.com/v1/requests/{request_id}/status"
headers = {"X-API-Key": api_key}
response = requests.get(url, headers=headers)
print(response.json())
curl -X GET https://api.withexxa.com/v1/requests/{request_id_here}/status \
-H "X-API-Key: $EXXA_API_KEY"
The status of a given request can be any of the following:
Status | Description |
---|---|
registered | The request was received and is pending processing |
in progress | The request was validated; processing underway |
cancelled | The request was cancelled |
completed | The request was processed; result is available |
failed | The request was processed; It failed and returned an error |
5. Retrieve the Results
Once a batch or an individual request has been processed, retrieve the output using:
- python
- curl
import requests
import os
api_key = os.environ["EXXA_API_KEY"]
batch_id = 'batch_id_here'
url = f"https://api.withexxa.com/v1/batches/{batch_id}/results"
headers = {"X-API-Key": api_key}
response = requests.get(url, headers=headers)
print(response.text)
# You could also iterate over the response to get the result of each request
# for line in response.iter_lines():
# result = json.loads(line)
# print(result)
curl -X GET https://api.withexxa.com/v1/batches/{batch_id_here}/results \
-H "X-API-Key: $EXXA_API_KEY"
6. Cancel a Request or a Batch
If you need to cancel a batch of requests or an individual request, you can do it as follows:
- python
- curl
import requests
import os
api_key = os.environ["EXXA_API_KEY"]
headers = {"X-API-Key": api_key}
# Cancel a batch
batch_id = 'batch_id_here'
url = f"https://api.withexxa.com/v1/batches/{batch_id}/cancel"
response = requests.post(url, headers=headers)
print(response.json())
# Cancel an individual request
request_id = 'request_id_here'
url = f"https://api.withexxa.com/v1/batches/{request_id}/cancel"
response = requests.post(url, headers=headers)
print(response.json())
# Cancel a batch
curl -X POST https://api.withexxa.com/v1/batches/{batch_id_here}/cancel \
-H "X-API-Key: $EXXA_API_KEY"
# Cancel an individual request
curl -X POST https://api.withexxa.com/v1/batches/{request_id_here}/cancel \
-H "X-API-Key: $EXXA_API_KEY"
Note that when you cancel a batch, this action will cancel all requests contained within it. However, canceling a single request from a batch will affect only that request; the other requests in the batch will not be affected and will continue processing.