Deploy

This guide provides detailed steps to deploy your models to CX, whether you prefer the convenience of a command-line interface or the flexibility of cURL requests.

Free Deployments for Open Source models

At CX, we are committed to supporting the open-source community. We proudly offer complimentary serverless deployments for any open-source AI model. If you have a model you'd like to deploy, please contact our team, and we'll ensure it's live within 24 hours.

How to Request a Deployment:

Compose an Email: Send your request to [email protected].
Use the Following Format:

Subject: Open Source Deployment Request

Body:
- Model Name: [Name of the model]
- Model Link: [Link to the model repository or source]
- Model Configuration: [E.g., 7B parameters, 12 layers, etc.]

1. Command Line Interface (CLI)

The cx CLI offers a streamlined approach for deploying models to CX.

Step-by-Step Deployment

Login:

$ cx login --username {username} --password {password}

Push image to a private registry.

$ cx push <image>:<tag>

Deploy the model:

$ cx deploy --app-name=<app> --image=<image> --model-image=<image with model downloaded> --gpu=<GPU model> --num-cpu-cores=<num cpu cores> --memory=<RAM in GB> --replicas=<num of machines>

Optionally, to deploy a serverless app, use the following command:

$ cx deploy-serverless --app-name=<app> --image=<image> --model-image=<image with model downloaded> --gpu=<GPU model> --num-cpu-cores=<num cpu cores> --memory=<RAM in GB> --concurrency=<num of concurrent requests> --min-scale=<min replicas> --max-scale=<max replicas>

2. cURL Requests

For those seeking greater flexibility, deploying a model or initiating serverless deployments can be done using cURL requests.

Standard Deployment

Execute the following command to deploy a model:

curl -X 'POST' \
  'https://api.computex.co/api/v1/deployments/deploy' \
  -H 'accept: application/json' \
  -H 'Authorization: Bearer <your-token-generated-from-login>' \
  -H 'Content-Type: application/json' \
  -d '{
  "app_name": "<your-app-name>",
  "container_image": "registry.computex.ai/company-name/image-with-execution-logic",
  "model_image": "registry.computex.ai/company-name/image-with-logic-to-download-model",
  "num_cpu_cores": 1,
  "num_gpu": 0,
  "gpu_sku": "A40",
  "memory": 4,
  "replicas": 1
}'

Remember: Update the -d payload to match your desired deployment configuration and insert the token from the login step in the Authorization header.

Serverless Deployment

To deploy your model in a serverless environment:

curl -X 'POST' \
  'http://localhost:8000/api/v1/deployments/deploy_serverless?is_public=false' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "app_name": "<your-app-name>",
  "container_image": "registry.computex.ai/company-name/image-with-execution-logic",
  "model_image": "registry.computex.ai/company-name/image-with-logic-to-download-model",
  "num_cpu_cores": 1,
  "num_gpu": 0,
  "gpu_sku": "A40",
  "memory": 4,
  "container_concurrency": 1,
  "min_scale": 0,
  "max_scale": 1
}'

Available GPUs:

Choose from the following GPU models

H100_NVLINK_80GB
A100_NVLINK_80GB
A100_NVLINK
A100_PCIE_40GB
A100_PCIE_80GB
A40
RTX_A6000
RTX_A5000
RTX_A4000
Tesla_V100_NVLINK

PreviousBuild NextPredict

Last updated 1 year ago