CX AI
  • Getting Started
  • Build
  • Deploy
  • Predict
  • Virtual Servers
  • Pricing
  • ClearML
Powered by GitBook
On this page
  • Model Download Image
  • How it Works
  • Steps
  • Directory structure
  • Model Run Image
  • How it Works
  • Steps

Build

PreviousGetting StartedNextDeploy

Last updated 1 year ago

Deploying your model to ComputeX (CX) allows you to leverage our infrastructure optimized for delivering the fastest serverless inference times. The deployment involves creating two Docker images: model-download-image and model-run-image.

  • The model-download-image solely contains the model which will be tensorized and stored on an attached NVMe SSD for exceptional model load times.

  • The model-run-image is a slim image carrying the necessary logic for executing inference.

Model Download Image

How it Works

  1. The script, , instructs the system on which HuggingFace model to download.

  2. The model is tensorized and stored on an attached NVMe SSD, enabling rapid model load times.

Follow along with the to see how to create the model download image.

Steps

  1. Modify the script to include the HuggingFace model you aim to download.

  2. Build and upload the image to your private CX Docker registry hosted at registry.computex.ai.

docker build -t registry.computex.ai/<org-name>/model-download-image-{MODEL-NAME}:v1 .
docker push registry.computex.ai/<org-name>/model-download-image-{MODEL-NAME}:v1
  1. Deploy your model to CX using the CLI command:

cx deploy --app <app-name> --model-image registry.computex.ai/<org-name>/model-download-image-{MODEL-NAME}:v1 

Directory structure

Inside the container, the directory structure will be as follows:

/mnt
    └── {model_name}
        ├── pretrained_tokenizer
        ├── {model_name}
        └── {model_name}.tensors

Each app gets its own Persistent Volume ensuring exceptionally fast bootup times.

Model Run Image

How it Works

  1. The script is tailored to load the model and execute inferences.

  2. Depending on the model, you might need to adjust the script to use specific HuggingFace libraries.

Steps

  1. Update the code in load_model.py and predict.py based on your model's needs. Refer to your model's card in HuggingFace to ascertain which HuggingFace libraries you need to call.

  2. Once done, package the image using Docker:

docker build -t registry.computex.ai/<org-name>/model-run-image-{MODEL-NAME}:v1 .
docker push registry.computex.ai/<org-name>/model-run-image-{MODEL-NAME}:v1
  1. Deploy the model to CX using the CLI command and specify the --image flag. You should also define the GPU, number of CPU cores, and memory capacity:

cx deploy --app <app-name> --image registry.computex.ai/<org-name>/model-run-image-{MODEL-NAME}:v1 --gpu A40 --memory 16 --num-cpu-cores 8

Follow along with the to see how to deploy a model to CX.

model_download.py
example
model_download.py
example