AI Deploy - Getting started – Support Guides

Learn how to get started with AI Deploy, one part of a set of OVHcloud-managed AI tools, by deploying your first application using either the OVHcloud Control Panel or the ovhai Command Line Interface.

AI Deploy is covered by OVHcloud Public Cloud Special Conditions.

Requirements

A Public Cloud project in your OVHcloud account

OVHcloud Control Panel Access

Direct link: Public Cloud Projects
Navigation path: Public Cloud > Select your project

Instructions

Once you access your Public Cloud project, navigate to the AI Deploy area in the AI & Machine Learning section.

Click on the Deploy an app button and accept the terms and conditions, if any.

Once clicked, you will be redirected to the creation process detailed below.

Deploying your first application

Proceed through the ordering process, customizing your deployment as you go. For more information about your options, use the menus below.

App name

First, choose a name for your AI Deploy app, or accept the automatically generated name if it meets your needs, to make it easier to manage all your apps.

Location

Select where your AI Deploy app will be hosted, meaning the physical location.

You can find the capabilities for AI Deploy in the guide AI Deploy capabilities.

Resources

To deploy an AI Deploy app, you must allocate compute resources. The app supports a range of resource configurations:

GPU Resources: 1 to 4 GPUs
CPU Resources: 1 to 12 CPUs

Note that each instance is billed based on its running time.

Each compute resource includes:

CPU or GPU cores: Processing power for your app
RAM: Memory of your app
Local Storage: Storage space for your app

You can adjust the Resource Size to customize the allocation of CPU and GPU cores, RAM, and local storage to meet your app's specific needs.

Application to deploy

AI Deploy allows a user to deploy applications from two sources:

From your own Docker image, giving you the full flexibility to deploy what you want. This image can be stored on many types of registry (OVHcloud Managed Private Registry, Docker Hub, GitHub packages, etc.), and the expected format is <registry-address>/<image-identifier>:<tag-name>.
From an OVHcloud catalog with already built-in AI models and applications.

In this tutorial, we will select an OVHcloud Docker image to deploy your first AI Deploy app. The objective is to deploy and call a simple Flask API, which will welcome you by sending back Hello followed by the name you sent, and the end of the sentence. There is no web interface; you will have an API endpoint that you can reach via HTTP.

If you want to deploy your own image, you need to comply with a few rules, like adding a specific user. Follow our Build and use custom images guide. You may also be interested in our Registries - Use & manage your registries guide.

To use this demonstration OVHcloud Docker image, enter the following name as the Custom Docker image: ovhcom/ai-deploy-hello-world. Then click the + button to confirm.

You can find this image in the OVHcloud DockerHub. For more information about this Docker image, please check the GitHub repository.

Scaling

Then you can modify the Number of replicas on which your AI Deploy app will be deployed, according to a scaling strategy.

The static scaling strategy allows you to choose the number of replicas on which the app will be deployed. For this method, the minimum number of replicas is 1, and the maximum is 10.

Static scaling can be used if you want to have fixed costs.
This scaling strategy is also useful when your consumption or inference load is fixed.

With the autoscaling strategy, it is possible to choose both the minimum number of replicas (1 by default) and the maximum number of replicas. High availability will measure the average resource usage across its replicas and add instances if this average exceeds the specified average usage percentage threshold. Conversely, it will remove instances when this average resource utilization falls below the threshold. You can even downscale to 0 if you have no usage, thereby limiting costs. The monitored metric can either be CPU or RAM, and the threshold is a percentage (integer between 1 and 100).

You can use autoscaling if you have irregular or sawtooth inference loads.

For more detailed information about scaling strategies, please refer to our AI Deploy - Scaling strategies guide.

HTTP Port

The default exposed port for your app's URL is 8080. However, if you are using a specific framework that requires a different port, you can override the default port and configure your application to use the desired alternative port.

Privacy

You can choose public access (open to the internet) or restricted access.

Public access means that everyone is authorized. Use this option carefully. Usually, public access is used for tests, but not in production, since everyone will be able to use your app.

On the other hand, restricted access will require credentials to access the app. Two options are available in this case:

An AI Platform user. It can be seen as a user and password restriction. Quite simple, but not a lot of granularity.
An AI token (preferred solution). A token is very effective since you can link them with labels, which are in the Advanced configuration section.

We will select Restricted access for this deployment.

Advanced configuration

This step allows you to customize your AI Deploy app with additional features. You can choose to configure one, several, or none of the options below.

Commands

You can override the default Docker command (entrypoint) with a custom command. This is useful if your Docker image has a specific entrypoint that you want to modify.

Volumes

If your application is based on external data, such as scripts or models, you can upload this data to an Object Storage or a GitHub repository, and then mount these storage solutions on your app.

You can also mount an Object Storage as an output folder, for example, to retrieve the data generated by your application.

You can attach as many volumes as you want to your app with various options.

In both cases, you will have to specify:

Storage container or Git repository URL: The name of the container to synchronize or the GitHub repository URL (the one that ends with .git).
Mount directory: The location in the app where the synced data is mounted.

There are also optional parameters:

Authorization: The permission rights on the mounted data. Available rights are Read Only (ro) or Read Write (rw). The default value is rw.
Cache: Whether the synced data should be added to the project cache. Data in the cache can be used by other apps without additional synchronization. To benefit from the cache, the new apps also need to mount the data with the cache option.

To learn more about data, volumes, and permissions, check out our data guide.

Labels

You can add some Key/Value labels to filter or organize your AI Deploy app access.

As an example, add a label with Key=owner and Value=test.

This will make the application accessible only to users who have the token associated with this label (key/value).

Learn more about this feature in the AI Deploy - Accessing your app with tokens guide.

Availability probe

Finally, you can enable the Readiness probe feature. To do so, provide:

Probe API endpoint: the /health endpoint of your app.
Probe port: The port associated with the probe endpoint.

This allows you to monitor the health of your app and ensure it is ready to receive traffic.

Review and launch your AI Deploy app

This final step is a summary of your AI Deploy app deployment. You can review the previously selected options and parameters.

You can also generate the equivalent ovhai CLI command, which enables you to deploy the same application using the command line. This CLI can be downloaded here. For more information, consult the CLI - Launch an AI Deploy app documentation.

Launch your AI Deploy app by clicking on Order now. Please note that your app will not be immediately available, as it requires some time to:

Pull the Docker image
Mount any configured data volumes

Once the deployment is complete, your first AI Deploy app will be running in production and ready to be accessed.

Connect to your first AI Deploy app

Step 1: Check your AI Deploy app status

First, go check your app details and verify that your AI Deploy app has reached the RUNNING status.

For your information, you can access your deployed application by clicking the HTTP access blue button, which will expose the default HTTP port of your app. However, since we have deployed a Flask API in this tutorial, you won't be able to access it through the HTTP access button, as no interface was deployed.

Step 2: Generate a security token

During the AI Deploy apps deployment process, we selected "restricted access." To query your app, you first need a valid security token.

In your OVHcloud Control Panel left menu, go to the AI Dashboard in the AI & Machine Learning section. Select the Tokens tab.

Click + Create a token, then fill in a name, label selector, role, and region as below:

Here are a few explanations:

Label selector: you can restrict the token granted by labels. You can note a specific ID, a type, or any previously created label, such as owner=elea in our case.
Role: AI Platform Operator can read and manage your AI Deploy app. AI Platform Read only can only read your AI Deploy app.
Region: tokens are regionalized. Select the region related to your AI Deploy app.

Generate your first cURL query

Now that your AI Deploy app is running and the token has been generated, you are ready for your first query.

Since we are on restricted access, you will need to specify the authentication token in the header following this format:

-H "Authorization: Bearer $YOURTOKENHERE"

In our case, the exact cURL code is:

curl --request POST \
  --url https://9b5b651e-8514-43d0-ae68-af801771542f.app.us-east-va.ai.cloud.ovh.us \
  -H "Authorization: Bearer wtaGrsPLRB+vKSCVYypZ1/TMR0ZWYBKlal0FntyHNZFmbosiBMviEi8p8UvPdjeH" \
  --header "Content-Type: application/json" \
  --data '"Elea"'

Which gives us:

 "Hello Elea. Congratulations, you have launched your first AI App!"

If you see this message with the name you provided, you have successfully launched your first app!

Generate your first Python query

If you want to query this API with Python, this code sample with Python Request library may suit you:

export AI_APP_TOKEN=token_value

import requests
import json
from requests.structures import CaseInsensitiveDict

url = "https://9b5b651e-8514-43d0-ae68-af801771542f.app.us-east-va.ai.cloud.ovh.us"

headers = CaseInsensitiveDict()
headers = {'content-type': 'application/json',
           'Accept-Charset': 'UTF-8',
           'Authorization': 'Bearer wtaGrsPLRB+vKSCVYypZ1/TMR0ZWYBKlal0FntyHNZFmbosiBMviEi8p8UvPdjeH'}

data = "Elea"
j_data = json.dumps(data)

r = requests.post(url, data = j_data, headers = headers)

print(r.status_code)
print(r.text)

Result:

200
 "Hello Elea. Congratulations, you have launched your first AI App!"

Stop and delete your AI Deploy app

You have the flexibility to keep your AI Deploy app running for an indefinite period. At any time, you can easily stop your application, using either the UI (OVHcloud Control Panel) or the ovhai CLI.

Click each tab across the top to view all content.

Go to the AI Deploy section, click the more options ... button to the right of your app, and select Stop.

Once stopped, your AI Deploy app will free up the previously allocated compute resources. Your endpoint is kept and if you restart your AI Deploy app, the same endpoint can be reused seamlessly. Also, when you stop your app, you no longer book compute resources which means you don't have expenses for this part. Only expenses for attached storage may occur.

If you want to completely delete your AI Deploy app, just click on the Delete action. Be sure to also delete your Object Storage data if you don't need it anymore, by going in the Object Storage section (in the Storage category).

To follow this part, make sure you have installed the ovhai CLI on your computer or on an instance.

You can easily stop your AI Deploy application using the following command:

ovhai app stop <APP_UUID>

Once stopped, your AI Deploy app will free up the previously allocated compute resources. Your endpoint is kept and if you restart your AI Deploy app, the same endpoint can be reused seamlessly.

ovhai app start <APP_UUID>

Also, when you stop your app, you no longer book compute resources which means you don't have expenses for this part. Only expenses for attached storage may occur.

If you want to completely delete your AI Deploy app, just run the following command:

ovhai app delete <APP_UUID>

Be sure to also delete your Object Storage data if you don't need it anymore. To do this, you will need to empty it first, then delete it:

ovhai bucket object delete --all <object_storage_name>@<region>

ovhai bucket delete <region> <object_storage_name>

Go further

You can imagine deploying an AI model for sketch recognition thanks to AI Deploy. Refer to this tutorial.
Do you want to use Streamlit in order to create an app? Follow this guide.

For more information and tutorials, please see our other AI & Machine Learning support guides or explore the guides for other OVHcloud products and services.

If you need training or technical assistance to implement our solutions, contact your sales representative or click on this link to get a quote and ask our Professional Services experts for a custom analysis of your project.