Learn about the submission of apps through the ovhai CLI. To deploy an app, some parameters are mandatory. Others are optional, depending on your needs. We will show you how each of the parameters works by showing examples.
Requirements
- A working
ovhai
CLI (see this guide to get started)
Instructions
This documentation is divided into the following parts:
- Deploying an app
- Setting environment variables
- Assigning a name to the app
- Attaching data
- Attaching compute resources
- Scaling strategy
- Setting tokens and labels
- Making your app public and sharing it
- Changing the default access port
- Setting up a health check
- Changing output format
Deploying an app
If you need any help while submitting a new app, run ovhai app run --help
.
Output:
The IMAGE
argument is mandatory. Indeed, you need to specify a Docker image that you either built yourself or found freely available on a public repository such as DockerHub.
More information about adding and managing public and private registries can be found here.
To launch a basic app, use the following command:
As indicated in the help command, many options can be passed to this command. Let's understand their usefulness and how to use them.
Setting environment variables
You can tweak the behavior of your Docker image without having to rebuild it every time (like updating the input source of your model (camera, image, video, file, etc.) by using the --env
flag. Using this, you can simply set environment variables directly in your app.
The values of these --env
flags will take precedence over those specified in your Dockerfile and in your Python scripts, provided that the python variable is only initialized if it does not exist, as follows:
As explained, this variable can be modified when the app is launched by using the --env
flag:
Assigning a name to the app
To manage your apps more easily and avoid ending up with random names, we recommend that you give your app a name. To do this, use the --name
parameter in the following way:
Attaching data
This step assumes that you either have data in your OVHcloud Object Storage that you wish to use within your deployed app or that you need to save the data generated by your app into the Object Storage. To learn more about data, volumes, and permission, check out the data guide.
You can attach as many volumes as you want to your app with various options. Let us go through those options and outline a few good practices with volume mounts.
The --volume
flag is used to attach a container as a volume to the app. The volume description sets the option for the volume and synchronization process <container@alias/prefix:mount_path(:permission)(:cache)>
:
-
container
the name of the container, in OVHcloud Object Storage, to synchronize -
alias
is the data store alias of your data; a list of all available aliases can be obtained by runningovhai datastore list
-
prefix
(optional) objects in the container are filtered based on this prefix, only matching objects are synced -
mount_path
the location in the app where the synced data is mounted -
permission
(optional) the permission rights on the mounted data; available rights are read only (ro), read write (rw) or read write delete (rwd) -
cache
(optional) whether the synced data should be added to the project cache; available options are eithercache
orno-cache
; data in the cache can be used by other apps without additional synchronization; to benefit from the cache, the new apps also need to mount the data with the cache option
Example:
Let's assume you have a team of data scientists working on the same input dataset but each running their own experiment. In this case, a good practice is to mount the input dataset with ro permission and cache activated for each experiment, the input data is synced only once and never synced back. In addition, each of the experiments will yield specific results that should be stored in a dedicated container. For each app, we would then mount an output container with rw permission and no cache. If a container does not exist yet in the object storage, it is created during the data synchronization.
Assuming our data is located in the Vint Hill (US-EAST-VA) Object Storage in a container named dataset
the command would now be:
Data in the cache does not persist indefinitely. After a period of inactivity, the data is emptied from the cache. Inactivity is defined as having no running apps using the data in the cache.
Attaching compute resources
First, you need to tweak the resources you need for your app depending on your model's task and your expected workload. To do this, you can use --cpu
or --gpu
flags.
Flags --cpu
and --gpu
are exclusive. If GPU resources are specified then the CPU flag is ignored and vice versa.
You can also use the --flavor
flag to specify which kind of resources you want to use. You can check the full list by running ovhai capabilities flavor list
. If this flag is not specified, the default CPU/GPU model for the cluster on which you submit your app will be used.
For example, here is how to launch an app running on 10 CPUs whose id is ai1-1-cpu
:
- If no resource flag is specified (
--cpu
or--gpu
), the app will run with one unit of the default GPU model. - If both CPU and GPU flags are provided, only the GPU one is considered
Scaling strategy
For your app, you can either choose static or automatic scaling.
NOTE: If you do not specify a scaling strategy, the static method will be used with one replica.
For more information about static and automatic scaling strategies, please refer to this documentation.
When to choose static scaling?
The static scaling strategy allows you to choose the number of replicas on which the app will be deployed. For this method, the minimum number of replicas is 1 and the maximum is 10.
- Static scaling can be used if you want to have fixed costs.
- This scaling strategy is also useful when your consumption or inference load are fixed.
Here is an example of launching your app on a static scaling strategy, with two replicas:
When to choose autoscaling?
With the autoscaling strategy, it is possible to choose both the minimum number of replicas (1 by default) and the maximum number of replicas. High availability will measure the average resource usage across its replicas and add instances if this average exceeds the specified average usage percentage threshold. Conversely, it will remove instances when this average resource utilization falls below the threshold. The monitored metric can either be CPU
or RAM
and the threshold is a percentage (integer between 1 and 100).
- You can use autoscaling if you have irregular or sawtooth inference loads.
Here is an example to launch your app on an autoscaling strategy, which uses between 2 and 12 replicas according to monitoring at 60% of the RAM:
Setting tokens and labels
Using tokens can help you share your app securely. More information about creating, managing and using tokens can be found here.
To add a token to your app, you can run:
If your token was created with a label selector, you can assign a label to your app. For that, add the parameter:
Making your app public and sharing it
If you want your app to be reachable without any authentication, add the --unsecure-http
parameter:
You can then share the access URL of your app with anybody. No more authentication will be required to access it.
Changing the default access port
When an app is running, an app_url
is associated to it that allows you to access any service exposed in your app. By default, the exposed port for this URL is the 8080
.
However, if you use a framework that uses another port, you can override the one used by default. For example, the port 8501
is the default port used by the Streamlit app. We will therefore use:
This will indicate that the port to reach the app URL is the 8501
.
Setting up a health check
To ensure the health and readiness of the Docker image that runs your Python app, you could be interested in configuring a probe path and a probe port. These settings allow you to perform health checks effectively, which plays a crucial role in determining the availability and status of your app. To perform these health checks, we will use probes.
--probe-path
: By setting a probe path, you define a specific URL endpoint within your app that can be accessed to determine its health. This endpoint should be designed to respond with an appropriate HTTP status code, indicating whether the container is healthy or not. For example, a successful response with an HTTP status code of 200 signifies a healthy state, while any other status code indicates an issue.
--probe-port
: The probe port specifies the network port on which the app listens for incoming requests. It allows us to establish a connection with the container and perform the health check.
To set up your health check, you can use the following command:
Changing output format
When you use the ovhai app run
command, you are given a lot of information (app id, app link, app resources, etc.). You can display all this information in a specific format, such as .json
or .yaml
by using the --output
parameter. Here is an example with a json
format:
Going further
To know more about the CLI and available commands to interact with your app check out the overview of ovhai.
For more information and tutorials, please see our other AI & Machine Learning support guides or explore the guides for other OVHcloud products and services.
If you need training or technical assistance to implement our solutions, contact your sales representative or click on this link to get a quote and ask our Professional Services experts for a custom analysis of your project.