Definition
An app in AI Deploy is the workload unit submitted to the cluster. An app runs as a Docker container within the OVHcloud infrastructure.
Each AI Deploy app is linked to a Public Cloud project and specifies the number of resources needed to run the inference task along with a Docker image either publicly available in the AI Deploy shared registry scoped to your project or a private registry of your choosing. For the latter, see the OVHcloud documentation on how to add, use, and manage registries.
Considerations
NOTE: An app will run indefinitely until manual interruption.
- Data can be attached to an app to serve as input (e.g., model weights).
- If you do not customize your resource request, the default request is 1 GPU (NVIDIA L4). Memory is not customizable.
- Scaling for applications depends on the chosen configuration. It can be static or automatic and is based on a trigger threshold according to the metric chosen by the user.
-
Billing for apps is minute-based and applies during the
SCALING
andRUNNING
states of the application. Each commenced minute is billed completely. - You can read further on app limitations here.
Under the hood
Apps in AI Deploy are Docker containers within the OVHcloud infrastructure.
App lifecycle
During its lifetime, the app will transit between the following states:
Only the RUNNING
and SCALING
time of the app is billed. For more information about apps billing, refer to this documentation.
-
QUEUED
: The app deployment request is about to be processed. -
INITIALIZING
: The app is being started, and the remote data (if any) is synchronized. -
RUNNING
: The app is running, you can connect to it, compute resources (GPUs/CPUs) are allocated to your specific app, and an HTTP endpoint is available. -
SCALING
: The app deployment is scaling up or down, depending on the scaling configuration. While scaling, the app is still available if it was running before. -
STOPPING
: The app is stopping, your compute resources are freed, and ephemeral data is deleted. -
STOPPED
: The app ended normally, and you can restart it whenever you want or delete it. -
FAILED
: The app ended in error, e.g., the Docker image is invalid (unreachable, built with Linux/ARM, etc.). -
ERROR
: The app ended due to a backend error (an issue on the OVHcloud side). -
DELETING
: The app is being removed. -
DELETED
: The app is fully deleted.
Go further
- You can check the OVHcloud documentation on how to create a data container.
- You can check the OVHcloud documentation on how to submit an app.
For more information and tutorials, please see our other AI & Machine Learning support guides or explore the guides for other OVHcloud products and services.
If you need training or technical assistance to implement our solutions, contact your sales representative or click on this link to get a quote and ask our Professional Services experts for a custom analysis of your project.