MLOps Workflow integration
This doc explains how to integrate MLOps Workflow in your application
Last updated
This doc explains how to integrate MLOps Workflow in your application
Last updated
When working in teams, it's important to stay informed about what has happened with your data over time and have the ability to reproduce an experiment using the same data that was initially used. Thus, the workflow represents a graph where data cards are connected to application cards that process this data.
The Workflow functionality provides:
Assurance that experiments are reproducible and models can be consistently retrained with the same results.
A version control system for projects and models, enabling effective management of changes and tracking of their origins.
Simplified tracking of the source and evolution of projects and models.
Enhanced collaboration among team members.
Quick access to view application sessions, open file locations for downloading, or navigate directly to projects, etc.
To access the Workflow, you can use three entry points: the context menu of the project, task, or workspace.
Almost every process on the platform has input and output data, so the workflow is mostly built around tasks.
Therefore, when you develop an application, it's nice to mark data import and export points for the workflow.
So, here are the types of import and export data that we can display in the workflow.
Input data types
Output data types
- project
- project
- dataset
- dataset
- file in Files
- file in Files
- folder in Files
- folder in Files
- task
- task
- labeling job
- labeling job
- app session
Applications you write using the SDK use an instance of the Api
class to send requests. When the Api
instance is initialized, it automatically initializes the AppApi
and Workflow
for it. Therefore, in your application's code, you need to use the corresponding methods in the places where you handle data retrieval and/or export.
These methods add project cards with preview and a link that allows you to directly access the project.
project
Optional[Union[int, ProjectInfo]]
Project ID or ProjectInfo
object.
version_id
Optional[int]
Version ID of the project.
input only
version_num
Optional[int]
Version number of the project. This argument can only be used in conjunction with the project
.
task_id
Optional[int]
Task ID. If not specified, the task ID will be determined automatically.
meta
Optional[Union[WorkflowMeta, dict]]
Additional data for node customization.
📝 Customization of the project node is not supported. All customizations will be ignored. You can only customize the main node with this method.
These methods add dataset cards with preview and a link that allows you to directly access the project.
dataset
Union[int, DatasetInfo]
Dataset ID or DatasetInfo
object.
task_id
Optional[int]
Task ID. If not specified, the task ID will be determined automatically.
meta
Optional[Union[WorkflowMeta, dict]]
Additional data for node customization.
These methods add file cards with corresponding icons and a link that allows you to directly access the catalog where the files are stored.
You can set a parameter that helps identify that a file is a model with weights.
file
Union[int, FileInfo, str]
File ID, FileInfo
object or file path in team Files.
model_weight
bool
Flag to indicate if the file is a model weight.
task_id
Optional[int]
Task ID. If not specified, the task ID will be determined automatically.
meta
Optional[Union[WorkflowMeta, dict]]
Additional data for node customization.
These methods are the same as for files, but already customized for folders.
path
str
Path to the folder in Team Files.
task_id
Optional[int]
Task ID. If not specified, the task ID will be determined automatically.
meta
Optional[Union[WorkflowMeta, dict]]
Additional data for node customization.
These methods add task cards with application icon and a link that allows you to directly access task logs.
input_task_id / output_task_id
int
Task ID that is used as input or output.
task_id
Optional[int]
Task ID. If not specified, the task ID will be determined automatically.
meta
Optional[Union[WorkflowMeta, dict]]
Additional data for node customization.
This method is used to add a card that indicates the application with GUI has an offline session where you can find the result of its work.
task_id
Optional[int]
Task ID. If not specified, the task ID will be determined automatically.
meta
Optional[Union[WorkflowMeta, dict]]
Additional data for node customization.
This method is used to add a card that indicates the application either uses a labeling job as a data source or creates a job as a result of its work.
id
int
Labeling Job ID.
task_id
Optional[int]
Task ID. If not specified, the task ID will be determined automatically.
meta
Optional[Union[WorkflowMeta, dict]]
Additional data for node customization.
Each card added to the workflow can be customized as to what type of data it displays and what type of connection it relates to. You can also customize the main node at the time of card creation. Therefore, customizations can be made for two linked elements at once within a single operation: for relation
which is our input or output data card and for the main node
which is usually a task.
However, if you configure the main node settings across different inputs or outputs, the state of the main node will be overwritten by the last action.
To prepare the settings for any card, you need to use the WorkflowSettings
class:
The class has the following properties, all of which are Optional
:
title
str
Title of the node. It is displayed in the node header. Title is formatted with the <h4>
tag.
description
str
Description of the node. It is displayed under the title line. Do not recommend using it for long texts. Description is formatted with the <small>
tag and used to clarify specific information.
icon
str
Icon of the node. It is displayed in the node body. The icon name should be from the Material Design Icons set. Do not include the 'zmdi-' prefix.
icon_color
str
The color of the icon in hexadecimal format.format
icon_bg_color
str
The background color of the icon in hexadecimal format. format
url
str
URL to be opened when the user clicks on it. Must start with a slash and be relative to the instance.
url_title
str
A short title or the URL
To assign these settings to a card, you need to compile them into a WorkflowMeta
object:
You can configure either one or both cards at the same time. When assigning the meta, there's no need to specify the relation to any particular operation; in input methods, it will always relate to the input.
Let’s imagine we have an application that works with annotations, and the application's session automatically ends after the main process. The input will usually be a project or dataset, and the output should be a new project that is immediately exported in a popular format as an archive.
A best practice is to add a separate module workflow.py
in the src
directory of your application. This module will contain all the logic for integrating the workflow into your application.
workflow.py
Next, integrate the workflow into your application code.
main.py
This way, we can track the history of data changes and visually see these changes at any time by accessing the MLOps Workflow.
Exclusively for Pro and Enterprise subscribers
Currently, only IMAGE projects are supported
A key aspect of the MLOps Workflow is data versioning. This mechanism allows you to preserve the state of a project to ensure the reproducibility of experiments. At any moment, you can recreate the current project as a new one, with its state corresponding to the selected save point. This approach has advantages over simple data copying. It is recommended to implement this functionality for automatic project versioning in applications like "Train NN Model."
You can easily add a data backup to your workflow. All you need to do is decide before which data operation you want to create the backup and add a single method.
project_info
Union[ProjectInfo, int]
ProjectInfo
object or project ID.
version_title
Optional[int]
Version title.
version_description
Optional[Union[WorkflowMeta, dict]]
Version description.
If a project already has a backup and there haven't been any changes since it was created, you'll receive the ID of that backup instead of creating a duplicate.
To recreate a project from a version, you need to use
project_info
Union[ProjectInfo, int]
ProjectInfo
object or project ID.
version_id
Optional[int]
Version ID.
version_num
Optional[int]
Version number.
skip_missed_entities
Optional[bool]
Skip missed Images
You can view the project version numbers and their corresponding IDs by using method
project_id
int
Project ID.
filters
Optional[List[dict]]
Filters.
Here is how versioning appears in the Workflow: