How to customize model inference
This document outlines some of the features of the Inference
class that can be useful in customizing your model's behavior.
Custom Inference Settings
Your neural network (NN) model can use various parameters that you may want to expose to the user. These parameters might include confidence threshold, intersection over union (IoU) threshold, and others. We have provided a way to allow users to set up some of these parameters when they connect to your model from the Inference dashboard or the Labeling Tool.
To support such parameters in your model, you should provide a dictionary or YAML file with default values to your model class, like this:
These parameters will be provided to the predict(image_path, settings)
method as the settings
parameter. In this method, you can use these parameters as needed, as shown in the following example from the Integrate Instance Segmentation model repository:
In this example, all predictions with a confidence score lower than the confidence_threshold
parameter value will not be included in the result annotation.
If a user wants to change a parameter value before the inference, they can do so. For example, if they use the Apply NN to Images Project app to label a project using your NN model, they will see these parameters in the Inference settings
> Additional settings
section.
After the user changes the parameter, the new value will be provided to your serving app in the request, and the settings
dictionary will contain the new values in the predict()
method.
If you want to add comments to your settings, as shown in the screenshot above, it is recommended to provide a path to a YAML file with comments to the custom_inference_settings
parameter instead of a dictionary.
Model Information
When users use your served model, they will connect to your app from another app, which sets up parameters to apply your neural network to data projects.
To ensure that users have chosen a suitable model for their task, it can be helpful to provide additional information about your served model. You can provide this information using the get_info()
method, which returns a dictionary of parameters.
By default, these parameters are related to the chosen computer vision problem, but you can add additional information. We recommend overriding this method in your model class and adding such information as model_name
, checkpoint_name
, pretrained_on_dataset
, and device
. Of course, you can add any custom parameters.
It's important to understand that method get_info()
of the Inference
class calls method get_classes()
, which is not implemented by default and must be declared explicitly for each model.
Sliding window mode
One problem with neural network model inference is that it can be challenging to apply them to large images with small objects. We provide tools to split the image into smaller parts, infer each part independently, and merge the results afterward.
This problem is significant for some computer vision tasks, but not for all. Therefore, it is crucial to consider this issue at the beginning.
We provide three modes to use sliding window:
none
This means not to use sliding window and prevent users from setting up sliding window parameters from Inference interfaces. In this mode, you will get the path to the full image as the parameter image_path
in the predict(image_path, settings)
method.
basic
In this mode, users can set up sliding window parameters, and you will get the path to a part of the image as the parameter image_path
in the predict(image_path, settings)
method.
In basic mode, all predictions are combined from all image parts into one result annotation.
advanced
basic
mode has the same problem as object detection neural networks without non-maximum suppression post-processing. Many labels from different image parts can collide, overlap and be split. We support the option to improve basic
mode and add the NMS post-processing to predicted labels.
In advanced
sliding window mode, you should implement the predict_raw()
method which will predict the objects like predict()
method, but they will be changed after by NMS algorithm. This approach is more appropriate for the object detection task.
You can refer to the Serve YOLOv5 app repository as an example of using advanced sliding window mode.
To ensure that your app uses the required sliding window mode, see the class of the task from which your model class inherits (for example, supervisely.nn.inference.InstanceSegmentation
) and check the sliding_window_mode
parameter in the constructor of the class.
If you want to change this parameter in your model class, provide the correct mode value to the sliding_window_mode
parameter of your model constructor:
Model files storage
To simplify data manipulations, we support the model_dir
parameter which is used as the location for all files needed for model inference. This folder must be provided to the load_on_device(model_dir, device)
method to prepare your model.
You can also use the download(src_path, dst_path)
method of the model class to download all the required files in the load_on_device(model_dir, device)
method. Currently, you can provide external URLs or the path to a file or folder in Team Files as the src_path. By default, the destination path is {model_dir}/{filename}
, but you can specify a different destination path to change the location or rename the downloaded file.
It is recommended to provide the model_dir
parameter in the constructor of your model to ensure that the download()
method works correctly.
Model meta for multitask models
A served model can provide additional info about its state through model_meta
property. (e.g. description of annotation classes, type of predicted object). This data helps inference GUI and other supervisely applications to display correct model properties and visualize predictions.
More information about model meta can be found in this section.
In most cases this property is automatically generated within the SDK, so you don't have to worry about it. But for multitasking applications it's important to check if model_meta
is built correctly for the chosen task/model.
Let's look closely at how to correctly define model_meta
for your custom model. The type of model_meta
is ProjectMeta
and it contains information about class names, shapes and colors (autogenerate feature). This property will be constructed automatically only once the first time it is called.
Since the model_meta
is specific to a chosen model, it can be guaranteed that the property will not be called before the self.load_on_device()
function is called. Therefore, it is important to make sure that after calling self.load_on_device()
, the self.get_classes()
and self._get_obj_class_shape()
functions work correctly for your instance.
There's nothing complicated with self.get_classes()
:
Notice that model_meta
property is "lazy" and will not update automatically if self._model_meta
is already defined. So, if your serving app supports several models that can be chosen via GUI, you should update your model_meta
manually by calling self.update_model_meta()
at the end of self.load_on_device()
.
The self._get_obj_class_shape()
is a bit tricky. Most serving apps are designed to solve only one task at a time and for this reason, this method is protected. For example, if you inherit from sly.nn.inference.ObjectDetection
class, self._get_obj_class_shape()
will always return sly.Rectangle
shape. But some API allows you to create app that can handle multiple tasks (e.g. YOLOv8, open-mmlab/mmdetection). In this case, the method must be overridden.
If for some reason this functionality is not enough for your serving app, you can freely define all needed attributes as well as overwrite self._model_meta
right inside the load_on_device()
method. For example, it is currently impossible to construct ObjClass
for sly.GraphNodes
because geometry_config
should be passed into constructor.
Last updated