# Inference API

## Inference API

### Introduction

**There are two ways how you can infer your models:**

* From Supervisely platform with the APPs like [Apply NN to Images](https://ecosystem.supervisely.com/apps/nn-image-labeling/project-dataset) and [Apply Classifier to Images](https://ecosystem.supervisely.com/apps/apply-classification-model-to-project).
* Right from your code.

In this tutorial, you'll learn how to infer deployed models **from your code** with the `sly.nn.inference.Session` class. This class is a convenient wrapper for a low-level API. It under the hood is just a communication with the serving app via `requests`.

**Before starting you have to deploy your model with a Serving App (e.g.** [**Serve YOLOv5**](https://ecosystem.supervisely.com/apps/yolov5/supervisely/serve)**)**

Try with Colab: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/supervisely-ecosystem/tutorial-inference-session/blob/master/nn_inference_tutorial_colab.ipynb)

**Table of Contents**:

* [Inference API](#inference-api)
  * [Introduction](#introduction)
* [Quick overview](#quick-overview)
  * [List of all inference methods](#list-of-all-inference-methods)
    * [Image inference methods:](#image-inference-methods)
    * [Video inference methods:](#video-inference-methods)
* [A Complete Tutorial](#a-complete-tutorial)
  * [1. Initialize `sly.nn.inference.Session`](#1.-initialize-sly.nn.inference.session)
  * [2. Get the model info](#2.-get-the-model-info)
    * [Session info](#session-info)
    * [Model Meta. Classes and tags](#model-meta.-classes-and-tags)
    * [Inference settings](#inference-settings)
    * [Set the inference settings](#set-the-inference-settings)
  * [3. Image Inference](#3.-image-inference)
    * [Inspecting the model prediction](#inspecting-the-model-prediction)
    * [Visualize model prediction](#visualize-model-prediction)
    * [Upload prediction to the Supervisely platform](#upload-prediction-to-the-supervisely-platform)
  * [4. Video Inference](#4.-video-inference)
    * [Method 1. Inferring video with iterator](#method-1.-inferring-video-with-iterator)
    * [Method 2. Inferring video without iterator](#method-2.-inferring-video-without-iterator)
* [Advanced. Working with raw JSON output](#advanced.-working-with-raw-json-output)

Let's start with a quick example of how you can connect and make inference of your model!

## Quick overview

*(for detailed tutorial go to the* [*next section*](#a-complete-tutorial)*)*

**Example usage: visualize prediction**

```python
import os
from dotenv import load_dotenv
import supervisely as sly


# Get your Serving App's task_id from the Supervisely platform
task_id = 27209

# init sly.Api
load_dotenv(os.path.expanduser("~/supervisely.env"))
api = sly.Api()

# Create Inference Session
session = sly.nn.inference.Session(api, task_id=task_id)

session.get_session_info()
```

```
{'app_name': 'Serve YOLOv5',
 'session_id': 27209,
 'model_files': '/sly-app-data/model/yolov5s.pt',
 'number_of_classes': 80,
 'sliding_window_support': 'advanced',
 'videos_support': True,
 'async_video_inference_support': True,
 'task type': 'object detection',
 'model_name': 'YOLOv5',
 'checkpoint_name': 'yolov5s',
 'pretrained_on_dataset': 'COCO train 2017',
 'device': 'cuda',
 'half': 'True',
 'input_size': 640}
```

```python
# Inference image_id
image_id = 19386161
prediction = session.inference_image_id(image_id)  # prediction is a `sly.Annotation` object

# Download and load the image that was inferred
save_path = "demo_image.jpg"
api.image.download_path(image_id, path=save_path)
image_np = sly.image.read(save_path)

# Draw the annotation and save it to the disk
save_path_predicted = "demo_image_pred.jpg"
predicted_annotation.draw_pretty(bitmap=image_np, output_path=save_path_predicted, fill_rectangles=False, thickness=7)
```

```python
# Show
from matplotlib import pyplot as plt
image_pred = sly.image.read(save_path_predicted)
plt.imshow(image_pred)
plt.axis('off')
```

![Image with predictions of the YOLOv5 model](https://user-images.githubusercontent.com/31512713/218431952-6183b5b0-19cc-4ba0-9ff9-0493b0bb4424.png)

### List of all inference methods

#### Image inference methods:

```python
# Infer single image by local path
pred = session.inference_image_path("image_01.jpg")

# Infer batch of images by local paths
pred = session.inference_image_paths(["image_01.jpg", "image_02.jpg"])

# Infer image by ID
pred = session.inference_image_id(17551748)

# Infer batch of images by IDs
pred = session.inference_image_ids([17551748, 17551750])

# Infer image by url
url = "https://images.unsplash.com/photo-1674552791148-c756b0899dba?ixlib=rb-4.0.3&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=387&q=80"
pred = session.inference_image_url(url)

# Infer the entire Project by ID
predictions = session.inference_project_id(18635)

# Infer the entire Project by ID with iterator
for ann_info in session.inference_project_id_async(18635):
    print(ann_info)
```

#### Video inference methods:

```python
from tqdm import tqdm

video_id = 18635803

# Infer video getting each frame as soon as it's ready
for frame_pred in tqdm(session.inference_video_id_async(video_id)):
    print(frame_pred)

# Infer video without iterator
pred = session.inference_video_id(video_id)
```

## A Complete Tutorial

### 1. Initialize `sly.nn.inference.Session`

First serve the model you want (e.g. [Serve YOLOv5](https://ecosystem.supervisely.com/apps/yolov5/supervisely/serve)) and copy the `task_id` from the `App sessions` section in the Supervisely platform:

![Copy the Task ID here](https://user-images.githubusercontent.com/31512713/218194505-b161be1e-5a05-488b-8eb7-9bc0f24141e2.png)

**Init your sly.Api:**

*(for more info see* [*Basics of authentication*](/getting-started/basics-of-authentication.md) *tutorial)*

```python
import os
from dotenv import load_dotenv
import supervisely as sly

# init sly.API
load_dotenv(os.path.expanduser("~/supervisely.env"))
api = sly.Api()
```

**Create an Inference Session, a connection to the model:**

```python
# Get your Serving App's task_id from the Supervisely platform
task_id = 27209

# create session
session = sly.nn.inference.Session(api, task_id=task_id)
```

**(Optional) You can pass the inference settings in init:**

```python
# pass settings by dict
inference_settings = {
    "conf_thres": 0.45
}
session = sly.nn.inference.Session(api, task_id=task_id, inference_settings=inference_settings)
```

Or with a `YAML` file:

```python
# pass settings by YAML
inference_settings_yaml = "settings.yml"
session = sly.nn.inference.Session(api, task_id=task_id, inference_settings=inference_settings_yaml)
```

### 2. Get the model info

#### Session info

Each app with a deployed model has its own unique **task\_id** (or **session\_id** which is the same), **model\_name**, **pretrained\_dataset** and other useful info that can be obtained with the `get_session_info()` method.

```python
session.get_session_info()
```

```
{'app_name': 'Serve YOLOv5',
 'session_id': 27209,
 'model_files': '/sly-app-data/model/yolov5s.pt',
 'number_of_classes': 80,
 'sliding_window_support': 'advanced',
 'videos_support': True,
 'async_video_inference_support': True,
 'task type': 'object detection',
 'model_name': 'YOLOv5',
 'checkpoint_name': 'yolov5s',
 'pretrained_on_dataset': 'COCO train 2017',
 'device': 'cuda',
 'half': 'True',
 'input_size': 640}
```

#### Model Meta. Classes and tags

The model may be pretrained on various datasets, like a **COCO**, **ImageNet** or even your **custom data**. Datasets are different in classes/tags they have. Therefore each dataset has its own meta information called `project_meta` in Supervisely. The model also contains this information and it's called `model_meta`. You can get the `model_meta` with method `get_model_meta()`:

```python
model_meta = session.get_model_meta()
print("The first 10 classes of the model_meta:")
[cls.name for cls in model_meta.obj_classes][:10]
```

```
The first 10 classes of the model_meta:

['person',
 'bicycle',
 'car',
 'motorcycle',
 'airplane',
 'bus',
 'train',
 'truck',
 'boat',
 'traffic light']
```

The `model_meta` will be used later, when we will visualize model predictions.

#### Inference settings

Each model has its own inference settings, like a `conf_thres`, `iou_thres` and others. You can get the full list of supported settings with `get_default_inference_settings()`:

```python
default_settings = session.get_default_inference_settings()
default_settings
```

```
{'conf_thres': 0.25,
 'iou_thres': 0.45,
 'augment': False,
 'debug_visualization': False}
```

#### Set the inference settings

**There are 3 ways to set the inference settings:**

* `update_inference_settings(**kwargs)`
* `set_inference_settings(dict)`
* `set_inference_settings(YAML)`

Also you can pass it earlier at creating the `Session`.

**a) Update only the parameters you need:**

```python
session.update_inference_settings(conf_thres=0.4, iou_thres=0.55)
session.inference_settings
```

Output:

```
{'conf_thres': 0.4, 'iou_thres': 0.55}
```

**b) Set parameters with a dict:**

```python
settings = {
    "conf_thres": 0.25
}
session.set_inference_settings(settings)
session.inference_settings
```

Output:

```
{'conf_thres': 0.25}
```

**c) Set parameters with a `YAML` file:**

```python
session.set_inference_settings("settings.yml")
session.inference_settings
```

Output:

```
{'conf_thres': 0.55, 'augment': False}
```

### 3. Image Inference

**There are several ways how to infer an image:**

* by Supervisely ID
* by local path
* by URL from the web

```python
# Infer image by local path
pred = session.inference_image_path("image_01.jpg")

# Infer image by ID
pred = session.inference_image_id(image_id=17551748)

# Infer image by url
url = "https://images.unsplash.com/photo-1674552791148-c756b0899dba?ixlib=rb-4.0.3&ixid=MnwxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8&auto=format&fit=crop&w=387&q=80"
pred = session.inference_image_url(url)
```

**And you can also infer a batch of images:**

```python
# Infer batch of images by local paths
pred = session.inference_image_paths(["image_01.jpg", "image_02.jpg"])

# Infer batch of images by IDs
pred = session.inference_image_ids([17551748, 17551750])
```

#### Inspecting the model prediction

The prediction is a `sly.Annotation` object. It contains all labels and tags for an image and can be uploaded directly to the Supervisely platform.

*(see more in* [*SDK reference*](https://supervisely.readthedocs.io/en/latest/sdk/supervisely.annotation.annotation.Annotation.html#supervisely.annotation.annotation.Annotation)*)*

```python
image_id = 19386163
prediction = session.inference_image_id(image_id)
prediction.to_json()
```

```
{'annotation': {'description': '',
  'size': {'height': 800, 'width': 1200},
  'tags': [],
  'objects': [{'classTitle': 'umbrella',
    'description': '',
    'tags': [{'name': 'confidence', 'value': 0.85693359375}],
    'points': {'exterior': [[540, 363], [694, 468]], 'interior': []},
    'geometryType': 'rectangle',
    'shape': 'rectangle'},
   {'classTitle': 'car',
    'description': '',
    'tags': [{'name': 'confidence', 'value': 0.86376953125}],
    'points': {'exterior': [[724, 380], [1198, 708]], 'interior': []},
    'geometryType': 'rectangle',
    'shape': 'rectangle'},
   {'classTitle': 'person',
    'description': '',
    'tags': [{'name': 'confidence', 'value': 0.8740234375}],
    'points': {'exterior': [[562, 442], [661, 685]], 'interior': []},
    'geometryType': 'rectangle',
    'shape': 'rectangle'},
   {'classTitle': 'car',
    'description': '',
    'tags': [{'name': 'confidence', 'value': 0.89501953125}],
    'points': {'exterior': [[4, 408], [509, 699]], 'interior': []},
    'geometryType': 'rectangle',
    'shape': 'rectangle'}],
  'customBigData': {}},
 'data': {}}
```

#### Visualize model prediction

`sly.Annotation` has a `draw_pretty()` method for convenient visualization routines:

```python
# Draw the annotation and save it to disk
save_path_predicted = "demo_image_pred.jpg"
prediction.draw_pretty(bitmap=image_np, output_path=save_path_predicted, fill_rectangles=False, thickness=7)

# Show
from matplotlib import pyplot as plt
image_pred = sly.image.read(save_path_predicted)
plt.imshow(image_pred)
plt.axis('off');
```

![Image with predictions of the YOLOv5 model](https://user-images.githubusercontent.com/31512713/218431952-6183b5b0-19cc-4ba0-9ff9-0493b0bb4424.png)

#### Upload prediction to the Supervisely platform

**Now you can upload the image with predictions to the Supervisely platform:**

```python
workspace_id = 662

# Create new project and dataset
project_info = api.project.create(workspace_id, "My model predictions", change_name_if_conflict=True)
dataset_info = api.dataset.create(project_info.id, "First dataset")

# Update project meta with model's classes
api.project.update_meta(project_info.id, model_meta)
api.project.pull_meta_ids(project_info.id, model_meta)

# Upload the image
image_name = os.path.basename(image_path)
img_info = api.image.upload_path(dataset_info.id, name=image_name, path=image_path)

# Upload model predictions to Supervisely
api.annotation.upload_ann(img_info.id, prediction)
```

**Note:** when you update a `project_meta` with `api.project.update_meta()` the server generates ids for the classes and tags that have pushed for the first time and you have to update the `model_meta` too for the further uploading a prediction. This is where `api.project.pull_meta_ids()` method is helpful. It assigns the ids directly to the `model_meta` object. Because of all predictions have a reference to the `model_meta`, without this step we can't upload the predictions to the platform as predictions' `ProjectMeta` will not have the ids.

**Result on the Supervisely platform:**

![Result in the Supervisely Labeling Tool](https://user-images.githubusercontent.com/31512713/218533323-47ce4ee0-a2e2-480d-bfc0-06caa21ccc8a.png)

### 4. Video Inference

#### Method 1. Inferring video with iterator

**The video inference is simple too.**

The first way is to infer the video with `inference_video_id_async` method. It returns an iterator, which can be useful in processing predictions frame by frame. As soon as the model done with a one frame it will be yielded by the iterator:

```python
from tqdm import tqdm

video_id = 18635803

pred_frames = []
for frame_ann in tqdm(session.inference_video_id_async(video_id)):
    pred_frames.append(frame_ann)
```

There are some parameters can be passed to the video inference:

* `start_frame_index`: the first frame to start
* `frames_count`: total frames to infer
* `frames_direction`: video playback direction, either "forward" or "backward"

**Getting more information about the inference process:**

```python
video_id = 18635803
video_info = api.video.get_info_by_id(video_id)

frame_iterator = session.inference_video_id_async(video_id)
total_frames = video_info.frames_count
for i, frame_ann in enumerate(frame_iterator):
    labels = frame_ann.labels
    predicted_classes = [x.obj_class.name for x in labels]
    print(f"Frame {i+1}/{total_frames} done. Predicted classes = {predicted_classes}")
```

```
{"message": "The video is preparing on the server, this may take a while...", "timestamp": "2023-02-13T16:10:03.827Z", "level": "info"}
{"message": "Inference has started:", "progress": {"current": 0, "total": 10}, "is_inferring": true, "cancel_inference": false, "result": null, "pending_results": [], "timestamp": "2023-02-13T16:10:13.014Z", "level": "info"}


Frame 1/10 done. Predicted classes = ['car']
Frame 2/10 done. Predicted classes = ['car', 'car']
Frame 3/10 done. Predicted classes = ['car', 'car']
Frame 4/10 done. Predicted classes = ['car', 'car']
Frame 5/10 done. Predicted classes = ['car', 'car']
Frame 6/10 done. Predicted classes = ['car', 'car']
Frame 7/10 done. Predicted classes = ['car', 'car']
Frame 8/10 done. Predicted classes = ['car', 'car']
Frame 9/10 done. Predicted classes = ['car']
Frame 10/10 done. Predicted classes = ['car']
```

**Stop video inference**

If you need to stop the inference, use `session.stop_async_inference()`:

```python
from tqdm import tqdm

video_id = 18635803

for i, frame_ann in enumerate(tqdm(session.inference_video_id_async(video_id))):
    if i == 2:
        session.stop_async_inference()
```

```
{"message": "The video is preparing on the server, this may take a while...", "timestamp": "2023-02-09T23:15:47.232Z", "level": "info"}
{"message": "Inference has started:", "progress": {"current": 0, "total": 10}, "is_inferring": true, "cancel_inference": false, "result": null, "pending_results": [], "timestamp": "2023-02-09T23:15:55.878Z", "level": "info"}
 20%|██        | 2/10 [00:03<00:13,  1.63s/it]{"message": "Inference will be stopped on the server", "timestamp": "2023-02-09T23:16:01.559Z", "level": "info"}
 30%|███       | 3/10 [00:05<00:13,  1.88s/it]
```

#### Method 2. Inferring video without iterator

If you don't need to iterate every frame, you can use the `inference_video_id` method:

```python
video_id = 18635803

predictions_list = session.inference_video_id(
    video_id, start_frame_index=5, frames_count=15, frames_direction="forward"
)
```

**Note:** it is recommended to use this method for very small videos, because the code will wait until the whole video has been inferred and you even can't to track the progress.

### 5. Project Inference

#### Method 1. Inferring project with iterator

```python
from tqdm import tqdm

project_id = 18635

pred_ann_infos = []
for ann_info in tqdm(session.inference_project_id_async(project_id)):
    pred_ann_infos.append(ann_info)
```

There is extra parameter that can be passed to the project inference:

* `dest_project_id`: destination project id. If not passed, iterator will return annotation infos. If dest\_project\_id is equal to project\_id, iterator will upload annotations to images in the project. If it is different from project\_id, iterator will copy images and upload annotations to the new project.

#### Method 2. Inferring project without iterator

If you don't need to iterate every image, you can use the `inference_project_id` method:

```python
project_id = 18635

predictions_list = session.inference_project_id(project_id)
```

## Advanced. Working with raw JSON output

### SessionJSON

There is a `sly.nn.inference.SessionJSON` class which is useful when it needed to work with raw json outputs.

The class has all the same methods as `Session`, it just returns a raw JSONs.

The prediction is a `dict` with the following fields:

* `"annotation"`: contains a predicted annotation, that can be easily converted to `sly.Annotation`.
* `"data"`: additional metadata of the prediction. In most cases you won't need this.

```python
session = sly.nn.inference.SessionJSON(api, task_id=task_id)

prediction_json = session.inference_image_path("img/image_01.jpg")
prediction_json
```

```
{'annotation': {'description': '',
  'size': {'height': 1600, 'width': 1280},
  'tags': [],
  'objects': [{'classTitle': 'sheep',
    'description': '',
    'tags': [{'name': 'confidence', 'value': 0.255615234375}],
    'points': {'exterior': [[308, 1049], [501, 1410]], 'interior': []},
    'geometryType': 'rectangle',
    'shape': 'rectangle'},
  {'classTitle': 'person',
    'description': '',
    'tags': [{'name': 'confidence', 'value': 0.869140625}],
    'points': {'exterior': [[764, 272], [1062, 1002]], 'interior': []},
    'geometryType': 'rectangle',
    'shape': 'rectangle'},
  {'classTitle': 'horse',
    'description': '',
    'tags': [{'name': 'confidence', 'value': 0.87109375}],
    'points': {'exterior': [[393, 412], [1274, 1435]], 'interior': []},
    'geometryType': 'rectangle',
    'shape': 'rectangle'}],
  'customBigData': {}},
'data': {}}
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://developer.supervisely.com/app-development/neural-network-integration/inference-api-tutorial.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
