Supervisely
About SuperviselyEcosystemContact usSlack
  • 💻Supervisely Developer Portal
  • 🎉Getting Started
    • Installation
    • Basics of authentication
    • Intro to Python SDK
    • Environment variables
    • Supervisely annotation format
      • Project Structure
      • Project Meta: Classes, Tags, Settings
      • Objects
      • Tags
      • Image Annotation
      • Video Annotation
      • Point Clouds Annotation
      • Point Cloud Episode Annotation
      • Volumes Annotation
    • Python SDK tutorials
      • Images
        • Images
        • Image and object tags
        • Spatial labels on images
        • Keypoints (skeletons)
        • Multispectral images
        • Multiview images
        • Advanced: Optimized Import
        • Advanced: Export
      • Videos
        • Videos
        • Video and object tags
        • Spatial labels on videos
      • Point Clouds
        • Point Clouds (LiDAR)
        • Point Cloud Episodes and object tags
        • 3D point cloud object segmentation based on sensor fusion and 2D mask guidance
        • 3D segmentation masks projection on 2D photo context image
      • Volumes
        • Volumes (DICOM)
        • Spatial labels on volumes
      • Common
        • Iterate over a project
        • Iterate over a local project
        • Progress Bar tqdm
        • Cloning projects for development
    • Command Line Interface (CLI)
      • Enterprise CLI Tool
        • Instance administration
        • Workflow automation
      • Supervisely SDK CLI
    • Connect your computer
      • Linux
      • Windows WSL
      • Troubleshooting
  • 🔥App development
    • Basics
      • Create app from any py-script
      • Configuration file
        • config.json
        • Example 1. Headless
        • Example 2. App with GUI
        • v1 - Legacy
          • Example 1. v1 Modal Window
          • Example 2. v1 app with GUI
      • Add private app
      • Add public app
      • App Compatibility
    • Apps with GUI
      • Hello World!
      • App in the Image Labeling Tool
      • App in the Video Labeling Tool
      • In-browser app in the Labeling Tool
    • Custom import app
      • Overview
      • From template - simple
      • From scratch - simple
      • From scratch GUI - advanced
      • Finding directories with specific markers
    • Custom export app
      • Overview
      • From template - simple
      • From scratch - advanced
    • Neural Network integration
      • Overview
      • Serving App
        • Introduction
        • Instance segmentation
        • Object detection
        • Semantic segmentation
        • Pose estimation
        • Point tracking
        • Object tracking
        • Mask tracking
        • Image matting
        • How to customize model inference
        • Example: Custom model inference with probability maps
      • Serving App with GUI
        • Introduction
        • How to use default GUI template
        • Default GUI template customization
        • How to create custom user interface
      • Inference API
      • Training App
        • Overview
        • Tensorboard template
        • Object detection
      • High level scheme
      • Custom inference pipeline
      • Train and predict automation model pipeline
    • Advanced
      • Advanced debugging
      • How to make your own widget
      • Tutorial - App Engine v1
        • Chapter 1 Headless
          • Part 1 — Hello world! [From your Python script to Supervisely APP]
          • Part 2 — Errors handling [Catching all bugs]
          • Part 3 — Site Packages [Customize your app]
          • Part 4 — SDK Preview [Lemons counter app]
          • Part 5 — Integrate custom tracker into Videos Annotator tool [OpenCV Tracker]
        • Chapter 2 Modal Window
          • Part 1 — Modal window [What is it?]
          • Part 2 — States and Widgets [Customize modal window]
        • Chapter 3 UI
          • Part 1 — While True Script [It's all what you need]
          • Part 2 — UI Rendering [Simplest UI Application]
          • Part 3 — APP Handlers [Handle Events and Errors]
          • Part 4 — State and Data [Mutable Fields]
          • Part 5 — Styling your app [Customizing the UI]
        • Chapter 4 Additionals
          • Part 1 — Remote Developing with PyCharm [Docker SSH Server]
      • Custom Configuration
        • Fixing SSL Certificate Errors in Supervisely
        • Fixing 400 HTTP errors when using HTTP instead of HTTPS
      • Autostart
      • Coordinate System
      • MLOps Workflow integration
    • Widgets
      • Input
        • Input
        • InputNumber
        • InputTag
        • BindedInputNumber
        • DatePicker
        • DateTimePicker
        • ColorPicker
        • TimePicker
        • ClassesMapping
        • ClassesColorMapping
      • Controls
        • Button
        • Checkbox
        • RadioGroup
        • Switch
        • Slider
        • TrainValSplits
        • FileStorageUpload
        • Timeline
        • Pagination
      • Text Elements
        • Text
        • TextArea
        • Editor
        • Copy to Clipboard
        • Markdown
        • Tooltip
        • ElementTag
        • ElementTagsList
      • Media
        • Image
        • LabeledImage
        • GridGallery
        • Video
        • VideoPlayer
        • ImagePairSequence
        • Icons
        • ObjectClassView
        • ObjectClassesList
        • ImageSlider
        • Carousel
        • TagMetaView
        • TagMetasList
        • ImageAnnotationPreview
        • ClassesMappingPreview
        • ClassesListPreview
        • TagsListPreview
        • MembersListPreview
      • Selection
        • Select
        • SelectTeam
        • SelectWorkspace
        • SelectProject
        • SelectDataset
        • SelectItem
        • SelectTagMeta
        • SelectAppSession
        • SelectString
        • Transfer
        • DestinationProject
        • TeamFilesSelector
        • FileViewer
        • Dropdown
        • Cascader
        • ClassesListSelector
        • TagsListSelector
        • MembersListSelector
        • TreeSelect
        • SelectCudaDevice
      • Thumbnails
        • ProjectThumbnail
        • DatasetThumbnail
        • VideoThumbnail
        • FolderThumbnail
        • FileThumbnail
      • Status Elements
        • Progress
        • NotificationBox
        • DoneLabel
        • DialogMessage
        • TaskLogs
        • Badge
        • ModelInfo
        • Rate
        • CircleProgress
      • Layouts and Containers
        • Card
        • Container
        • Empty
        • Field
        • Flexbox
        • Grid
        • Menu
        • OneOf
        • Sidebar
        • Stepper
        • RadioTabs
        • Tabs
        • TabsDynamic
        • ReloadableArea
        • Collapse
        • Dialog
        • IFrame
      • Tables
        • Table
        • ClassicTable
        • RadioTable
        • ClassesTable
        • RandomSplitsTable
        • FastTable
      • Charts and Plots
        • LineChart
        • GridChart
        • HeatmapChart
        • ApexChart
        • ConfusionMatrix
        • LinePlot
        • GridPlot
        • ScatterChart
        • TreemapChart
        • PieChart
      • Compare Data
        • MatchDatasets
        • MatchTagMetas
        • MatchObjClasses
        • ClassBalance
        • CompareAnnotations
      • Widgets demos on github
  • 😎Advanced user guide
    • Objects binding
    • Automate with Python SDK & API
      • Start and stop app
      • User management
      • Labeling Jobs
  • 🖥️UI widgets
    • Element UI library
    • Supervisely UI widgets
    • Apexcharts - modern & interactive charts
    • Plotly graphing library
  • 📚API References
    • REST API Reference
    • Python SDK Reference
Powered by GitBook
On this page
  • Dataset Types
  • Flat Dataset Structure
  • Nested Dataset Structure
  • Step-by-Step Guide
  • 1. Demo project
  • 2. .env files
  • 3. Python script
  • 4. Optimizations

Was this helpful?

Edit on GitHub
  1. Getting Started
  2. Python SDK tutorials
  3. Common

Iterate over a project

PreviousCommonNextIterate over a local project

Last updated 2 months ago

Was this helpful?

In this article, we will learn how to iterate through a project with annotated data in python. It is one of the most frequent operations in Superviely Apps and python automation scripts.

Dataset Types

In Supervisely, datasets can be organized in two ways: flat or nested. Understanding these structures can help you with efficient data organization and management.

Flat Dataset Structure

A flat dataset is the simplest form of organization where all images and their annotations are stored in a single level. This structure is good for simple projects with straightforward organization.

You can add this dataset to your team via Supervisely Ecosystem - ⬇️

Example structure:

📦 Lemons (Annotated)           ← The project
 ┣ 📂 ds1                       ← The dataset
 ┃ ┣ 📂 ann                     ← Folder for annotations
 ┃ ┃ ┣ 📜 IMG_0748.jpeg.json    ← Annotation for image 0748
 ┃ ┃ ┣ 📜 IMG_1836.jpeg.json    ← Annotation for image 1836
 ┃ ┃ ┣ 📜 IMG_2084.jpeg.json
 ┃ ┃ ┣ 📜 IMG_3861.jpeg.json
 ┃ ┃ ┣ 📜 IMG_4451.jpeg.json
 ┃ ┃ ┗ 📜 IMG_8144.jpeg.json
 ┃ ┣ 📂 img                     ← Folder for images
 ┃ ┃ ┣ 🖼️ IMG_0748.jpeg         ← Image 0748
 ┃ ┃ ┣ 🖼️ IMG_1836.jpeg         ← Image 1836
 ┃ ┃ ┣ 🖼️ IMG_2084.jpeg
 ┃ ┃ ┣ 🖼️ IMG_3861.jpeg
 ┃ ┃ ┣ 🖼️ IMG_4451.jpeg
 ┃ ┃ ┗ 🖼️ IMG_8144.jpeg
 ┣ 📜 meta.json                 ← Project metadata
 ┗ 📜 README.md                 ← Optional readme file

Nested Dataset Structure

A nested dataset structure is a bit more advanced. It lets you create datasets inside other datasets, forming a hierarchy—like tree for your data. Nested datasets are good for complex projects requiring hierarchical organization or when you need to group related data together.

Important Note about Nested Datasets:

When working with nested datasets, keep in mind:

  • Parent datasets (like "Temperate" or "Tropical") can be empty or non-empty themselves, but contain images inside nested datasets

  • To get all parent dataset images including nested ones, you'll need to iterate through each nested dataset

Example structure:

  • The main datasets ("Temperate" and "Tropical") don't hold images or annotations directly in ann and img folders.

  • Instead, they have a datasets folder containing nested datasets (like "Apple", "Banana", etc.), and those hold the images and annotations.

  • The main datasets can also contain images, but we removed them for this example

📦 Fruits (Annotated)          ← The project
 ┣ 📂 Temperate                ← Main dataset #1
 ┃ ┣ 📂 ann                    ← Empty (no annotations here)
 ┃ ┣ 📂 img                    ← Empty (no images here)
 ┃ ┣ 📂 datasets               ← Where the nested datasets live
 ┃ ┃ ┣ 📂 Apple                ← Nested dataset for apples
 ┃ ┃ ┃ ┣ 📂 ann
 ┃ ┃ ┃ ┃ ┣ 📜 apple_1.jpg.json
 ┃ ┃ ┃ ┃ ┣ 📜 apple_2.jpg.json
 ┃ ┃ ┃ ┃ ┗ 📜 apple_3.jpg.json
 ┃ ┃ ┃ ┗ 📂 img
 ┃ ┃ ┃ ┃ ┣ 🖼️ apple_1.jpg
 ┃ ┃ ┃ ┃ ┣ 🖼️ apple_2.jpg
 ┃ ┃ ┃ ┃ ┗ 🖼️ apple_3.jpg
 ┃ ┃ ┗ 📂 Pear                  ← Nested dataset for pears
 ┃ ┃ ┃ ┣ 📂 ann
 ┃ ┃ ┃ ┃ ┣ 📜 pear_1.jpg.json
 ┃ ┃ ┃ ┃ ┣ 📜 pear_2.jpg.json
 ┃ ┃ ┃ ┃ ┗ 📜 pear_3.jpg.json
 ┃ ┃ ┃ ┗ 📂 img
 ┃ ┃ ┃ ┃ ┣ 🖼️ pear_1.jpg
 ┃ ┃ ┃ ┃ ┣ 🖼️ pear_2.jpg
 ┃ ┃ ┃ ┃ ┗ 🖼️ pear_3.jpg
 ┣ 📂 Tropical                 ← Main dataset #2
 ┃ ┣ 📂 ann                    ← Empty (no annotations here)
 ┃ ┣ 📂 img                    ← Empty (no images here)
 ┃ ┣ 📂 datasets               ← Where the nested datasets live
 ┃ ┃ ┣ 📂 Banana               ← Nested dataset for bananas
 ┃ ┃ ┃ ┣ 📂 ann
 ┃ ┃ ┃ ┃ ┣ 📜 banana_1.jpg.json
 ┃ ┃ ┃ ┃ ┣ 📜 banana_2.jpg.json
 ┃ ┃ ┃ ┃ ┗ 📜 banana_3.jpg.json
 ┃ ┃ ┃ ┗ 📂 img
 ┃ ┃ ┃ ┃ ┣ 🖼️ banana_1.jpg
 ┃ ┃ ┃ ┃ ┣ 🖼️ banana_2.jpg
 ┃ ┃ ┃ ┃ ┗ 🖼️ banana_3.jpg
 ┃ ┃ ┣ 📂 Lemon                ← Nested dataset for lemons
 ┃ ┃ ┃ ┣ 📂 ann
 ┃ ┃ ┃ ┃ ┣ 📜 lemon_1.jpg.json
 ┃ ┃ ┃ ┃ ┣ 📜 lemon_2.jpg.json
 ┃ ┃ ┃ ┃ ┗ 📜 lemon_3.jpg.json
 ┃ ┃ ┃ ┗ 📂 img
 ┃ ┃ ┃ ┃ ┣ 🖼️ lemon_1.jpg
 ┃ ┃ ┃ ┃ ┣ 🖼️ lemon_2.jpg
 ┃ ┃ ┃ ┃ ┗ 🖼️ lemon_3.jpg
 ┃ ┃ ┗ 📂 Mango                ← Nested dataset for mangoes
 ┃ ┃ ┃ ┣ 📂 ann
 ┃ ┃ ┃ ┃ ┣ 📜 mango_1.jpg.json
 ┃ ┃ ┃ ┃ ┣ 📜 mango_2.jpg.json
 ┃ ┃ ┃ ┃ ┗ 📜 mango_3.jpg.json
 ┃ ┃ ┃ ┗ 📂 img
 ┃ ┃ ┃ ┃ ┣ 🖼️ mango_1.jpg
 ┃ ┃ ┃ ┃ ┣ 🖼️ mango_2.jpg
 ┃ ┃ ┃ ┃ ┗ 🖼️ mango_3.jpg
 ┣ 📜 meta.json                ← Project metadata
 ┗ 📜 README.md                ← Optional readme file

Step-by-Step Guide

In this guide we will go through the following steps:

1. Demo project

If you don't have any projects yet, go to the ecosystem and add the demo project 🍋 Lemons (Annotated) or 🍍 Fruits Nested (Annotated) to your current workspace.

2. .env files

# your API credentials, learn more here: https://developer.supervisely.com/getting-started/basics-of-authentication
SERVER_ADDRESS="https://app.supervisely.com" # ⬅️ change it if use Enterprise Edition
API_TOKEN="4r47N.....blablabla......xaTatb" # ⬅️ change it

Create the second file local.env and place it in the same directory with the main.py. This file will contain values we are going to use in the python script.

# change the Project ID to your value
PROJECT_ID=12208 # ⬅️ change it

3. Python script

To start debugging you need to

  1. Check that you have ~/supervisely.env file with correct values

Source code

import os
import supervisely as sly
from dotenv import load_dotenv

if sly.is_development():
    load_dotenv("local.env")
    load_dotenv(os.path.expanduser("~/supervisely.env"))

api = sly.Api.from_env()

project_id = sly.env.project_id()
project = api.project.get_info_by_id(project_id)
if project is None:
    raise KeyError(f"Project with ID {project_id} not found in your account")
print(f"Project info: {project.name} (id={project.id})")

# get project meta - collection of annotation classes and tags
meta_json = api.project.get_meta(project.id)
project_meta = sly.ProjectMeta.from_json(meta_json)
print(project_meta)

# Set recursive to True if you want to include nested datasets
datasets = api.dataset.get_list(project.id, recursive=True) 
print(f"There are {len(datasets)} datasets in project")

for dataset in datasets:
    print(f"Dataset {dataset.name} has {dataset.items_count} images")
    images = api.image.get_list(dataset.id)
    for image in images:
        ann_json = api.annotation.download_json(image.id)
        ann = sly.Annotation.from_json(ann_json, project_meta)
        print(f"There are {len(ann.labels)} objects on image {image.name}")

If you are working with nested datasets and want to get full path to the dataset, you can use api.dataset.tree method instead of api.dataset.get_list. It returns a generator that yields tuples (parents, dataset) where parents is a list of parent dataset names and dataset is a dataset object.

Your for loop will look like this:

for parents, dataset in api.dataset.tree(project_id):
    print(f"Dataset path: {'/'.join(parents + [dataset.name])}")
    print(f"Dataset {dataset.name} has {dataset.items_count} images")
    images = api.image.get_list(dataset.id)
    for image in images:
        ann_json = api.annotation.download_json(image.id)
        ann = sly.Annotation.from_json(ann_json, project_meta)
        print(f"There are {len(ann.labels)} objects on image {image.name}")

# >>> Dataset path: Temperate
# >>> Dataset Temperate has 0 images
# >>> 
# >>> Dataset path: Temperate/Apple
# >>> Dataset Apple has 3 images
# >>> There are 1 objects on image apple_2.jpg
# >>> There are 1 objects on image apple_1.jpg
# >>> There are 1 objects on image apple_3.jpg
# >>> 
# >>> Dataset path: Temperate/Pear
# >>> Dataset Pear has 3 images
# >>> There are 1 objects on image pear_3.jpg
# >>> There are 1 objects on image pear_1.jpg
# >>> There are 1 objects on image pear_2.jpg
# >>> ...

Output

The script above produces the following output for Lemons (Annotated) project:


Project info: Lemons (Annotated) (id=12208)
ProjectMeta:
Object Classes
+-------+--------+----------------+--------+
|  Name | Shape  |     Color      | Hotkey |
+-------+--------+----------------+--------+
|  kiwi | Bitmap |  [255, 0, 0]   |        |
| lemon | Bitmap | [81, 198, 170] |        |
+-------+--------+----------------+--------+
Tags
+------+------------+-----------------+--------+---------------+--------------------+
| Name | Value type | Possible values | Hotkey | Applicable to | Applicable classes |
+------+------------+-----------------+--------+---------------+--------------------+
+------+------------+-----------------+--------+---------------+--------------------+

There are 1 datasets in project
Dataset ds1 has 6 images
There are 3 objects on image IMG_1836.jpeg
There are 4 objects on image IMG_8144.jpeg
There are 4 objects on image IMG_3861.jpeg
There are 3 objects on image IMG_0748.jpeg
There are 5 objects on image IMG_4451.jpeg
There are 7 objects on image IMG_2084.jpeg

The script above produces the following output for Fruits Nested (Annotated) project:


Project info: Fruits Nested (Annotated) (id=1317)
ProjectMeta:
Object Classes
+-----------+-----------+----------------+--------+
|    Name   |   Shape   |     Color      | Hotkey |
+-----------+-----------+----------------+--------+
|   Lemon   | Rectangle | [144, 19, 254] |        |
|   Apple   | Rectangle |  [208, 2, 27]  |        |
| Pineapple | Rectangle | [248, 231, 28] |        |
|    Pear   | Rectangle | [126, 211, 33] |        |
|   Orange  | Rectangle | [80, 227, 194] |        |
|   Banana  | Rectangle | [139, 87, 42]  |        |
|   Mango   | Rectangle | [74, 144, 226] |        |
+-----------+-----------+----------------+--------+
Tags
+-----------+------------+-----------------+--------+---------------+--------------------+-------------+
|    Name   | Value type | Possible values | Hotkey | Applicable to | Applicable classes | Target type |
+-----------+------------+-----------------+--------+---------------+--------------------+-------------+
|   Apple   |    none    |       None      |        |      all      |         []         |     all     |
|   Banana  |    none    |       None      |        |      all      |         []         |     all     |
|   Lemon   |    none    |       None      |        |      all      |         []         |     all     |
|   Mango   |    none    |       None      |        |      all      |         []         |     all     |
|   Orange  |    none    |       None      |        |      all      |         []         |     all     |
|    Pear   |    none    |       None      |        |      all      |         []         |     all     |
| Pineapple |    none    |       None      |        |      all      |         []         |     all     |
+-----------+------------+-----------------+--------+---------------+--------------------+-------------+

There are 9 datasets in project
Dataset Temperate has 0 images
Dataset Apple has 3 images
There are 1 objects on image apple_2.jpg
There are 1 objects on image apple_1.jpg
There are 1 objects on image apple_3.jpg
Dataset Pear has 3 images
There are 1 objects on image pear_3.jpg
There are 1 objects on image pear_1.jpg
There are 1 objects on image pear_2.jpg
Dataset Tropical has 0 images
Dataset Banana has 3 images
There are 1 objects on image banana_1.jpg
There are 1 objects on image banana_2.jpg
There are 3 objects on image banana_3.jpg
Dataset Mango has 4 images
There are 1 objects on image mango_3.jpg
There are 1 objects on image mango_1.jpg
There are 1 objects on image mango_4.jpg
There are 1 objects on image mango_2.jpg
Dataset Pineapple has 3 images
There are 1 objects on image pineapple_3.jpg
There are 1 objects on image pineapple_2.jpg
There are 1 objects on image pineapple_1.jpg
Dataset Lemon has 3 images
There are 1 objects on image lemon_1.jpg
There are 1 objects on image lemon_3.jpg
There are 1 objects on image lemon_2.jpg
Dataset Orange has 3 images
There are 1 objects on image orange_2.jpg
There are 1 objects on image orange_1.jpg
There are 1 objects on image orange_3.jpg

4. Optimizations

The bottleneck of this script is in these lines (27-28):

for image in images:
    ann_json = api.annotation.download_json(image.id)

If you have 1M images in your project, your code will send 🟡 1M requests to download annotations. It is inefficient due to Round Trip Time (RTT) and a large number of similar tiny requests to a Supervisely database.

It can be optimized by using the batch API method:

api.annotation.download_json_batch(dataset.id, image_ids) 

Supervisely API allows downloading annotations for multiple images in a single request. The code sample below sends ✅ 50x fewer requests and it leads to a significant speed-up of our original code:

for batch in sly.batched(images):
    image_ids = [image.id for image in batch]
    annotations = api.annotation.download_json_batch(dataset.id, image_ids)
    for image, ann_json in zip(batch, annotations):
        ann = sly.Annotation.from_json(ann_json, project_meta)
        print(f"There are {len(ann.labels)} objects on image {image.name}")

You can add this dataset to your team via Supervisely Ecosystem - ⬇️

Everything you need to reproduce : source code, Visual Studio code configuration, and a shell script for creating venv.

**** Get a demo project with labeled or with nested datasets.

**** Prepare .env files with credentials and ID of a demo project.

**** Run .

**** Show possible optimizations.

Add demo project "Lemons (Annotated)" to your workjspace

Create a file at ~/supervisely.env with the credentials for your Supervisely account. Learn more about environment variables . The content should look like this:

This script illustrates only the basics. If your project is huge and has hundreds of thousands of images then it is not so efficient to download annotations one by one. It is better to use batch (bulk) methods to reduce the number of API requests and significantly speed up your code. Learn more in below.

Clone the

Create by running the script

Change value in

The optimized version of the original script is in .

🎉
Lemons (Annotated)
Fruits (Annotated)
this tutorial is on GitHub
here
repo
venv
create_venv.sh
local.env
main_optimized.py
lemons and kiwis
fruits project
Step 1.
Step 2.
python script
Step 3.
Step 4.
the optimizations section