Supervisely
About SuperviselyEcosystemContact usSlack
  • 💻Supervisely Developer Portal
  • 🎉Getting Started
    • Installation
    • Basics of authentication
    • Intro to Python SDK
    • Environment variables
    • Supervisely annotation format
      • Project Structure
      • Project Meta: Classes, Tags, Settings
      • Objects
      • Tags
      • Image Annotation
      • Video Annotation
      • Point Clouds Annotation
      • Point Cloud Episode Annotation
      • Volumes Annotation
    • Python SDK tutorials
      • Images
        • Images
        • Image and object tags
        • Spatial labels on images
        • Keypoints (skeletons)
        • Multispectral images
        • Multiview images
        • Advanced: Optimized Import
        • Advanced: Export
      • Videos
        • Videos
        • Video and object tags
        • Spatial labels on videos
      • Point Clouds
        • Point Clouds (LiDAR)
        • Point Cloud Episodes and object tags
        • 3D point cloud object segmentation based on sensor fusion and 2D mask guidance
        • 3D segmentation masks projection on 2D photo context image
      • Volumes
        • Volumes (DICOM)
        • Spatial labels on volumes
      • Common
        • Iterate over a project
        • Iterate over a local project
        • Progress Bar tqdm
        • Cloning projects for development
    • Command Line Interface (CLI)
      • Enterprise CLI Tool
        • Instance administration
        • Workflow automation
      • Supervisely SDK CLI
    • Connect your computer
      • Linux
      • Windows WSL
      • Troubleshooting
  • 🔥App development
    • Basics
      • Create app from any py-script
      • Configuration file
        • config.json
        • Example 1. Headless
        • Example 2. App with GUI
        • v1 - Legacy
          • Example 1. v1 Modal Window
          • Example 2. v1 app with GUI
      • Add private app
      • Add public app
      • App Compatibility
    • Apps with GUI
      • Hello World!
      • App in the Image Labeling Tool
      • App in the Video Labeling Tool
      • In-browser app in the Labeling Tool
    • Custom import app
      • Overview
      • From template - simple
      • From scratch - simple
      • From scratch GUI - advanced
      • Finding directories with specific markers
    • Custom export app
      • Overview
      • From template - simple
      • From scratch - advanced
    • Neural Network integration
      • Overview
      • Serving App
        • Introduction
        • Instance segmentation
        • Object detection
        • Semantic segmentation
        • Pose estimation
        • Point tracking
        • Object tracking
        • Mask tracking
        • Image matting
        • How to customize model inference
        • Example: Custom model inference with probability maps
      • Serving App with GUI
        • Introduction
        • How to use default GUI template
        • Default GUI template customization
        • How to create custom user interface
      • Inference API
      • Training App
        • Overview
        • Tensorboard template
        • Object detection
      • High level scheme
      • Custom inference pipeline
      • Train and predict automation model pipeline
    • Advanced
      • Advanced debugging
      • How to make your own widget
      • Tutorial - App Engine v1
        • Chapter 1 Headless
          • Part 1 — Hello world! [From your Python script to Supervisely APP]
          • Part 2 — Errors handling [Catching all bugs]
          • Part 3 — Site Packages [Customize your app]
          • Part 4 — SDK Preview [Lemons counter app]
          • Part 5 — Integrate custom tracker into Videos Annotator tool [OpenCV Tracker]
        • Chapter 2 Modal Window
          • Part 1 — Modal window [What is it?]
          • Part 2 — States and Widgets [Customize modal window]
        • Chapter 3 UI
          • Part 1 — While True Script [It's all what you need]
          • Part 2 — UI Rendering [Simplest UI Application]
          • Part 3 — APP Handlers [Handle Events and Errors]
          • Part 4 — State and Data [Mutable Fields]
          • Part 5 — Styling your app [Customizing the UI]
        • Chapter 4 Additionals
          • Part 1 — Remote Developing with PyCharm [Docker SSH Server]
      • Custom Configuration
        • Fixing SSL Certificate Errors in Supervisely
        • Fixing 400 HTTP errors when using HTTP instead of HTTPS
      • Autostart
      • Coordinate System
      • MLOps Workflow integration
    • Widgets
      • Input
        • Input
        • InputNumber
        • InputTag
        • BindedInputNumber
        • DatePicker
        • DateTimePicker
        • ColorPicker
        • TimePicker
        • ClassesMapping
        • ClassesColorMapping
      • Controls
        • Button
        • Checkbox
        • RadioGroup
        • Switch
        • Slider
        • TrainValSplits
        • FileStorageUpload
        • Timeline
        • Pagination
      • Text Elements
        • Text
        • TextArea
        • Editor
        • Copy to Clipboard
        • Markdown
        • Tooltip
        • ElementTag
        • ElementTagsList
      • Media
        • Image
        • LabeledImage
        • GridGallery
        • Video
        • VideoPlayer
        • ImagePairSequence
        • Icons
        • ObjectClassView
        • ObjectClassesList
        • ImageSlider
        • Carousel
        • TagMetaView
        • TagMetasList
        • ImageAnnotationPreview
        • ClassesMappingPreview
        • ClassesListPreview
        • TagsListPreview
        • MembersListPreview
      • Selection
        • Select
        • SelectTeam
        • SelectWorkspace
        • SelectProject
        • SelectDataset
        • SelectItem
        • SelectTagMeta
        • SelectAppSession
        • SelectString
        • Transfer
        • DestinationProject
        • TeamFilesSelector
        • FileViewer
        • Dropdown
        • Cascader
        • ClassesListSelector
        • TagsListSelector
        • MembersListSelector
        • TreeSelect
        • SelectCudaDevice
      • Thumbnails
        • ProjectThumbnail
        • DatasetThumbnail
        • VideoThumbnail
        • FolderThumbnail
        • FileThumbnail
      • Status Elements
        • Progress
        • NotificationBox
        • DoneLabel
        • DialogMessage
        • TaskLogs
        • Badge
        • ModelInfo
        • Rate
        • CircleProgress
      • Layouts and Containers
        • Card
        • Container
        • Empty
        • Field
        • Flexbox
        • Grid
        • Menu
        • OneOf
        • Sidebar
        • Stepper
        • RadioTabs
        • Tabs
        • TabsDynamic
        • ReloadableArea
        • Collapse
        • Dialog
        • IFrame
      • Tables
        • Table
        • ClassicTable
        • RadioTable
        • ClassesTable
        • RandomSplitsTable
        • FastTable
      • Charts and Plots
        • LineChart
        • GridChart
        • HeatmapChart
        • ApexChart
        • ConfusionMatrix
        • LinePlot
        • GridPlot
        • ScatterChart
        • TreemapChart
        • PieChart
      • Compare Data
        • MatchDatasets
        • MatchTagMetas
        • MatchObjClasses
        • ClassBalance
        • CompareAnnotations
      • Widgets demos on github
  • 😎Advanced user guide
    • Objects binding
    • Automate with Python SDK & API
      • Start and stop app
      • User management
      • Labeling Jobs
  • 🖥️UI widgets
    • Element UI library
    • Supervisely UI widgets
    • Apexcharts - modern & interactive charts
    • Plotly graphing library
  • 📚API References
    • REST API Reference
    • Python SDK Reference
Powered by GitBook
On this page
  • Introduction
  • Function signature
  • Parameters
  • Data Example
  • Tutorial content
  • Step 1. How to extract the archive and remove junk files
  • Step 2. How to find directories with markers
  • Step 3. How to check directories for specific conditions
  • Example of the final code
  • Summary

Was this helpful?

Edit on GitHub
  1. App development
  2. Custom import app

Finding directories with specific markers

A tutorial on how to find directories with specific markers (filename) and check them for some conditions using `sly.fs.dirs_with_marker()` function.

Introduction

First of all, let's talk about the use cases of this function. When working with import apps, there are often cases where we need a specific structure for input data to ensure the application functions correctly. Frequently, it's necessary to identify a specific file marker that serves as a reference point for determining the rest of the structure. However, user-provided data can have varying structures. For instance, the target directory might not be in the root directory but nested within several other directories, or the data may contain not just one but multiple suitable directories. For example, we're trying to find a directory with a config.json file in it. The directory structure might look like this:

📂 input_dir
┣ 📂 nested_dir_1
┃ ┣ 📂 nested_dir_2
┃ ┃ ┣ 📂 nested_dir_3
┃ ┃ ┃ ┣ 📄 config.json
┃ ┃ ┃ ┗ 📄 data.csv
┣ 📂 nested_dir_4
┃ ┣ 📄 config.json
┃ ┗ 📄 data.csv

So, in this case, we need to find two directories: nested_dir_3 and nested_dir_4. It's not a problem to find them, but why implement the same logic every time? It's much easier to use a function that will do it for us everywhere we need it. And it's just a part of the problem, we're trying to solve here. Same as finding directories, we usually need to check them for some conditions. Because the file we've found maybe incorrect and a user may just forget to delete it, or the directory may not contain other needed files. If we try to work with the data from this directory, we may have an error or, much worse, incorrect results. Of course, we can write the function that will check the directory for us, and pass the directories to it one by one. Well, while using sly.fs.dirs_with_marker() we still need to have this function, if we need to check the directory for some conditions, but we can make this process more convenient and the code more readable and clear.

Function signature

sly.fs.dirs_with_marker(
    input_path: str,
    markers: Union[str, List[str]],
    check_function: Optional[Callable] = None,
    ignore_case: Optional[bool] = False,
) -> Generator[str, None, None]:

Parameters

Parameters
Type
Description

input_path

str

Path to the directory from which the search will be performed.

markers

Union[str, List[str]]

Filename or list of filenames, which will be searched.

check_function

Optional[Callable]

Function that will be used to check the directory for some conditions.

ignore_case

Optional[bool]

If True, the search will be case-insensitive. Default is False.

Data Example

We prepared a short Python script, that will unpack an archive (as an example of input data from a user) and find directories with config.json files in them. Then it will check if they're valid. Conditions for checking are the following:

  • The directory must contain config.json file.

  • The config.json file must have a key valid, and its value must be true.

  • The directory must contain two other subdirectories: images and anns.

Example archive structure:

📦extracted
 ┗ 📂input_dir
 ┃ ┣ 📂subdir01
 ┃ ┃ ┣ 📂subdir11
 ┃ ┃ ┃ ┣ 📂anns
 ┃ ┃ ┃ ┣ 📂images
 ┃ ┃ ┃ ┗ 📄config.json
 ┃ ┃ ┗ 📂subdir12
 ┃ ┃ ┃ ┣ 📂anns
 ┃ ┃ ┃ ┣ 📂images
 ┃ ┃ ┃ ┗ 📄config.json
 ┃ ┗ 📂subdir02

So, we have two directories with config.json files in them, but only one of them is valid.

Tutorial content

Step 1. How to extract the archive and remove junk files

It's not a rare case, when the archive from a user contains some unnecessary files. For example, the archive may contain a __MACOSX directory or .DS_Store files, which are created by macOS. If we don't handle them, we may have an error while working with the data, so in most cases, it's much easier to delete them.

To extract the archive, we'll use the sly.fs.unpack_archive() function:

import os
import json
import supervisely as sly

DATA_DIR = "data"
ARCHIVE_PATH = os.path.join(DATA_DIR, "input_archive.zip")
EXCTRACT_PATH = os.path.join(DATA_DIR, "extracted")

sly.fs.unpack_archive(ARCHIVE_PATH, EXCTRACT_PATH, remove_junk=True)

ℹ️ If the archive was already extracted and you want to remove junk files from it, you can use the sly.fs.remove_junk_from_dir() function:

sly.fs.remove_junk_from_dir(EXCTRACT_PATH)

Now we have a directory without junk files, and we can start searching for directories with markers.

Step 2. How to find directories with markers

As we already have a directory without junk files, we need to specify the markers we want to find. In our case, it's config.json file. We can pass it as a string or as a list of strings if we need to find multiple markers. ℹ️ If you're passing a list of markers it's important to mention, that they will be searched with the OR, not AND condition. It means that if you pass a list of markers, the function will return all directories that contain at least one of the markers. If you need a condition when the directory contains several specific markers, you can implement this check in the check_function parameter, we will talk about it in next section. Now we can use the sly.fs.dirs_with_marker() function to iterate over all directories with markers:

MARKERS = "config.json"

for directory in sly.fs.dirs_with_marker(EXCTRACT_PATH, MARKERS, ignore_case=True):
    print(f"The directory with the marker '{MARKERS}' is found: '{directory}'")

Ok, we've found the directories, but how can we check them for some conditions? Let's talk about it in the next section.

Step 3. How to check directories for specific conditions

As we've already mentioned, we can pass a function to the check_function parameter, which will be used to check the directory for specific conditions. This function must return True if the directory is valid and False otherwise. So, our conditions for checking were: config.json file must have a key valid, and its value must be true, and the directory must contain two subdirectories: images and anns. Let's implement this check:

def check_function(directory: str) -> bool:
    config = json.load(open(os.path.join(directory, MARKERS)))
    images_dir = os.path.join(directory, "images")
    anns_dir = os.path.join(directory, "anns")

    return config.get("valid") is True and os.path.isdir(images_dir) and os.path.isdir(anns_dir)

Just as a reminder, the check_function must return the bool value.

Example of the final code

So, we've implemented the check function, and now we can use it in the sly.fs.dirs_with_marker() function. Here's the full code:

import os
import json

import supervisely as sly

DATA_DIR = "data"
ARCHIVE_PATH = os.path.join(DATA_DIR, "input_archive.zip")
EXCTRACT_PATH = os.path.join(DATA_DIR, "extracted")

# 1. Extracting the archive and removing junk files.
sly.fs.unpack_archive(ARCHIVE_PATH, EXCTRACT_PATH, remove_junk=True)

# 2. Specifying the marker we want to find.
MARKERS = "config.json"

# 3. Iterating over directories with markers without checking them.
for directory in sly.fs.dirs_with_marker(EXCTRACT_PATH, MARKERS, ignore_case=True):
    print(f"The directory with the marker '{MARKERS}' is found: '{directory}'")

# 4. Defining the check function.
def check_function(directory: str) -> bool:
    config = json.load(open(os.path.join(directory, MARKERS)))
    images_dir = os.path.join(directory, "images")
    anns_dir = os.path.join(directory, "anns")

    return config.get("valid") is True and os.path.isdir(images_dir) and os.path.isdir(anns_dir)

# 5. Iterating over directories with directories which contains markers and passed the check.
for checked_directory in sly.fs.dirs_with_marker(
    EXCTRACT_PATH, MARKERS, check_function=check_function, ignore_case=True
):
    print(f"The directory '{checked_directory}' is valid.")

Let's have a look on what we've got here:

  1. We've extracted the archive and removed junk files.

  2. We've specified the marker we want to find.

  3. We've iterated over directories with markers without checking them. In our test case it will print two directories, while the correct is only one.

  4. We've defined the check function that will be used to check the directory meet our requirements.

  5. We've iterated over directories that passed all checks. In our test case it will print only one directory, which is correct.

And now we can easily work with the data from the directory (or directories) we've found, knowing that it's valid.

Summary

In this tutorial, we've learned how to find directories with specific markers and check them for some conditions using sly.fs.dirs_with_marker() function. We've also learned how to extract the archive and clean it of junk files using sly.fs.unpack_archive() or sly.fs.remove_junk_from_dir() functions. We hope this tutorial was helpful for you and it will save you some time in the future, while working with import apps.

PreviousFrom scratch GUI - advancedNextCustom export app

Last updated 1 year ago

Was this helpful?

You can find the above demo archive in the data directory of the dirs-with-marker repo -

Everything you need to reproduce : .

🔥
here
this tutorial is on GitHub
main.py
Step 1. How to extract the archive and remove junk files
Step 2. How to find directories with markers
Step 3. How to check directories for specific conditions
Example of the final code