SegmentsDataset¶

class SegmentsDataset(release_file, labelset='ground-truth', filter_by=None, filter_by_metadata=None, segments_dir='segments', preload=True, s3_client=None, gcp_client=None, load_images=True)[source]¶

A class that represents a Segments dataset.

# pip install --upgrade segments-ai
from segments import SegmentsClient, SegmentsDataset
from segments.utils import export_dataset

# Initialize a SegmentsDataset from the release file
client = SegmentsClient('YOUR_API_KEY')
release = client.get_release('jane/flowers', 'v1.0') # Alternatively: release = 'flowers-v1.0.json'
dataset = SegmentsDataset(release, labelset='ground-truth', filter_by=['LABELED', 'REVIEWED'])

# Export to COCO panoptic format
export_format = 'coco-panoptic'
export_dataset(dataset, export_format)

Alternatively, you can use the initialized SegmentsDataset to loop through the samples and labels, and visualize or process them in any way you please:

import matplotlib.pyplot as plt
from segments.utils import get_semantic_bitmap

for sample in dataset:

    # Print the sample name and list of labeled objects
    print(sample['name'])
    print(sample['annotations'])

    # Show the image
    plt.imshow(sample['image'])
    plt.show()

    # Show the instance segmentation label
    plt.imshow(sample['segmentation_bitmap'])
    plt.show()

    # Show the semantic segmentation label
    semantic_bitmap = get_semantic_bitmap(sample['segmentation_bitmap'], sample['annotations'])
    plt.imshow(semantic_bitmap)
    plt.show()

Parameters:

release_file (Union[str, Release]) – Path to a release file, or a release class resulting from get_release().
labelset (str) – The labelset that should be loaded. Defaults to ground-truth.
filter_by (Union[LabelStatus, List[LabelStatus], None]) – A list of label statuses to filter by. Defaults to None.
filter_by_metadata (Optional[Dict[str, str]]) – A dict of metadata key:value pairs to filter by. Filters are ANDed together. Defaults to None.
segments_dir (str) – The directory where the data will be downloaded to for caching. Set to None to disable caching. Defaults to segments. Alternatively, you can set the SEGMENTS_DIR environment variable to change the default.
preload (bool) – Whether the data should be pre-downloaded when the dataset is initialized. Ignored if segments_dir is None. Defaults to True.
s3_client (Optional[Any]) – A boto3 S3 client, e.g. s3_client = boto3.client("s3"). Needs to be provided if your images are in a private S3 bucket. Defaults to None.
gcp_client (Optional[Any]) – A Google Gloud Storage client, e.g. gcp_client = storage.Client(). Needs to be provided if your images are in a private GCP bucket. Defaults to None.
load_images (Optional[bool]) – Whether this dataset object should load images when iterating over it. Disabling this allows you to load the annotations of your dataset while not fetching the images.

Raises:

ValueError – If the release task type is not one of: segmentation-bitmap, segmentation-bitmap-highres, image-vector-sequence, bboxes, vector, pointcloud-cuboid, pointcloud-cuboid-sequence, pointcloud-segmentation, pointcloud-segmentation-sequence.
ValueError – If there is no labelset with this name.