geococo

Submodules

Functions

append_dataset(dataset, images_dir, src, ...[, ...])

Move across a given geotiff, converting all intersecting labels to COCO

create_dataset(description, contributor[, version, ...])

Instances and returns a new CocoDataset model with given kwargs.

load_dataset(json_path)

Dumps the contents of json_path as a string, interprets it as a CocoDataset model

save_dataset(dataset, json_path)

JSON-encodes an instance of CocoDataset and saves it to json_path.

Package Contents

geococo.append_dataset(dataset, images_dir, src, window_bounds, labels, id_attribute=None, name_attribute=None, super_attribute=None)[source]

Move across a given geotiff, converting all intersecting labels to COCO annotations and appending them to a COCODataset model. This is done through rasterio.Window objects, the bounds of which you can set with window_bounds (also determines the size of the output images associated with the Annotation instances). The degree of overlap between these windows is determined by the dimensions of the given labels to maximize representation in the resulting dataset.

The “iscrowd” attribute (see https://cocodataset.org/#format-data) is determined by whether the respective labels are Polygon or MultiPolygon instances. The attribute arguments are column names of the GeoDataFrame and are used to supply (super)category names, category ids, or both. If only names are given, ids are autogenerated (and vice versa). Lastly, each call to this method will increment the dataset version: patch if using the same image_path, minor if using a new image_path, and major if using a new output_dir.

Parameters:
  • dataset (geococo.coco_models.CocoDataset) – CocoDataset model to append images and annotations to

  • images_dir (pathlib.Path) – output directory for all label images

  • src (rasterio.io.DatasetReader) – open rasterio reader for input raster

  • labels (geopandas.GeoDataFrame) – GeoDataFrame containing labels and class_info (‘category_id’)

  • window_bounds (List[Tuple[int, int]]) – a list of window_bounds to attempt to use ()

  • id_attribute (Optional[str]) – Column containing category_id values

  • name_attribute (Optional[str]) – Column containing category_name values

  • super_attribute (Optional[str]) – Column containing supercategory values

Returns:

CocoDataset with n appended Image, Category and Annotation instances

Return type:

geococo.coco_models.CocoDataset

geococo.create_dataset(description, contributor, version=str(Version(major=0)), date_created=datetime.now())[source]

Instances and returns a new CocoDataset model with given kwargs.

Parameters:
  • description (str) – Description of your COCO dataset

  • contributor (str) – Main contributors of your COCO dataset, its images and its annotations

  • version (str) – Initial SemVer version (defaults to 0.0.0)

  • date_created (datetime.datetime) – Date when dataset was initially created, defaults to datetime.now()

Returns:

An instance of CocoDataset without Image- and Annotation objects

Return type:

geococo.coco_models.CocoDataset

geococo.load_dataset(json_path)[source]

Dumps the contents of json_path as a string, interprets it as a CocoDataset model and returns it.

Parameters:

json_path (pathlib.Path) – path to the JSON file containing the json-encoded COCO dataset

Returns:

An instance of CocoDataset with loaded Image- and Annotation objects from json_path

Return type:

geococo.coco_models.CocoDataset

geococo.save_dataset(dataset, json_path)[source]

JSON-encodes an instance of CocoDataset and saves it to json_path.

Parameters:
Return type:

None