geococo.utils ============= .. py:module:: geococo.utils Functions --------- .. autoapisummary:: geococo.utils.mask_label geococo.utils.window_intersect geococo.utils.reshape_image geococo.utils.generate_window_polygon geococo.utils.generate_window_offsets geococo.utils.window_factory geococo.utils.estimate_average_bounds geococo.utils.estimate_schema geococo.utils.validate_labels geococo.utils.update_labels geococo.utils.get_date_created Module Contents --------------- .. py:function:: mask_label(input_raster, label) Masks out an label from input_raster and flattens it to a 2D binary array. If it doesn't overlap, the resulting mask will only consist of False bools. :param input_raster: open rasterio DatasetReader for the input raster :param label: Polygon object representing the area to be masked (i.e. label) :return: A 2D binary array representing the label .. py:function:: window_intersect(input_raster, input_vector) Generates a Rasterio Window from the intersecting extents of the input data. It also verifies if the input data share the same CRS and if they physically overlap. :param input_raster: rasterio dataset (i.e. input image) :param input_vector: geopandas geodataframe (i.e. input labels) :return: rasterio window that represent the intersection between input data extents .. py:function:: reshape_image(img_array, shape, padding_value = 0) Reshapes 3D numpy array to match given 3D shape, done through slicing or padding. :param img_array: the numpy array to be reshaped :param shape: the desired shape (bands, rows, cols) :param padding_value: what value to pad img_array with (if too small) :return: numpy array in desired shape .. py:function:: generate_window_polygon(datasource, window) Turns the spatial bounds of a given window to a shapely.Polygon object in a given dataset's CRS. :param datasource: a rasterio DatasetReader object that provides the affine transformation :param window: bounds to represent as Polygon :return: shapely Polygon representing the spatial bounds of a given window in a given CRS .. py:function:: generate_window_offsets(window, schema) Computes an array of window offsets bound by a given window. :param window: the bounding window (i.e. offsets will be within its bounds) :param schema: the parameters for the window generator :return: an array of window offsets within the bounds of window .. py:function:: window_factory(parent_window, schema, boundless = True) Generator that produces rasterio.Window objects in predetermined steps, within the given Window. :param parent_window: the window that provides the bounds for all child_window objects :param schema: the parameters that determine the window steps :param boundless: whether the child_window should be clipped by the parent_window or not :yield: a rasterio.Window used for windowed reading/writing .. py:function:: estimate_average_bounds(gdf, quantile = 0.9) Estimates the average size of all features in a GeoDataFrame. :param gdf: GeoDataFrame that contains all features (i.e. shapely.Geometry objects) :param quantile: what quantile will represent the feature population :return: a tuple of floats representing average width and height .. py:function:: estimate_schema(gdf, src, quantile = 0.9, window_bounds = [(256, 256), (512, 512)]) Attempts to find a schema that is able to represent the average GeoDataFrame feature (i.e. sufficient overlap) but within the bounds given by window_bounds. :param gdf: GeoDataFrame that contains features that determine the degree of overlap :param src: The rasterio DataSource associated with the resulting schema (i.e. bounds and pixelsizes) :param quantile: what quantile will represent the feature population :param window_bounds: a list of possible limits for the window generators :return: (if found) a viable WindowSchema with sufficient overlap within the window_bounds .. py:function:: validate_labels(labels, id_attribute = 'category_id', name_attribute = None, super_attribute = None) Validates all necessary attributes for a geococo-viable GeoDataFrame. It also checks for the presence of either category_id or category_name values and ensures valid geometry. :param labels: GeoDataFrame containing labels and category attributes :param id_attribute: Column name that holds category_id values :param name_attribute: Column name that holds category_name values :param super_attribute: Column name that holds supercategory values :return: Validated GeoDataFrame with coerced dtypes .. py:function:: update_labels(labels, categories, id_attribute = 'category_id', name_attribute = None) Updates labels with validated (super)category names and ids from given Category instances (i.e. source of truth created from current and previous labels). This ultimately just matches a given key (name or id) with keys in each Category instance and maps the associated (and validated) values to labels. :param labels: GeoDataFrame containing labels and category attributes (validated by validate_labels) :param categories: list of Category instances created from current and previous labels :param id_attribute: Column name that holds category_id values :param name_attribute: Column name that holds category_name values :return: labels with name, id and supercategory attributes from all given Category instances .. py:function:: get_date_created(raster_source) Get the creation date of the input image, represented as a datetime object. If no such information is available in the image's metadata, we return the date the file itself was last modified. :param raster_source: reader for input image :return: datetime object representing date_created