pixcdust.readers

Readers for SWOT Pixel Cloud. They support the Netcdf official format and converted Zarr or Geopackage database.

Submodules

Classes

NcSimpleReader

Class for reading SWOT Pixel cloud official format files.

GpkgReader

GeoPackage pixcdust database reader.

ZarrReader

Zarr pixcdust database reader.

Package Contents

class pixcdust.readers.NcSimpleReader(path: str | Iterable[str] | pathlib.Path | Iterable[pathlib.Path], variables: list[str] | None = None, area_of_interest: geopandas.GeoDataFrame | None = None, format_cfg: NcFormatCfg | None = None, conditions: dict[str, dict[str, str | float]] | None = None)[source]

Bases: pixcdust.readers.base_reader.BaseReader

Class for reading SWOT Pixel cloud official format files.

It’s for simple uses cases as it only reads the pixel_cloud group.

Attributes:

path: Path or list of path to read. variables: Optionally only read these variables. area_of_interest: Optionally only read points in area_of_interest. MULTI_FILE_SUPPORT: True, this class can read multiple netcdf. conditions: Optionally pass conditions to filter variables. Example: { “sig0”:{‘operator’: “ge”, ‘threshold’: 20}, “classification”:{‘operator’: “ge”, ‘threshold’: 3}, }

MULTI_FILE_SUPPORT = True
forbidden_variables
trusted_group
cst
conditions = None
static extract_info_from_nc_attrs(filename: str) Tuple[str, datetime.datetime, int, int, int, str][source]

Extracts orbit information from global attributes in a SWOT pixel cloud netcdf.

Args:

filename: path of SWOT PIXC Netcdf file

Returns:

(time of granule start as string, time of granule start as datetime, cycle number, pass number, tile number, swath size)

filter_variable() None[source]

Filters xarray dataset based on operator and threshold on specific variables.

Raises:

IOError: If the variable provided in conditions is not in the dataset. ValueError: If ‘operator’ or ‘threshold’ keys are not in conditions. AttributeError: If operator is not the function name of the operator module.

read(orbit_info: bool = False) None[source]

Load self.path file(s). You can then access from data or with methods like to_xarray, to_dataframe or to_geodataframe.

See self.open_mfdataset for more details on how multiple files are merged.

Args:
orbit_info: Option to extract the orbit information.

Only used if multiple files are read.

open_dataset() None[source]

Load the self.path file (need only one file in self.path). You can then access from data or with methods like to_xarray, to_dataframe or to_geodataframe.

open_mfdataset(orbit_info: bool = False) None[source]

Load self.path file(s) as a nested array. You can then access from data or with methods like to_xarray, to_dataframe or to_geodataframe.

Variables that are not one-dimensional along points dimension are not allowed and will be dropped:

  • ‘pixc_line_qual’,

  • ‘pixc_line_to_tvp’,

  • ‘interferogram’

  • etc.

Args:

orbit_info: option to extract the orbit information.

to_h3(variables: str | list[str] | None = None, resolution: int = 8, interp: bool = False, method: str = 'linear') xarray.Dataset[source]

Convert a Dataset with latitude and longitude coordinates into an H3-indexed grid.

Args:

variables: The variables you want to convert into the H3 grid, all variables by default. resolution: The resolution of the H3 grid. Valid values are from 0 (coarse) to 15 (fine). interp: True for interpolate data, could be more precise but take a lot of time, default is False. method: (‘nearest’, ‘linear’, ‘cubic’) The interpolation method used by`scippy.interpolate.griddata`.

Returns:

A new dataset with data variables interpolated onto the H3 grid

to_healpix(variables: str | list[str] | None = None, resolution: int = 8, interp: bool = False, method: str = 'linear') xarray.Dataset[source]

Convert a Dataset with latitude and longitude coordinates into an HEALPix-indexed grid.

Args:

variables: The variables you want to convert into the HEALpix grid, all by default. resolution: The resolution of the HEALPix grid. interp: True for interpolate data, could be more precise but take a lot of time, default is False. method: (‘nearest’, ‘linear’, ‘cubic’) The interpolation method used by`scippy.interpolate.griddata`.

Returns:

A new dataset with data variables interpolated onto the HEALPix grid.

__postprocess_points() None[source]

Adds a points coordinates containing shapely.Points (longitude, latitude) Useful for compatibility with xvec package and geographic manipulation.

__preprocess_types(ds: xarray.Dataset) xarray.Dataset[source]

Preprocessing function changing types in pixc dataset.

It cast the lon and lat to float32.

Args:

ds: pixc dataset read by xarray.open_dataset to preprocess

Returns:

dataset with cast types

__preprocess_types_and_add_orbit_info(ds: xarray.Dataset) xarray.Dataset[source]

Preprocessing function adding orbit information in pixc dataset.

It cast the lon and lat to float32.

Args:

ds: pixc dataset read by xarray.open_dataset to preprocess

Returns:

dataset augmented with orbit information for each index and with cast types

class pixcdust.readers.GpkgReader(path: str | pathlib.Path, area_of_interest: geopandas.GeoDataFrame | None = None)[source]

Bases: pixcdust.readers.base_reader.BaseReader

GeoPackage pixcdust database reader.

Read a database from a GeoPackage file . You can then request a xr.Dataset, pd.DataFrame or gpd.GeoDataFrame view of the database.

Attributes:

path: Path to read. variables: Not supported. area_of_interest: Optionally only read points in area_of_interest. MULTI_FILE_SUPPORT: False, only support one file.

_gdf_data: geopandas.GeoDataFrame | None = None
layers: list[str]
property data: xarray.Dataset

Return an xarray.Dataset view from the database loaded.

Equivalent to to_xarray.

Returns:

Dataset read

to_geodataframe() geopandas.GeoDataFrame[source]

Convert the database read to a gpd.GeoDataFrame. Only points in self.area_of_interest are included.

Returns:

GeoDataFrame read.

read_single_layer(layer: str) geopandas.GeoDataFrame[source]

Read and return a single layer of geopackage database.

Don’t load the read data into the class (can’t be then converted by the reader). Use read for more advanced usage.

Args:

layer : name of the geodataframe layer to read. Must be in self.layers

Returns:

Geodataframe containing data read from layer

read(layers: List[str] | None = None) None[source]

Load all layers, or subset of layers, from geopackage database. You can then access from data or with methods like to_xarray, to_dataframe or to_geodataframe.

Args:

layers: Optional list of layers to load. Default to all.

class pixcdust.readers.ZarrReader(path: str | Iterable[str] | pathlib.Path | Iterable[pathlib.Path], variables: list[str] | None = None, area_of_interest: geopandas.GeoDataFrame | None = None, conditions: dict[str, dict[str, str | float]] | None = None)[source]

Bases: pixcdust.readers.base_reader.BaseReader

Zarr pixcdust database reader.

Read a database from a Zarr database (folder). You can then request a xr.Dataset, pd.DataFrame or gpd.GeoDataFrame view of the database.

Attributes:

path: Path to read. variables: Optionally only read these variables. area_of_interest: Optionally only read points in area_of_interest. MULTI_FILE_SUPPORT: False, only support one file.

read(date_interval: Tuple[datetime.datetime, datetime.datetime] | None | None = None) None[source]

Load a zarr database. You can then access from data or with methods like to_xarray, to_dataframe or to_geodataframe.

Args:
date_interval: Optional date filter on the database read.

Only load data dated within the interval.