Client

class cellarium.cas.client.CASClient(api_token: str, num_attempts_per_chunk: int = 7)[source]

Bases: object

Service that is designed to communicate with the Cellarium Cloud Backend.

Parameters:
  • api_token – API token issued by the Cellarium team

  • num_attempts_per_chunk – Number of attempts the client should make to annotate each chunk.
    Default: 3

annotate_10x_h5_file(filepath: str, chunk_size: int = 1000, cas_model_name: str = 'default', count_matrix_name: str = 'X', feature_ids_column_name: str = 'index', feature_names_column_name: str | None = None, include_dev_metadata: bool = False) List[Dict[str, Any]][source]

Parse the 10x ‘h5’ matrix and apply the annotate_anndata() method to it.

Parameters:
  • filepath – Filepath of the local ‘h5’ matrix

  • chunk_size – Size of chunks to split on

  • cas_model_name – Model name to use for annotation.
    Allowed Values: Model name from the allowed_models_list list or "default" keyword, which refers to the default selected model in the Cellarium backend.
    Default: "default"

  • count_matrix_name – Where to obtain a feature expression count matrix from.
    Allowed Values: Choice of either "X" or "raw.X" in order to use adata.X or adata.raw.X
    Default: "X"

  • feature_ids_column_name – Column name where to obtain Ensembl feature ids.
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: "index"

  • feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is None
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: None

  • include_dev_metadata – Boolean indicating whether to include a breakdown of the number of cells by dataset

Returns:

A list of dictionaries with annotations for each of the cells from input adata

annotate_anndata(adata: AnnData, chunk_size=1000, cas_model_name: str = 'default', count_matrix_name: str = 'X', feature_ids_column_name: str = 'index', feature_names_column_name: str | None = None, include_dev_metadata: bool = False) List[Dict[str, Any]][source]

Send an instance of anndata.AnnData to the Cellarium Cloud backend for annotations. The function splits the adata into smaller chunks and asynchronously sends them to the backend API service. Each chunk is of equal size, except for the last one, which may be smaller. The backend processes these chunks in parallel.

Parameters:
  • adataanndata.AnnData instance to annotate

  • chunk_size – Size of chunks to split on

  • cas_model_name – Model name to use for annotation.
    Allowed Values: Model name from the allowed_models_list list or "default" keyword, which refers to the default selected model in the Cellarium backend.
    Default: "default"

  • count_matrix_name – Where to obtain a feature expression count matrix from.
    Allowed Values: Choice of either "X" or "raw.X" in order to use adata.X or adata.raw.X
    Default: "X"

  • feature_ids_column_name – Column name where to obtain Ensembl feature ids.
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: "index"

  • feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is None
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: None

  • include_dev_metadata – Boolean indicating whether to include a breakdown of the number of cells by dataset

Returns:

A list of dictionaries with annotations for each of the cells from input adata

annotate_anndata_file(filepath: str, chunk_size=1000, cas_model_name: str = 'default', count_matrix_name: str = 'X', feature_ids_column_name: str = 'index', feature_names_column_name: str | None = None, include_dev_metadata: bool = False) List[Dict[str, Any]][source]

Read the ‘h5ad’ file into a anndata.AnnData matrix and apply the annotate_anndata() method to it.

Parameters:
  • filepath – Filepath of the local anndata.AnnData matrix

  • chunk_size – Size of chunks to split on

  • cas_model_name – Model name to use for annotation.
    Allowed Values: Model name from the allowed_models_list list or "default" keyword, which refers to the default selected model in the Cellarium backend.
    Default: "default"

  • count_matrix_name – Where to obtain a feature expression count matrix from.
    Allowed Values: Choice of either "X" or "raw.X" in order to use adata.X or adata.raw.X
    Default: "X"

  • feature_ids_column_name – Column name where to obtain Ensembl feature ids.
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: "index"

  • feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is None
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: None

  • include_dev_metadata – Boolean indicating whether to include a breakdown of the number of cells per dataset

Returns:

A list of dictionaries with annotations for each of the cells from input adata

query_cells_by_ids(cell_ids: List[int], model_name: str, metadata_feature_names: List[str] | None = None) List[Dict[str, Any]][source]

Query cells by their ids from a single anndata file with Cellarium CAS. Input file should be validated and sanitized according to the model schema.

Parameters:
  • cell_ids – List of cell ids to query

  • model_name – Model name to use for annotation.
    Allowed Values: Model name from the allowed_models_list list or "default" keyword, which refers to the default selected model in the Cellarium backend.
    Default: "default"

  • metadata_feature_names – List of metadata feature names to include in the response.

Returns:

List of cells with metadata

search_10x_h5_file(filepath: str, chunk_size: int = 500, cas_model_name: str = 'default', count_matrix_name: str = 'X', feature_ids_column_name: str = 'index', feature_names_column_name: str | None = None) List[Dict[str, Any]][source]

Parse the 10x ‘h5’ matrix and apply the search_anndata() method to it.

Parameters:
  • filepath – Filepath of the local ‘h5’ matrix

  • chunk_size – Size of chunks to split on

  • cas_model_name – Model name to use for annotation.
    Allowed Values: Model name from the allowed_models_list list or "default" keyword, which refers to the default selected model in the Cellarium backend.
    Default: "default"

  • count_matrix_name – Where to obtain a feature expression count matrix from.
    Allowed Values: Choice of either "X" or "raw.X" in order to use adata.X or adata.raw.X
    Default: "X"

  • feature_ids_column_name – Column name where to obtain Ensembl feature ids.
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: "index"

  • feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is None
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: None

Returns:

A list of dictionaries with annotations for each of the cells from input adata

search_anndata(adata: AnnData, chunk_size=500, cas_model_name: str = 'default', count_matrix_name: str = 'X', feature_ids_column_name: str = 'index', feature_names_column_name: str | None = None) List[Dict[str, Any]][source]

Send an instance of anndata.AnnData to the Cellarium Cloud backend for nearest neighbor search. The function splits the adata into smaller chunks and asynchronously sends them to the backend API service. Each chunk is of equal size, except for the last one, which may be smaller. The backend processes these chunks in parallel.

Parameters:
  • adataanndata.AnnData instance to annotate

  • chunk_size – Size of chunks to split on

  • cas_model_name – Model name to use for annotation.
    Allowed Values: Model name from the allowed_models_list list or "default" keyword, which refers to the default selected model in the Cellarium backend.
    Default: "default"

  • count_matrix_name – Where to obtain a feature expression count matrix from.
    Allowed Values: Choice of either "X" or "raw.X" in order to use adata.X or adata.raw.X
    Default: "X"

  • feature_ids_column_name – Column name where to obtain Ensembl feature ids.
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: "index"

  • feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is None
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: None

Returns:

A list of dictionaries with annotations for each of the cells from input adata

validate_and_sanitize_input_data(adata: AnnData, cas_model_name: str, count_matrix_name: str, feature_ids_column_name: str, feature_names_column_name: str | None = None) AnnData[source]

Validate and sanitize input anndata.AnnData instance according to a specified feature schema associated with a particular model.

Parameters:
  • adataanndata.AnnData instance to annotate

  • cas_model_name – The model associated with the schema used for sanitizing.
    Allowed Values: Model name from the allowed_models_list list or "default" keyword, which refers to the default selected model in the Cellarium backend.

  • count_matrix_name – Where to obtain a feature expression count matrix from.
    Allowed Values: Choice of either "X" or "raw.X" in order to use adata.X or ``adata.raw.X``|br|

  • feature_ids_column_name – Column name where to obtain Ensembl feature ids.
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.

  • feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is None
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: None

Returns:

Validated and sanitized instance of anndata.AnnData

validate_model_name(model_name: str) None[source]

Validate if the model name provided is valid

Parameters:

model_name – Model name to check

Raises:

ValueError if model name is not valid