Client

class cellarium.cas.client.CASClient(api_token: str, api_url: str = 'https://cellarium-cloud-api.cellarium.ai', num_attempts_per_chunk: int = 7)[source]

Bases: object

Service that is designed to communicate with the Cellarium Cloud Backend.

Parameters:
  • api_token – API token issued by the Cellarium team

  • num_attempts_per_chunk – Number of attempts the client should make to annotate each chunk.
    Default: 3

annotate_10x_h5_file(filepath: str, chunk_size: int = 1000, cas_model_name: str = 'default', count_matrix_input: CountMatrixInput = CountMatrixInput.X, feature_ids_column_name: str = 'index', feature_names_column_name: str | None = None, include_dev_metadata: bool = False) List[Dict[str, Any]][source]

Parse the 10x ‘h5’ matrix and apply the annotate_anndata() method to it.

Parameters:
  • filepath – Filepath of the local ‘h5’ matrix

  • chunk_size – Size of chunks to split on

  • cas_model_name – Model name to use for annotation.
    Allowed Values: Model name from the allowed_models_list list or "default" keyword, which refers to the default selected model in the Cellarium backend.
    Default: "default"

  • count_matrix_input – Where to obtain a feature expression count matrix from.
    Allowed Values: Choices from enum :class:`cellarium.cas.constants.CountMatrixInput
    Default: "CountMatrixInput.X"

  • feature_ids_column_name – Column name where to obtain Ensembl feature ids.
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: "index"

  • feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is None
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: None

  • include_dev_metadata – Boolean indicating whether to include a breakdown of the number of cells by dataset

Returns:

A list of dictionaries with annotations for each of the cells from input adata

annotate_anndata(adata: AnnData, chunk_size=1000, cas_model_name: str = 'default', count_matrix_input: CountMatrixInput = CountMatrixInput.X, feature_ids_column_name: str = 'index', feature_names_column_name: str | None = None, include_dev_metadata: bool = False) List[Dict[str, Any]][source]

Send an instance of anndata.AnnData to the Cellarium Cloud backend for annotations. The function splits the adata into smaller chunks and asynchronously sends them to the backend API service. Each chunk is of equal size, except for the last one, which may be smaller. The backend processes these chunks in parallel.

Parameters:
  • adataanndata.AnnData instance to annotate

  • chunk_size – Size of chunks to split on

  • cas_model_name – Model name to use for annotation.
    Allowed Values: Model name from the allowed_models_list list or "default" keyword, which refers to the default selected model in the Cellarium backend.
    Default: "default"

  • count_matrix_input – Where to obtain a feature expression count matrix from.
    Allowed Values: Choices from enum :class:`cellarium.cas.constants.CountMatrixInput
    Default: "CountMatrixInput.X"

  • feature_ids_column_name – Column name where to obtain Ensembl feature ids.
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: "index"

  • feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is None
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: None

  • include_dev_metadata – Boolean indicating whether to include a breakdown of the number of cells by dataset

Returns:

A list of dictionaries with annotations for each of the cells from input adata

annotate_anndata_file(filepath: str, chunk_size=1000, cas_model_name: str = 'default', count_matrix_input: CountMatrixInput = CountMatrixInput.X, feature_ids_column_name: str = 'index', feature_names_column_name: str | None = None, include_dev_metadata: bool = False) List[Dict[str, Any]][source]

Read the ‘h5ad’ file into a anndata.AnnData matrix and apply the annotate_anndata() method to it.

Parameters:
  • filepath – Filepath of the local anndata.AnnData matrix

  • chunk_size – Size of chunks to split on

  • cas_model_name – Model name to use for annotation.
    Allowed Values: Model name from the allowed_models_list list or "default" keyword, which refers to the default selected model in the Cellarium backend.
    Default: "default"

  • count_matrix_input – Where to obtain a feature expression count matrix from.
    Allowed Values: Choices from enum :class:`cellarium.cas.constants.CountMatrixInput
    Default: "CountMatrixInput.X"

  • feature_ids_column_name – Column name where to obtain Ensembl feature ids.
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: "index"

  • feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is None
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: None

  • include_dev_metadata – Boolean indicating whether to include a breakdown of the number of cells per dataset

Returns:

A list of dictionaries with annotations for each of the cells from input adata

annotate_matrix_cell_type_ontology_aware_strategy(matrix: str | AnnData, chunk_size=1000, count_matrix_input: CountMatrixInput = CountMatrixInput.X, feature_ids_column_name: str = 'index', cas_model_name: str | None = None, feature_names_column_name: str | None = None, prune_threshold: float = 0.1, weighting_prefactor: float = 1.0) List[Dict[str, Any]][source]

Send an instance of anndata.AnnData to the Cellarium Cloud backend for annotations using ontology aware strategy . The function splits the adata into smaller chunks and asynchronously sends them to the backend API service. Each chunk is of equal size, except for the last one, which may be smaller. The backend processes these chunks in parallel.

Parameters:
  • matrix – Either path to a file (must be either .h5 or .h5ad) or an anndata.AnnData instance to annotate

  • chunk_size – Size of chunks to split on

  • count_matrix_input – Where to obtain a feature expression count matrix from.
    Allowed Values: Choices from enum :class:`cellarium.cas.constants.CountMatrixInput
    Default: "CountMatrixInput.X"

  • feature_ids_column_name – Column name where to obtain Ensembl feature ids.
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: "index"

  • cas_model_name – Model name to use for annotation.
    Allowed Values: Model name from the allowed_models_list list or None, which refers to the default selected model in the Cellarium backend.
    Default: None

  • feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is None
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: None

  • prune_threshold – Threshold score for pruning the ontology graph in the output

  • weighting_prefactor – Weighting prefactor for the weight calculation. A larger absolute value of the weighting_prefactor results in a steeper decay (weights drop off more quickly as distance increases), whereas a smaller absolute value results in a slower decay

Returns:

A list of dictionaries with annotations for each of the cells from input adata

annotate_matrix_cell_type_summary_statistics_strategy(matrix: str | AnnData, chunk_size=1000, count_matrix_input: CountMatrixInput = CountMatrixInput.X, feature_ids_column_name: str = 'index', include_extended_statistics: bool = True, cas_model_name: str | None = None, feature_names_column_name: str | None = None) List[Dict[str, Any]][source]

Send an instance of anndata.AnnData to the Cellarium Cloud backend for annotations. The function splits the adata into smaller chunks and asynchronously sends them to the backend API service. Each chunk is of equal size, except for the last one, which may be smaller. The backend processes these chunks in parallel.

Parameters:
  • matrix – Either path to a file (must be either .h5 or .h5ad) or an anndata.AnnData instance to annotate

  • chunk_size – Size of chunks to split on

  • count_matrix_input – Where to obtain a feature expression count matrix from.
    Allowed Values: Choices from enum :class:`cellarium.cas.constants.CountMatrixInput
    Default: "CountMatrixInput.X"

  • feature_ids_column_name – Column name where to obtain Ensembl feature ids.
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: "index"

  • include_extended_statistics – Boolean indicating whether to include a breakdown of the number of cells by dataset

  • cas_model_name – Model name to use for annotation.
    Allowed Values: Model name from the allowed_models_list list or None, which refers to the default selected model in the Cellarium backend.
    Default: None

  • feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is None
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: None

Returns:

A list of dictionaries with annotations for each of the cells from input adata

query_cells_by_ids(cell_ids: List[int], model_name: str | None, metadata_feature_names: List[str] | None = None) List[Dict[str, Any]][source]

Query cells by their ids from a single anndata file with Cellarium CAS. Input file should be validated and sanitized according to the model schema.

Parameters:
  • cell_ids – List of cell ids to query

  • model_name – Model name to use for annotation.
    Allowed Values: Model name from the allowed_models_list list or None keyword, which refers to the default selected model in the Cellarium backend.
    Default: None

  • metadata_feature_names – List of metadata feature names to include in the response.

Returns:

List of cells with metadata

search_10x_h5_file(filepath: str, chunk_size: int = 500, cas_model_name: str = 'default', count_matrix_input: CountMatrixInput = CountMatrixInput.X, feature_ids_column_name: str = 'index', feature_names_column_name: str | None = None) List[Dict[str, Any]][source]

Parse the 10x ‘h5’ matrix and apply the search_anndata() method to it.

Parameters:
  • filepath – Filepath of the local ‘h5’ matrix

  • chunk_size – Size of chunks to split on

  • cas_model_name – Model name to use for annotation.
    Allowed Values: Model name from the allowed_models_list list or "default" keyword, which refers to the default selected model in the Cellarium backend.
    Default: "default"

  • count_matrix_input – Where to obtain a feature expression count matrix from.
    Allowed Values: Choices from enum :class:`cellarium.cas.constants.CountMatrixInput
    Default: "CountMatrixInput.X"

  • feature_ids_column_name – Column name where to obtain Ensembl feature ids.
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: "index"

  • feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is None
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: None

Returns:

A list of dictionaries with annotations for each of the cells from input adata

search_anndata(adata: AnnData, chunk_size=500, cas_model_name: str = 'default', count_matrix_input: CountMatrixInput = CountMatrixInput.X, feature_ids_column_name: str = 'index', feature_names_column_name: str | None = None) List[Dict[str, Any]][source]

Send an instance of anndata.AnnData to the Cellarium Cloud backend for nearest neighbor search. The function splits the adata into smaller chunks and asynchronously sends them to the backend API service. Each chunk is of equal size, except for the last one, which may be smaller. The backend processes these chunks in parallel.

Parameters:
  • adataanndata.AnnData instance to annotate

  • chunk_size – Size of chunks to split on

  • cas_model_name – Model name to use for annotation.
    Allowed Values: Model name from the allowed_models_list list or "default" keyword, which refers to the default selected model in the Cellarium backend.
    Default: "default"

  • count_matrix_input – Where to obtain a feature expression count matrix from.
    Allowed Values: Choices from enum :class:`cellarium.cas.constants.CountMatrixInput
    Default: "CountMatrixInput.X"

  • feature_ids_column_name – Column name where to obtain Ensembl feature ids.
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: "index"

  • feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is None
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: None

Returns:

A list of dictionaries with annotations for each of the cells from input adata

search_matrix(matrix: str | AnnData, chunk_size: int = 500, count_matrix_input: CountMatrixInput = CountMatrixInput.X, feature_ids_column_name: str = 'index', cas_model_name: str | None = None, feature_names_column_name: str | None = None) List[Dict[str, Any]][source]

Send an instance of anndata.AnnData to the Cellarium Cloud backend for nearest neighbor search. The function splits the adata into smaller chunks and asynchronously sends them to the backend API service. Each chunk is of equal size, except for the last one, which may be smaller. The backend processes these chunks in parallel.

Parameters:
  • matrix – Either path to a file (must be either .h5 or .h5ad) or an anndata.AnnData instance to annotate

  • chunk_size – Size of chunks to split on

  • count_matrix_input – Where to obtain a feature expression count matrix from.
    Allowed Values: Choices from enum :class:`cellarium.cas.constants.CountMatrixInput
    Default: "CountMatrixInput.X"

  • feature_ids_column_name – Column name where to obtain Ensembl feature ids.
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: "index"

  • cas_model_name – Model name to use for annotation.
    Allowed Values: Model name from the allowed_models_list list or None, which refers to the default selected model in the Cellarium backend.
    Default: None

  • feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is None
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: None

Returns:

A list of dictionaries with annotations for each of the cells from input adata

validate_and_sanitize_input_data(adata: AnnData, cas_model_name: str, count_matrix_name: CountMatrixInput, feature_ids_column_name: str, feature_names_column_name: str | None = None) AnnData[source]

Validate and sanitize input anndata.AnnData instance according to a specified feature schema associated with a particular model.

Parameters:
  • adataanndata.AnnData instance to annotate

  • cas_model_name – The model associated with the schema used for sanitizing.
    Allowed Values: Model name from the allowed_models_list list keyword, which refers to the default selected model in the Cellarium backend.

  • count_matrix_name – Where to obtain a feature expression count matrix from.
    Allowed Values: Choice of either "X" or "raw.X" in order to use adata.X or ``adata.raw.X``|br|

  • feature_ids_column_name – Column name where to obtain Ensembl feature ids.
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.

  • feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is None
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: None

Returns:

Validated and sanitized instance of anndata.AnnData

validate_model_name(model_name: str | None = None) None[source]

Validate if the model name provided is valid

Parameters:

model_name – Model name to check

Raises:

ValueError if model name is not valid