Client

class cellarium.cas.client.CASClient(api_token: str, api_url: str = 'https://cellarium-cloud-api.cellarium.ai', num_attempts_per_chunk: int = 7)[source]

Client that is designed to communicate with the Cellarium Cloud Backend.

Parameters:
  • api_token – API token issued by the Cellarium team.

  • api_url – URL of the Cellarium Cloud Backend. Should be left blank in most cases.

  • num_attempts_per_chunk – Number of attempts the client should make to annotate each chunk.
    Default: 3

property allowed_models_list

List of models in Cellarium CAS that can be used to annotate.

annotate_10x_h5_file(filepath: str, chunk_size: int = 1000, cas_model_name: str = 'default', count_matrix_input: CountMatrixInput = CountMatrixInput.X, feature_ids_column_name: str = 'index', feature_names_column_name: str | None = None, include_dev_metadata: bool = False) CellTypeSummaryStatisticsResults[source]

Parse the 10x ‘h5’ matrix and apply the annotate_anndata() method to it.

Parameters:
  • filepath – Filepath of the local ‘h5’ matrix

  • chunk_size – Size of chunks to split on

  • cas_model_name – Model name to use for annotation.
    Allowed Values: Model name from the allowed_models_list list or "default" keyword, which refers to the default selected model in the Cellarium backend.
    Default: "default"

  • count_matrix_input – Where to obtain a feature expression count matrix from.
    Allowed Values: Choices from enum CountMatrixInput
    Default: "CountMatrixInput.X"

  • feature_ids_column_name – Column name where to obtain Ensembl feature ids.
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: "index"

  • feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is None
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: None

  • include_dev_metadata – Boolean indicating whether to include a breakdown of the number of cells by dataset

Returns:

A CellTypeSummaryStatisticsResults object with annotations for each of the cells from the input adata

annotate_anndata(adata: AnnData, chunk_size=1000, cas_model_name: str = 'default', count_matrix_input: CountMatrixInput = CountMatrixInput.X, feature_ids_column_name: str = 'index', feature_names_column_name: str | None = None, include_dev_metadata: bool = False) CellTypeSummaryStatisticsResults[source]

Send an instance of anndata.AnnData to the Cellarium Cloud backend for annotations. The function splits the adata into smaller chunks and asynchronously sends them to the backend API service. Each chunk is of equal size, except for the last one, which may be smaller. The backend processes these chunks in parallel.

Parameters:
  • adataanndata.AnnData instance to annotate

  • chunk_size – Size of chunks to split on

  • cas_model_name – Model name to use for annotation.
    Allowed Values: Model name from the allowed_models_list list or "default" keyword, which refers to the default selected model in the Cellarium backend.
    Default: "default"

  • count_matrix_input – Where to obtain a feature expression count matrix from.
    Allowed Values: Choices from enum CountMatrixInput
    Default: "CountMatrixInput.X"

  • feature_ids_column_name – Column name where to obtain Ensembl feature ids.
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: "index"

  • feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is None
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: None

  • include_dev_metadata – Boolean indicating whether to include a breakdown of the number of cells by dataset

Returns:

A CellTypeSummaryStatisticsResults object with annotations for each of the cells from the adata input

annotate_anndata_file(filepath: str, chunk_size=1000, cas_model_name: str = 'default', count_matrix_input: CountMatrixInput = CountMatrixInput.X, feature_ids_column_name: str = 'index', feature_names_column_name: str | None = None, include_dev_metadata: bool = False) CellTypeSummaryStatisticsResults[source]

Read the ‘h5ad’ file into a anndata.AnnData matrix and apply the annotate_anndata() method to it.

Parameters:
  • filepath – Filepath of the local anndata.AnnData matrix

  • chunk_size – Size of chunks to split on

  • cas_model_name – Model name to use for annotation.
    Allowed Values: Model name from the allowed_models_list list or "default" keyword, which refers to the default selected model in the Cellarium backend.
    Default: "default"

  • count_matrix_input – Where to obtain a feature expression count matrix from.
    Allowed Values: Choices from enum CountMatrixInput
    Default: "CountMatrixInput.X"

  • feature_ids_column_name – Column name where to obtain Ensembl feature ids.
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: "index"

  • feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is None
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: None

  • include_dev_metadata – Boolean indicating whether to include a breakdown of the number of cells per dataset

Returns:

A CellTypeSummaryStatisticsResults object with annotations for each of the cells from the input adata

annotate_matrix_cell_type_ontology_aware_strategy(matrix: str | AnnData, chunk_size=1000, count_matrix_input: CountMatrixInput = CountMatrixInput.X, feature_ids_column_name: str = 'index', cas_model_name: str | None = None, feature_names_column_name: str | None = None, prune_threshold: float = 0.05, weighting_prefactor: float = 1.0) CellTypeOntologyAwareResults[source]

Send an instance of anndata.AnnData to the Cellarium Cloud backend for annotations using ontology aware strategy . The function splits the adata into smaller chunks and asynchronously sends them to the backend API service. Each chunk is of equal size, except for the last one, which may be smaller. The backend processes these chunks in parallel.

Parameters:
  • matrix – Either path to a file (must be either .h5 or .h5ad) or an anndata.AnnData instance to annotate

  • chunk_size – Size of chunks to split on

  • count_matrix_input – Where to obtain a feature expression count matrix from.
    Allowed Values: Choices from enum CountMatrixInput
    Default: "CountMatrixInput.X"

  • feature_ids_column_name – Column name where to obtain Ensembl feature ids.
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: "index"

  • cas_model_name – Model name to use for annotation.
    Allowed Values: Model name from the allowed_models_list list or None, which refers to the default selected model in the Cellarium backend.
    Default: None

  • feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is None
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: None

  • prune_threshold – Threshold score for pruning the ontology graph in the output

  • weighting_prefactor – Weighting prefactor for the weight calculation. A larger absolute value of the weighting_prefactor results in a steeper decay (weights drop off more quickly as distance increases), whereas a smaller absolute value results in a slower decay

Returns:

A CellTypeOntologyAwareResults object with annotations for each of the cells from the input adata

annotate_matrix_cell_type_summary_statistics_strategy(matrix: str | AnnData, chunk_size=1000, count_matrix_input: CountMatrixInput = CountMatrixInput.X, feature_ids_column_name: str = 'index', include_extended_statistics: bool = True, cas_model_name: str | None = None, feature_names_column_name: str | None = None) CellTypeSummaryStatisticsResults[source]

Send an instance of anndata.AnnData to the Cellarium Cloud backend for annotations. The function splits the adata into smaller chunks and asynchronously sends them to the backend API service. Each chunk is of equal size, except for the last one, which may be smaller. The backend processes these chunks in parallel.

Parameters:
  • matrix – Either path to a file (must be either .h5 or .h5ad) or an anndata.AnnData instance to annotate

  • chunk_size – Size of chunks to split on

  • count_matrix_input – Where to obtain a feature expression count matrix from.
    Allowed Values: Choices from enum CountMatrixInput
    Default: "CountMatrixInput.X"

  • feature_ids_column_name – Column name where to obtain Ensembl feature ids.
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: "index"

  • include_extended_statistics – Boolean indicating whether to include a breakdown of the number of cells by dataset

  • cas_model_name – Model name to use for annotation.
    Allowed Values: Model name from the allowed_models_list list or None, which refers to the default selected model in the Cellarium backend.
    Default: None

  • feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is None
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: None

Returns:

A CellTypeSummaryStatisticsResults object with annotations for each of the cells from the input adata

print_user_quota() None[source]

Print the user’s quota information

query_cells_by_ids(cell_ids: List[int], metadata_feature_names: List[CellMetadataFeatures] = None) CellQueryResults[source]

Query cells by their ids from a single anndata file with Cellarium CAS. Input file should be validated and sanitized according to the model schema.

Parameters:
  • cell_ids – List of cell ids to query

  • metadata_feature_names – List of metadata features to include in the response.

Returns:

A CellQueryResults object with cell query results

search_10x_h5_file(filepath: str, chunk_size: int = 500, cas_model_name: str = 'default', count_matrix_input: CountMatrixInput = CountMatrixInput.X, feature_ids_column_name: str = 'index', feature_names_column_name: str | None = None) MatrixQueryResults[source]

Parse the 10x ‘h5’ matrix and apply the search_anndata() method to it.

Parameters:
  • filepath – Filepath of the local ‘h5’ matrix

  • chunk_size – Size of chunks to split on

  • cas_model_name – Model name to use for annotation.
    Allowed Values: Model name from the allowed_models_list list or "default" keyword, which refers to the default selected model in the Cellarium backend.
    Default: "default"

  • count_matrix_input – Where to obtain a feature expression count matrix from.
    Allowed Values: Choices from enum CountMatrixInput
    Default: "CountMatrixInput.X"

  • feature_ids_column_name – Column name where to obtain Ensembl feature ids.
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: "index"

  • feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is None
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: None

Returns:

A MatrixQueryResults object with search results for each of the cells from the input adata

search_anndata(adata: AnnData, chunk_size=500, cas_model_name: str = 'default', count_matrix_input: CountMatrixInput = CountMatrixInput.X, feature_ids_column_name: str = 'index', feature_names_column_name: str | None = None) MatrixQueryResults[source]

Send an instance of anndata.AnnData to the Cellarium Cloud backend for nearest neighbor search. The function splits the adata into smaller chunks and asynchronously sends them to the backend API service. Each chunk is of equal size, except for the last one, which may be smaller. The backend processes these chunks in parallel.

Parameters:
  • adataanndata.AnnData instance to annotate

  • chunk_size – Size of chunks to split on

  • cas_model_name – Model name to use for annotation.
    Allowed Values: Model name from the allowed_models_list list or "default" keyword, which refers to the default selected model in the Cellarium backend.
    Default: "default"

  • count_matrix_input – Where to obtain a feature expression count matrix from.
    Allowed Values: Choices from enum CountMatrixInput
    Default: "CountMatrixInput.X"

  • feature_ids_column_name – Column name where to obtain Ensembl feature ids.
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: "index"

  • feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is None
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: None

Returns:

A MatrixQueryResults object with search results for each of the cells from the input adata

search_matrix(matrix: str | AnnData, chunk_size: int = 500, count_matrix_input: CountMatrixInput = CountMatrixInput.X, feature_ids_column_name: str = 'index', cas_model_name: str | None = None, feature_names_column_name: str | None = None) MatrixQueryResults[source]

Send an instance of anndata.AnnData to the Cellarium Cloud backend for nearest neighbor search. The function splits the adata into smaller chunks and asynchronously sends them to the backend API service. Each chunk is of equal size, except for the last one, which may be smaller. The backend processes these chunks in parallel.

Parameters:
  • matrix – Either path to a file (must be either .h5 or .h5ad) or an anndata.AnnData instance to annotate

  • chunk_size – Size of chunks to split on

  • count_matrix_input – Where to obtain a feature expression count matrix from.
    Allowed Values: Choices from enum CountMatrixInput
    Default: "CountMatrixInput.X"

  • feature_ids_column_name – Column name where to obtain Ensembl feature ids.
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: "index"

  • cas_model_name – Model name to use for annotation.
    Allowed Values: Model name from the allowed_models_list list or None, which refers to the default selected model in the Cellarium backend.
    Default: None

  • feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is None
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: None

Returns:

A MatrixQueryResults object with search results for each of the cells from the input adata

validate_and_sanitize_input_data(adata: AnnData, cas_model_name: str, count_matrix_name: CountMatrixInput, feature_ids_column_name: str, feature_names_column_name: str | None = None) AnnData[source]

Validate and sanitize input anndata.AnnData instance according to a specified feature schema associated with a particular model.

Parameters:
  • adataanndata.AnnData instance to annotate

  • cas_model_name – The model associated with the schema used for sanitizing.
    Allowed Values: Model name from the allowed_models_list list keyword, which refers to the default selected model in the Cellarium backend.

  • count_matrix_name – Where to obtain a feature expression count matrix from.
    Allowed Values: Choice of either "X" or "raw.X" in order to use adata.X or adata.raw.X

  • feature_ids_column_name – Column name where to obtain Ensembl feature ids.
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.

  • feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is None
    Allowed Values: A value from adata.var.columns or "index" keyword, which refers to index column.
    Default: None

Returns:

Validated and sanitized instance of anndata.AnnData

validate_model_name(model_name: str | None = None) None[source]

Validate if the model name provided is valid

Parameters:

model_name – Model name to check

Raises:

ValueError if model name is not valid

validate_version()[source]

Validate that this version of the client library is compatible with the selected server.

class cellarium.cas.constants.CountMatrixInput(value)[source]

Constants for the count matrix input type.

X: str = 'X'
RAW_X: str = 'raw.X'
class cellarium.cas.constants.CellMetadataFeatures(value)[source]

Represents the cell features that can be queried for in the CAS API.

CAS_CELL_INDEX: str = 'cas_cell_index'
CELL_TYPE: str = 'cell_type'
ASSAY: str = 'assay'
DISEASE: str = 'disease'
DONOR_ID: str = 'donor_id'
IS_PRIMARY_DATA: str = 'is_primary_data'
DEVELOPMENT_STAGE: str = 'development_stage'
ORGANISM: str = 'organism'
SELF_REPORTED_ETHNICITY: str = 'self_reported_ethnicity'
SEX: str = 'sex'
SUSPENSION_TYPE: str = 'suspension_type'
TISSUE: str = 'tissue'
TOTAL_MRNA_UMIS: str = 'total_mrna_umis'
CELL_TYPE_ONTOLOGY_TERM_ID: str = 'cell_type_ontology_term_id'
ASSAY_ONTOLOGY_TERM_ID: str = 'assay_ontology_term_id'
DISEASE_ONTOLOGY_TERM_ID: str = 'disease_ontology_term_id'
DEVELOPMENT_STAGE_ONTOLOGY_TERM_ID: str = 'development_stage_ontology_term_id'
ORGANISM_ONTOLOGY_TERM_ID: str = 'organism_ontology_term_id'
SELF_REPORTED_ETHNICITY_ONTOLOGY_TERM_ID: str = 'self_reported_ethnicity_ontology_term_id'
SEX_ONTOLOGY_TERM_ID: str = 'sex_ontology_term_id'
TISSUE_ONTOLOGY_TERM_ID: str = 'tissue_ontology_term_id'
pydantic model cellarium.cas.models.CellTypeSummaryStatisticsResults[source]

Represents the data object returned by the CAS API for nearest neighbor annotations.

Fields:
pydantic model DatasetStatistics[source]
Fields:
  • count_per_dataset (int)

  • dataset_id (str)

  • max_distance (float)

  • mean_distance (float)

  • median_distance (float)

  • min_distance (float)

field dataset_id: str [Required]

The ID of the dataset containing cells

field count_per_dataset: int [Required]

The number of cells found in the dataset

field min_distance: float [Required]

The minimum distance between the query cell and the dataset cells

field max_distance: float [Required]

The maximum distance between the query cell and the dataset cells

field median_distance: float [Required]

The median distance between the query cell and the dataset cells

field mean_distance: float [Required]

The mean distance between the query cell and the dataset cells

pydantic model SummaryStatistics[source]
Fields:
  • cell_count (int)

  • cell_type (str)

  • dataset_ids_with_counts (List[cellarium.cas.models.CellTypeSummaryStatisticsResults.DatasetStatistics] | None)

  • max_distance (float)

  • median_distance (float)

  • min_distance (float)

  • p25_distance (float)

  • p75_distance (float)

field cell_type: str [Required]

The cell type of the cluster of cells

field cell_count: int [Required]

The number of cells in the cluster

field min_distance: float [Required]

The minimum distance between the query cell and the cluster cells

field p25_distance: float [Required]

The 25th percentile distance between the query cell and the cluster cells

field median_distance: float [Required]

The median distance between the query cell and the cluster cells

field p75_distance: float [Required]

The 75th percentile distance between the query cell and the cluster cells

field max_distance: float [Required]

The maximum distance between the query cell and the cluster cells

field dataset_ids_with_counts: List[CellTypeSummaryStatisticsResults.DatasetStatistics] | None = None
pydantic model NeighborhoodAnnotation[source]

Represents the data object returned by the CAS API for a single nearest neighbor annotation.

Fields:
  • matches (List[cellarium.cas.models.CellTypeSummaryStatisticsResults.SummaryStatistics])

  • query_cell_id (str)

field query_cell_id: str [Required]

The ID of the querying cell

field matches: List[CellTypeSummaryStatisticsResults.SummaryStatistics] [Required]
field data: List[CellTypeSummaryStatisticsResults.NeighborhoodAnnotation] [Required]

The annotations found

pydantic model cellarium.cas.models.CellTypeOntologyAwareResults[source]

Represents the data object returned by the CAS API for a ontology-aware annotations.

Fields:
pydantic model Match[source]
Fields:
  • cell_type (str)

  • cell_type_ontology_term_id (str)

  • score (float)

field score: float [Required]

The score of the match

field cell_type_ontology_term_id: str [Required]

The ontology term ID of the cell type for the match

field cell_type: str [Required]

The cell type of the match

pydantic model OntologyAwareAnnotation[source]

Represents the data object returned by the CAS API for a single ontology-aware annotation.

Fields:
  • matches (List[cellarium.cas.models.CellTypeOntologyAwareResults.Match])

  • query_cell_id (str)

  • total_neighbors (int)

  • total_neighbors_unrecognized (int)

  • total_weight (float)

field query_cell_id: str [Required]

The ID of the querying cell

field matches: List[CellTypeOntologyAwareResults.Match] [Required]

The matches found for the querying cell

field total_weight: float [Required]

The total weight of the matches

field total_neighbors: int [Required]

The total number of neighbors matched

field total_neighbors_unrecognized: int [Required]

The total number of neighbors that were not recognized

field data: List[CellTypeOntologyAwareResults.OntologyAwareAnnotation] [Required]

The annotations found

pydantic model cellarium.cas.models.MatrixQueryResults[source]

Represents the data object returned by the CAS API when performing a cell matrix query (e.g. a query of the cell database using a matrix).

Fields:
pydantic model Match[source]
Fields:
  • cas_cell_index (float)

  • distance (float)

field cas_cell_index: float [Required]

CAS-specific ID of a single cell

field distance: float [Required]

The distance between this querying cell and the found cell

pydantic model MatrixQueryResult[source]

Represents the data object returned by the CAS API for a single cell query.

Fields:
  • neighbors (List[cellarium.cas.models.MatrixQueryResults.Match])

  • query_cell_id (str)

field query_cell_id: str [Required]

The ID of the querying cell

field neighbors: List[MatrixQueryResults.Match] [Required]
field data: List[MatrixQueryResults.MatrixQueryResult] [Required]

The results of the query

pydantic model cellarium.cas.models.CellQueryResults[source]

Represents the data object returned by the CAS API for a cell query.

Fields:
pydantic model CellariumCellMetadata[source]
Fields:
  • assay (str | None)

  • assay_ontology_term_id (str | None)

  • cas_cell_index (int)

  • cell_type (str | None)

  • cell_type_ontology_term_id (str | None)

  • development_stage (str | None)

  • development_stage_ontology_term_id (str | None)

  • disease (str | None)

  • disease_ontology_term_id (str | None)

  • donor_id (str | None)

  • is_primary_data (bool | None)

  • organism (str | None)

  • organism_ontology_term_id (str | None)

  • self_reported_ethnicity (str | None)

  • self_reported_ethnicity_ontology_term_id (str | None)

  • sex (str | None)

  • sex_ontology_term_id (str | None)

  • suspension_type (str | None)

  • tissue (str | None)

  • tissue_ontology_term_id (str | None)

  • total_mrna_umis (int | None)

field cas_cell_index: int [Required]

The CAS-specific ID of the cell

field cell_type: str | None [Required]

The cell type of the cell

field assay: str | None [Required]

The assay used to generate the cell

field disease: str | None [Required]

The disease state of the cell

field donor_id: str | None [Required]

The ID of the donor of the cell

field is_primary_data: bool | None [Required]

Whether the cell is primary data

field development_stage: str | None [Required]

The development stage of the cell donor

field organism: str | None [Required]

The organism of the cell

field self_reported_ethnicity: str | None [Required]

The self reported ethnicity of the cell donor

field sex: str | None [Required]

The sex of the cell donor

field suspension_type: str | None [Required]

The cell suspension types used

field tissue: str | None [Required]

The tissue-type that the cell was a part of

field total_mrna_umis: int | None [Required]

The count of mRNA UMIs associated with this cell

field cell_type_ontology_term_id: str | None [Required]

The ID used by the ontology for the type of the cell

field assay_ontology_term_id: str | None [Required]

The ID used by the ontology for the assay used to generate the cell

field disease_ontology_term_id: str | None [Required]

The ID used by the ontology for the disease state of the cell

field development_stage_ontology_term_id: str | None [Required]

The ID used by the ontology for the development stage of the cell donor

field organism_ontology_term_id: str | None [Required]

The ID used by the ontology for the organism of the cell

field self_reported_ethnicity_ontology_term_id: str | None [Required]

The ID used by the ontology for the self reported ethnicity of the cell donor

field sex_ontology_term_id: str | None [Required]

The ID used by the ontology for the sex of the cell donor

field tissue_ontology_term_id: str | None [Required]

The ID used by the ontology for the tissue type that the cell was a part of

field data: List[CellQueryResults.CellariumCellMetadata] [Required]

The metadata of the found cells