Client
- class cellarium.cas.client.CASClient(api_token: str, api_url: str = 'https://cellarium-cloud-api.cellarium.ai', num_attempts_per_chunk: int = 7)[source]
Client that is designed to communicate with the Cellarium Cloud Backend.
- Parameters:
api_token – API token issued by the Cellarium team.
api_url – URL of the Cellarium Cloud Backend. Should be left blank in most cases.
num_attempts_per_chunk – Number of attempts the client should make to annotate each chunk.
Default:3
- property allowed_models_list
List of models in Cellarium CAS that can be used to annotate.
- annotate_10x_h5_file(filepath: str, chunk_size: int = 1000, cas_model_name: str = 'default', count_matrix_input: CountMatrixInput = CountMatrixInput.X, feature_ids_column_name: str = 'index', feature_names_column_name: str | None = None, include_dev_metadata: bool = False) CellTypeSummaryStatisticsResults [source]
Parse the 10x ‘h5’ matrix and apply the
annotate_anndata()
method to it.- Parameters:
filepath – Filepath of the local ‘h5’ matrix
chunk_size – Size of chunks to split on
cas_model_name – Model name to use for annotation.
Allowed Values: Model name from theallowed_models_list
list or"default"
keyword, which refers to the default selected model in the Cellarium backend.
Default:"default"
count_matrix_input – Where to obtain a feature expression count matrix from.
Allowed Values: Choices from enumCountMatrixInput
Default:"CountMatrixInput.X"
feature_ids_column_name – Column name where to obtain Ensembl feature ids.
Allowed Values: A value fromadata.var.columns
or"index"
keyword, which refers to index column.
Default:"index"
feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is
None
Allowed Values: A value fromadata.var.columns
or"index"
keyword, which refers to index column.
Default:None
include_dev_metadata – Boolean indicating whether to include a breakdown of the number of cells by dataset
- Returns:
A
CellTypeSummaryStatisticsResults
object with annotations for each of the cells from the input adata
- annotate_anndata(adata: AnnData, chunk_size=1000, cas_model_name: str = 'default', count_matrix_input: CountMatrixInput = CountMatrixInput.X, feature_ids_column_name: str = 'index', feature_names_column_name: str | None = None, include_dev_metadata: bool = False) CellTypeSummaryStatisticsResults [source]
Send an instance of
anndata.AnnData
to the Cellarium Cloud backend for annotations. The function splits theadata
into smaller chunks and asynchronously sends them to the backend API service. Each chunk is of equal size, except for the last one, which may be smaller. The backend processes these chunks in parallel.- Parameters:
adata –
anndata.AnnData
instance to annotatechunk_size – Size of chunks to split on
cas_model_name – Model name to use for annotation.
Allowed Values: Model name from theallowed_models_list
list or"default"
keyword, which refers to the default selected model in the Cellarium backend.
Default:"default"
count_matrix_input – Where to obtain a feature expression count matrix from.
Allowed Values: Choices from enumCountMatrixInput
Default:"CountMatrixInput.X"
feature_ids_column_name – Column name where to obtain Ensembl feature ids.
Allowed Values: A value fromadata.var.columns
or"index"
keyword, which refers to index column.
Default:"index"
feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is
None
Allowed Values: A value fromadata.var.columns
or"index"
keyword, which refers to index column.
Default:None
include_dev_metadata – Boolean indicating whether to include a breakdown of the number of cells by dataset
- Returns:
A
CellTypeSummaryStatisticsResults
object with annotations for each of the cells from the adata input
- annotate_anndata_file(filepath: str, chunk_size=1000, cas_model_name: str = 'default', count_matrix_input: CountMatrixInput = CountMatrixInput.X, feature_ids_column_name: str = 'index', feature_names_column_name: str | None = None, include_dev_metadata: bool = False) CellTypeSummaryStatisticsResults [source]
Read the ‘h5ad’ file into a
anndata.AnnData
matrix and apply theannotate_anndata()
method to it.- Parameters:
filepath – Filepath of the local
anndata.AnnData
matrixchunk_size – Size of chunks to split on
cas_model_name – Model name to use for annotation.
Allowed Values: Model name from theallowed_models_list
list or"default"
keyword, which refers to the default selected model in the Cellarium backend.
Default:"default"
count_matrix_input – Where to obtain a feature expression count matrix from.
Allowed Values: Choices from enumCountMatrixInput
Default:"CountMatrixInput.X"
feature_ids_column_name – Column name where to obtain Ensembl feature ids.
Allowed Values: A value fromadata.var.columns
or"index"
keyword, which refers to index column.
Default:"index"
feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is
None
Allowed Values: A value fromadata.var.columns
or"index"
keyword, which refers to index column.
Default:None
include_dev_metadata – Boolean indicating whether to include a breakdown of the number of cells per dataset
- Returns:
A
CellTypeSummaryStatisticsResults
object with annotations for each of the cells from the input adata
- annotate_matrix_cell_type_ontology_aware_strategy(matrix: str | AnnData, chunk_size=1000, count_matrix_input: CountMatrixInput = CountMatrixInput.X, feature_ids_column_name: str = 'index', cas_model_name: str | None = None, feature_names_column_name: str | None = None, prune_threshold: float = 0.05, weighting_prefactor: float = 1.0) CellTypeOntologyAwareResults [source]
Send an instance of
anndata.AnnData
to the Cellarium Cloud backend for annotations using ontology aware strategy . The function splits theadata
into smaller chunks and asynchronously sends them to the backend API service. Each chunk is of equal size, except for the last one, which may be smaller. The backend processes these chunks in parallel.- Parameters:
matrix – Either path to a file (must be either .h5 or .h5ad) or an
anndata.AnnData
instance to annotatechunk_size – Size of chunks to split on
count_matrix_input – Where to obtain a feature expression count matrix from.
Allowed Values: Choices from enumCountMatrixInput
Default:"CountMatrixInput.X"
feature_ids_column_name – Column name where to obtain Ensembl feature ids.
Allowed Values: A value fromadata.var.columns
or"index"
keyword, which refers to index column.
Default:"index"
cas_model_name – Model name to use for annotation.
Allowed Values: Model name from theallowed_models_list
list orNone
, which refers to the default selected model in the Cellarium backend.
Default:None
feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is
None
Allowed Values: A value fromadata.var.columns
or"index"
keyword, which refers to index column.
Default:None
prune_threshold – Threshold score for pruning the ontology graph in the output
weighting_prefactor – Weighting prefactor for the weight calculation. A larger absolute value of the weighting_prefactor results in a steeper decay (weights drop off more quickly as distance increases), whereas a smaller absolute value results in a slower decay
- Returns:
A
CellTypeOntologyAwareResults
object with annotations for each of the cells from the input adata
- annotate_matrix_cell_type_summary_statistics_strategy(matrix: str | AnnData, chunk_size=1000, count_matrix_input: CountMatrixInput = CountMatrixInput.X, feature_ids_column_name: str = 'index', include_extended_statistics: bool = True, cas_model_name: str | None = None, feature_names_column_name: str | None = None) CellTypeSummaryStatisticsResults [source]
Send an instance of
anndata.AnnData
to the Cellarium Cloud backend for annotations. The function splits theadata
into smaller chunks and asynchronously sends them to the backend API service. Each chunk is of equal size, except for the last one, which may be smaller. The backend processes these chunks in parallel.- Parameters:
matrix – Either path to a file (must be either .h5 or .h5ad) or an
anndata.AnnData
instance to annotatechunk_size – Size of chunks to split on
count_matrix_input – Where to obtain a feature expression count matrix from.
Allowed Values: Choices from enumCountMatrixInput
Default:"CountMatrixInput.X"
feature_ids_column_name – Column name where to obtain Ensembl feature ids.
Allowed Values: A value fromadata.var.columns
or"index"
keyword, which refers to index column.
Default:"index"
include_extended_statistics – Boolean indicating whether to include a breakdown of the number of cells by dataset
cas_model_name – Model name to use for annotation.
Allowed Values: Model name from theallowed_models_list
list orNone
, which refers to the default selected model in the Cellarium backend.
Default:None
feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is
None
Allowed Values: A value fromadata.var.columns
or"index"
keyword, which refers to index column.
Default:None
- Returns:
A
CellTypeSummaryStatisticsResults
object with annotations for each of the cells from the input adata
- query_cells_by_ids(cell_ids: List[int], metadata_feature_names: List[CellMetadataFeatures] = None) CellQueryResults [source]
Query cells by their ids from a single anndata file with Cellarium CAS. Input file should be validated and sanitized according to the model schema.
- Parameters:
cell_ids – List of cell ids to query
metadata_feature_names – List of metadata features to include in the response.
- Returns:
A
CellQueryResults
object with cell query results
- search_10x_h5_file(filepath: str, chunk_size: int = 500, cas_model_name: str = 'default', count_matrix_input: CountMatrixInput = CountMatrixInput.X, feature_ids_column_name: str = 'index', feature_names_column_name: str | None = None) MatrixQueryResults [source]
Parse the 10x ‘h5’ matrix and apply the
search_anndata()
method to it.- Parameters:
filepath – Filepath of the local ‘h5’ matrix
chunk_size – Size of chunks to split on
cas_model_name – Model name to use for annotation.
Allowed Values: Model name from theallowed_models_list
list or"default"
keyword, which refers to the default selected model in the Cellarium backend.
Default:"default"
count_matrix_input – Where to obtain a feature expression count matrix from.
Allowed Values: Choices from enumCountMatrixInput
Default:"CountMatrixInput.X"
feature_ids_column_name – Column name where to obtain Ensembl feature ids.
Allowed Values: A value fromadata.var.columns
or"index"
keyword, which refers to index column.
Default:"index"
feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is
None
Allowed Values: A value fromadata.var.columns
or"index"
keyword, which refers to index column.
Default:None
- Returns:
A
MatrixQueryResults
object with search results for each of the cells from the input adata
- search_anndata(adata: AnnData, chunk_size=500, cas_model_name: str = 'default', count_matrix_input: CountMatrixInput = CountMatrixInput.X, feature_ids_column_name: str = 'index', feature_names_column_name: str | None = None) MatrixQueryResults [source]
Send an instance of
anndata.AnnData
to the Cellarium Cloud backend for nearest neighbor search. The function splits theadata
into smaller chunks and asynchronously sends them to the backend API service. Each chunk is of equal size, except for the last one, which may be smaller. The backend processes these chunks in parallel.- Parameters:
adata –
anndata.AnnData
instance to annotatechunk_size – Size of chunks to split on
cas_model_name – Model name to use for annotation.
Allowed Values: Model name from theallowed_models_list
list or"default"
keyword, which refers to the default selected model in the Cellarium backend.
Default:"default"
count_matrix_input – Where to obtain a feature expression count matrix from.
Allowed Values: Choices from enumCountMatrixInput
Default:"CountMatrixInput.X"
feature_ids_column_name – Column name where to obtain Ensembl feature ids.
Allowed Values: A value fromadata.var.columns
or"index"
keyword, which refers to index column.
Default:"index"
feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is
None
Allowed Values: A value fromadata.var.columns
or"index"
keyword, which refers to index column.
Default:None
- Returns:
A
MatrixQueryResults
object with search results for each of the cells from the input adata
- search_matrix(matrix: str | AnnData, chunk_size: int = 500, count_matrix_input: CountMatrixInput = CountMatrixInput.X, feature_ids_column_name: str = 'index', cas_model_name: str | None = None, feature_names_column_name: str | None = None) MatrixQueryResults [source]
Send an instance of
anndata.AnnData
to the Cellarium Cloud backend for nearest neighbor search. The function splits theadata
into smaller chunks and asynchronously sends them to the backend API service. Each chunk is of equal size, except for the last one, which may be smaller. The backend processes these chunks in parallel.- Parameters:
matrix – Either path to a file (must be either .h5 or .h5ad) or an
anndata.AnnData
instance to annotatechunk_size – Size of chunks to split on
count_matrix_input – Where to obtain a feature expression count matrix from.
Allowed Values: Choices from enumCountMatrixInput
Default:"CountMatrixInput.X"
feature_ids_column_name – Column name where to obtain Ensembl feature ids.
Allowed Values: A value fromadata.var.columns
or"index"
keyword, which refers to index column.
Default:"index"
cas_model_name – Model name to use for annotation.
Allowed Values: Model name from theallowed_models_list
list orNone
, which refers to the default selected model in the Cellarium backend.
Default:None
feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is
None
Allowed Values: A value fromadata.var.columns
or"index"
keyword, which refers to index column.
Default:None
- Returns:
A
MatrixQueryResults
object with search results for each of the cells from the input adata
- validate_and_sanitize_input_data(adata: AnnData, cas_model_name: str, count_matrix_name: CountMatrixInput, feature_ids_column_name: str, feature_names_column_name: str | None = None) AnnData [source]
Validate and sanitize input
anndata.AnnData
instance according to a specified feature schema associated with a particular model.- Parameters:
adata –
anndata.AnnData
instance to annotatecas_model_name – The model associated with the schema used for sanitizing.
Allowed Values: Model name from theallowed_models_list
list keyword, which refers to the default selected model in the Cellarium backend.count_matrix_name – Where to obtain a feature expression count matrix from.
Allowed Values: Choice of either"X"
or"raw.X"
in order to useadata.X
oradata.raw.X
feature_ids_column_name – Column name where to obtain Ensembl feature ids.
Allowed Values: A value fromadata.var.columns
or"index"
keyword, which refers to index column.feature_names_column_name – Column name where to obtain feature names (symbols). feature names wouldn’t be mapped if value is
None
Allowed Values: A value fromadata.var.columns
or"index"
keyword, which refers to index column.
Default:None
- Returns:
Validated and sanitized instance of
anndata.AnnData
- class cellarium.cas.constants.CountMatrixInput(value)[source]
Constants for the count matrix input type.
- X: str = 'X'
- RAW_X: str = 'raw.X'
- class cellarium.cas.constants.CellMetadataFeatures(value)[source]
Represents the cell features that can be queried for in the CAS API.
- CAS_CELL_INDEX: str = 'cas_cell_index'
- CELL_TYPE: str = 'cell_type'
- ASSAY: str = 'assay'
- DISEASE: str = 'disease'
- DONOR_ID: str = 'donor_id'
- IS_PRIMARY_DATA: str = 'is_primary_data'
- DEVELOPMENT_STAGE: str = 'development_stage'
- ORGANISM: str = 'organism'
- SELF_REPORTED_ETHNICITY: str = 'self_reported_ethnicity'
- SEX: str = 'sex'
- SUSPENSION_TYPE: str = 'suspension_type'
- TISSUE: str = 'tissue'
- TOTAL_MRNA_UMIS: str = 'total_mrna_umis'
- CELL_TYPE_ONTOLOGY_TERM_ID: str = 'cell_type_ontology_term_id'
- ASSAY_ONTOLOGY_TERM_ID: str = 'assay_ontology_term_id'
- DISEASE_ONTOLOGY_TERM_ID: str = 'disease_ontology_term_id'
- DEVELOPMENT_STAGE_ONTOLOGY_TERM_ID: str = 'development_stage_ontology_term_id'
- ORGANISM_ONTOLOGY_TERM_ID: str = 'organism_ontology_term_id'
- SELF_REPORTED_ETHNICITY_ONTOLOGY_TERM_ID: str = 'self_reported_ethnicity_ontology_term_id'
- SEX_ONTOLOGY_TERM_ID: str = 'sex_ontology_term_id'
- TISSUE_ONTOLOGY_TERM_ID: str = 'tissue_ontology_term_id'
- pydantic model cellarium.cas.models.CellTypeSummaryStatisticsResults[source]
Represents the data object returned by the CAS API for nearest neighbor annotations.
- pydantic model DatasetStatistics[source]
- Fields:
count_per_dataset (int)
dataset_id (str)
max_distance (float)
mean_distance (float)
median_distance (float)
min_distance (float)
- field dataset_id: str [Required]
The ID of the dataset containing cells
- field count_per_dataset: int [Required]
The number of cells found in the dataset
- field min_distance: float [Required]
The minimum distance between the query cell and the dataset cells
- field max_distance: float [Required]
The maximum distance between the query cell and the dataset cells
- field median_distance: float [Required]
The median distance between the query cell and the dataset cells
- field mean_distance: float [Required]
The mean distance between the query cell and the dataset cells
- pydantic model SummaryStatistics[source]
- Fields:
cell_count (int)
cell_type (str)
dataset_ids_with_counts (List[cellarium.cas.models.CellTypeSummaryStatisticsResults.DatasetStatistics] | None)
max_distance (float)
median_distance (float)
min_distance (float)
p25_distance (float)
p75_distance (float)
- field cell_type: str [Required]
The cell type of the cluster of cells
- field cell_count: int [Required]
The number of cells in the cluster
- field min_distance: float [Required]
The minimum distance between the query cell and the cluster cells
- field p25_distance: float [Required]
The 25th percentile distance between the query cell and the cluster cells
- field median_distance: float [Required]
The median distance between the query cell and the cluster cells
- field p75_distance: float [Required]
The 75th percentile distance between the query cell and the cluster cells
- field max_distance: float [Required]
The maximum distance between the query cell and the cluster cells
- field dataset_ids_with_counts: List[CellTypeSummaryStatisticsResults.DatasetStatistics] | None = None
- pydantic model NeighborhoodAnnotation[source]
Represents the data object returned by the CAS API for a single nearest neighbor annotation.
- Fields:
matches (List[cellarium.cas.models.CellTypeSummaryStatisticsResults.SummaryStatistics])
query_cell_id (str)
- field query_cell_id: str [Required]
The ID of the querying cell
- field matches: List[CellTypeSummaryStatisticsResults.SummaryStatistics] [Required]
- field data: List[CellTypeSummaryStatisticsResults.NeighborhoodAnnotation] [Required]
The annotations found
- pydantic model cellarium.cas.models.CellTypeOntologyAwareResults[source]
Represents the data object returned by the CAS API for a ontology-aware annotations.
- pydantic model Match[source]
- Fields:
cell_type (str)
cell_type_ontology_term_id (str)
score (float)
- field score: float [Required]
The score of the match
- field cell_type_ontology_term_id: str [Required]
The ontology term ID of the cell type for the match
- field cell_type: str [Required]
The cell type of the match
- pydantic model OntologyAwareAnnotation[source]
Represents the data object returned by the CAS API for a single ontology-aware annotation.
- Fields:
matches (List[cellarium.cas.models.CellTypeOntologyAwareResults.Match])
query_cell_id (str)
total_neighbors (int)
total_neighbors_unrecognized (int)
total_weight (float)
- field query_cell_id: str [Required]
The ID of the querying cell
- field matches: List[CellTypeOntologyAwareResults.Match] [Required]
The matches found for the querying cell
- field total_weight: float [Required]
The total weight of the matches
- field total_neighbors: int [Required]
The total number of neighbors matched
- field total_neighbors_unrecognized: int [Required]
The total number of neighbors that were not recognized
- field data: List[CellTypeOntologyAwareResults.OntologyAwareAnnotation] [Required]
The annotations found
- pydantic model cellarium.cas.models.MatrixQueryResults[source]
Represents the data object returned by the CAS API when performing a cell matrix query (e.g. a query of the cell database using a matrix).
- pydantic model Match[source]
- Fields:
cas_cell_index (float)
distance (float)
- field cas_cell_index: float [Required]
CAS-specific ID of a single cell
- field distance: float [Required]
The distance between this querying cell and the found cell
- pydantic model MatrixQueryResult[source]
Represents the data object returned by the CAS API for a single cell query.
- Fields:
neighbors (List[cellarium.cas.models.MatrixQueryResults.Match])
query_cell_id (str)
- field query_cell_id: str [Required]
The ID of the querying cell
- field neighbors: List[MatrixQueryResults.Match] [Required]
- field data: List[MatrixQueryResults.MatrixQueryResult] [Required]
The results of the query
- pydantic model cellarium.cas.models.CellQueryResults[source]
Represents the data object returned by the CAS API for a cell query.
- pydantic model CellariumCellMetadata[source]
- Fields:
assay (str | None)
assay_ontology_term_id (str | None)
cas_cell_index (int)
cell_type (str | None)
cell_type_ontology_term_id (str | None)
development_stage (str | None)
development_stage_ontology_term_id (str | None)
disease (str | None)
disease_ontology_term_id (str | None)
donor_id (str | None)
is_primary_data (bool | None)
organism (str | None)
organism_ontology_term_id (str | None)
self_reported_ethnicity (str | None)
self_reported_ethnicity_ontology_term_id (str | None)
sex (str | None)
sex_ontology_term_id (str | None)
suspension_type (str | None)
tissue (str | None)
tissue_ontology_term_id (str | None)
total_mrna_umis (int | None)
- field cas_cell_index: int [Required]
The CAS-specific ID of the cell
- field cell_type: str | None [Required]
The cell type of the cell
- field assay: str | None [Required]
The assay used to generate the cell
- field disease: str | None [Required]
The disease state of the cell
- field donor_id: str | None [Required]
The ID of the donor of the cell
- field is_primary_data: bool | None [Required]
Whether the cell is primary data
- field development_stage: str | None [Required]
The development stage of the cell donor
- field organism: str | None [Required]
The organism of the cell
- field self_reported_ethnicity: str | None [Required]
The self reported ethnicity of the cell donor
- field sex: str | None [Required]
The sex of the cell donor
- field suspension_type: str | None [Required]
The cell suspension types used
- field tissue: str | None [Required]
The tissue-type that the cell was a part of
- field total_mrna_umis: int | None [Required]
The count of mRNA UMIs associated with this cell
- field cell_type_ontology_term_id: str | None [Required]
The ID used by the ontology for the type of the cell
- field assay_ontology_term_id: str | None [Required]
The ID used by the ontology for the assay used to generate the cell
- field disease_ontology_term_id: str | None [Required]
The ID used by the ontology for the disease state of the cell
- field development_stage_ontology_term_id: str | None [Required]
The ID used by the ontology for the development stage of the cell donor
- field organism_ontology_term_id: str | None [Required]
The ID used by the ontology for the organism of the cell
- field self_reported_ethnicity_ontology_term_id: str | None [Required]
The ID used by the ontology for the self reported ethnicity of the cell donor
- field sex_ontology_term_id: str | None [Required]
The ID used by the ontology for the sex of the cell donor
- field tissue_ontology_term_id: str | None [Required]
The ID used by the ontology for the tissue type that the cell was a part of
- field data: List[CellQueryResults.CellariumCellMetadata] [Required]
The metadata of the found cells