Command-Line Interface

cellarium-cas ships a command-line interface for running annotation and benchmarking jobs without writing Python. The API token is read from the CAS_API_TOKEN environment variable by default, or passed explicitly via --cas-api-token.

export CAS_API_TOKEN="your_token_here"

annotate

Annotate a local .h5ad file using the CAS ontology-aware strategy and write outputs to a directory. The output directory will contain ontology_response.json, ontology_resource.json, and metadata.json by default. Pass --infer-labels to also write inferred_labels.csv with per-cell top-k cell type assignments. Pass --cluster-label to additionally compute cluster-level label calls.

cellarium-cas annotate \
    --input-path cells.h5ad \
    --output-dir ./cas_output \
    --cluster-label leiden

cellarium-cas annotate

Annotate a single-cell dataset using the CAS ontology-aware strategy.

Usage

cellarium-cas annotate [OPTIONS]

Options

--input-path <input_path>

Required Path to input .h5ad or .h5 (Cell Ranger HDF5) file.

--output-dir <output_dir>

Required Directory to write output files into.

--cas-api-token <cas_api_token>

CAS API token. Can also be set via the CAS_API_TOKEN environment variable.

--cas-api-url <cas_api_url>

CAS API URL. Defaults to the Cellarium production server. Can also be set via CAS_API_URL.

--cas-model-name <cas_model_name>

Model name to use for annotation. Defaults to the server default.

--chunk-size <chunk_size>

Number of cells per annotation chunk.

Default:

1000

--count-matrix-input <count_matrix_input>

Where to read the count matrix from in the AnnData object.

Default:

'X'

Options:

X | raw.X

--feature-ids-column-name <feature_ids_column_name>

Column in adata.var containing Ensembl feature IDs, or ‘index’.

Default:

'index'

--feature-names-column-name <feature_names_column_name>

Column in adata.var containing feature names (symbols). Omit to skip mapping.

--prune-threshold <prune_threshold>

Score threshold for pruning the ontology graph output.

Default:

0.05

--weighting-prefactor <weighting_prefactor>

Weighting prefactor controlling neighbor distance weight decay.

Default:

1.0

--infer-labels, --no-infer-labels

Run postprocessing to infer top-k cell type labels and save inferred_labels.csv.

Default:

False

--min-acceptable-score <min_acceptable_score>

Minimum score for a cell type call to be included. Used when –infer-labels is set.

Default:

0.2

--top-k <top_k>

Number of top cell type calls per cell. Used when –infer-labels is set.

Default:

3

--save-metadata, --no-save-metadata

Save metadata.json with run provenance info.

Default:

True

--save-ontology-resource, --no-save-ontology-resource

Save ontology_resource.json. Required for offline ontology-aware benchmarking.

Default:

True

--output-h5ad <output_h5ad>

If provided, insert CAS annotations into the AnnData object and save it as an .h5ad file at this path.

--cluster-label <cluster_label>

adata.obs column containing cluster labels. If provided, runs cluster-level label inference and includes results in inferred_labels.csv.

Environment variables

CAS_API_TOKEN

Provide a default for --cas-api-token

CAS_API_URL

Provide a default for --cas-api-url


benchmark

Evaluate annotation quality against a labelled reference dataset. The benchmark subcommands consume output directories produced by cellarium-cas annotate. All benchmark artifacts (confusion matrices, metric CSVs) are written to a single --output-dir that acts as the benchmarking workspace. Annotate directories are never modified.

The pipeline has four ordered steps plus a convenience command that runs all of them:

# All-in-one
cellarium-cas benchmark all \
    --annotate-dirs ./annotate_outputs \
    --output-dir    ./benchmark_results \
    --gt-label      cell_type_ontology_term_id \
    --inferred-label cas_cell_type_name_1

# Or step by step
cellarium-cas benchmark confusion-matrix ...
cellarium-cas benchmark aggregate        ...
cellarium-cas benchmark f-measure        ...
cellarium-cas benchmark hierarchical     ...

See Benchmarking for full documentation of the pipeline and output columns.

cellarium-cas benchmark

Compute benchmarking metrics against CAS annotate outputs.

Usage

cellarium-cas benchmark [OPTIONS] COMMAND [ARGS]...

aggregate

Aggregate per-sample confusion matrices by model name into <output-dir>/cm_aggregate/.

Groups all matrices in cm_raw/ by their model_name metadata field and sums each group into one aggregated sparse confusion matrix.

Usage

cellarium-cas benchmark aggregate [OPTIONS]

Options

--output-dir <output_dir>

Required Benchmarking workspace directory. All artifacts are written here.

all

Run the full benchmark pipeline in one command.

Equivalent to running confusion-matrix -> aggregate -> f-measure -> hierarchical in sequence. All artifacts are written to <output-dir>.

Usage

cellarium-cas benchmark all [OPTIONS]

Options

--annotate-dirs <annotate_dirs>

Required Path to a parent directory whose subdirectories are annotate output dirs, or a .txt file listing one annotate output directory path per line.

--output-dir <output_dir>

Required Benchmarking workspace directory. All artifacts are written here.

--gt-label <gt_label>

Required Column name in the original .h5ad obs that contains ground-truth cell type labels (e.g. ‘cell_type_ontology_term_id’).

--inferred-label <inferred_label>

Required Column name in inferred_labels.csv that contains the predicted cell type labels (e.g. ‘cas_cell_type_name_1’).

--f-measure-top-k <f_measure_top_k>

Number of ranked inferred-label columns to consider for flat F-measure. Columns are derived from –inferred-label by replacing its trailing rank; hierarchical F-measure remains top-1.

Default:

1

confusion-matrix

Build per-sample confusion matrices and save them to <output-dir>/cm_raw/.

Reads ground-truth labels from the original .h5ad (via metadata.json input_path) and predicted labels from inferred_labels.csv. The matrix is aligned to the full Cell Ontology cl_names universe from ontology_resource.json.

Usage

cellarium-cas benchmark confusion-matrix [OPTIONS]

Options

--annotate-dirs <annotate_dirs>

Required Path to a parent directory whose subdirectories are annotate output dirs, or a .txt file listing one annotate output directory path per line.

--output-dir <output_dir>

Required Benchmarking workspace directory. All artifacts are written here.

--gt-label <gt_label>

Required Column name in the original .h5ad obs that contains ground-truth cell type labels (e.g. ‘cell_type_ontology_term_id’).

--inferred-label <inferred_label>

Required Column name in inferred_labels.csv that contains the predicted cell type labels (e.g. ‘cas_cell_type_name_1’).

--f-measure-top-k <f_measure_top_k>

Number of ranked inferred-label columns to consider for flat F-measure. Columns are derived from –inferred-label by replacing its trailing rank; hierarchical F-measure remains top-1.

Default:

1

f-measure

Compute standard F-measure metrics from confusion matrices.

Reads cm_raw/ and cm_aggregate/ and writes:

f_measure_per_sample.csv (columns: model_name, test_sample, tp, fp, fn,
precision_micro, recall_micro, f1_micro,
f1_macro, precision_macro, recall_macro,
precision_weighted, recall_weighted, f1_weighted)
f_measure_per_group.csv (columns: group_name, tp, fp, fn,
precision_micro, recall_micro, f1_micro,
f1_macro, precision_macro, recall_macro,
precision_weighted, recall_weighted, f1_weighted)

Usage

cellarium-cas benchmark f-measure [OPTIONS]

Options

--output-dir <output_dir>

Required Benchmarking workspace directory. All artifacts are written here.

hierarchical

Compute hierarchical F-measure metrics from confusion matrices.

Implements the Kiritchenko et al. approach: hierarchical TP/FP/FN are derived from the overlap of ontology ancestor sets of the true and predicted labels. Reads cm_raw/ and cm_aggregate/ and writes:

hierarchical_f_measure_per_sample.csv (columns: model_name, test_sample,
h_tp, h_fp, h_fn,
h_precision_micro, h_recall_micro,
h_f1_micro, h_f1_macro,
h_precision_macro, h_recall_macro,
h_precision_weighted, h_recall_weighted,
h_f1_weighted)
hierarchical_f_measure_per_group.csv (columns: group_name, h_tp, h_fp, h_fn,
h_precision_micro, h_recall_micro,
h_f1_micro, h_f1_macro,
h_precision_macro, h_recall_macro,
h_precision_weighted, h_recall_weighted,
h_f1_weighted)

Usage

cellarium-cas benchmark hierarchical [OPTIONS]

Options

--output-dir <output_dir>

Required Benchmarking workspace directory. All artifacts are written here.