scanpy.get.obs_df

scanpy.get.obs_df(adata, keys=(), obsm_keys=(), *, layer=None, gene_symbols=None, use_raw=False)

Return values for observations in adata.

Parameters
adata : AnnData

AnnData object to get values from.

keys : Iterable[str] (default: ())

Keys from either .var_names, .var[gene_symbols], or .obs.columns.

obsm_keys : Iterable[Tuple[str, int]] (default: ())

Tuple of (key from obsm, column index of obsm[key]).

layer : Optional[str] (default: None)

Layer of adata to use as expression values.

gene_symbols : Optional[str] (default: None)

Column of adata.var to search for keys in.

use_raw : bool (default: False)

Whether to get expression values from adata.raw.

Return type

DataFrame

Returns

A dataframe with adata.obs_names as index, and values specified by keys and obsm_keys.

Examples

Getting value for plotting:

>>> pbmc = sc.datasets.pbmc68k_reduced()
>>> plotdf = sc.get.obs_df(
        pbmc,
        keys=["CD8B", "n_genes"],
        obsm_keys=[("X_umap", 0), ("X_umap", 1)]
    )
>>> plotdf.plot.scatter("X_umap0", "X_umap1", c="CD8B")

Calculating mean expression for marker genes by cluster:

>>> pbmc = sc.datasets.pbmc68k_reduced()
>>> marker_genes = ['CD79A', 'MS4A1', 'CD8A', 'CD8B', 'LYZ']
>>> genedf = sc.get.obs_df(
        pbmc,
        keys=["louvain", *marker_genes]
    )
>>> grouped = genedf.groupby("louvain")
>>> mean, var = grouped.mean(), grouped.var()