facs3/4 Jupyter Notebook lamindata

Query & integrate data#

import lamindb as ln
import lnschema_bionty as lb

lb.settings.species = "human"
馃挕 loaded instance: testuser1/test-facs (lamindb 0.55.2)
ln.track()
馃挕 notebook imports: lamindb==0.55.2 lnschema_bionty==0.31.2
馃挕 Transform(id='wukchS8V976Uz8', name='Query & integrate data', short_name='facs3', version='0', type=notebook, updated_at=2023-10-10 15:45:54, created_by_id='DzTjkKse')
馃挕 Run(id='eYGe8Gi3R4jMTpKIjRde', run_at=2023-10-10 15:45:54, transform_id='wukchS8V976Uz8', created_by_id='DzTjkKse')

Inspect the CellMarker registry #

Inspect your aggregated cell marker registry as a DataFrame:

lb.CellMarker.filter().df().head()
name synonyms gene_symbol ncbi_gene_id uniprotkb_id species_id bionty_source_id updated_at created_by_id
id
cFJEI6e6wml3 CD20 MS4A1 931 A0A024R507 uHJU Fbnq 2023-10-10 15:45:22 DzTjkKse
roEbL8zuLC5k Cd14 CD14 4695 O43678 uHJU Fbnq 2023-10-10 15:45:22 DzTjkKse
uThe3c0V3d4i CD27 CD27 939 P26842 uHJU Fbnq 2023-10-10 15:45:22 DzTjkKse
0qCmUijBeByY CD94 KLRD1 3824 Q13241 uHJU Fbnq 2023-10-10 15:45:22 DzTjkKse
CR7DAHxybgyi CD38 CD38 952 B4E006 uHJU Fbnq 2023-10-10 15:45:22 DzTjkKse

Search for a marker (synonyms aware):

lb.CellMarker.search("PD-1").head(2)
id synonyms __ratio__
name
PD1 2VeZenLi2dj5 PID1|PD-1|PD 1 100.000000
CD14/19 9VptKqpwq9BZ 54.545455

Look up markers with auto-complete:

markers = lb.CellMarker.lookup()

markers.cd8
CellMarker(id='ttBc0Fs01sYk', name='CD8', synonyms='', gene_symbol='CD8A', ncbi_gene_id='925', uniprotkb_id='P01732', updated_at=2023-10-10 15:45:22, species_id='uHJU', bionty_source_id='Fbnq', created_by_id='DzTjkKse')

Query files by markers #

Query panels and datasets based on markers, e.g., which datasets have 'CD8' in the flow panel:

panels_with_cd8 = ln.FeatureSet.filter(cell_markers=markers.cd8).all()
ln.File.filter(feature_sets__in=panels_with_cd8).df()
storage_id key suffix accessor description version size hash hash_type transform_id run_id initial_version_id updated_at created_by_id
id
QerZGf4taLaLa0UCUjEO QK7Rd19J None .h5ad AnnData Alpert19 None 33369696 VsTnnzHN63ovNESaJtlRUQ md5 OWuTtS4SAponz8 R7aH6AfJFQF8f46JiC2H None 2023-10-10 15:45:30 DzTjkKse
9iRLixREI4t3uJwL6urF QK7Rd19J None .h5ad AnnData Oetjen18_t1 None 46501304 I8nRS02iBs5z1J01b2qwOg md5 SmQmhrhigFPLz8 q7FG1FkyFAdnRdyp17Hk None 2023-10-10 15:45:44 DzTjkKse

Access registries:

features = ln.Feature.lookup()

Find shared cell markers between two files:

files = ln.File.filter(feature_sets__in=panels_with_cd8).list()
file1, file2 = files[0], files[1]
shared_markers = file1.features["var"] & file2.features["var"]
shared_markers.list("name")
['CD27', 'Cd4', 'CD3', 'CD8', 'CD45RA', 'Ccr7']