Validate and register an h5ad file based on CELLxGENE schema#

This guide shows how to validate and curate an AnnData object using the metadata registries of laminlabs/cellxgene, based on the CELLxGENE schema version 5.1.0.

The validated object can be subsequently registered as an artifact in your LaminDB instance.

Note

The cellxgene-lamin-validator is primarily designed to validate all metadata with respect to adhere to the ontologies. The validator does not reimplement all rules of the cellxgene schema and we therefore recommend running the cellxgene-schema if full adherence beyond metadata is a necessity.

Set up#

Load your instance to register the validated AnnData:

!lamin init --storage ./test-cellxgene-lamin-validator --schema bionty
Hide code cell output
πŸ’‘ connected lamindb: testuser1/test-cellxgene-lamin-validator
import lamindb as ln
from cellxgene_lamin_validator import Validator, datasets, CellxGeneFields

ln.settings.verbosity = "hint"
πŸ’‘ connected lamindb: testuser1/test-cellxgene-lamin-validator
❗ Full backed capabilities are not available for this version of anndata, please install anndata>=0.9.1.

An h5ad file#

Let’s start with an AnnData object that we’d like to inspect and curate:

adata = datasets.anndata_human_immune_cells(populate_registries=True)
adata
AnnData object with n_obs Γ— n_vars = 1626 Γ— 36503
    obs: 'donor', 'tissue', 'cell_type', 'assay', 'sex_ontology_term_id'
    var: 'feature_is_filtered'
    uns: 'default_embedding'
    obsm: 'X_umap'
adata.write_h5ad("anndata_human_immune_cells.h5ad")
... storing 'donor' as categorical
... storing 'sex_ontology_term_id' as categorical
!cellxgene-schema validate anndata_human_immune_cells.h5ad
Loading dependencies
Loading validator modules
Starting validation...
WARNING: Validation of raw layer was not performed due to current errors, try again after fixing current errors.
ERROR: Add labels error: Column 'cell_type' is a reserved column name of 'obs'. Remove it from h5ad and try again.
ERROR: Add labels error: Column 'assay' is a reserved column name of 'obs'. Remove it from h5ad and try again.
ERROR: Add labels error: Column 'tissue' is a reserved column name of 'obs'. Remove it from h5ad and try again.
ERROR: 'title' in 'uns' is not present.
ERROR: 'ENSG00000269933' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261737' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259834' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256374' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000263464' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000203812' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272196' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272880' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000270188' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000287116' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237133' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000224739' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000227902' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000239467' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272551' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000280374' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236886' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000229352' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286601' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000227021' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259855' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273301' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000271870' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237838' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286996' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000269028' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286699' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273370' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261490' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272567' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000270394' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272370' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272354' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000251044' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272040' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000182230' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000204092' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261068' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236740' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236996' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000232295' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000271734' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236673' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000227220' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236166' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000112096' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000285162' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286228' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237513' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000285106' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000226380' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000270672' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000225932' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000244693' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000268955' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272267' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000253878' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259820' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000226403' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000233776' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000269900' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261534' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237548' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000239665' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256892' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000249860' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000271409' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000224745' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261438' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000231575' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000260461' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000255823' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000254740' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000254561' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000282080' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256427' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000287388' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000276814' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000280710' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000215271' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000258414' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000258808' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277050' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273888' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000258861' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259444' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000244952' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273923' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000262668' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000232196' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256618' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000221995' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000226377' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273576' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000267637' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000282965' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273837' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286949' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256222' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000280095' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278927' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278955' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277352' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000239446' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256045' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000228906' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000228139' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261773' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278198' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273496' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277666' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278782' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277761' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000269933' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261737' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259834' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256374' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000263464' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000203812' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272196' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272880' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000270188' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000287116' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237133' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000224739' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000227902' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000239467' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272551' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000280374' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236886' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000229352' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286601' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000227021' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259855' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273301' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000271870' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237838' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286996' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000269028' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286699' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273370' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261490' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272567' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000270394' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272370' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272354' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000251044' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272040' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000182230' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000204092' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261068' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236740' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236996' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000232295' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000271734' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236673' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000227220' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236166' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000112096' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000285162' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286228' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237513' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000285106' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000226380' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000270672' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000225932' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000244693' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000268955' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272267' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000253878' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259820' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000226403' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000233776' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000269900' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261534' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237548' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000239665' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256892' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000249860' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000271409' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000224745' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261438' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000231575' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000260461' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000255823' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000254740' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000254561' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000282080' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256427' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000287388' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000276814' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000280710' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000215271' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000258414' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000258808' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277050' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273888' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000258861' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259444' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000244952' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273923' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000262668' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000232196' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256618' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000221995' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000226377' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273576' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000267637' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000282965' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273837' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286949' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256222' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000280095' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278927' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278955' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277352' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000239446' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256045' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000228906' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000228139' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261773' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278198' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273496' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277666' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278782' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277761' is not a valid feature ID in 'raw.var'.
ERROR: Dataframe 'obs' is missing column 'cell_type_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'assay_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'disease_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'organism_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'tissue_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'self_reported_ethnicity_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'development_stage_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'is_primary_data'.
ERROR: Dataframe 'obs' is missing column 'donor_id'.
ERROR: Dataframe 'obs' is missing column 'suspension_type'.
ERROR: Dataframe 'obs' is missing column 'tissue_type'.
Validation complete in 0:00:00.483019 with status is_valid=False

Validate and curate metadata#

Validate the AnnData object:

try:
    validator = Validator(adata)
except Exception as e:
    print(e)
columns ['development_stage', 'disease', 'donor_id', 'self_reported_ethnicity', 'suspension_type', 'tissue_type', 'organism'] are not found in the AnnData object!

Let’s fix the β€œdonor_id” column name:

adata.obs.rename(columns={"donor": "donor_id"}, inplace=True)

For the missing columns, we can pass default values suggested from CELLxGENE:

CellxGeneFields.OBS_FIELD_DEFAULTS
{'disease': 'normal',
 'development_stage': 'unknown',
 'self_reported_ethnicity': 'unknown',
 'suspension_type': 'cell',
 'donor_id': 'na',
 'tissue_type': 'tissue',
 'cell_type': 'native_cell',
 'sex': 'unknown'}
validator = Validator(adata, organism="human", **CellxGeneFields.OBS_FIELD_DEFAULTS)
πŸ’‘ added defaults to the AnnData object: {'organism': 'human', 'disease': 'normal', 'development_stage': 'unknown', 'self_reported_ethnicity': 'unknown', 'suspension_type': 'cell', 'tissue_type': 'tissue'}
βœ… registered 1 records without reference: ['sex_ontology_term_id']
βœ… registered 10 records from laminlabs/cellxgene: ['assay', 'cell_type', 'development_stage', 'disease', 'donor_id', 'self_reported_ethnicity', 'tissue', 'organism', 'tissue_type', 'suspension_type']
validator.obs_fields
{'assay': FieldAttr(ExperimentalFactor.name),
 'cell_type': FieldAttr(CellType.name),
 'development_stage': FieldAttr(DevelopmentalStage.name),
 'disease': FieldAttr(Disease.name),
 'donor_id': FieldAttr(ULabel.name),
 'self_reported_ethnicity': FieldAttr(Ethnicity.name),
 'sex_ontology_term_id': FieldAttr(Phenotype.ontology_id),
 'suspension_type': FieldAttr(ULabel.name),
 'tissue': FieldAttr(Tissue.name),
 'tissue_type': FieldAttr(ULabel.name),
 'organism': FieldAttr(Organism.name)}
validated = validator.validate()
πŸ’‘ inspecting 'variables' by Gene.ensembl_gene_id
❗    123 terms are not validated: 'ENSG00000269933', 'ENSG00000261737', 'ENSG00000259834', 'ENSG00000256374', 'ENSG00000263464', 'ENSG00000203812', 'ENSG00000272196', 'ENSG00000272880', 'ENSG00000270188', 'ENSG00000287116', 'ENSG00000237133', 'ENSG00000224739', 'ENSG00000227902', 'ENSG00000239467', 'ENSG00000272551', 'ENSG00000280374', 'ENSG00000236886', 'ENSG00000229352', 'ENSG00000286601', 'ENSG00000227021', ...
      β†’ register terms via `.register_labels('variables')`
πŸ’‘ inspecting 'assay' by ExperimentalFactor.name
❗    3 terms are not validated: '10x 3' v3', '10x 5' v2', '10x 5' v1'
      β†’ register terms via `.register_labels('assay')`
πŸ’‘ inspecting 'cell_type' by CellType.name
βœ…    all cell_types are validated
πŸ’‘ inspecting 'development_stage' by DevelopmentalStage.name
❗    1 terms is not validated: 'unknown'
      β†’ register terms via `.register_labels('development_stage')`
πŸ’‘ inspecting 'disease' by Disease.name
❗    1 terms is not validated: 'normal'
      β†’ register terms via `.register_labels('disease')`
πŸ’‘ inspecting 'donor_id' by ULabel.name
❗    12 terms are not validated: 'D496-1', '621B-1', 'A29-1', 'A36-1', 'A35-1', '637C-1', 'A52-1', 'A37-1', 'D503-1', '640C-1', 'A31-1', '582C-1'
      β†’ register terms via `.register_labels('donor_id')`
πŸ’‘ inspecting 'self_reported_ethnicity' by Ethnicity.name
❗    1 terms is not validated: 'unknown'
      β†’ register terms via `.register_labels('self_reported_ethnicity')`
πŸ’‘ inspecting 'sex_ontology_term_id' by Phenotype.ontology_id
❗    1 terms is not validated: 'PATO:0000384'
      β†’ register terms via `.register_labels('sex_ontology_term_id')`
πŸ’‘ inspecting 'suspension_type' by ULabel.name
❗    1 terms is not validated: 'cell'
      β†’ register terms via `.register_labels('suspension_type')`
πŸ’‘ inspecting 'tissue' by Tissue.name
❗    17 terms are not validated: 'blood', 'thoracic lymph node', 'spleen', 'lungg', 'mesenteric lymph node', 'lamina propria', 'liver', 'jejunal epithelium', 'omentum', 'bone marrow', 'ileum', 'caecum', 'thymus', 'skeletal muscle tissue', 'duodenum', 'sigmoid colon', 'transverse colon'
      β†’ register terms via `.register_labels('tissue')`
πŸ’‘ inspecting 'tissue_type' by ULabel.name
❗    1 terms is not validated: 'tissue'
      β†’ register terms via `.register_labels('tissue_type')`
πŸ’‘ inspecting 'organism' by Organism.name
βœ…    all organisms are validated
validated
False

Register new metadata labels#

Following the suggestions above to register genes and labels that aren’t present in the current instance:

(Note that our instance is rather empty. Once you filled up the registries, registering new labels won’t be frequently needed)

validator.register_labels(feature="all")
πŸ’‘ registering labels for 'variables'
βœ… registered 123 records from laminlabs/cellxgene: ['ENSG00000112096', 'ENSG00000182230', 'ENSG00000203812', 'ENSG00000204092', 'ENSG00000215271', 'ENSG00000221995', 'ENSG00000224739', 'ENSG00000224745', 'ENSG00000225932', 'ENSG00000226377', 'ENSG00000226380', 'ENSG00000226403', 'ENSG00000227021', 'ENSG00000227220', 'ENSG00000227902', 'ENSG00000228139', 'ENSG00000228906', 'ENSG00000229352', 'ENSG00000231575', 'ENSG00000232196', 'ENSG00000232295', 'ENSG00000233776', 'ENSG00000236166', 'ENSG00000236673', 'ENSG00000236740', 'ENSG00000236886', 'ENSG00000236996', 'ENSG00000237133', 'ENSG00000237513', 'ENSG00000237548', 'ENSG00000237838', 'ENSG00000239446', 'ENSG00000239467', 'ENSG00000239665', 'ENSG00000244693', 'ENSG00000244952', 'ENSG00000249860', 'ENSG00000251044', 'ENSG00000253878', 'ENSG00000254561', 'ENSG00000254740', 'ENSG00000255823', 'ENSG00000256045', 'ENSG00000256222', 'ENSG00000256374', 'ENSG00000256427', 'ENSG00000256618', 'ENSG00000256892', 'ENSG00000258414', 'ENSG00000258808', 'ENSG00000258861', 'ENSG00000259444', 'ENSG00000259820', 'ENSG00000259834', 'ENSG00000259855', 'ENSG00000260461', 'ENSG00000261068', 'ENSG00000261438', 'ENSG00000261490', 'ENSG00000261534', 'ENSG00000261737', 'ENSG00000261773', 'ENSG00000262668', 'ENSG00000263464', 'ENSG00000267637', 'ENSG00000268955', 'ENSG00000269028', 'ENSG00000269900', 'ENSG00000269933', 'ENSG00000270188', 'ENSG00000270394', 'ENSG00000270672', 'ENSG00000271409', 'ENSG00000271734', 'ENSG00000271870', 'ENSG00000272040', 'ENSG00000272196', 'ENSG00000272267', 'ENSG00000272354', 'ENSG00000272370', 'ENSG00000272551', 'ENSG00000272567', 'ENSG00000272880', 'ENSG00000273301', 'ENSG00000273370', 'ENSG00000273496', 'ENSG00000273554', 'ENSG00000273576', 'ENSG00000273837', 'ENSG00000273888', 'ENSG00000273923', 'ENSG00000274175', 'ENSG00000274792', 'ENSG00000275249', 'ENSG00000275869', 'ENSG00000276017', 'ENSG00000276814', 'ENSG00000277050', 'ENSG00000277196', 'ENSG00000277352', 'ENSG00000277666', 'ENSG00000277761', 'ENSG00000277836', 'ENSG00000278198', 'ENSG00000278633', 'ENSG00000278782', 'ENSG00000278817', 'ENSG00000278927', 'ENSG00000278955', 'ENSG00000280095', 'ENSG00000280374', 'ENSG00000280710', 'ENSG00000282080', 'ENSG00000282965', 'ENSG00000285106', 'ENSG00000285162', 'ENSG00000286228', 'ENSG00000286601', 'ENSG00000286699', 'ENSG00000286949', 'ENSG00000286996', 'ENSG00000287116', 'ENSG00000287388']
πŸ’‘ registering labels for 'assay'
βœ… registered 3 records from laminlabs/cellxgene: ["10x 5' v1", "10x 5' v2", "10x 3' v3"]
πŸ’‘ registering labels for 'cell_type'
πŸ’‘ registering labels for 'development_stage'
βœ… registered 1 records from laminlabs/cellxgene: ['unknown']
πŸ’‘ registering labels for 'disease'
βœ… registered 1 records from laminlabs/cellxgene: ['normal']
πŸ’‘ registering labels for 'donor_id'
❗ 12 non-validated labels are not registered: ['D496-1', '621B-1', 'A29-1', 'A36-1', 'A35-1', '637C-1', 'A52-1', 'A37-1', 'D503-1', '640C-1', 'A31-1', '582C-1']!
      β†’ to lookup categories, use `.lookup().{feature_name}`
      β†’ to register, set `validated_only=False`
πŸ’‘ registering labels for 'self_reported_ethnicity'
βœ… registered 1 records from laminlabs/cellxgene: ['unknown']
πŸ’‘ registering labels for 'sex_ontology_term_id'
βœ… registered 1 records from laminlabs/cellxgene: ['PATO:0000384']
πŸ’‘ registering labels for 'suspension_type'
βœ… registered 1 records from laminlabs/cellxgene: ['cell']
πŸ’‘ registering labels for 'tissue'
❗ 1 non-validated labels are not registered: ['lungg']!
      β†’ to lookup categories, use `.lookup().{feature_name}`
      β†’ to register, set `validated_only=False`
βœ… registered 16 records from laminlabs/cellxgene: ['spleen', 'sigmoid colon', 'jejunal epithelium', 'bone marrow', 'skeletal muscle tissue', 'transverse colon', 'thymus', 'liver', 'duodenum', 'lamina propria', 'mesenteric lymph node', 'caecum', 'omentum', 'blood', 'ileum', 'thoracic lymph node']
πŸ’‘ registering labels for 'tissue_type'
βœ… registered 1 records from laminlabs/cellxgene: ['tissue']
πŸ’‘ registering labels for 'organism'

For donors, we register the new labels:

validator.register_labels(feature="donor_id", validated_only=False)
βœ… registered 12 records without reference: ['D496-1', '621B-1', 'A29-1', 'A36-1', 'A35-1', '637C-1', 'A52-1', 'A37-1', 'D503-1', '640C-1', 'A31-1', '582C-1']

An error is shown for the tissue label β€œlungg”, which is a typo, should be β€œlung”. Let’s fix it:

tissues = validator.lookup()["tissue"]
Lookup objects from the laminlabs/cellxgene
# using a lookup object to find the correct term
tissues.lung
Tissue(uid='7Tt4iEKc', name='lung', ontology_id='UBERON:0002048', synonyms='pulmo', description='Respiration Organ That Develops As An Outpocketing Of The Esophagus.', updated_at=2024-01-08 15:22:49 UTC, public_source_id=47, created_by_id=1)
adata.obs["tissue"] = adata.obs["tissue"].cat.rename_categories({"lungg": tissues.lung.name})
validator.register_labels('tissue')
βœ… registered 1 records from laminlabs/cellxgene: ['lung']

Let’s validate the object again:

validated = validator.validate()
πŸ’‘ inspecting 'variables' by Gene.ensembl_gene_id
βœ…    all variabless are validated
πŸ’‘ inspecting 'assay' by ExperimentalFactor.name
βœ…    all assays are validated
πŸ’‘ inspecting 'cell_type' by CellType.name
βœ…    all cell_types are validated
πŸ’‘ inspecting 'development_stage' by DevelopmentalStage.name
βœ…    all development_stages are validated
πŸ’‘ inspecting 'disease' by Disease.name
βœ…    all diseases are validated
πŸ’‘ inspecting 'donor_id' by ULabel.name
βœ…    all donor_ids are validated
πŸ’‘ inspecting 'self_reported_ethnicity' by Ethnicity.name
βœ…    all self_reported_ethnicitys are validated
πŸ’‘ inspecting 'sex_ontology_term_id' by Phenotype.ontology_id
βœ…    all sex_ontology_term_ids are validated
πŸ’‘ inspecting 'suspension_type' by ULabel.name
βœ…    all suspension_types are validated
πŸ’‘ inspecting 'tissue' by Tissue.name
βœ…    all tissues are validated
πŸ’‘ inspecting 'tissue_type' by ULabel.name
βœ…    all tissue_types are validated
πŸ’‘ inspecting 'organism' by Organism.name
βœ…    all organisms are validated
validated
True
adata.obs.head()
donor_id tissue cell_type assay sex_ontology_term_id organism disease development_stage self_reported_ethnicity suspension_type tissue_type
CZINY-0109_CTGGTCTAGTCTGTAC D496-1 blood classical monocyte 10x 3' v3 PATO:0000384 human normal unknown unknown cell tissue
CZI-IA10244332+CZI-IA10244434_CCTTCGACATACTCTT 621B-1 thoracic lymph node T follicular helper cell 10x 5' v2 PATO:0000384 human normal unknown unknown cell tissue
Pan_T7935491_CTGGTCTGTACATGTC A29-1 spleen memory B cell 10x 5' v1 PATO:0000384 human normal unknown unknown cell tissue
Pan_T7980367_GGGCATCCAGGTGGAT A36-1 lung alveolar macrophage 10x 5' v1 PATO:0000384 human normal unknown unknown cell tissue
Pan_T7935494_ATCATGGTCTACCTGC A29-1 mesenteric lymph node naive thymus-derived CD4-positive, alpha-beta ... 10x 5' v1 PATO:0000384 human normal unknown unknown cell tissue

Register file#

Now we are ready to register the artifact to the working instance:

# track the current notebook
ln.transform.stem_uid = "WOK3vP0bNGLx"
ln.transform.version = "0"
ln.track()
πŸ’‘ Assuming editor is Jupyter Lab.
πŸ’‘ notebook imports: cellxgene_lamin_validator==0.3.2 lamindb==0.69.1
πŸ’‘ saved: Transform(uid='WOK3vP0bNGLx6K79', name='Validate and register an h5ad file based on CELLxGENE schema', key='cellxgene-lamin-validator', version='0', type=notebook, updated_at=2024-03-19 09:16:07 UTC, created_by_id=1)
πŸ’‘ saved: Run(uid='jqGVooyHTGL7IefKmU4l', started_at=2024-03-19 09:16:07 UTC, transform_id=1, created_by_id=1)
πŸ’‘ tracked pip freeze > /home/runner/.cache/lamindb/run_env_pip_jqGVooyHTGL7IefKmU4l.txt
# this will modify the AnnData object by adding required columns and categories
artifact = validator.register_artifact(description="test h5ad file")
... storing 'organism' as categorical
... storing 'disease' as categorical
... storing 'development_stage' as categorical
... storing 'self_reported_ethnicity' as categorical
... storing 'suspension_type' as categorical
... storing 'tissue_type' as categorical
πŸ’‘ path content will be copied to default storage upon `save()` with key `None` ('.lamindb/OfQUHid4mEQLh8dXEiaD.h5ad')
βœ… storing artifact 'OfQUHid4mEQLh8dXEiaD' at '/home/runner/work/cellxgene-lamin-validator/cellxgene-lamin-validator/docs/test-cellxgene-lamin-validator/.lamindb/OfQUHid4mEQLh8dXEiaD.h5ad'
πŸ’‘ parsing feature names of X stored in slot 'var'
βœ…    36503 terms (100.00%) are validated for ensembl_gene_id
βœ…    linked: FeatureSet(uid='H5eD9WaI6YQUFUliRgRQ', n=36503, type='number', registry='bionty.Gene', hash='xtVNbbhs3ty63qs-rwKZ', created_by_id=1)
πŸ’‘ parsing feature names of slot 'obs'
βœ…    11 terms (100.00%) are validated for name
❗    loaded Feature record with same name: 'donor_id' (disable via ln.settings.upon_create_search_names)
❗    loaded Feature record with same name: 'tissue' (disable via ln.settings.upon_create_search_names)
❗    loaded Feature record with same name: 'cell_type' (disable via ln.settings.upon_create_search_names)
❗    loaded Feature record with same name: 'assay' (disable via ln.settings.upon_create_search_names)
❗    loaded Feature record with same name: 'sex_ontology_term_id' (disable via ln.settings.upon_create_search_names)
❗    loaded Feature record with same name: 'organism' (disable via ln.settings.upon_create_search_names)
❗    loaded Feature record with same name: 'disease' (disable via ln.settings.upon_create_search_names)
❗    loaded Feature record with same name: 'development_stage' (disable via ln.settings.upon_create_search_names)
❗    loaded Feature record with same name: 'self_reported_ethnicity' (disable via ln.settings.upon_create_search_names)
❗    loaded Feature record with same name: 'suspension_type' (disable via ln.settings.upon_create_search_names)
❗    loaded Feature record with same name: 'tissue_type' (disable via ln.settings.upon_create_search_names)
βœ…    linked: FeatureSet(uid='bt6wBntDEp4pywGjU5BH', n=11, registry='core.Feature', hash='dpUmFC9xFN7HaXqqVx08', created_by_id=1)
βœ… saved 2 feature sets for slots: 'var','obs'
βœ… linked feature 'sex_ontology_term_id' to registry 'bionty.Phenotype'
πŸŽ‰ successfully registered artifact in LaminDB!
view it in the hub: https://lamin.ai/testuser1/test-cellxgene-lamin-validator/artifact/OfQUHid4mEQLh8dXEiaD

View the registered artifact with metadata:

artifact.describe()
Artifact(uid='OfQUHid4mEQLh8dXEiaD', suffix='.h5ad', accessor='AnnData', description='test h5ad file', size=54727155, hash='5esmrdu-DFv9nKyK4ZFA0G', hash_type='sha1-fl', n_observations=1626, visibility=1, key_is_virtual=True, updated_at=2024-03-19 09:16:13 UTC)

Provenance:
  πŸ—ƒοΈ storage: Storage(uid='KD8ixw5l', root='/home/runner/work/cellxgene-lamin-validator/cellxgene-lamin-validator/docs/test-cellxgene-lamin-validator', type='local', updated_at=2024-03-19 09:14:05 UTC, created_by_id=1)
  πŸ’« transform: Transform(uid='WOK3vP0bNGLx6K79', name='Validate and register an h5ad file based on CELLxGENE schema', key='cellxgene-lamin-validator', version='0', type=notebook, updated_at=2024-03-19 09:16:07 UTC, created_by_id=1)
  πŸ‘£ run: Run(uid='jqGVooyHTGL7IefKmU4l', started_at=2024-03-19 09:16:07 UTC, transform_id=1, created_by_id=1)
  πŸ‘€ created_by: User(uid='DzTjkKse', handle='testuser1', name='Test User1', updated_at=2024-03-19 09:14:05 UTC)
Features:
  var: FeatureSet(uid='H5eD9WaI6YQUFUliRgRQ', n=36503, type='number', registry='bionty.Gene', hash='xtVNbbhs3ty63qs-rwKZ', updated_at=2024-03-19 09:16:11 UTC, created_by_id=1)
    'MIR1302-2HG', 'FAM138A', 'OR4F5', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'OR4F29', 'None', 'OR4F16', 'None', 'LINC01409', 'FAM87B', 'LINC01128', 'LINC00115', 'FAM41C', 'None', ...
  obs: FeatureSet(uid='bt6wBntDEp4pywGjU5BH', n=11, registry='core.Feature', hash='dpUmFC9xFN7HaXqqVx08', updated_at=2024-03-19 09:16:13 UTC, created_by_id=1)
    πŸ”— assay (3, bionty.ExperimentalFactor): '10x 5' v1', '10x 5' v2', '10x 3' v3'
    πŸ”— cell_type (31, bionty.CellType): 'classical monocyte', 'T follicular helper cell', 'memory B cell', 'alveolar macrophage', 'naive thymus-derived CD4-positive, alpha-beta T cell', 'effector memory CD8-positive, alpha-beta T cell, terminally differentiated', 'alpha-beta T cell', 'CD4-positive helper T cell', 'naive thymus-derived CD8-positive, alpha-beta T cell', 'macrophage', ...
    πŸ”— development_stage (1, bionty.DevelopmentalStage): 'unknown'
    πŸ”— disease (1, bionty.Disease): 'normal'
    πŸ”— donor_id (12, core.ULabel): 'D496-1', '621B-1', 'A29-1', 'A36-1', 'A35-1', '637C-1', 'A52-1', 'A37-1', 'D503-1', '640C-1', ...
    πŸ”— self_reported_ethnicity (1, bionty.Ethnicity): 'unknown'
    πŸ”— tissue (17, bionty.Tissue): 'spleen', 'sigmoid colon', 'jejunal epithelium', 'bone marrow', 'skeletal muscle tissue', 'transverse colon', 'thymus', 'liver', 'duodenum', 'lamina propria', ...
    πŸ”— organism (1, bionty.Organism): 'human'
    πŸ”— tissue_type (1, core.ULabel): 'tissue'
    πŸ”— suspension_type (1, core.ULabel): 'cell'
    πŸ”— sex_ontology_term_id (1, bionty.Phenotype): 'male'
Labels:
  🏷️ organism (1, bionty.Organism): 'human'
  🏷️ tissues (17, bionty.Tissue): 'spleen', 'sigmoid colon', 'jejunal epithelium', 'bone marrow', 'skeletal muscle tissue', 'transverse colon', 'thymus', 'liver', 'duodenum', 'lamina propria', ...
  🏷️ cell_types (31, bionty.CellType): 'classical monocyte', 'T follicular helper cell', 'memory B cell', 'alveolar macrophage', 'naive thymus-derived CD4-positive, alpha-beta T cell', 'effector memory CD8-positive, alpha-beta T cell, terminally differentiated', 'alpha-beta T cell', 'CD4-positive helper T cell', 'naive thymus-derived CD8-positive, alpha-beta T cell', 'macrophage', ...
  🏷️ diseases (1, bionty.Disease): 'normal'
  🏷️ phenotypes (1, bionty.Phenotype): 'male'
  🏷️ experimental_factors (3, bionty.ExperimentalFactor): '10x 5' v1', '10x 5' v2', '10x 3' v3'
  🏷️ developmental_stages (1, bionty.DevelopmentalStage): 'unknown'
  🏷️ ethnicities (1, bionty.Ethnicity): 'unknown'
  🏷️ ulabels (14, core.ULabel): 'cell', 'tissue', 'D496-1', '621B-1', 'A29-1', 'A36-1', 'A35-1', '637C-1', 'A52-1', 'A37-1', ...

Register collection#

Register a new collection for the registered artifact:

# register a new collection
collection = validator.register_collection(
    artifact,  # registered artifact above, can also pass a list of artifacts
    name="Cross-tissue immune cell analysis reveals tissue-specific features in humans (for test demo only)",  # title of the publication
    description="10.1126/science.abl5197",  # DOI of the publication
    reference="E-MTAB-11536", # accession number (e.g. GSE#, E-MTAB#, etc.)
    reference_type="ArrayExpress") # source type (e.g. GEO, ArrayExpress, SRA, etc.)
πŸŽ‰ successfully registered collection in LaminDB!
view it in the hub: https://lamin.ai/testuser1/test-cellxgene-lamin-validator/collection/OfQUHid4mEQLh8dXEiaD
collection.artifact
Artifact(uid='OfQUHid4mEQLh8dXEiaD', suffix='.h5ad', accessor='AnnData', description='test h5ad file', size=54727155, hash='5esmrdu-DFv9nKyK4ZFA0G', hash_type='sha1-fl', n_observations=1626, visibility=1, key_is_virtual=True, updated_at=2024-03-19 09:16:15 UTC, storage_id=1, transform_id=1, run_id=1, created_by_id=1)
artifact.collection
Collection(uid='OfQUHid4mEQLh8dXEiaD', name='Cross-tissue immune cell analysis reveals tissue-specific features in humans (for test demo only)', description='10.1126/science.abl5197', hash='5esmrdu-DFv9nKyK4ZFA0G', reference='E-MTAB-11536', reference_type='ArrayExpress', visibility=1, updated_at=2024-03-19 09:16:15 UTC, transform_id=1, run_id=1, artifact_id=1, created_by_id=1)

Return an input h5ad file for cellxgene-schema#

adata_cxg = validator.to_cellxgene(is_primary_data=True)
adata_cxg
AnnData object with n_obs Γ— n_vars = 1626 Γ— 36503
    obs: 'donor_id', 'sex_ontology_term_id', 'suspension_type', 'tissue_type', 'tissue_ontology_term_id', 'cell_type_ontology_term_id', 'assay_ontology_term_id', 'organism_ontology_term_id', 'disease_ontology_term_id', 'development_stage_ontology_term_id', 'self_reported_ethnicity_ontology_term_id', 'is_primary_data'
    var: 'feature_is_filtered'
    uns: 'default_embedding', 'title', 'schema_reference', 'schema_version'
    obsm: 'X_umap'
adata_cxg.write_h5ad("anndata_human_immune_cells_cxg.h5ad")
!cellxgene-schema validate anndata_human_immune_cells_cxg.h5ad
Loading dependencies
Loading validator modules
Starting validation...
WARNING: Validation of raw layer was not performed due to current errors, try again after fixing current errors.
ERROR: Column 'schema_version' is a reserved column name of 'uns'. Remove it from h5ad and try again.
ERROR: Column 'schema_reference' is a reserved column name of 'uns'. Remove it from h5ad and try again.
ERROR: 'ENSG00000269933' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261737' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259834' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256374' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000263464' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000203812' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272196' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272880' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000270188' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000287116' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237133' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000224739' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000227902' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000239467' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272551' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000280374' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236886' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000229352' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286601' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000227021' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259855' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273301' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000271870' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237838' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286996' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000269028' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286699' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273370' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261490' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272567' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000270394' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272370' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272354' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000251044' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272040' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000182230' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000204092' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261068' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236740' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236996' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000232295' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000271734' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236673' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000227220' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236166' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000112096' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000285162' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286228' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237513' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000285106' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000226380' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000270672' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000225932' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000244693' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000268955' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272267' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000253878' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259820' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000226403' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000233776' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000269900' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261534' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237548' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000239665' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256892' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000249860' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000271409' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000224745' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261438' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000231575' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000260461' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000255823' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000254740' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000254561' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000282080' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256427' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000287388' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000276814' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000280710' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000215271' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000258414' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000258808' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277050' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273888' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000258861' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259444' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000244952' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273923' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000262668' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000232196' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256618' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000221995' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000226377' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273576' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000267637' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000282965' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273837' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286949' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256222' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000280095' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278927' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278955' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277352' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000239446' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256045' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000228906' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000228139' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261773' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278198' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273496' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277666' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278782' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277761' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000269933' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261737' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259834' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256374' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000263464' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000203812' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272196' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272880' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000270188' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000287116' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237133' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000224739' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000227902' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000239467' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272551' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000280374' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236886' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000229352' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286601' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000227021' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259855' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273301' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000271870' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237838' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286996' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000269028' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286699' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273370' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261490' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272567' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000270394' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272370' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272354' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000251044' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272040' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000182230' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000204092' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261068' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236740' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236996' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000232295' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000271734' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236673' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000227220' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236166' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000112096' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000285162' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286228' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237513' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000285106' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000226380' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000270672' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000225932' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000244693' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000268955' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272267' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000253878' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259820' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000226403' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000233776' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000269900' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261534' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237548' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000239665' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256892' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000249860' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000271409' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000224745' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261438' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000231575' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000260461' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000255823' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000254740' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000254561' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000282080' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256427' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000287388' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000276814' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000280710' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000215271' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000258414' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000258808' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277050' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273888' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000258861' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259444' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000244952' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273923' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000262668' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000232196' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256618' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000221995' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000226377' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273576' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000267637' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000282965' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273837' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286949' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256222' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000280095' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278927' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278955' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277352' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000239446' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256045' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000228906' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000228139' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261773' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278198' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273496' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277666' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278782' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277761' is not a valid feature ID in 'raw.var'.
Validation complete in 0:00:00.802560 with status is_valid=False