Validate and register an h5ad file based on CELLxGENE schema#
This guide shows how to validate and curate an AnnData object using the metadata registries of laminlabs/cellxgene, based on the CELLxGENE schema version 5.1.0.
The validated object can be subsequently registered as an artifact in your LaminDB instance.
Note
The cellxgene-lamin-validator is primarily designed to validate all metadata with respect to adhere to the ontologies. The validator does not reimplement all rules of the cellxgene schema and we therefore recommend running the cellxgene-schema if full adherence beyond metadata is a necessity.
Set up#
Load your instance to register the validated AnnData:
!lamin init --storage ./test-cellxgene-lamin-validator --schema bionty
Show code cell output
π‘ connected lamindb: testuser1/test-cellxgene-lamin-validator
import lamindb as ln
from cellxgene_lamin_validator import Validator, datasets, CellxGeneFields
ln.settings.verbosity = "hint"
π‘ connected lamindb: testuser1/test-cellxgene-lamin-validator
β Full backed capabilities are not available for this version of anndata, please install anndata>=0.9.1.
An h5ad file#
Letβs start with an AnnData object that weβd like to inspect and curate:
adata = datasets.anndata_human_immune_cells(populate_registries=True)
adata
AnnData object with n_obs Γ n_vars = 1626 Γ 36503
obs: 'donor', 'tissue', 'cell_type', 'assay', 'sex_ontology_term_id'
var: 'feature_is_filtered'
uns: 'default_embedding'
obsm: 'X_umap'
adata.write_h5ad("anndata_human_immune_cells.h5ad")
... storing 'donor' as categorical
... storing 'sex_ontology_term_id' as categorical
!cellxgene-schema validate anndata_human_immune_cells.h5ad
Loading dependencies
Loading validator modules
Starting validation...
WARNING: Validation of raw layer was not performed due to current errors, try again after fixing current errors.
ERROR: Add labels error: Column 'cell_type' is a reserved column name of 'obs'. Remove it from h5ad and try again.
ERROR: Add labels error: Column 'assay' is a reserved column name of 'obs'. Remove it from h5ad and try again.
ERROR: Add labels error: Column 'tissue' is a reserved column name of 'obs'. Remove it from h5ad and try again.
ERROR: 'title' in 'uns' is not present.
ERROR: 'ENSG00000269933' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261737' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259834' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256374' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000263464' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000203812' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272196' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272880' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000270188' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000287116' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237133' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000224739' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000227902' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000239467' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272551' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000280374' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236886' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000229352' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286601' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000227021' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259855' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273301' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000271870' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237838' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286996' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000269028' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286699' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273370' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261490' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272567' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000270394' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272370' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272354' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000251044' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272040' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000182230' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000204092' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261068' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236740' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236996' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000232295' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000271734' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236673' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000227220' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236166' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000112096' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000285162' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286228' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237513' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000285106' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000226380' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000270672' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000225932' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000244693' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000268955' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272267' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000253878' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259820' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000226403' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000233776' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000269900' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261534' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237548' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000239665' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256892' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000249860' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000271409' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000224745' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261438' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000231575' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000260461' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000255823' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000254740' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000254561' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000282080' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256427' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000287388' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000276814' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000280710' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000215271' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000258414' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000258808' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277050' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273888' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000258861' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259444' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000244952' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273923' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000262668' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000232196' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256618' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000221995' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000226377' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273576' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000267637' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000282965' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273837' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286949' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256222' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000280095' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278927' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278955' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277352' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000239446' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256045' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000228906' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000228139' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261773' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278198' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273496' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277666' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278782' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277761' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000269933' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261737' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259834' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256374' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000263464' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000203812' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272196' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272880' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000270188' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000287116' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237133' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000224739' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000227902' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000239467' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272551' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000280374' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236886' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000229352' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286601' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000227021' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259855' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273301' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000271870' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237838' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286996' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000269028' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286699' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273370' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261490' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272567' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000270394' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272370' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272354' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000251044' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272040' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000182230' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000204092' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261068' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236740' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236996' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000232295' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000271734' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236673' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000227220' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236166' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000112096' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000285162' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286228' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237513' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000285106' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000226380' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000270672' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000225932' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000244693' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000268955' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272267' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000253878' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259820' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000226403' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000233776' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000269900' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261534' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237548' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000239665' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256892' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000249860' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000271409' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000224745' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261438' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000231575' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000260461' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000255823' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000254740' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000254561' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000282080' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256427' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000287388' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000276814' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000280710' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000215271' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000258414' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000258808' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277050' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273888' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000258861' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259444' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000244952' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273923' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000262668' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000232196' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256618' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000221995' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000226377' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273576' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000267637' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000282965' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273837' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286949' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256222' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000280095' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278927' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278955' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277352' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000239446' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256045' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000228906' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000228139' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261773' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278198' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273496' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277666' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278782' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277761' is not a valid feature ID in 'raw.var'.
ERROR: Dataframe 'obs' is missing column 'cell_type_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'assay_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'disease_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'organism_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'tissue_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'self_reported_ethnicity_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'development_stage_ontology_term_id'.
ERROR: Dataframe 'obs' is missing column 'is_primary_data'.
ERROR: Dataframe 'obs' is missing column 'donor_id'.
ERROR: Dataframe 'obs' is missing column 'suspension_type'.
ERROR: Dataframe 'obs' is missing column 'tissue_type'.
Validation complete in 0:00:00.483019 with status is_valid=False
Validate and curate metadata#
Validate the AnnData object:
try:
validator = Validator(adata)
except Exception as e:
print(e)
columns ['development_stage', 'disease', 'donor_id', 'self_reported_ethnicity', 'suspension_type', 'tissue_type', 'organism'] are not found in the AnnData object!
Letβs fix the βdonor_idβ column name:
adata.obs.rename(columns={"donor": "donor_id"}, inplace=True)
For the missing columns, we can pass default values suggested from CELLxGENE:
CellxGeneFields.OBS_FIELD_DEFAULTS
{'disease': 'normal',
'development_stage': 'unknown',
'self_reported_ethnicity': 'unknown',
'suspension_type': 'cell',
'donor_id': 'na',
'tissue_type': 'tissue',
'cell_type': 'native_cell',
'sex': 'unknown'}
validator = Validator(adata, organism="human", **CellxGeneFields.OBS_FIELD_DEFAULTS)
π‘ added defaults to the AnnData object: {'organism': 'human', 'disease': 'normal', 'development_stage': 'unknown', 'self_reported_ethnicity': 'unknown', 'suspension_type': 'cell', 'tissue_type': 'tissue'}
β
registered 1 records without reference: ['sex_ontology_term_id']
β
registered 10 records from laminlabs/cellxgene: ['assay', 'cell_type', 'development_stage', 'disease', 'donor_id', 'self_reported_ethnicity', 'tissue', 'organism', 'tissue_type', 'suspension_type']
validator.obs_fields
{'assay': FieldAttr(ExperimentalFactor.name),
'cell_type': FieldAttr(CellType.name),
'development_stage': FieldAttr(DevelopmentalStage.name),
'disease': FieldAttr(Disease.name),
'donor_id': FieldAttr(ULabel.name),
'self_reported_ethnicity': FieldAttr(Ethnicity.name),
'sex_ontology_term_id': FieldAttr(Phenotype.ontology_id),
'suspension_type': FieldAttr(ULabel.name),
'tissue': FieldAttr(Tissue.name),
'tissue_type': FieldAttr(ULabel.name),
'organism': FieldAttr(Organism.name)}
validated = validator.validate()
π‘ inspecting 'variables' by Gene.ensembl_gene_id
β 123 terms are not validated: 'ENSG00000269933', 'ENSG00000261737', 'ENSG00000259834', 'ENSG00000256374', 'ENSG00000263464', 'ENSG00000203812', 'ENSG00000272196', 'ENSG00000272880', 'ENSG00000270188', 'ENSG00000287116', 'ENSG00000237133', 'ENSG00000224739', 'ENSG00000227902', 'ENSG00000239467', 'ENSG00000272551', 'ENSG00000280374', 'ENSG00000236886', 'ENSG00000229352', 'ENSG00000286601', 'ENSG00000227021', ...
β register terms via `.register_labels('variables')`
π‘ inspecting 'assay' by ExperimentalFactor.name
β 3 terms are not validated: '10x 3' v3', '10x 5' v2', '10x 5' v1'
β register terms via `.register_labels('assay')`
π‘ inspecting 'cell_type' by CellType.name
β
all cell_types are validated
π‘ inspecting 'development_stage' by DevelopmentalStage.name
β 1 terms is not validated: 'unknown'
β register terms via `.register_labels('development_stage')`
π‘ inspecting 'disease' by Disease.name
β 1 terms is not validated: 'normal'
β register terms via `.register_labels('disease')`
π‘ inspecting 'donor_id' by ULabel.name
β 12 terms are not validated: 'D496-1', '621B-1', 'A29-1', 'A36-1', 'A35-1', '637C-1', 'A52-1', 'A37-1', 'D503-1', '640C-1', 'A31-1', '582C-1'
β register terms via `.register_labels('donor_id')`
π‘ inspecting 'self_reported_ethnicity' by Ethnicity.name
β 1 terms is not validated: 'unknown'
β register terms via `.register_labels('self_reported_ethnicity')`
π‘ inspecting 'sex_ontology_term_id' by Phenotype.ontology_id
β 1 terms is not validated: 'PATO:0000384'
β register terms via `.register_labels('sex_ontology_term_id')`
π‘ inspecting 'suspension_type' by ULabel.name
β 1 terms is not validated: 'cell'
β register terms via `.register_labels('suspension_type')`
π‘ inspecting 'tissue' by Tissue.name
β 17 terms are not validated: 'blood', 'thoracic lymph node', 'spleen', 'lungg', 'mesenteric lymph node', 'lamina propria', 'liver', 'jejunal epithelium', 'omentum', 'bone marrow', 'ileum', 'caecum', 'thymus', 'skeletal muscle tissue', 'duodenum', 'sigmoid colon', 'transverse colon'
β register terms via `.register_labels('tissue')`
π‘ inspecting 'tissue_type' by ULabel.name
β 1 terms is not validated: 'tissue'
β register terms via `.register_labels('tissue_type')`
π‘ inspecting 'organism' by Organism.name
β
all organisms are validated
validated
False
Register new metadata labels#
Following the suggestions above to register genes and labels that arenβt present in the current instance:
(Note that our instance is rather empty. Once you filled up the registries, registering new labels wonβt be frequently needed)
validator.register_labels(feature="all")
π‘ registering labels for 'variables'
β
registered 123 records from laminlabs/cellxgene: ['ENSG00000112096', 'ENSG00000182230', 'ENSG00000203812', 'ENSG00000204092', 'ENSG00000215271', 'ENSG00000221995', 'ENSG00000224739', 'ENSG00000224745', 'ENSG00000225932', 'ENSG00000226377', 'ENSG00000226380', 'ENSG00000226403', 'ENSG00000227021', 'ENSG00000227220', 'ENSG00000227902', 'ENSG00000228139', 'ENSG00000228906', 'ENSG00000229352', 'ENSG00000231575', 'ENSG00000232196', 'ENSG00000232295', 'ENSG00000233776', 'ENSG00000236166', 'ENSG00000236673', 'ENSG00000236740', 'ENSG00000236886', 'ENSG00000236996', 'ENSG00000237133', 'ENSG00000237513', 'ENSG00000237548', 'ENSG00000237838', 'ENSG00000239446', 'ENSG00000239467', 'ENSG00000239665', 'ENSG00000244693', 'ENSG00000244952', 'ENSG00000249860', 'ENSG00000251044', 'ENSG00000253878', 'ENSG00000254561', 'ENSG00000254740', 'ENSG00000255823', 'ENSG00000256045', 'ENSG00000256222', 'ENSG00000256374', 'ENSG00000256427', 'ENSG00000256618', 'ENSG00000256892', 'ENSG00000258414', 'ENSG00000258808', 'ENSG00000258861', 'ENSG00000259444', 'ENSG00000259820', 'ENSG00000259834', 'ENSG00000259855', 'ENSG00000260461', 'ENSG00000261068', 'ENSG00000261438', 'ENSG00000261490', 'ENSG00000261534', 'ENSG00000261737', 'ENSG00000261773', 'ENSG00000262668', 'ENSG00000263464', 'ENSG00000267637', 'ENSG00000268955', 'ENSG00000269028', 'ENSG00000269900', 'ENSG00000269933', 'ENSG00000270188', 'ENSG00000270394', 'ENSG00000270672', 'ENSG00000271409', 'ENSG00000271734', 'ENSG00000271870', 'ENSG00000272040', 'ENSG00000272196', 'ENSG00000272267', 'ENSG00000272354', 'ENSG00000272370', 'ENSG00000272551', 'ENSG00000272567', 'ENSG00000272880', 'ENSG00000273301', 'ENSG00000273370', 'ENSG00000273496', 'ENSG00000273554', 'ENSG00000273576', 'ENSG00000273837', 'ENSG00000273888', 'ENSG00000273923', 'ENSG00000274175', 'ENSG00000274792', 'ENSG00000275249', 'ENSG00000275869', 'ENSG00000276017', 'ENSG00000276814', 'ENSG00000277050', 'ENSG00000277196', 'ENSG00000277352', 'ENSG00000277666', 'ENSG00000277761', 'ENSG00000277836', 'ENSG00000278198', 'ENSG00000278633', 'ENSG00000278782', 'ENSG00000278817', 'ENSG00000278927', 'ENSG00000278955', 'ENSG00000280095', 'ENSG00000280374', 'ENSG00000280710', 'ENSG00000282080', 'ENSG00000282965', 'ENSG00000285106', 'ENSG00000285162', 'ENSG00000286228', 'ENSG00000286601', 'ENSG00000286699', 'ENSG00000286949', 'ENSG00000286996', 'ENSG00000287116', 'ENSG00000287388']
π‘ registering labels for 'assay'
β
registered 3 records from laminlabs/cellxgene: ["10x 5' v1", "10x 5' v2", "10x 3' v3"]
π‘ registering labels for 'cell_type'
π‘ registering labels for 'development_stage'
β
registered 1 records from laminlabs/cellxgene: ['unknown']
π‘ registering labels for 'disease'
β
registered 1 records from laminlabs/cellxgene: ['normal']
π‘ registering labels for 'donor_id'
β 12 non-validated labels are not registered: ['D496-1', '621B-1', 'A29-1', 'A36-1', 'A35-1', '637C-1', 'A52-1', 'A37-1', 'D503-1', '640C-1', 'A31-1', '582C-1']!
β to lookup categories, use `.lookup().{feature_name}`
β to register, set `validated_only=False`
π‘ registering labels for 'self_reported_ethnicity'
β
registered 1 records from laminlabs/cellxgene: ['unknown']
π‘ registering labels for 'sex_ontology_term_id'
β
registered 1 records from laminlabs/cellxgene: ['PATO:0000384']
π‘ registering labels for 'suspension_type'
β
registered 1 records from laminlabs/cellxgene: ['cell']
π‘ registering labels for 'tissue'
β 1 non-validated labels are not registered: ['lungg']!
β to lookup categories, use `.lookup().{feature_name}`
β to register, set `validated_only=False`
β
registered 16 records from laminlabs/cellxgene: ['spleen', 'sigmoid colon', 'jejunal epithelium', 'bone marrow', 'skeletal muscle tissue', 'transverse colon', 'thymus', 'liver', 'duodenum', 'lamina propria', 'mesenteric lymph node', 'caecum', 'omentum', 'blood', 'ileum', 'thoracic lymph node']
π‘ registering labels for 'tissue_type'
β
registered 1 records from laminlabs/cellxgene: ['tissue']
π‘ registering labels for 'organism'
For donors, we register the new labels:
validator.register_labels(feature="donor_id", validated_only=False)
β
registered 12 records without reference: ['D496-1', '621B-1', 'A29-1', 'A36-1', 'A35-1', '637C-1', 'A52-1', 'A37-1', 'D503-1', '640C-1', 'A31-1', '582C-1']
An error is shown for the tissue label βlunggβ, which is a typo, should be βlungβ. Letβs fix it:
tissues = validator.lookup()["tissue"]
Lookup objects from the laminlabs/cellxgene
# using a lookup object to find the correct term
tissues.lung
Tissue(uid='7Tt4iEKc', name='lung', ontology_id='UBERON:0002048', synonyms='pulmo', description='Respiration Organ That Develops As An Outpocketing Of The Esophagus.', updated_at=2024-01-08 15:22:49 UTC, public_source_id=47, created_by_id=1)
adata.obs["tissue"] = adata.obs["tissue"].cat.rename_categories({"lungg": tissues.lung.name})
validator.register_labels('tissue')
β
registered 1 records from laminlabs/cellxgene: ['lung']
Letβs validate the object again:
validated = validator.validate()
π‘ inspecting 'variables' by Gene.ensembl_gene_id
β
all variabless are validated
π‘ inspecting 'assay' by ExperimentalFactor.name
β
all assays are validated
π‘ inspecting 'cell_type' by CellType.name
β
all cell_types are validated
π‘ inspecting 'development_stage' by DevelopmentalStage.name
β
all development_stages are validated
π‘ inspecting 'disease' by Disease.name
β
all diseases are validated
π‘ inspecting 'donor_id' by ULabel.name
β
all donor_ids are validated
π‘ inspecting 'self_reported_ethnicity' by Ethnicity.name
β
all self_reported_ethnicitys are validated
π‘ inspecting 'sex_ontology_term_id' by Phenotype.ontology_id
β
all sex_ontology_term_ids are validated
π‘ inspecting 'suspension_type' by ULabel.name
β
all suspension_types are validated
π‘ inspecting 'tissue' by Tissue.name
β
all tissues are validated
π‘ inspecting 'tissue_type' by ULabel.name
β
all tissue_types are validated
π‘ inspecting 'organism' by Organism.name
β
all organisms are validated
validated
True
adata.obs.head()
| donor_id | tissue | cell_type | assay | sex_ontology_term_id | organism | disease | development_stage | self_reported_ethnicity | suspension_type | tissue_type | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| CZINY-0109_CTGGTCTAGTCTGTAC | D496-1 | blood | classical monocyte | 10x 3' v3 | PATO:0000384 | human | normal | unknown | unknown | cell | tissue |
| CZI-IA10244332+CZI-IA10244434_CCTTCGACATACTCTT | 621B-1 | thoracic lymph node | T follicular helper cell | 10x 5' v2 | PATO:0000384 | human | normal | unknown | unknown | cell | tissue |
| Pan_T7935491_CTGGTCTGTACATGTC | A29-1 | spleen | memory B cell | 10x 5' v1 | PATO:0000384 | human | normal | unknown | unknown | cell | tissue |
| Pan_T7980367_GGGCATCCAGGTGGAT | A36-1 | lung | alveolar macrophage | 10x 5' v1 | PATO:0000384 | human | normal | unknown | unknown | cell | tissue |
| Pan_T7935494_ATCATGGTCTACCTGC | A29-1 | mesenteric lymph node | naive thymus-derived CD4-positive, alpha-beta ... | 10x 5' v1 | PATO:0000384 | human | normal | unknown | unknown | cell | tissue |
Register file#
Now we are ready to register the artifact to the working instance:
# track the current notebook
ln.transform.stem_uid = "WOK3vP0bNGLx"
ln.transform.version = "0"
ln.track()
π‘ Assuming editor is Jupyter Lab.
π‘ notebook imports: cellxgene_lamin_validator==0.3.2 lamindb==0.69.1
π‘ saved: Transform(uid='WOK3vP0bNGLx6K79', name='Validate and register an h5ad file based on CELLxGENE schema', key='cellxgene-lamin-validator', version='0', type=notebook, updated_at=2024-03-19 09:16:07 UTC, created_by_id=1)
π‘ saved: Run(uid='jqGVooyHTGL7IefKmU4l', started_at=2024-03-19 09:16:07 UTC, transform_id=1, created_by_id=1)
π‘ tracked pip freeze > /home/runner/.cache/lamindb/run_env_pip_jqGVooyHTGL7IefKmU4l.txt
# this will modify the AnnData object by adding required columns and categories
artifact = validator.register_artifact(description="test h5ad file")
... storing 'organism' as categorical
... storing 'disease' as categorical
... storing 'development_stage' as categorical
... storing 'self_reported_ethnicity' as categorical
... storing 'suspension_type' as categorical
... storing 'tissue_type' as categorical
π‘ path content will be copied to default storage upon `save()` with key `None` ('.lamindb/OfQUHid4mEQLh8dXEiaD.h5ad')
β
storing artifact 'OfQUHid4mEQLh8dXEiaD' at '/home/runner/work/cellxgene-lamin-validator/cellxgene-lamin-validator/docs/test-cellxgene-lamin-validator/.lamindb/OfQUHid4mEQLh8dXEiaD.h5ad'
π‘ parsing feature names of X stored in slot 'var'
β
36503 terms (100.00%) are validated for ensembl_gene_id
β
linked: FeatureSet(uid='H5eD9WaI6YQUFUliRgRQ', n=36503, type='number', registry='bionty.Gene', hash='xtVNbbhs3ty63qs-rwKZ', created_by_id=1)
π‘ parsing feature names of slot 'obs'
β
11 terms (100.00%) are validated for name
β loaded Feature record with same name: 'donor_id' (disable via ln.settings.upon_create_search_names)
β loaded Feature record with same name: 'tissue' (disable via ln.settings.upon_create_search_names)
β loaded Feature record with same name: 'cell_type' (disable via ln.settings.upon_create_search_names)
β loaded Feature record with same name: 'assay' (disable via ln.settings.upon_create_search_names)
β loaded Feature record with same name: 'sex_ontology_term_id' (disable via ln.settings.upon_create_search_names)
β loaded Feature record with same name: 'organism' (disable via ln.settings.upon_create_search_names)
β loaded Feature record with same name: 'disease' (disable via ln.settings.upon_create_search_names)
β loaded Feature record with same name: 'development_stage' (disable via ln.settings.upon_create_search_names)
β loaded Feature record with same name: 'self_reported_ethnicity' (disable via ln.settings.upon_create_search_names)
β loaded Feature record with same name: 'suspension_type' (disable via ln.settings.upon_create_search_names)
β loaded Feature record with same name: 'tissue_type' (disable via ln.settings.upon_create_search_names)
β
linked: FeatureSet(uid='bt6wBntDEp4pywGjU5BH', n=11, registry='core.Feature', hash='dpUmFC9xFN7HaXqqVx08', created_by_id=1)
β
saved 2 feature sets for slots: 'var','obs'
β
linked feature 'sex_ontology_term_id' to registry 'bionty.Phenotype'
π successfully registered artifact in LaminDB!
view it in the hub: https://lamin.ai/testuser1/test-cellxgene-lamin-validator/artifact/OfQUHid4mEQLh8dXEiaD
View the registered artifact with metadata:
artifact.describe()
Artifact(uid='OfQUHid4mEQLh8dXEiaD', suffix='.h5ad', accessor='AnnData', description='test h5ad file', size=54727155, hash='5esmrdu-DFv9nKyK4ZFA0G', hash_type='sha1-fl', n_observations=1626, visibility=1, key_is_virtual=True, updated_at=2024-03-19 09:16:13 UTC)
Provenance:
ποΈ storage: Storage(uid='KD8ixw5l', root='/home/runner/work/cellxgene-lamin-validator/cellxgene-lamin-validator/docs/test-cellxgene-lamin-validator', type='local', updated_at=2024-03-19 09:14:05 UTC, created_by_id=1)
π« transform: Transform(uid='WOK3vP0bNGLx6K79', name='Validate and register an h5ad file based on CELLxGENE schema', key='cellxgene-lamin-validator', version='0', type=notebook, updated_at=2024-03-19 09:16:07 UTC, created_by_id=1)
π£ run: Run(uid='jqGVooyHTGL7IefKmU4l', started_at=2024-03-19 09:16:07 UTC, transform_id=1, created_by_id=1)
π€ created_by: User(uid='DzTjkKse', handle='testuser1', name='Test User1', updated_at=2024-03-19 09:14:05 UTC)
Features:
var: FeatureSet(uid='H5eD9WaI6YQUFUliRgRQ', n=36503, type='number', registry='bionty.Gene', hash='xtVNbbhs3ty63qs-rwKZ', updated_at=2024-03-19 09:16:11 UTC, created_by_id=1)
'MIR1302-2HG', 'FAM138A', 'OR4F5', 'None', 'None', 'None', 'None', 'None', 'None', 'None', 'OR4F29', 'None', 'OR4F16', 'None', 'LINC01409', 'FAM87B', 'LINC01128', 'LINC00115', 'FAM41C', 'None', ...
obs: FeatureSet(uid='bt6wBntDEp4pywGjU5BH', n=11, registry='core.Feature', hash='dpUmFC9xFN7HaXqqVx08', updated_at=2024-03-19 09:16:13 UTC, created_by_id=1)
π assay (3, bionty.ExperimentalFactor): '10x 5' v1', '10x 5' v2', '10x 3' v3'
π cell_type (31, bionty.CellType): 'classical monocyte', 'T follicular helper cell', 'memory B cell', 'alveolar macrophage', 'naive thymus-derived CD4-positive, alpha-beta T cell', 'effector memory CD8-positive, alpha-beta T cell, terminally differentiated', 'alpha-beta T cell', 'CD4-positive helper T cell', 'naive thymus-derived CD8-positive, alpha-beta T cell', 'macrophage', ...
π development_stage (1, bionty.DevelopmentalStage): 'unknown'
π disease (1, bionty.Disease): 'normal'
π donor_id (12, core.ULabel): 'D496-1', '621B-1', 'A29-1', 'A36-1', 'A35-1', '637C-1', 'A52-1', 'A37-1', 'D503-1', '640C-1', ...
π self_reported_ethnicity (1, bionty.Ethnicity): 'unknown'
π tissue (17, bionty.Tissue): 'spleen', 'sigmoid colon', 'jejunal epithelium', 'bone marrow', 'skeletal muscle tissue', 'transverse colon', 'thymus', 'liver', 'duodenum', 'lamina propria', ...
π organism (1, bionty.Organism): 'human'
π tissue_type (1, core.ULabel): 'tissue'
π suspension_type (1, core.ULabel): 'cell'
π sex_ontology_term_id (1, bionty.Phenotype): 'male'
Labels:
π·οΈ organism (1, bionty.Organism): 'human'
π·οΈ tissues (17, bionty.Tissue): 'spleen', 'sigmoid colon', 'jejunal epithelium', 'bone marrow', 'skeletal muscle tissue', 'transverse colon', 'thymus', 'liver', 'duodenum', 'lamina propria', ...
π·οΈ cell_types (31, bionty.CellType): 'classical monocyte', 'T follicular helper cell', 'memory B cell', 'alveolar macrophage', 'naive thymus-derived CD4-positive, alpha-beta T cell', 'effector memory CD8-positive, alpha-beta T cell, terminally differentiated', 'alpha-beta T cell', 'CD4-positive helper T cell', 'naive thymus-derived CD8-positive, alpha-beta T cell', 'macrophage', ...
π·οΈ diseases (1, bionty.Disease): 'normal'
π·οΈ phenotypes (1, bionty.Phenotype): 'male'
π·οΈ experimental_factors (3, bionty.ExperimentalFactor): '10x 5' v1', '10x 5' v2', '10x 3' v3'
π·οΈ developmental_stages (1, bionty.DevelopmentalStage): 'unknown'
π·οΈ ethnicities (1, bionty.Ethnicity): 'unknown'
π·οΈ ulabels (14, core.ULabel): 'cell', 'tissue', 'D496-1', '621B-1', 'A29-1', 'A36-1', 'A35-1', '637C-1', 'A52-1', 'A37-1', ...
Register collection#
Register a new collection for the registered artifact:
# register a new collection
collection = validator.register_collection(
artifact, # registered artifact above, can also pass a list of artifacts
name="Cross-tissue immune cell analysis reveals tissue-specific features in humans (for test demo only)", # title of the publication
description="10.1126/science.abl5197", # DOI of the publication
reference="E-MTAB-11536", # accession number (e.g. GSE#, E-MTAB#, etc.)
reference_type="ArrayExpress") # source type (e.g. GEO, ArrayExpress, SRA, etc.)
π successfully registered collection in LaminDB!
view it in the hub: https://lamin.ai/testuser1/test-cellxgene-lamin-validator/collection/OfQUHid4mEQLh8dXEiaD
collection.artifact
Artifact(uid='OfQUHid4mEQLh8dXEiaD', suffix='.h5ad', accessor='AnnData', description='test h5ad file', size=54727155, hash='5esmrdu-DFv9nKyK4ZFA0G', hash_type='sha1-fl', n_observations=1626, visibility=1, key_is_virtual=True, updated_at=2024-03-19 09:16:15 UTC, storage_id=1, transform_id=1, run_id=1, created_by_id=1)
artifact.collection
Collection(uid='OfQUHid4mEQLh8dXEiaD', name='Cross-tissue immune cell analysis reveals tissue-specific features in humans (for test demo only)', description='10.1126/science.abl5197', hash='5esmrdu-DFv9nKyK4ZFA0G', reference='E-MTAB-11536', reference_type='ArrayExpress', visibility=1, updated_at=2024-03-19 09:16:15 UTC, transform_id=1, run_id=1, artifact_id=1, created_by_id=1)
Return an input h5ad file for cellxgene-schema#
adata_cxg = validator.to_cellxgene(is_primary_data=True)
adata_cxg
AnnData object with n_obs Γ n_vars = 1626 Γ 36503
obs: 'donor_id', 'sex_ontology_term_id', 'suspension_type', 'tissue_type', 'tissue_ontology_term_id', 'cell_type_ontology_term_id', 'assay_ontology_term_id', 'organism_ontology_term_id', 'disease_ontology_term_id', 'development_stage_ontology_term_id', 'self_reported_ethnicity_ontology_term_id', 'is_primary_data'
var: 'feature_is_filtered'
uns: 'default_embedding', 'title', 'schema_reference', 'schema_version'
obsm: 'X_umap'
adata_cxg.write_h5ad("anndata_human_immune_cells_cxg.h5ad")
!cellxgene-schema validate anndata_human_immune_cells_cxg.h5ad
Loading dependencies
Loading validator modules
Starting validation...
WARNING: Validation of raw layer was not performed due to current errors, try again after fixing current errors.
ERROR: Column 'schema_version' is a reserved column name of 'uns'. Remove it from h5ad and try again.
ERROR: Column 'schema_reference' is a reserved column name of 'uns'. Remove it from h5ad and try again.
ERROR: 'ENSG00000269933' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261737' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259834' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256374' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000263464' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000203812' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272196' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272880' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000270188' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000287116' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237133' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000224739' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000227902' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000239467' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272551' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000280374' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236886' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000229352' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286601' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000227021' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259855' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273301' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000271870' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237838' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286996' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000269028' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286699' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273370' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261490' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272567' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000270394' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272370' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272354' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000251044' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272040' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000182230' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000204092' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261068' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236740' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236996' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000232295' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000271734' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236673' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000227220' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000236166' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000112096' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000285162' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286228' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237513' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000285106' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000226380' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000270672' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000225932' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000244693' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000268955' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000272267' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000253878' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259820' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000226403' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000233776' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000269900' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261534' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000237548' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000239665' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256892' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000249860' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000271409' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000224745' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261438' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000231575' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000260461' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000255823' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000254740' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000254561' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000282080' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256427' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000287388' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000276814' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000280710' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000215271' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000258414' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000258808' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277050' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273888' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000258861' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000259444' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000244952' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273923' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000262668' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000232196' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256618' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000221995' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000226377' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273576' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000267637' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000282965' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273837' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000286949' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256222' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000280095' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278927' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278955' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277352' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000239446' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000256045' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000228906' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000228139' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000261773' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278198' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000273496' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277666' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000278782' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000277761' is not a valid feature ID in 'var'.
ERROR: 'ENSG00000269933' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261737' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259834' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256374' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000263464' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000203812' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272196' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272880' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000270188' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000287116' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237133' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000224739' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000227902' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000239467' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272551' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000280374' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236886' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000229352' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286601' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000227021' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259855' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273301' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000271870' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237838' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286996' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000269028' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286699' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273370' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261490' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272567' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000270394' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272370' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272354' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000251044' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272040' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000182230' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000204092' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261068' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236740' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236996' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000232295' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000271734' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236673' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000227220' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000236166' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000112096' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000285162' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286228' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237513' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000285106' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000226380' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000270672' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000225932' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000244693' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000268955' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000272267' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000253878' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259820' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000226403' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000233776' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000269900' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261534' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000237548' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000239665' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256892' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000249860' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000271409' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000224745' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261438' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000231575' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000260461' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000255823' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000254740' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000254561' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000282080' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256427' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000287388' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000276814' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000280710' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000215271' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000258414' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000258808' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277050' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273888' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000258861' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000259444' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000244952' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273923' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000262668' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000232196' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256618' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000221995' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000226377' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273576' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000267637' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000282965' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273837' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000286949' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256222' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000280095' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278927' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278955' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277352' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000239446' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000256045' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000228906' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000228139' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000261773' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278198' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000273496' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277666' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000278782' is not a valid feature ID in 'raw.var'.
ERROR: 'ENSG00000277761' is not a valid feature ID in 'raw.var'.
Validation complete in 0:00:00.802560 with status is_valid=False