Skip to content

Commit

Permalink
Jpuerto/donor yaml update (#1329)
Browse files Browse the repository at this point in the history
* Docs: Add donor files back to YAML

* Docs: Remove murine-source fields

* Docs: Remove murine-source fields

---------

Co-authored-by: Juan Puerto <=>
  • Loading branch information
jpuerto-psc committed May 7, 2024
1 parent 4b14798 commit 76936fe
Show file tree
Hide file tree
Showing 5 changed files with 91 additions and 121 deletions.
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@
- Bugfix stripping trailing slash in ingest api url
- Converted upload `_url_checks` to use `_get_method` for SenNet compatibility
- Add CEDAR template for murine-source
- Add donor field descriptions back, remove murine-source descriptions

## v0.0.17

Expand Down
63 changes: 29 additions & 34 deletions docs/field-descriptions.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@ acquisition_matrix_size_in_frequency_encoding_direction: Dimensions of the acqui
acquisition_matrix_size_in_phase_encoding_direction: Dimensions of the acquired phase
data before reconstruction, per image.
affiliation: Institutional affiliation
age_unit: Unit for age measurement.
age_value: The time elapsed since birth.
analyte_class: Analytes are the target molecules being measured with the assay.
antibodies_path: Relative path to file with antibody information for this dataset.
antibody_name: Anti-(target name) antibody. Not validated or used down-stream.
Expand All @@ -40,7 +42,10 @@ assay_type: The specific type of assay being executed.
bead_barcode_offset: Position(s) in the read at which the bead barcode starts
bead_barcode_read: Which read file contains the bead barcode
bead_barcode_size: Length of the bead barcode in base pairs
bedding: The type of cage bedding in the cage where the source is housed.
blood_type: ABO blood type or "serotype" refers to the presence/absence of the either/both
A & B blood antigens.
body_mass_index_value: An individual's weight in kilograms divided by the square of
the height in meters.
bulk_atac_cell_isolation_protocols_io_doi: 'Link to a protocols document answering
the question: How was tissue stored and processed for cell/nuclei isolation'
bulk_rna_isolation_protocols_io_doi: 'Link to a protocols document answering the question:
Expand All @@ -53,7 +58,7 @@ bulk_rna_yield_value: 'RNA (ng) per Weight of Tissue (mg). Answer the question:
RNA? Calculate the yield by dividing total RNA isolated by amount of tissue used
to isolate RNA from (ng/mg).'
bulk_transposition_input_number_nuclei: A number (no comma separators)
cage_enhancements: "Environmental enrichments present in the source\u2019s cage."
cause_of_death: The circumstance or condition that caused death.
ce_background_electrolyte: Chemical composition of the background electrolyte that
fills the separation capillary (e.g. "3% acetic acid").
ce_capillary_coating: Treatment of surface of separation capillary. Capillary coating
Expand Down Expand Up @@ -108,16 +113,11 @@ data_collection_mode: Mode of data collection in tandem MS assays. Either DDA (D
data_path: Relative path to file or directory with instrument data. Downstream processing
will depend on filename extension conventions.
data_precision_bytes: Numerical data precision in bytes
date_of_birth_or_fertilization: The date when the mouse/embryo was born/fertilized.
If the hours/minutes are not known, use '00:00'.
date_of_death: The date when the mouse/embryo died. If the hours/minutes are not known,
use '00:00'.
description: Free-text description of this assay.
desi_solvent: Solvent composition for conducting nanospray desorption electrospray
ionization (nanoDESI) or desorption electrospray ionization (DESI).
desi_solvent_flow_rate: The rate of flow of the solvent into a spray.
desi_solvent_flow_rate_unit: Units of the rate of solvent flow.
diet: A free text description of the source's diet.
dilution: Antibody solutions may be diluted according to the experimental protocol.
dms: Was differential mobility spectrometry used in this assay?
dna_assay_input_unit: Units of DNA input into library preparation
Expand All @@ -132,7 +132,6 @@ echo_time: 'Time in msec between the middle of the excitation pulse and the peak
to cover the center of k-space (i.e., -kx=0, ky=0). '
echo_train_length: 'Number of lines in k-space acquired per excitation per image. '
end_datetime: Time stamp indicating end of ablation for ROI
euthanization_method: If the source was euthanized, select the method of euthanization.
execution_datetime: Start date and time of assay, typically a date-time stamped folder
generated by the acquisition instrument. YYYY-MM-DD hh:mm, where YYYY is the year,
MM is the month with leading 0s, and DD is the day with leading 0s, hh is the hour
Expand Down Expand Up @@ -160,6 +159,9 @@ health_status: "Patient's baseline physical condition prior to immediate event l
\ healthy subject may have experienced trauma leading to brain death. As a result\
\ of organ donation, a sample is collected. In this scenario, the subject is deemed\
\ \u201Crelatively healthy.\u201D"
height_unit: Unit for height measurement.
height_value: The vertical measurement or distance from the base to the top of a subject
or participant.
histological_report: histopathological reporting of key variables that are important
for the tissue (absence of necrosis, comment on composition, significant pathology
description, high level inflammation/fibrosis assessment etc
Expand All @@ -178,11 +180,21 @@ ion_mobility: 'Specifies whether or not ion mobility spectrometry was performed
ion_source: Specifies the ion source used
is_cedar: Identifies whether the version is hosted by CEDAR
is_contact: Is this individual a contact for DOI purposes?
is_deceased: Is the source deceased? Use either 'True' or 'False'.
is_embryo: Is the source an embryo? Use either 'True' or 'False'.
is_targeted: Specifies whether or not a specific molecule(s) is/are targeted for detection/measurement
by the assay.
is_technical_replicate: Is this a sequencing replicate?
kidney_donor_profile_index_value: 'The Kidney Donor Profle Index (KDPI) is a numerical
measure that combines ten donor factors, including clinical parameters and demographics,
to summarize into a single number the quality of deceased donor kidneys relative
to other recovered kidneys. The KDPI is derived by frst calculating the Kidney Donor
Risk Index (KDRI) for a deceased donor. Kidneys from a donor with a KDPI of 90%,
for example, have a KDRI (which indicates relative risk of graft failure) greater
than 90% of recovered kidneys. The KDPI is simply a mapping of the KDRI from a relative
risk scale to a cumulative percentage scale. The reference population used for this
mapping is all deceased donors in the United States with a kidney recovered for
the purpose of transplantation in the prior calendar year. Lower KDPI values are
associated with increased donor quality and expected longevity. https://optn.transplant.hrsa.gov/media/1512/guide_to_calculating_interpreting_kdpi.pdf
'
lab_id: "An internal field labs can use it to add whatever ID(s) they want or need\
\ for dataset validation and tracking. This could be a single ID (e.g., \"Visium_9OLC_A4_S1\"\
) or a delimited list of IDs (e.g., \u201C9OL; 9OLC.A2; Visium_9OLC_A4_S1\u201D\
Expand Down Expand Up @@ -235,21 +247,17 @@ library_layout: Whether the library was generated for single-end or paired end s
library_pcr_cycles: Number of PCR cycles to amplify cDNA
library_pcr_cycles_for_sample_index: Number of PCR cycles performed for library indexing
library_preparation_kit: Reagent kit used for library preparation
light_cycle: The light cycle in the room where the source is housed. "Standard/default"
refers to 12-hour photoperiods (e.g., lights on at 7:00 AM, lights off at 7:00 PM).
"Longer photoperiods" refers to 14-hour photoperiods (e.g., lights on at 7:00 AM,
lights off at 9:00 PM). "Reverse lightcycles" means that the the timing of the 12-hour
photoperiod is reversed (.e.g, lights on at 7:00 PM, lights off at 7:00 AM).
local_lifespan_data: A free text description of how long mice live within the local
environment. It is recommended to provide the median or maximum values for murine
lifespans.
lot_number: 'The lot# is specific to the vendor. (eg: Abcam lot# GR3238979-1)'
mass_resolving_power: "The MS1 resolving power defined as m/\u2206m where \u2206m\
\ is the FWHM for a given peak with a specified m/z (m). (unitless)"
max_x_width_unit: Units of image width of the ROI acquisition
max_x_width_value: Image width value of the ROI acquisition
max_y_height_unit: Units of image height of the ROI acquisition
max_y_height_value: Image height value of the ROI acquisition
mechanism_of_injury: 'Mechanism of injury may be, for example: fall, impact (eg: auto
accident), weapon (eg: firearm), etc.'
medical_history: A record of a patient's background regarding health and the occurrence
of disease events of the individual.
middle_name_or_initial: Middle name or initial
ms_scan_mode: Indicates whether the data were generated using MS, MS/MS or MS3.
ms_source: The ion source type used for surface sampling.
Expand Down Expand Up @@ -352,7 +360,8 @@ quality_criteria: 'For example, RIN: 8.7. For suspensions, measured by visual in
or no cells. This can be captured at a high level. "OK" or "not OK", or with more
specificity such as "debris", "clump", "low clump".'
quality_view: The quality of the acquired ultrasound images.
rack_setup: The rack setup type in which the source is housed.
race: A grouping of humans based on shared physical characteristics or social/ethnic
identity generally viewed as distinct.
range_z_unit: The unit of range_z_value.
range_z_value: The total range of the z axis.
reagent_prep_protocols_io_doi: DOI for protocols.io referring to the protocol for
Expand All @@ -374,10 +383,6 @@ roi_description: A description of the region of interest (ROI) captured in the i
roi_id: Multiple images (1-n) are acquired from regions of interest (ROI1, ROI2, ROI3,
etc) on a slide. The roi_id is a number from 1-n representing the ROI captured on
a slide.
room_health_status: A description of the pathogen and opportunist exclusion level
of the room where the source is housed.
room_temperature: The temperature value in Celsius of the room where the source is
housed. An example is "23".
rr_id: The rr_id is a unique antibody identifier that comes from the Antibody Registry
(https://antibodyregistry.org).
sample_id: UUID or HuBMAP ID of parent
Expand Down Expand Up @@ -412,15 +417,14 @@ sequencing_read_format: Slash-delimited list of the number of sequencing cycles
sequencing_read_percent_q30: 'Q30 is the weighted average of all the reads (e.g. #
bases UMI * q30 UMI + # bases R2 * q30 R2 + ...)'
sequencing_reagent_kit: Reagent kit used for sequencing
sex: The sex of the mouse.
sex: 'Biological sex at birth: male or female or other.'
signal_type: Type of signal measured per channel (usually dual counts)
single_file_export_format: 'The format in which each single imaging file will be exported.
(Example: DICOM, tiff, avi, etc.)'
sn_quality: 'An integer describing the signal to noise quality of an OCT image (Example:
30)'
sn_quality_unit: 'The unit of the integer describing the signal to noise quality of
an OCT image (Example: dB)'
source_id: SenNet ID of the source (whole organism) of the assayed tissue.
source_project: External source (outside of HuBMAP) of the project, eg. HCA (The Human
Cell Atlas Consortium).
source_storage_time_unit: Time unit
Expand Down Expand Up @@ -455,13 +459,6 @@ step_z_value: The number of optical sections in z axis range.
storage_media: What was the sample preserved in.
storage_method: The method by which the sample was stored, after preparation and before
the assay was performed.
strain: Jackson Labs nomenclature. When mutant alleles are part of the strain name,
use "<" and ">" to indicate the superscripted alleles. For example, C57BL/6J-KitW-39J
should be entered as "C57BL/6J-Kit<W-39J>", where "W-39J" would be the portion of
the string displayed as superscripted text. For further information, see the "Quick
Guide to Mouse Nomenclature" (https://resources.jax.org/guides/quick-guide-to-mouse-nomenclature).
strain_rrid: The Research Resource Identifier (RRID) (https://scicrunch.org/resources/data/source/nlx_154697-1/search)
for the strain. An example is 'RRID:MGI:3713213'
suspension_enriched: Was the cell/nuclei population enriched?
suspension_enriched_target: If the suspension was enriched, then this is the target
of the enrichment.
Expand Down Expand Up @@ -500,8 +497,6 @@ warm_ischemic_time_value: 'Time interval from interruption of blood supply of ti
to cooling to 4C: For organ donor: cessation of blood flow to perfusion of organ
(cooled to 4C) For surgical specimen/biopsy: cessation of blood flow to specimen
(time biopsy taken or blood supply is interrupted) to cooling of specimen to 4C.'
water_source: A free text description of the source's water supply, including any
treatments to the water.
wavelength_unit: The unit of the wavelength value used to acquire OCT images (nm)
wavelength_value: 'The value of the wavelength used to acquire OCT images (Example:
787)'
Expand Down
60 changes: 25 additions & 35 deletions docs/field-entities.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,10 @@ acquisition_matrix_size_in_phase_encoding_direction:
- dataset
affiliation:
- contributors
age_unit:
- donor
age_value:
- donor
analyte_class:
- dataset
antibodies_path:
Expand All @@ -48,8 +52,10 @@ bead_barcode_read:
- dataset
bead_barcode_size:
- dataset
bedding:
- murine
blood_type:
- donor
body_mass_index_value:
- donor
bulk_atac_cell_isolation_protocols_io_doi:
- dataset
bulk_rna_isolation_protocols_io_doi:
Expand All @@ -62,8 +68,8 @@ bulk_rna_yield_value:
- dataset
bulk_transposition_input_number_nuclei:
- dataset
cage_enhancements:
- murine
cause_of_death:
- donor
ce_background_electrolyte:
- dataset
ce_capillary_coating:
Expand Down Expand Up @@ -128,10 +134,6 @@ data_path:
- dataset
data_precision_bytes:
- dataset
date_of_birth_or_fertilization:
- murine
date_of_death:
- murine
description:
- dataset
desi_solvent:
Expand All @@ -140,8 +142,6 @@ desi_solvent_flow_rate:
- dataset
desi_solvent_flow_rate_unit:
- dataset
diet:
- murine
dilution:
- antibodies
dms:
Expand All @@ -162,8 +162,6 @@ echo_train_length:
- dataset
end_datetime:
- dataset
euthanization_method:
- murine
execution_datetime:
- dataset
expected_cell_count:
Expand All @@ -190,6 +188,10 @@ harmonics:
- dataset
health_status:
- sample
height_unit:
- donor
height_value:
- donor
histological_report:
- sample
imaging_threshold_unit_value:
Expand All @@ -213,14 +215,12 @@ is_cedar:
- sample
is_contact:
- contributors
is_deceased:
- murine
is_embryo:
- murine
is_targeted:
- dataset
is_technical_replicate:
- dataset
kidney_donor_profile_index_value:
- donor
lab_id:
- organ
label_name:
Expand Down Expand Up @@ -293,10 +293,6 @@ library_pcr_cycles_for_sample_index:
- dataset
library_preparation_kit:
- dataset
light_cycle:
- murine
local_lifespan_data:
- murine
lot_number:
- antibodies
mass_resolving_power:
Expand All @@ -309,6 +305,10 @@ max_y_height_unit:
- dataset
max_y_height_value:
- dataset
mechanism_of_injury:
- donor
medical_history:
- donor
middle_name_or_initial:
- contributors
ms_scan_mode:
Expand Down Expand Up @@ -455,8 +455,8 @@ quality_criteria:
- sample
quality_view:
- dataset
rack_setup:
- murine
race:
- donor
range_z_unit:
- dataset
range_z_value:
Expand Down Expand Up @@ -489,10 +489,6 @@ roi_description:
- dataset
roi_id:
- dataset
room_health_status:
- murine
room_temperature:
- murine
rr_id:
- antibodies
sample_id:
Expand Down Expand Up @@ -534,7 +530,7 @@ sequencing_read_percent_q30:
sequencing_reagent_kit:
- dataset
sex:
- murine
- donor
signal_type:
- dataset
single_file_export_format:
Expand All @@ -543,8 +539,6 @@ sn_quality:
- dataset
sn_quality_unit:
- dataset
source_id:
- murine
source_project:
- dataset
source_storage_time_unit:
Expand Down Expand Up @@ -577,10 +571,6 @@ storage_media:
- sample
storage_method:
- sample
strain:
- murine
strain_rrid:
- murine
suspension_enriched:
- sample
suspension_enriched_target:
Expand Down Expand Up @@ -642,15 +632,15 @@ warm_ischemic_time_unit:
- organ
warm_ischemic_time_value:
- organ
water_source:
- murine
wavelength_unit:
- dataset
wavelength_value:
- dataset
weight_unit:
- donor
- organ
- sample
weight_value:
- donor
- sample

Loading

0 comments on commit 76936fe

Please sign in to comment.