diff --git a/docs/articles/AddNewCheck.html b/docs/articles/AddNewCheck.html index bc027dea..0e7ab7f1 100644 --- a/docs/articles/AddNewCheck.html +++ b/docs/articles/AddNewCheck.html @@ -180,7 +180,7 @@

Add a New Data Quality Check

Don Torok

-

2024-06-29

+

2024-07-11

Source: vignettes/AddNewCheck.rmd diff --git a/docs/articles/CheckStatusDefinitions.html b/docs/articles/CheckStatusDefinitions.html index 919cfebc..fe8df052 100644 --- a/docs/articles/CheckStatusDefinitions.html +++ b/docs/articles/CheckStatusDefinitions.html @@ -181,7 +181,7 @@

Check Status Definitions

Dmitry Ilyn, Maxim Moinat

-

2024-06-29

+

2024-07-11

Source: vignettes/CheckStatusDefinitions.rmd diff --git a/docs/articles/CheckTypeDescriptions.html b/docs/articles/CheckTypeDescriptions.html index 6e89315a..a58ea494 100644 --- a/docs/articles/CheckTypeDescriptions.html +++ b/docs/articles/CheckTypeDescriptions.html @@ -181,7 +181,7 @@

Data Quality Check Type Definitions

Clair Blacketer

-

2024-06-29

+

2024-07-11

Source: vignettes/CheckTypeDescriptions.rmd diff --git a/docs/articles/DataQualityDashboard.html b/docs/articles/DataQualityDashboard.html index f1cf33d8..d22f577a 100644 --- a/docs/articles/DataQualityDashboard.html +++ b/docs/articles/DataQualityDashboard.html @@ -181,7 +181,7 @@

Getting Started

Clair Blacketer

-

2024-06-29

+

2024-07-11

Source: vignettes/DataQualityDashboard.rmd diff --git a/docs/articles/DqdForCohorts.html b/docs/articles/DqdForCohorts.html index 8db9ac2c..d9530035 100644 --- a/docs/articles/DqdForCohorts.html +++ b/docs/articles/DqdForCohorts.html @@ -181,7 +181,7 @@

Running the DQD on a Cohort

Clair Blacketer

-

2024-06-29

+

2024-07-11

Source: vignettes/DqdForCohorts.rmd diff --git a/docs/articles/SqlOnly.html b/docs/articles/SqlOnly.html index 1c4a7d4c..71653b00 100644 --- a/docs/articles/SqlOnly.html +++ b/docs/articles/SqlOnly.html @@ -181,7 +181,7 @@

Running the DQD in SqlOnly mode

Maxim Moinat

-

2024-06-29

+

2024-07-11

Source: vignettes/SqlOnly.rmd diff --git a/docs/articles/Thresholds.html b/docs/articles/Thresholds.html index 1d94bcf7..7bcce30c 100644 --- a/docs/articles/Thresholds.html +++ b/docs/articles/Thresholds.html @@ -181,7 +181,7 @@

Failure Thresholds and How to Change Them

Clair Blacketer

-

2024-06-29

+

2024-07-11

Source: vignettes/Thresholds.rmd diff --git a/docs/articles/checkIndex.html b/docs/articles/checkIndex.html index 289565c4..9289ad6a 100644 --- a/docs/articles/checkIndex.html +++ b/docs/articles/checkIndex.html @@ -181,7 +181,7 @@

Index

Katy Sadowski

-

2024-06-29

+

2024-07-11

Source: vignettes/checkIndex.Rmd diff --git a/docs/articles/checks/cdmDatatype.html b/docs/articles/checks/cdmDatatype.html index c0b9d6c7..bd765aeb 100644 --- a/docs/articles/checks/cdmDatatype.html +++ b/docs/articles/checks/cdmDatatype.html @@ -181,7 +181,7 @@

cdmDatatype

Katy Sadowski

-

2024-06-29

+

2024-07-11

Source: vignettes/checks/cdmDatatype.Rmd diff --git a/docs/articles/checks/cdmField.html b/docs/articles/checks/cdmField.html index 99406a82..3464b260 100644 --- a/docs/articles/checks/cdmField.html +++ b/docs/articles/checks/cdmField.html @@ -181,7 +181,7 @@

cdmField

Heidi Schmidt, Katy Sadowski

-

2024-06-29

+

2024-07-11

Source: vignettes/checks/cdmField.Rmd diff --git a/docs/articles/checks/cdmTable.html b/docs/articles/checks/cdmTable.html index 3150fd31..03d2cd40 100644 --- a/docs/articles/checks/cdmTable.html +++ b/docs/articles/checks/cdmTable.html @@ -181,7 +181,7 @@

cdmTable

John Gresh, Katy Sadowski

-

2024-06-29

+

2024-07-11

Source: vignettes/checks/cdmTable.Rmd diff --git a/docs/articles/checks/fkClass.html b/docs/articles/checks/fkClass.html index 2904a2e3..bbfbd439 100644 --- a/docs/articles/checks/fkClass.html +++ b/docs/articles/checks/fkClass.html @@ -181,7 +181,7 @@

fkClass

Clair Blacketer, Katy Sadowski

-

2024-06-29

+

2024-07-11

Source: vignettes/checks/fkClass.Rmd diff --git a/docs/articles/checks/fkDomain.html b/docs/articles/checks/fkDomain.html index d90b3fb0..bc151de1 100644 --- a/docs/articles/checks/fkDomain.html +++ b/docs/articles/checks/fkDomain.html @@ -181,7 +181,7 @@

fkDomain

Clair Blacketer, Katy Sadowski

-

2024-06-29

+

2024-07-11

Source: vignettes/checks/fkDomain.Rmd diff --git a/docs/articles/checks/isForeignKey.html b/docs/articles/checks/isForeignKey.html index 3983b9fa..49b25eeb 100644 --- a/docs/articles/checks/isForeignKey.html +++ b/docs/articles/checks/isForeignKey.html @@ -181,7 +181,7 @@

isForeignKey

Dmytry Dymshyts, Katy Sadowski

-

2024-06-29

+

2024-07-11

Source: vignettes/checks/isForeignKey.Rmd @@ -246,7 +246,7 @@

User GuidanceViolated rows query
-- @cdmTableName.@cdmFieldName is the x_concept_id or x_source_concept_id field in a CDM table
--- Inspect the contents of the x_source_value field to investigate the source of the error
+
-- @cdmTableName.@cdmFieldName is the _concept_id or _source_concept_id field in a CDM table
+-- Inspect the contents of the _source_value field to investigate the source of the error
 
 SELECT 
   '@cdmTableName.@cdmFieldName' AS violating_field,  
diff --git a/docs/articles/checks/isPrimaryKey.html b/docs/articles/checks/isPrimaryKey.html
index 8fdef03a..cc5d3275 100644
--- a/docs/articles/checks/isPrimaryKey.html
+++ b/docs/articles/checks/isPrimaryKey.html
@@ -181,7 +181,7 @@ 

isPrimaryKey

John Gresh, Katy Sadowski

-

2024-06-29

+

2024-07-11

Source: vignettes/checks/isPrimaryKey.Rmd diff --git a/docs/articles/checks/isRequired.html b/docs/articles/checks/isRequired.html index 994448e6..3da6dfaa 100644 --- a/docs/articles/checks/isRequired.html +++ b/docs/articles/checks/isRequired.html @@ -181,7 +181,7 @@

isRequired

Katy Sadowski

-

2024-06-29

+

2024-07-11

Source: vignettes/checks/isRequired.Rmd @@ -253,9 +253,9 @@

ETL DevelopersFill in the missing values:

@@ -240,36 +239,43 @@

ETL DevelopersX_concept_id -column with 0. See the Book of OHDSI for additional guidance on the -concept mapping process: https://ohdsi.github.io/TheBookOfOhdsi/ExtractTransformLoad.html#step-2-create-the-code-mappings

+for a source code, you MUST populate its _concept_id column +with 0. See the Book of OHDSI for additional guidance on the concept +mapping process: https://ohdsi.github.io/TheBookOfOhdsi/ExtractTransformLoad.html#step-2-create-the-code-mappings

You may inspect the failing rows using the following SQL:

SELECT  
   '@cdmTableName.@cdmFieldName' AS violating_field,  
-  cdmTable.*  
-FROM @schema.@cdmTableName cdmTable 
-  JOIN @vocabDatabaseSchema.concept co ON cdmTable.@cdmFieldName = co.concept_id 
-WHERE co.concept_id != 0  
-  AND (co.standard_concept != 'S' OR co.invalid_reason IS NOT NULL) 
-

You may build upon this query by joining the relevant -X_concept_id and X_source_concept_id columns -to the concept table and inspecting their names and vocabularies. If the -X_source_concept_id correctly represents the source code in -X_source_value, the fix will be a matter of ensuring your -ETL is correctly using the concept_relationship table to map the source -concept ID to a standard concept via the ‘Maps to’ relationship. If you -are not populating the X_source_concept_id column and/or -are using an intermediate concept mapping table, you may need to inspect -the mappings in your mapper table to ensure they’ve been generated -correctly using the ‘Maps to’ relationship for your CDM’s vocabulary -version.

+ cdmTable.*, + co.* +FROM @schema.@cdmTableName cdmTable + JOIN @vocabDatabaseSchema.concept co ON cdmTable.@cdmFieldName = co.concept_id +WHERE co.concept_id != 0 + AND (co.standard_concept != 'S' OR co.invalid_reason IS NOT NULL)

+

You may build upon this query by joining the +_source_concept_id column to the concept table and +inspecting the source concepts from which the failing non-standard +concepts were mapped. If the _source_concept_id correctly +represents the source code in _source_value, the fix will +be a matter of ensuring your ETL is correctly using the +concept_relationship table to map the source concept ID to a standard +concept via the ‘Maps to’ relationship. If you are not populating the +_source_concept_id column and/or are using an intermediate +concept mapping table, you may need to inspect the mappings in your +mapper table to ensure they’ve been generated correctly using the ‘Maps +to’ relationship for your CDM’s vocabulary version.

+

Also note that when updating the OMOP vocabularies, previously +standard concepts could have been become non-standard and need +remapping. Often this remapping can be done programatically, by +following the ‘Maps to’ relationship to the new standard concept.

Data Users

This check failure means that the failing rows will not be picked up -in a standard OHDSI analysis. It is highly recommended to work with your -ETL team or data provider, if possible, to resolve this issue.

+in a standard OHDSI analysis. Especially when participating in network +research, where only standard concepts are used, this might result in +invalid results. It is highly recommended to work with your ETL team or +data provider, if possible, to resolve this issue.

However, you may work around it at your own risk by determining whether or not the affected rows are relevant for your analysis. Here’s an example query you could run to inspect failing rows in the diff --git a/docs/articles/checks/measureConditionEraCompleteness.html b/docs/articles/checks/measureConditionEraCompleteness.html index e8ce2a9f..2689e33c 100644 --- a/docs/articles/checks/measureConditionEraCompleteness.html +++ b/docs/articles/checks/measureConditionEraCompleteness.html @@ -180,7 +180,7 @@

measureConditionEraCompleteness

-

2024-06-29

+

2024-07-11

Source: vignettes/checks/measureConditionEraCompleteness.Rmd diff --git a/docs/articles/checks/measurePersonCompleteness.html b/docs/articles/checks/measurePersonCompleteness.html index 4e61a71d..3d68ce34 100644 --- a/docs/articles/checks/measurePersonCompleteness.html +++ b/docs/articles/checks/measurePersonCompleteness.html @@ -181,7 +181,7 @@

measurePersonCompleteness

Katy Sadowski

-

2024-06-29

+

2024-07-11

Source: vignettes/checks/measurePersonCompleteness.Rmd @@ -193,7 +193,8 @@

2024-06-29

Summary

-

Level: TABLE
Context: Validation
Category: Completeness
Subcategory:
Severity: CDM convention ⚠ Characterization ✔

+

Level: TABLE
Context: Validation
Category: Completeness
Subcategory:
Severity: CDM convention ⚠ (for observation period), +Characterization ✔  (for all other tables)

diff --git a/docs/articles/checks/measureValueCompleteness.html b/docs/articles/checks/measureValueCompleteness.html index f0be5e07..114f34ad 100644 --- a/docs/articles/checks/measureValueCompleteness.html +++ b/docs/articles/checks/measureValueCompleteness.html @@ -181,7 +181,7 @@

measureValueCompleteness

Katy Sadowski

-

2024-06-29

+

2024-07-11

Source:
vignettes/checks/measureValueCompleteness.Rmd @@ -216,8 +216,12 @@

Definition @@ -248,13 +252,13 @@

Violated rows query

ETL Developers

-

Failures of this check on fields required in the CDM specification -are redundant with failures of isRequired. See isRequired documentation for more -information.

+

Failures of this check on required fields are redundant with failures +of isRequired. See isRequired +documentation for more information.

ETL developers have 2 main options for the use of this check on non-required fields:

    -
  • The check threshold may be set to 100% for non-required fields such +
  • The check threshold may be left on 100% for non-required fields such that the check will never fail. The check result can be used simply to understand completeness for these fields
  • The check threshold may be set to an appropriate value corresponding @@ -264,19 +268,17 @@

    ETL Developers

Unexpectedly missing values should be investigated for a potential -root cause in the ETL. For expected missingness, rows that violate this -check in non-required fields are acceptable but should be clearly -communicated to data users so that they can know when and when not to -expect data to be present in each field. To avoid confusion for users, -however, thresholds should be modified to avoid check failures at -expected levels.

+root cause in the ETL. If a threshold has been adjusted to account for +expected missingness, this should be clearly communicated to data users +so that they can know when and when not to expect data to be present in +each field.

Data Users

This check informs you of the level of missing data in each column of -the CDM. If data is missing in a required column, see the isRequired -documentation for more information.

+the CDM. If data is missing in a required column, see the +isRequired documentation for more information.

The interpretation of a check failure on a non-required column will depend on the context. In some cases, the threshold for this check will have been very deliberately set, and any failure should be cause for diff --git a/docs/articles/checks/plausibleAfterBirth.html b/docs/articles/checks/plausibleAfterBirth.html index 610bc1e4..cacfbb0a 100644 --- a/docs/articles/checks/plausibleAfterBirth.html +++ b/docs/articles/checks/plausibleAfterBirth.html @@ -181,7 +181,7 @@

plausibleAfterBirth

Maxim Moinat, Katy Sadowski

-

2024-06-29

+

2024-07-11

Source: vignettes/checks/plausibleAfterBirth.Rmd diff --git a/docs/articles/checks/plausibleBeforeDeath.html b/docs/articles/checks/plausibleBeforeDeath.html index 2e91b839..a2d11704 100644 --- a/docs/articles/checks/plausibleBeforeDeath.html +++ b/docs/articles/checks/plausibleBeforeDeath.html @@ -181,7 +181,7 @@

plausibleBeforeDeath

Maxim Moinat

-

2024-06-29

+

2024-07-11

Source: vignettes/checks/plausibleBeforeDeath.Rmd diff --git a/docs/articles/checks/plausibleGenderUseDescendants.html b/docs/articles/checks/plausibleGenderUseDescendants.html index c4e8155a..2e92d60d 100644 --- a/docs/articles/checks/plausibleGenderUseDescendants.html +++ b/docs/articles/checks/plausibleGenderUseDescendants.html @@ -180,7 +180,7 @@

plausibleGender

-

2024-06-29

+

2024-07-11

Source: vignettes/checks/plausibleGenderUseDescendants.Rmd diff --git a/docs/articles/checks/plausibleStartBeforeEnd.html b/docs/articles/checks/plausibleStartBeforeEnd.html index f80c0a9b..489a615d 100644 --- a/docs/articles/checks/plausibleStartBeforeEnd.html +++ b/docs/articles/checks/plausibleStartBeforeEnd.html @@ -181,7 +181,7 @@

plausibleStartBeforeEnd

Maxim Moinat

-

2024-06-29

+

2024-07-11

Source: vignettes/checks/plausibleStartBeforeEnd.Rmd diff --git a/docs/articles/checks/plausibleTemporalAfter.html b/docs/articles/checks/plausibleTemporalAfter.html index 3a01db40..2c11f327 100644 --- a/docs/articles/checks/plausibleTemporalAfter.html +++ b/docs/articles/checks/plausibleTemporalAfter.html @@ -180,7 +180,7 @@

plausibleTemporalAfter

-

2024-06-29

+

2024-07-11

Source: vignettes/checks/plausibleTemporalAfter.Rmd diff --git a/docs/articles/checks/plausibleUnitConceptIds.html b/docs/articles/checks/plausibleUnitConceptIds.html index 6855948c..cf0df5fb 100644 --- a/docs/articles/checks/plausibleUnitConceptIds.html +++ b/docs/articles/checks/plausibleUnitConceptIds.html @@ -180,7 +180,7 @@

plausibleUnitConceptIds

-

2024-06-29

+

2024-07-11

Source: vignettes/checks/plausibleUnitConceptIds.Rmd diff --git a/docs/articles/checks/plausibleValueHigh.html b/docs/articles/checks/plausibleValueHigh.html index 799422ff..7879501f 100644 --- a/docs/articles/checks/plausibleValueHigh.html +++ b/docs/articles/checks/plausibleValueHigh.html @@ -181,7 +181,7 @@

plausibleValueHigh

Dymytry Dymshyts

-

2024-06-29

+

2024-07-11

Source: vignettes/checks/plausibleValueHigh.Rmd @@ -234,7 +234,7 @@

Definition diff --git a/docs/articles/checks/plausibleValueLow.html b/docs/articles/checks/plausibleValueLow.html index 9cdbe4b0..40d10b8a 100644 --- a/docs/articles/checks/plausibleValueLow.html +++ b/docs/articles/checks/plausibleValueLow.html @@ -181,7 +181,7 @@

plausibleValueLow

Dymytry Dymshyts

-

2024-06-29

+

2024-07-11

Source:
vignettes/checks/plausibleValueLow.Rmd @@ -247,7 +247,7 @@

Definition diff --git a/docs/articles/checks/sourceConceptRecordCompleteness.html b/docs/articles/checks/sourceConceptRecordCompleteness.html index b5f59d0b..2d4effb4 100644 --- a/docs/articles/checks/sourceConceptRecordCompleteness.html +++ b/docs/articles/checks/sourceConceptRecordCompleteness.html @@ -181,7 +181,7 @@

sourceConceptRecordCompleteness

Katy Sadowski

-

2024-06-29

+

2024-07-11

Source:
vignettes/checks/sourceConceptRecordCompleteness.Rmd @@ -207,29 +207,23 @@

DefinitionSource concept mapping
  • CDM Fields/Tables: All source concept ID -(X_source_concept_id) columns in all event tables.
  • +(_source_concept_id) columns in all event tables.
  • Default Threshold Value:
      -
    • 10 for primary source concept ID columns in condition, drug, -measurement, procedure, device, and observation tables
    • -
    • 100 for all other source concept ID columns
    • +
    • 10% for source concept ID columns in condition, drug, measurement, +procedure, device, and observation tables
    • +
    • 100% for all other source concept ID columns
  • @@ -247,11 +241,11 @@

    User Guidance

    ETL Developers

    -

    Recall that the X_source_concept_id columns should +

    Recall that the _source_concept_id columns should contain the OMOP concept representing the exact code used in the source -data for a given record: “If the is coded in the source -data using an OMOP supported vocabulary put the concept id representing -the source value here.”

    +data for a given record: “If the <_source_value> is coded in the +source data using an OMOP supported vocabulary put the concept id +representing the source value here.”

    A failure of this check usually indicates a failure to map a source value to an OMOP concept. In some cases, such a failure can and should be remediated in the concept-mapping step of the ETL. In other cases, it @@ -259,25 +253,24 @@

    ETL DevelopersTo investigate the failure, run the following query:

    SELECT  
       concept.concept_name AS standard_concept_name, 
    -  cdmTable.X_concept_id, -- standard concept ID field for the table 
    +  cdmTable._concept_id, -- standard concept ID field for the table 
       c2.concept_name AS source_value_concept_name, 
    -  cdmTable.X_source_value, -- source value field for the table 
    +  cdmTable._source_value, -- source value field for the table 
       COUNT(*) 
     FROM @cdmDatabaseSchema.@cdmTableName cdmTable 
    -LEFT JOIN @vocabDatabaseSchema.concept ON concept.concept_id = cdmTable.X_concept_id 
    +LEFT JOIN @vocabDatabaseSchema.concept ON concept.concept_id = cdmTable._concept_id 
     -- WARNING this join may cause fanning if a source value exists in multiple vocabularies 
    -LEFT JOIN @vocabDatabaseSchema.concept c2 ON concept.concept_code = cdmTable.X_source_value 
    +LEFT JOIN @vocabDatabaseSchema.concept c2 ON concept.concept_code = cdmTable._source_value 
     AND c2.domain_id = <Domain of cdmTable> 
     WHERE cdmTable.@cdmFieldName = 0  
    --- AND cdmTable.value_as_number IS NOT NULL -- uncomment for unit_concept_id checks 
    -GROUP BY 1,2,3 
    -ORDER BY 4 DESC 
    +GROUP BY 1,2,3 +ORDER BY 4 DESC

    The query results will give you a summary of the source codes which failed to map to an OMOP concept. Inspecting this data should give you an initial idea of what might be going on.

    If source values return legitimate matches on concept_code, it’s possible that there is an error in the concept mapping step of your ETL. -Please note that while the X_source_concept_id fields are +Please note that while the _source_concept_id fields are technically not required, it is highly recommended to populate them with OMOP concepts whenever possible. This will greatly aid analysts in understanding the provenance of the data.

    @@ -292,10 +285,10 @@

    Data UsersJared Houghtaling, Clair Blacketer

    -

    2024-06-29

    +

    2024-07-11

    Source:
    vignettes/checks/sourceValueCompleteness.Rmd @@ -205,11 +205,11 @@

    DefinitionDefinitionKaty Sadowski

    -

    2024-06-29

    +

    2024-07-11

    Source:
    vignettes/checks/standardConceptRecordCompleteness.Rmd @@ -207,10 +207,10 @@

    DefinitionDefinitionETL DevelopersTo investigate the failure, run the following query:

    SELECT  
       concept_name, 
    -  cdmTable.X_source_concept_id, -- source concept ID field for the table 
    -  cdmTable.X_source_value, -- source value field for the table 
    +  cdmTable._source_concept_id, -- source concept ID field for the table 
    +  cdmTable._source_value, -- source value field for the table 
       COUNT(*) 
     FROM @cdmDatabaseSchema.@cdmTableName cdmTable 
    -LEFT JOIN @vocabDatabaseSchema.concept ON concept.concept_id = cdmTable.X_source_concept_id 
    +LEFT JOIN @vocabDatabaseSchema.concept ON concept.concept_id = cdmTable._source_concept_id 
     WHERE cdmTable.@cdmFieldName = 0  
     -- AND cdmTable.value_as_number IS NOT NULL -- uncomment for unit_concept_id checks 
     GROUP BY 1,2,3 
    @@ -327,11 +328,12 @@ 

    ETL Developers
  • Finally, if the investigation query returns no source value, you must trace the relevant record(s) back to their source and confirm if -the missing value is expected. If not, identify and fix the related -issue in your ETL. If the record legitimately has no value/code in the -source data, then the standard concept ID may be left as 0. However, in -some cases these “code-less” records represent junk data which should be -filtered out in the ETL. The proper approach will be context-dependent +the missing source value is expected. If not, identify and fix the +related issue in your ETL. If the record legitimately has no value/code +in the source data, then the standard concept ID may be left as 0. +However, in some cases these “code-less” records represent junk data +which should be filtered out in the ETL. The proper approach will be +context-dependent