Add survey criteria #970

tjennison-work · 2024-08-28T18:09:11Z

Loads all survey data for real time filtering.
Improves question/answer name handling, though indexing changes will
be needed to handle global search correctly.
Multi-select plus instance level data still to be dealt with.
Config changes will come separately since they'll break stored data.
Copies some filter generation code from entity group criteria but this
allows the config and selection data to be distinct from entity group
criteria. There is already some divergence and this allows them to
evolve separately. This wasn't originally planned but handling
backwards compatability while switching to a new
plugin/config/selection data seemed pretty sketchy.
Fixes a bug where entity group criteria with modifiers but no
selection weren't handled correctly.
Testing is weird because we don't currently have survey data. The
basics are the same as entity group criteria so using the same data
for now but TBD how we want to handle this.

marikomedlock · 2024-08-29T13:57:16Z

ui/src/criteria/survey.tsx

+// "survey" plugins are designed to handle medium sized amount of survey data in
+// an optimized fashion.


nit: Any clarification possible on "medium sized" -- maybe < 1GB or <100k rows (from fetchAll limit below)?

marikomedlock · 2024-08-29T13:58:06Z

ui/src/criteria/survey.tsx

+      // TODO(tjennison): There's no way to get the question information for
+      // answers added via global search. We'll need to get the question
+      // information at index time.


Would creating a new attribute e.g. label that is the concatenation of [question] : [answer] address this for global search? Or maybe that's what you mean by "at index time".

Using the newly added isLeaf filter to not return answers per our discussion.

marikomedlock · 2024-08-29T14:13:25Z

underlay/src/main/proto/criteriaselector/configschema/survey.proto

+  repeated EntityGroupConfig entity_groups = 2;
+
+  // Optional configuration of a categorical or numeric value associated with
+  // the selection (e.g. a measurement value). Applied to the entire selection


nit: "e.g. a survey numeric answer value"?

marikomedlock · 2024-08-29T14:14:41Z

underlay/src/main/proto/criteriaselector/configschema/survey.proto

+  // Entity groups where the related entity is what is selected (e.g. condition
+  // when filtering condition_occurrences).


nit: "e.g. surveyBasics when filtering surveyOccurrence"?

marikomedlock · 2024-08-29T14:15:26Z

underlay/src/main/proto/criteriaselector/dataschema/survey.proto

+    // The key of the selected value, which references a related entity (e.g.
+    // condition for a condition_occurrence).


nit: "e.g. surveyBasics for a surveOccurrence"?

marikomedlock · 2024-08-29T14:15:46Z

underlay/src/main/proto/criteriaselector/dataschema/survey.proto

+  // Data for an additional categorical or numeric value associated with the
+  // selection (e.g. a measurement value).


nit: "e.g. a survey numeric answer value"?

marikomedlock · 2024-08-29T14:27:27Z

underlay/src/main/java/bio/terra/tanagra/filterbuilder/impl/core/SurveyFilterBuilder.java

+@SuppressFBWarnings(
+    value = "NP_UNWRITTEN_PUBLIC_OR_PROTECTED_FIELD",
+    justification = "The config and data objects are deserialized by Jackson.")
+public class SurveyFilterBuilder extends FilterBuilder {


Chunks of this class are pretty similar to the existing EntityGroupFilterBuilder. I think it would be better to move those chunks into a shared base class or a utility class. Perhaps using a generic type would be the least amount of up-front work (e.g. <T> = <DTEntityGroupEntityGroup> or <DTSurvey.Survey>).

Cleaned this up into a shared base class. The generic type didn't seem like the way to go because of some of the names not matching the complexity of the interface required to handle the builders.

marikomedlock · 2024-08-29T14:31:07Z

underlay/src/test/java/bio/terra/tanagra/filterbuilder/SurveyFilterBuilderTest.java

+import org.junit.jupiter.api.BeforeEach;
+import org.junit.jupiter.api.Test;
+
+public class SurveyFilterBuilderTest {


I think these tests should use actual survey entities instead of the same ones the entity group filter builder tests use. The filter builder tests are just testing that the right filter object is created, I don't think they're running any queries (you can confirm by running them in the noCloudAccessRequiredTests mode). So you could use the AoU config files here.

These are removed now that the code is mostly shared. We can start adding survey specific tests when the functionality starts to diverge.

marikomedlock · 2024-09-11T21:25:33Z

...ay/src/main/java/bio/terra/tanagra/filterbuilder/impl/core/EntityGroupFilterBuilderBase.java

+
+    selectedIdsPerEntityGroup.forEach(
+        (entityGroup, selectedIds) -> {
+          Collections.sort(selectedIds);


nit: Could you add a comment here that the ids will always be in sorted order in the generated SQL, not in the order of the defined criteria?

marikomedlock · 2024-09-11T21:39:46Z

...ay/src/main/java/bio/terra/tanagra/filterbuilder/impl/core/EntityGroupFilterBuilderBase.java

+    if (!selectedIdsPerEntityGroup.isEmpty()) {
+      selectedEntityGroups = new ArrayList<>(selectedIdsPerEntityGroup.keySet());
+    } else {
+      selectedEntityGroups =


This means that if we add a criteria with no selected ids (e.g. condition multi-select and check no boxes), we'll get a different filter now than we would have before. Previously, we'd have returned a null filter, so the effect would be to show counts for the entire dataset. Now, we'll return a filter for any condition occurrence data, so the effect may be to show counts for a subset. Is that correct?

That's actually not the case because buildForCohort and buildForDataFeature early exit for criteria with no selection or modifiers. In the modifiers only case this is a change as expected.

That being said, I think changing the behavior is actually correct. Adding an empty condition filter should filter on all people who have a condition rather than doing nothing. I'll probably follow up to remove the early exit but I didn't want to put anything else into this PR if I didn't need to.

Ah right, I forgot about the early exit conditional at the top. Agreed that changing the behavior would make more sense.

marikomedlock · 2024-09-12T13:16:18Z

...ay/src/main/java/bio/terra/tanagra/filterbuilder/impl/core/EntityGroupFilterBuilderBase.java

+    return selectedIdsPerEntityGroup;
+  }
+
+  private List<EntityGroup> selectedEntityGroups(


nit: Could you add a comment here that this returns the list of entity groups that the user selected ids from, or the list of configured entity groups if the user selected no ids? I know that now from our discussions and reviewing this code, but seems like something I'm going to forget.

marikomedlock · 2024-09-12T13:19:20Z

RE survey tests: Now that we have the AoU CT test data indexed in verily-tanagra-dev, we could run tests against that, as long as the VUMC folks are okay with that. I imagine they will be, since we got the okay to run tests against the 2019 test data -- but that was several years ago now, so probably good to double check.

* Loads all survey data for real time filtering. * Improves question/answer name handling, though indexing changes will be needed to handle global search correctly. * Multi-select plus instance level data still to be dealt with. * Config changes will come separately since they'll break stored data. * Filter generation shares code with existing entity group criteria. * Copies some filter generation code from entity group criteria but this allows the config and selection data to be distinct from entity group criteria. There is already some divergence and this allows them to evolve separately. This wasn't originally planned but handling backwards compatability while switching to a new plugin/config/selection data seemed pretty sketchy. * Fixes a bug where entity group criteria with modifiers but no selection weren't handled correctly. * Testing is weird because we don't currently have survey data. The basics are the same as entity group criteria so using the same data for now but TBD how we want to handle this.

tjennison-work · 2024-09-12T14:40:43Z

RE survey tests: Now that we have the AoU CT test data indexed in verily-tanagra-dev, we could run tests against that, as long as the VUMC folks are okay with that. I imagine they will be, since we got the okay to run tests against the 2019 test data -- but that was several years ago now, so probably good to double check.

Sounds good. I'll check with them on Wednesday.

tjennison-work force-pushed the tj-newsurvey branch 2 times, most recently from 7275ced to 40fa78e Compare August 28, 2024 19:10

tjennison-work requested a review from marikomedlock August 28, 2024 20:14

marikomedlock reviewed Aug 29, 2024

View reviewed changes

tjennison-work force-pushed the tj-newsurvey branch 10 times, most recently from a576a29 to c560da9 Compare September 10, 2024 18:45

tjennison-work requested a review from marikomedlock September 10, 2024 18:58

tjennison-work force-pushed the tj-newsurvey branch from c560da9 to 6509ca8 Compare September 11, 2024 14:18

marikomedlock approved these changes Sep 12, 2024

View reviewed changes

tjennison-work force-pushed the tj-newsurvey branch from 6509ca8 to 3a948d7 Compare September 12, 2024 14:38

tjennison-work force-pushed the tj-newsurvey branch from 3a948d7 to 34de726 Compare September 12, 2024 14:39

tjennison-work merged commit 1c58d1e into main Sep 12, 2024
8 checks passed

tjennison-work deleted the tj-newsurvey branch September 12, 2024 14:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add survey criteria #970

Add survey criteria #970

tjennison-work commented Aug 28, 2024 •

edited

Loading

marikomedlock Aug 29, 2024

tjennison-work Sep 9, 2024

marikomedlock Aug 29, 2024

tjennison-work Sep 11, 2024

marikomedlock Aug 29, 2024

tjennison-work Sep 9, 2024

marikomedlock Aug 29, 2024

tjennison-work Sep 9, 2024

marikomedlock Aug 29, 2024

marikomedlock Aug 29, 2024

tjennison-work Sep 9, 2024

marikomedlock Aug 29, 2024

tjennison-work Sep 9, 2024

marikomedlock Aug 29, 2024

tjennison-work Sep 9, 2024

marikomedlock Sep 11, 2024

tjennison-work Sep 12, 2024

marikomedlock Sep 11, 2024

tjennison-work Sep 12, 2024

marikomedlock Sep 12, 2024

marikomedlock Sep 12, 2024

tjennison-work Sep 12, 2024

marikomedlock commented Sep 12, 2024

tjennison-work commented Sep 12, 2024

		// "survey" plugins are designed to handle medium sized amount of survey data in
		// an optimized fashion.

		// Entity groups where the related entity is what is selected (e.g. condition
		// when filtering condition_occurrences).

		// The key of the selected value, which references a related entity (e.g.
		// condition for a condition_occurrence).

		// Data for an additional categorical or numeric value associated with the
		// selection (e.g. a measurement value).

Add survey criteria #970

Add survey criteria #970

Conversation

tjennison-work commented Aug 28, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

marikomedlock commented Sep 12, 2024

tjennison-work commented Sep 12, 2024

tjennison-work commented Aug 28, 2024 •

edited

Loading