Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PROPOSAL & DISCUSSION] Organization-wide Testing Requirements #135

Open
stephen-crawford opened this issue Feb 17, 2023 · 11 comments
Open
Labels
campaign Parent issues of OpenSearch release campaigns. discuss Issues calling for discussion

Comments

@stephen-crawford
Copy link

TL;DR: What do you want the organization to use as testing requirements? What certifies something is "done?"

What are you proposing?

A major topic discussed at the milestone meetings hosted by @CEHENKLE, has been establishing a universal quality bar. An actionable mechanism for the this is using GitHub actions and checks to assert changes meet expectations.

This issue should be used as a discussion board for what these expectations are. Everyone across the @opensearch-project, is encouraged to contribute as this is a change that will affect us all. The final result of this discussion will be a "universal" set of expectations which all repositories will employ. This means that maintainers will have an established quality bar for merging changes, and that contributors can know all requirements before they start working.

What users have asked for this feature?

The milestone group worked with numerous repositories to learn their pain points--the things that were preventing them from releasing on time and making a great product. One of the consistent patterns was a lack of established expectations. One group wanted changes to have "a, b, and c," while another wanted "x, y, and z" This proposal seeks to establish a set of accepted expectations across all repositories.

What is the contributor experience going to be?

Under the new system, a contributor will be able to reference the CONTRIBUTING.md document of any @opensearch-project repository and see the repository's expectations. Most of these expectations will be standardized across the organization but there may be additional requirements depending on each repositories use case. For example, the Security repository may require that the plugin install workflow passes but this would not be required for the whole organization.

This discussion establishes a base quality bar across the organization. If any repository wants to add further requirements specific to them that is encouraged. They will need to list the additional requirements alongside the standardized ones in their CONTRIBUTING.md.

How does this impact flaky tests?

Currently, we are trying to reduce the number of flaky tests which exist in the project. Unfortunately, this is easier said than done and the process is ongoing. Part of this discussion should be to establish expectations for handling existing flaky tests and it is likely that the new quality bar will expect no more flaky tests are introduced. A method of doing this is running all new tests in a separate workflow which is isolated from the older tests. This prevents any currently flaky tests from impacting newly added changes.

What will it take to implement?

In order for repositories to adopt the new quality bar, they should update their CONTRIBUTING.md document and their testing frameworks. Since the expectations are being standardized, it should be relatively easy to port between repositories. If tests are written as GitHub Actions they can likely be directly copied with minimal changes.

Examples of some of the types of tests that may be helpful can be found in the identity project which has workflows for testing a wide range of functions on commit. There may be many more tests that should be mentioned however, and expectations can also include things like updated developer guides.

Any remaining open questions?

We are still trying to determine what these expectations should be so it is important that all opinions are heard. If you have any comments, concerns, or suggestions please share them below.

@stephen-crawford
Copy link
Author

stephen-crawford commented Feb 17, 2023

I am going to go ahead and get this ball rolling.

Some things I would be interested in seeing are:

  1. New tests auto-run: Keep the existing gradle check, but add a workflow that automatically scanned for new test files (gradle + GitHub should be able to do this) and then runs those tests separately. This would let you know whether there was an issue with the test you just wrote or something related to how the tests interact with the older test suite. @cwperks wrote a file that does something along these lines over in the Identity project, and it has worked well for us. I think taking Craig's idea and going one step further could be helpful for a lot of repos.

  2. Dashboards auto-push to functional repo: I have heard from some dashboards-minded folks that the functional tests repo does not always get individual repos' tests. I am not much of a dashboards expert myself so do not know what is required here, but it sounds like adding a check to make sure that all tests are added to the functional testing repo would be helpful.

  3. Documentation check: Something else that I heard was that there are some challenges keeping documentation relevant and up-to-date. I think it could be helpful to add an auto-check like there is for the CHANGELOG that checked for documentation updates or required you to override it stating that your change only modified code irrelevant to feature execution. I.e. you updated a dependency or added an extra test.

@stephen-crawford
Copy link
Author

To help establish guidelines, I am going to start going through each repository and reviewing their testing framework and confirming their current requirements. I will then keep a running document with each of the repos' requirements listed. From there, I will synthesize a set of requirements that should be established.

@stephen-crawford
Copy link
Author

stephen-crawford commented Feb 20, 2023

Repository Unit Tests Integration Tests Backwards Compatibility Tests Additional Tests Link
Alerting
  • Security Test Workflow, Multi-node Test Workflow, Certificate of Origin opensearch-project/alerting#803
    Alerting Dashboards
  • Create Documentation Issue, Certificate of Origin opensearch-project/alerting-dashboards-plugin#491
    Anomaly Detection
  • Link checker, AD benchmark, Certificate of Origin opensearch-project/anomaly-detection#814
    Anomaly Detection Dashboards
  • Link checker, AD benchmark, PR Labeler, Certificate of Origin opensearch-project/anomaly-detection-dashboards-plugin#428
    Ansible Playbook
  • opensearch-project/ansible-playbook#117
    Asynchronous Search
  • Certificate or Origin, Multi-node test opensearch-project/asynchronous-search#243
    Common Utils
  • Certificate or Origin opensearch-project/common-utils#368
    Cross Cluster Replication
  • Certificate or Origin, Security Tests opensearch-project/cross-cluster-replication#721
    Dashboards Anywhere
  • Health Check, Deployment Tests, Functional Tests opensearch-project/dashboards-anywhere#153
    Dashboards Desktop
  • opensearch-project/dashboards-desktop#37
    Dashboards Maps
  • opensearch-project/dashboards-maps#262
    Dashboards Notifications
  • Lint Checker, Documentation Auto-cut opensearch-project/dashboards-notifications#17
    Dashboards Observability
  • Link Checker, Certificate of Origin opensearch-project/dashboards-observability#289
    Dashboards Query Workbench
  • Code QL, Lint Checker, Certificate of Origin, SQL Query Workbench opensearch-project/dashboards-query-workbench#45
    Dashboards Reporting
  • Certificate or Origin, Reports Checker opensearch-project/dashboards-reporting#65
    Dashboards Search Relevance
  • Certificate or Origin, Changelog Verifier, Lint Checker opensearch-project/dashboards-search-relevance#152
    Dashboards Visualizations
  • Certificate or Origin, Link Checker opensearch-project/dashboards-visualizations#167
    Data Prepper
  • Certificate of Origin, Create Document Issue, Performance Tests Compile, App Check, Trace Analytics Tests opensearch-project/data-prepper#2302
    Geospatial
  • Certificate of Origin, Link Checker opensearch-project/geospatial#227
    Helm Charts
  • Lint Checker https://github.com/opensearch-project/helm-charts/issues/3858
    Index Management
  • Certificate of Origin, Link Checker, Lint Checker, Security Test, Docker Security Test, Multi-node Test opensearch-project/index-management#698
    Index Management Dashboards Plugin
  • Link Checker, Certificate of Origin opensearch-project/index-management-dashboards-plugin#629
    Job Scheduler
  • Certificate of Origin, Link Checker opensearch-project/job-scheduler#327
    k-NN
  • Certificate of Origin, Link Checker, Documentation Request, Benchmarking Tool, Custom Benhcmarking Tool opensearch-project/k-NN#775
    ML-Commons
  • Certificate of Origin, Documentation Request opensearch-project/ml-commons#752
    ML-Commons Dashboards
  • Lint Checker opensearch-project/ml-commons-dashboards#127
    Neural Search
  • Certificate of Origin, Link Checker, Benchmarking Tool (in progress) opensearch-project/neural-search#124
    Notifications
  • Link checker opensearch-project/notifications#622
    Observability
  • Link checker, Certificate of Origin, Enforce PR Labels opensearch-project/observability#1416
    OpenSearch
  • Code Hygiene, Changelog Verifier, Certificate of Origin, Label Checker, Link Checker, Create Documentation Issue, ShellCheck, Validate Gradle Wrapper opensearch-project/OpenSearch#6389
    OpenSearch Benchmark
  • opensearch-project/opensearch-benchmark#222
    OpenSearch Build
  • Certificate of Origin, Link Checker, License Header, Groovy Tests, Yaml Lint, Python Tests opensearch-project/opensearch-build#3241
    OpenSearch CI
  • opensearch-project/opensearch-ci#254
    OpenSearch CLI
  • opensearch-project/opensearch-cli#75
    OpenSearch Dashboards
  • Code Hygiene, Changelog Verifier, Certificate of Origin, Label Checker, Link Checker, Document Link Checker opensearch-project/OpenSearch-Dashboards#3466
    OpenSearch Dashboards Functional Test
  • Link Checker, Lint, Bundled Tests, tests added from other repositories opensearch-project/opensearch-dashboards-functional-test#546
    OpenSearch DSL-PY
  • Changelog Verifier, Link Checker opensearch-project/opensearch-dsl-py#101
    OpenSearch Go
  • License Headers, Changelog Verifier, Link Checker, Lint Checker opensearch-project/opensearch-go#238
    OpenSearch Hadoop
  • Certificate of Origin, Link Checker, Changelog Verifier opensearch-project/opensearch-hadoop#123
    OpenSearch Java
  • Certificate of Origin, Link Checker, Lint Checker, Checkstyle opensearch-project/opensearch-java#378
    OpenSearch-JS
  • Link Checker, License, License Header, Changelog Verifier opensearch-project/opensearch-js#390
    OpenSearch Net
  • Certificate of Origin, Link Checker, Deploy Documentation, License Headers, Changelog Verifier opensearch-project/opensearch-net#157
    OpenSearch PHP
  • Link Checker, Changelog Verifier, Update Documentation opensearch-project/opensearch-php#127
    OpenSearch Py
  • Certificate of Origin, Link Checker, License Headers, Changelog, Deploy Doc opensearch-project/opensearch-py#304
    OpenSearch Py-ML
  • Certificate of Origin, Deploy Doc opensearch-project/opensearch-py-ml#88
    OpenSearch-rs
  • Changelog verifier, Link Checker, Clippy Check, Certificate of Origin opensearch-project/opensearch-rs#127
    OpenSearch Ruby
  • Link Checker, License Headers, Publish Docs opensearch-project/opensearch-ruby#148
    Opensearch-SDK-Java
  • Validate Gradle Wrapper opensearch-project/opensearch-sdk-java#470
    OUI
  • Certificate Checker, Lint checker opensearch-project/oui#331
    Performance Analyzer
  • Task List Checker, Link Checker, Certificate of Origin, Create Doc Issue opensearch-project/performance-analyzer#391
    Performance Analyzer RCA
  • Link Checker, Gauntlets Test, Certificate of Origin, Create Doc Issue opensearch-project/performance-analyzer-rca#298
    Reporting
  • Certificate of Origin, Link Checker opensearch-project/reporting#661
    Search Processor
  • Certificate of Origin, Link Checker, Create Documentation Issue opensearch-project/search-processor#104
    Security
  • Code Hygiene, Certificate of Origin, Plugin install opensearch-project/security#2449
    Security Analytics
  • Certificate of Origin, Link Checker, Security Test, Documentation Request, Benchmarking Tool opensearch-project/security-analytics#365
    Security Analytics Dashboards
  • Certificate of Origin opensearch-project/security-analytics-dashboards-plugin#462
    Security Dashboards
  • Code Hygiene, Certificate of Origin, Plugin install opensearch-project/security-dashboards-plugin#1337
    Simple Schema
  • Link checker, Certificate of Origin opensearch-project/simple-schema#71
    SQL
  • Checkstyle, Jacoco (100% coverage required), Comparison Tests, Link checker, Certificate of Origin opensearch-project/sql#1360

    @stephen-crawford
    Copy link
    Author

    stephen-crawford commented Feb 22, 2023

    Follow-up from: SDK, SQL, Search Relevance, Dashboards Anywhere, OpenSearch-js, OpenSearch, OpenSearch Net, k-NN, Neural Search, OpenSearch Dashboards, Anomaly Detection, Benchmark, Build, helm-charts, opensearch-dashboards-functional-tests, OUI, ansible, Search Processor

    Repositories Filed: 60

    @peternied
    Copy link
    Member

    What do the outcome of this proposal look like, is it a process, a tool, or a product feature? How would we weight passible testing requirements from great ones? If we had those requirements and executed on them how would the project be better for it?

    @stephen-crawford
    Copy link
    Author

    Hi @peternied, thank you for following up. The purpose of this discussion is to finalize a set of testing requirements that all development repositories will be required to maintain. This will be in the form of guidelines on the Developer Guide that state that repositories must require "x, y, and z" to have their code merged. Likewise, changes for the repositories will be expected to have those tests pass before they are merged. The hope is that this post will garner feedback (either here or directly) on what requirements people would like to see and then we can go from there.

    As things stand there is a very wide range of requirements for testing across the organization. If Repository A requires code coverage of 85% and runs unit, integration, and backwards compatibility tests, they may expect Repository B does the same. However, because there is no standardized bar, Repository B instead has 98% code coverage but only has unit and integration tests. This causes a non-standard quality bar that creates a poor release experience and has resulted in features being released that don't meet user expectations.

    You bring up a really good point in what the requirement could look like. The plan is to include it in documentation and require that the development repositories implement the requirements. That being said, a tool or GitHub workflow the checked the repository states could be helpful in keeping everyone accountable. The idea is not to be an auditor for repositories but instead to help ensure we all are on the same page. I spoke with quite a few people about what they would like to see different with releases and the quality of the changes was a recurring topic.

    @davidlago
    Copy link

    Thanks @scrawfor99 for driving this. To summarize, but also to make sure I understand:

    The problem is uneven quality bar across the repositories in our organization
    The goal is to come up with a set of global, cross-org recommendations for what that bar looks like
    The process is first analyzing what the state of the world is now, use that to inform the discussion for what that bar should be, and then write that up into a global CONTRIBUTING.md doc that projects can reference, potentially adding their own tweaks on top. After that bar is set, we can also think of providing automations/QoL improvements.

    Does this capture what you're thinking too?

    @stephen-crawford
    Copy link
    Author

    Hi @davidlago, you are exactly correct. I am a bit busy this week but am hoping to be able to file the same type of documenting issue for all repositories and then make a small, easy to digest analysis. After seeing where things are I will take further steps along the lines of my response to Peter and what you mentioned in your comment.

    @joshuali925
    Copy link
    Member

    Hi @scrawfor99, I see BWC confirmation being added to dashboards plugin repo issues. Do we have BWC for dashboards plugins? I checked https://github.com/opensearch-project/opensearch-plugins/blob/main/TESTING.md#backwards-compatibility-testing but it seems only for opensearch plugins.

    @stephen-crawford
    Copy link
    Author

    Hi @joshuali925, thank you for following up on this discussion. You are correct. To my knowledge there are no BWC tests for dashboards-plugins. The section in the table for BWC can effectively be ignored for dashboards plugins since it seems unlikely that that would ever be an expectation. It is not clear what backwards compatibility would entail for the frontend focused repos. It is in the table for consistency and readability.

    @dblock dblock added discuss Issues calling for discussion campaign Parent issues of OpenSearch release campaigns. and removed untriaged labels Mar 20, 2023
    @stephen-crawford
    Copy link
    Author

    stephen-crawford commented Mar 22, 2023

    Statistics based off of findings

    This table provides the presence metrics for each of the tests documented in this process.

    Test Count (max is 59 overall/32 for backend only) Percentage Not Present In
    Unit 57 97% Hadoop?, Ansible Playbook
    Integration 56 95% OUI, Hadoop?, Ansible Playbook
    BWC 22 69% Simple Schema, Performance Analyzer RCA, OpenSearch-rs OpenSearch Py-ML, OpenSearch-CLI, Hadoop, Helm Charts, Geospatial, Data Prepper, Common Utils, Ansible
    Certificate of Origin 41 68% Opensearch-SDK-Java, OpenSearch Ruby, OpenSearch PHP, OpenSearch JS, OpenSearch GO, OpenSearch Dashboards Functional Test, OpenSearch CLI, OpenSearch CI, OpenSearch Benchmark, Notifications, ML-Commons Dashboards, Helm Charts, Dashboards Notifications, Dashboards Maps, Dashboards Anywhere, Ansible
    Checkstyle, Lint Check, or Code Hygiene 16 27% Too many to list
    Link Checker 33 56% Too many to list
    Create Documentation Issue 12 20% Too many to list
    Deploy Documentation 3 1% Too many to list

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    campaign Parent issues of OpenSearch release campaigns. discuss Issues calling for discussion
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants