Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Rdkit cart substructure search #2055

Open
wants to merge 17 commits into
base: main
Choose a base branch
from

Conversation

headri
Copy link
Contributor

@headri headri commented Aug 5, 2024

Upgraded the Postgres 14 database to Postgres 16 with the RDKit cartridge. This includes:

  • The Dockerfile for the new database image
  • A seed file to call the migrations on demand
  • migrations to create the necessary tables

Added a new structured search option with the new RDKit.

@PiTrem PiTrem changed the title Rdkit cart substructure search feat: Rdkit cart substructure search Aug 5, 2024
@PiTrem PiTrem changed the base branch from main to v1.9 August 5, 2024 11:20
@PiTrem PiTrem changed the base branch from v1.9 to main August 6, 2024 07:29
@headri headri force-pushed the rdkit_cart_substructure_search branch from f0e03e4 to e461766 Compare August 6, 2024 09:05
@@ -110,45 +112,61 @@ class Sample < ApplicationRecord
pg_search_scope :search_by_cas, against: { xref: 'cas' }

# scopes for suggestions
scope :by_residues_custom_info, ->(info, val) { joins(:residues).where("residues.custom_info -> '#{info}' ILIKE ?", "%#{sanitize_sql_like(val)}%")}
scope :by_residues_custom_info, lambda { |info, val|
joins(:residues).where("residues.custom_info -> '#{info}' ILIKE ?", "%#{sanitize_sql_like(val)}%")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Layout/LineLength: Line is too long. [134/120]

joins(:reactions_product_samples).where(reactions_samples: { reaction_id: ids })
}
scope :by_reaction_material_ids, lambda { |ids|
joins(:reactions_starting_material_samples).where(reactions_samples: { reaction_id: ids })
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Layout/LineLength: Line is too long. [127/120]

scope :sample_or_startmat_or_products, -> {
joins("left join reactions_samples rs on rs.sample_id = samples.id").where("rs.id isnull or rs.\"type\" in ('ReactionsProductSample', 'ReactionsStartingMaterialSample')")
scope :sample_or_startmat_or_products, lambda {
joins('left join reactions_samples rs on rs.sample_id = samples.id').where("rs.id isnull or rs.\"type\" in ('ReactionsProductSample', 'ReactionsStartingMaterialSample')")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Layout/LineLength: Line is too long. [174/120]

nil
if sample_svg_file.present?
"/images/samples/#{sample_svg_file}"
elsif molecule&.molecule_svg_file&.present?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lint/RedundantSafeNavigation: Redundant safe navigation detected, use . instead.

end

def init_elemental_compositions
residue = self.residues[0]
return unless molecule_sum_formular.present?
residue = residues[0]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metrics/AbcSize: Assignment Branch Condition size for init_elemental_compositions is too high. [<10, 34, 17> 39.31/25]

end

def init_elemental_compositions
residue = self.residues[0]
return unless molecule_sum_formular.present?
residue = residues[0]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metrics/PerceivedComplexity: Perceived complexity for init_elemental_compositions is too high. [15/8]

end
end

def check_molfile_polymer_section
return if decoupled
return unless self.molfile.include? 'R#'
return unless molfile.include? 'R#'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metrics/AbcSize: Assignment Branch Condition size for check_molfile_polymer_section is too high. [<9, 37, 10> 39.37/25]

end
end

def check_molfile_polymer_section
return if decoupled
return unless self.molfile.include? 'R#'
return unless molfile.include? 'R#'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metrics/CyclomaticComplexity: Cyclomatic complexity for check_molfile_polymer_section is too high. [10/7]

end
end

def check_molfile_polymer_section
return if decoupled
return unless self.molfile.include? 'R#'
return unless molfile.include? 'R#'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Metrics/PerceivedComplexity: Perceived complexity for check_molfile_polymer_section is too high. [11/8]

self.creator.increment_counter 'samples'
return unless /^#{creator.name_abbreviation}-\d+$/.match?(short_label)

creator.increment_counter 'samples'
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rails/SkipsModelValidations: Avoid using increment_counter because it skips validations.

Copy link
Collaborator

@JanCBrammer JanCBrammer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The substructure search does not seem to be tested on the current main. This PR would be a good opportunity to add these tests (for the existing substructure search methods as well as the new one that's being added here).

CHANGELOG.md Outdated Show resolved Hide resolved
app/packs/src/components/navigation/search/Search.js Outdated Show resolved Hide resolved
config/application.rb Outdated Show resolved Hide resolved
@@ -97,10 +97,13 @@ def sample_structure_search(c_id = @c_id, not_permitted = @dl_s && @dl_s < 1)

# TODO: implement this: http://pubs.acs.org/doi/abs/10.1021/ci600358f
scope =
if params[:selection][:search_type] == 'similar'
case params[:selection][:search_type]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This switch case seems to be replicating the switch case under usecases/search/structure_search.rb. Why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

usecases/search/structure_search.rb removed

let(:screen) { create(:screen, name: 'Screen') }
let(:other_screen) { create(:screen, name: 'Other Screen') }
let!(:cell_line) { create(:cellline_sample, name: 'another-cellline-search-example', collections: [collection]) }
let!(:mof3000_1) { Rails.root.join('spec/fixtures/mof_v3000_1.mol').read }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RSpec/IndexedLet: This let statement uses index in its name. Please give it a meaningful name.

let(:screen) { create(:screen, name: 'Screen') }
let(:other_screen) { create(:screen, name: 'Other Screen') }
let!(:cell_line) { create(:cellline_sample, name: 'another-cellline-search-example', collections: [collection]) }
let!(:mof3000_1) { Rails.root.join('spec/fixtures/mof_v3000_1.mol').read }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming/VariableNumber: Use normalcase for symbol numbers.

let(:screen) { create(:screen, name: 'Screen') }
let(:other_screen) { create(:screen, name: 'Other Screen') }
let!(:cell_line) { create(:cellline_sample, name: 'another-cellline-search-example', collections: [collection]) }
let!(:mof3000_1) { Rails.root.join('spec/fixtures/mof_v3000_1.mol').read }
let!(:mof3000_2) { Rails.root.join('spec/fixtures/mof_v3000_2.mol').read }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RSpec/IndexedLet: This let statement uses index in its name. Please give it a meaningful name.

let(:screen) { create(:screen, name: 'Screen') }
let(:other_screen) { create(:screen, name: 'Other Screen') }
let!(:cell_line) { create(:cellline_sample, name: 'another-cellline-search-example', collections: [collection]) }
let!(:mof3000_1) { Rails.root.join('spec/fixtures/mof_v3000_1.mol').read }
let!(:mof3000_2) { Rails.root.join('spec/fixtures/mof_v3000_2.mol').read }
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming/VariableNumber: Use normalcase for symbol numbers.

cllde8 and others added 12 commits August 8, 2024 10:47
Required: RDKit library installed.

The system variable:
One system environment variable PG_CART_INSTALLED is added with the
default value as false. If RDKit is installed and you want to turn on
this feature for user, remember to set RDKIT_INSTALLED as true.

The migration files:
1. They are executed only if RDKit is installed in Postgresql.
2. A new column is added to the table: samples. The script for
generating the data into the new column is included.
@headri headri force-pushed the rdkit_cart_substructure_search branch from 5e3c6ec to 027895b Compare August 8, 2024 08:49
Sample.by_collection_id(c_id).search_by_fingerprint_sub(molfile)
when 'subRDKit'
Sample.by_collection_id(c_id).search_by_rdkit_sub(molfile)
end
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed adding a fallback / default case for invalid search_type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The search type has a whitelist: "optional :search_type, type: String, values: %w[similar sub subRDKit]"
Do we still need a fallback / default case?

@@ -1,4 +1,5 @@
require_relative "shared/admin.seed.rb"
require_relative "shared/molecules.seed.rb"
require_relative "shared/text_templates.seed.rb"
# require_relative "shared/text_templates.seed.rb"
Copy link
Collaborator

@JanCBrammer JanCBrammer Aug 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add "TODO" comment with a reminder to comment this back in or remove it.
Maybe also add context on why this is commented out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see #2071

Copy link
Collaborator

@JanCBrammer JanCBrammer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. How will the RDKit cartridge be introduced in test (CI) and production environments?
    This PR sets it up for the development environment. Question also for @PiTrem.

  2. Is there going to be documentation over at https://github.com/ComPlat/chemotion_saurus?

docker-compose.dev.yml Outdated Show resolved Hide resolved
Copy link
Collaborator

@JanCBrammer JanCBrammer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've clicked through the feature in the UI. It seems to be working fine.

db/seeds/development.rb Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants