-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make validation async #13
Conversation
It should be possible to run tests from the Rakefile, and the default action for `rake` should be to run tests. Running tests (working code) is more important than formatting (standard) - so first test, then lint.
Most mistakes in webhook configuration are usually related to the configuration of the secrets for signatures. While it is good in some situations to reject the incorrectly signed webhooks immediately, in way more situations it will actually be the misconfigured signature that'll be at fault. A better approach is to allow the webhooks to be respooled for processing when the signature configuration has been rectified.
The handler is stateless, so it doesn't have to linger on in memory across calls. But it seems that the serialized ActionDispatch request loses its route params.
Rails loads the Engine module on its own, and the module loading is handled by Bundler/rubygems. This also removes a circular load warning which would happen when running tests.
To configure handlers during Rails init, you need to force-require them - which goes contrary to the Rails autoloading.
There is no need to have an entire "dummy" app structure. The same can be achieved with a single-file app (see https://greg.molnar.io/blog/a-single-file-rails-application/) With that, all the directories for model tests etc. can re-appear once we actually have tests to put inside of them. Same for fixtures - is it really needed to have a fixture with a webhook payload which fits in a string, and use the entire Rails fixture machinery? We are after finding out whether our webhook ends up in the DB correctly. Less is more.
rescue ActiveRecord::RecordNotUnique # Deduplicated | ||
nil | ||
rescue ActiveRecord::RecordNotUnique # Webhook deduplicated | ||
Rails.logger.info { "#{inspect} Webhook #{handler_event_id} is a duplicate delivery and will not be stored." } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: I would suggest to use Rails.error.report("#{inspect} Webhook #{handler_event_id} is a duplicate delivery and will not be stored.", handled: true, severity: :info)
instead here. We already rely on it and it feels confusing that we use .logger for one case and .error for another.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aight
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rails.logger.info { "#{inspect} Webhook #{handler_event_id} is a duplicate delivery and will not be stored." } | |
Rails.error.report("#{inspect} Webhook #{handler_event_id} is a duplicate delivery and will not be stored.", handled: true, severity: :info) |
webhook.handler.process(webhook) | ||
# TODO: remove process attribute | ||
else | ||
Rails.logger.info { "Webhook #{webhook.inspect} did not pass validation and was skipped" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
praise: I would prefer if you use Rails.error
interface here, since we already rely on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason, you decided not to use Rails.error.report("Webhook #{webhook.inspect} did not pass validation and was skipped", handled: true, severity: :info)
here?
lib/munster/base_handler.rb
Outdated
# we default it to `false`. The default is going to be `true` in future versions of Munster. | ||
# | ||
# @return [Boolean] | ||
def validate_async? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought: I lean towards making all validation asynchronous, as async processing is a core principle of the Munster gem.
Regarding the stated problem you're describing:
This prevents malicious senders from spamming your DB and causing a denial-of-service on it. That's why this is made configurable.
I have two objections to this approach:
-
In my experience, I've never encountered this issue. It seems unlikely that flooding a webhook endpoint that performs only a single insert operation would cause a denial-of-service. Attackers would likely target other endpoints that perform more complex operations.
-
If such an attack were to occur, I would address it differently. Implementing rate limiting for the entire webhooks endpoint seems like a more effective solution.
Additionally, since this gem hasn't reached a stable release yet, we don't need to maintain backward compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My idea was to ensure backwards compatibility for webhooks which have been persisted without headers, for those doing the background validation would not be possible. But given the gem is still officially in a "flux" state, I think we are at liberty to make breaking changes as we see fit. Let's make all validations async. Rate limits on the ingress without validation might be the wrong approach, but then again - this is prematurely tweaking for problems we, ourselves, haven't observed yet. So 🚀
end | ||
|
||
class HandlerInactive < StandardError | ||
end | ||
|
||
def create | ||
handler = lookup_handler(params[:service_id]).new | ||
class UnknownHandler < StandardError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: Should we maybe move these error definitions elsewhere? Not sure why they are defined under ReceiveWebhooksController
class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are convenient to place in the controller because they don't need a namespace to raise and to match, and this is the only spot where they get raised. Once we transition to a setup where these errors may get raised from the background job too - we'll move them into the gem namespace.
if self.class.column_names.include?("request_headers") | ||
write_attribute("request_headers", headers) | ||
else | ||
Rails.logger.warn { "You need to run Munster migrations so that request headers can be persisted with the model. Async validation is not going to work without that column being set." } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: I suggest we use Rails.error
interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I will remove this error path altogether - it is reasonable to expect people to migrate before running the app
|
||
module Munster | ||
class ReceivedWebhook < ActiveRecord::Base | ||
MISSING_HEADERS_COLUMN_ERROR = <<~EOS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
suggestion: Can we please move these errors into a separate file? Our errors are a bit all over the place now (some are in controllers, some are here in models)
Can we also subclass StandardError for these, please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That I am not going to do - co-locating these errors with the only spot they can get raised is an affordance to the reader, and it is deliberate. However, if we are not so fixed on maintaining backwards compatibility (and given your review I think we should not be so fixed on it, after all) these errors can just be deleted.
raise MISSING_HEADERS_COLUMN_ERROR unless self.class.column_names.include?("request_headers") | ||
raise MISSING_HEADERS_ERROR if request_headers.blank? | ||
headers = try(:request_headers) || {} | ||
ActionDispatch::Request.new(headers.merge!("rack.input" => StringIO.new(body.to_s.b))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thought: I'm kind of wondering:
- How big of a size all this data would take? In most cases, it will probably take up more space than body itself. Do we need to purge records from database in this case?
- Can this create issues with future version of rails with chages in ActionDispatch::Request? Why saving just headers is not enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re. size - as big as is necessary. We do need to strip the raw post body though, which I forgot to do - essentially we store the request body twice.
I am using ActionDispatch::Request
for a number of reasons.
- It was already used in
valid?
in all handlers, and all our handlers which validate expect that API to be available on the passed argument ActionDispatch::Request
provides access toparams
...- ...and more importantly - to the path params, which get set by the Rails router.
While we did implement the original Munster handlers with the expectation that they will be re-parsing the request body themselves, keeping the entire request available allows Rails to take over for parsing, and folks can use params
like they are used to in Rails controllers, which is also a nice UX improvement IMO.
None of these affordances are possible to realise by other means. The API for ActionDispatch::Request.new
is public API in Rails - if it changes, it would be just the same change as other changes in Rails internals, and our tests should catch that it changed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do need to strip the raw post body though, which I forgot to do - essentially we store the request body twice.
In this case, it seems reasonable to keep body in ActionDispatch::Request
and not store the body in a separate column.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your comments, quite a few good ideas there ❤️
lib/munster/base_handler.rb
Outdated
# we default it to `false`. The default is going to be `true` in future versions of Munster. | ||
# | ||
# @return [Boolean] | ||
def validate_async? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My idea was to ensure backwards compatibility for webhooks which have been persisted without headers, for those doing the background validation would not be possible. But given the gem is still officially in a "flux" state, I think we are at liberty to make breaking changes as we see fit. Let's make all validations async. Rate limits on the ingress without validation might be the wrong approach, but then again - this is prematurely tweaking for problems we, ourselves, haven't observed yet. So 🚀
end | ||
|
||
class HandlerInactive < StandardError | ||
end | ||
|
||
def create | ||
handler = lookup_handler(params[:service_id]).new | ||
class UnknownHandler < StandardError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are convenient to place in the controller because they don't need a namespace to raise and to match, and this is the only spot where they get raised. Once we transition to a setup where these errors may get raised from the background job too - we'll move them into the gem namespace.
webhook.handler.process(webhook) | ||
# TODO: remove process attribute | ||
else | ||
Rails.logger.info { "Webhook #{webhook.inspect} did not pass validation and was skipped" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
|
||
module Munster | ||
class ReceivedWebhook < ActiveRecord::Base | ||
MISSING_HEADERS_COLUMN_ERROR = <<~EOS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That I am not going to do - co-locating these errors with the only spot they can get raised is an affordance to the reader, and it is deliberate. However, if we are not so fixed on maintaining backwards compatibility (and given your review I think we should not be so fixed on it, after all) these errors can just be deleted.
if self.class.column_names.include?("request_headers") | ||
write_attribute("request_headers", headers) | ||
else | ||
Rails.logger.warn { "You need to run Munster migrations so that request headers can be persisted with the model. Async validation is not going to work without that column being set." } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I will remove this error path altogether - it is reasonable to expect people to migrate before running the app
raise MISSING_HEADERS_COLUMN_ERROR unless self.class.column_names.include?("request_headers") | ||
raise MISSING_HEADERS_ERROR if request_headers.blank? | ||
headers = try(:request_headers) || {} | ||
ActionDispatch::Request.new(headers.merge!("rack.input" => StringIO.new(body.to_s.b))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re. size - as big as is necessary. We do need to strip the raw post body though, which I forgot to do - essentially we store the request body twice.
I am using ActionDispatch::Request
for a number of reasons.
- It was already used in
valid?
in all handlers, and all our handlers which validate expect that API to be available on the passed argument ActionDispatch::Request
provides access toparams
...- ...and more importantly - to the path params, which get set by the Rails router.
While we did implement the original Munster handlers with the expectation that they will be re-parsing the request body themselves, keeping the entire request available allows Rails to take over for parsing, and folks can use params
like they are used to in Rails controllers, which is also a nice UX improvement IMO.
None of these affordances are possible to realise by other means. The API for ActionDispatch::Request.new
is public API in Rails - if it changes, it would be just the same change as other changes in Rails internals, and our tests should catch that it changed.
"jsonb" is only available in Postgres
051329e
to
c3b3624
Compare
c3b3624
to
5e8a17f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nicely done! Great improvement.
I just left couple of comments where you can slightly clean up things ;-)
rescue ActiveRecord::RecordNotUnique # Deduplicated | ||
nil | ||
rescue ActiveRecord::RecordNotUnique # Webhook deduplicated | ||
Rails.logger.info { "#{inspect} Webhook #{handler_event_id} is a duplicate delivery and will not be stored." } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rails.logger.info { "#{inspect} Webhook #{handler_event_id} is a duplicate delivery and will not be stored." } | |
Rails.error.report("#{inspect} Webhook #{handler_event_id} is a duplicate delivery and will not be stored.", handled: true, severity: :info) |
webhook.handler.process(webhook) | ||
# TODO: remove process attribute | ||
else | ||
Rails.logger.info { "Webhook #{webhook.inspect} did not pass validation and was skipped" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason, you decided not to use Rails.error.report("Webhook #{webhook.inspect} did not pass validation and was skipped", handled: true, severity: :info)
here?
test/test_helper.rb
Outdated
# Configure Rails Environment | ||
ENV["RAILS_ENV"] = "test" | ||
# ENV["RAILS_ENV"] = "test" | ||
# |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick: would be nice to clean comment here.
@skatkov I've decided not to use the error reporter because it is for errors and also expects an |
@julik sounds reasonable. thanks for explanation. |
Most mistakes in webhook configuration are usually related to the configuration of the secrets for signatures. While it is good in some situations to reject the incorrectly signed webhooks immediately, in way more situations it will actually be the misconfigured signature that'll be at fault. A better approach is to allow the webhooks to be respooled for processing when the signature configuration has been rectified.
valid?
in the background job instead of the controller. Most common configuration issue is an incorrectly specified signing secret, or an incorrectly implemented input validation. When these happen, it is better to allow the webhook to be reprocessed.new
- with a module using singleton methods it may returnself
fromnew
.Closes #12 #11