Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

https://nodejs.org/dist/ Blocking JFrog Artifactory User Agent: Artifactory #5605

Closed
adam-browning opened this issue Aug 3, 2023 · 29 comments

Comments

@adam-browning
Copy link

Version

18.15.0

Platform

Darwin adamb-mac 22.5.0 Darwin Kernel Version 22.5.0: Thu Jun 8 22:22:20 PDT 2023; root:xnu-8796.121.3~7/RELEASE_ARM64_T6000 arm64

Subsystem

No response

What steps will reproduce the bug?

curl -H "Host: nodejs.org" -H "User-Agent: Artifactory/" https://nodejs.org/dist --head
HTTP/2 403
date: Sun, 30 Jul 2023 11:17:09 GMT
content-type: text/html; charset=UTF-8
cache-control: max-age=15
expires: Sun, 30 Jul 2023 11:17:24 GMT
x-frame-options: SAMEORIGIN
strict-transport-security: max-age=31536000; includeSubDomains; preload
x-content-type-options: nosniff
server: cloudflare
cf-ray: 7eed318d7de809c3-HFA

How often does it reproduce? Is there a required condition?

always

What is the expected behavior? Why is that the expected behavior?

Allow access to https://nodejs.org/dist/ from User Agent: Artifactory

What do you see instead?

HTTP/2 403
date: Thu, 03 Aug 2023 15:50:03 GMT
content-type: text/html; charset=UTF-8
cache-control: max-age=15
expires: Thu, 03 Aug 2023 15:50:18 GMT
x-frame-options: SAMEORIGIN
strict-transport-security: max-age=31536000; includeSubDomains; preload
x-content-type-options: nosniff
server: cloudflare
cf-ray: 7f0fb6cf8a11416a-LHR

Additional information

Hello,
My name is Adam and I'm a Product Manager @jfrog.
We recently discovered that our mutual customers are being blocked from accessing the following URL when requesting data from user agent "Artifactory" https://nodejs.org/dist/

We would like to collaborate together to understand why this restriction was enabled and see how we can resolve any issues from the Jfrog Platform source.

We are looking forward to working together
Thanks in advance,
Adam

@bnoordhuis
Copy link
Member

Probably better asked over at nodejs/nodejs.org. Maybe an admin can move the issue.

@MoLow MoLow transferred this issue from nodejs/node Aug 3, 2023
@mhdawson
Copy link
Member

mhdawson commented Aug 3, 2023

There is a related discussion in - nodejs/build#3223.

If the complaint is that requests for non Node.js downloads are being blocked, then that issues covers why the project sees those requests as a problem. If it is blocking requests for Node.js downloads that might not be intentional but a side effect of trying to block the spam requests.

@mhdawson
Copy link
Member

mhdawson commented Aug 3, 2023

@nodejs/build to pull in more of the people who have been in past discussions.

@targos
Copy link
Member

targos commented Aug 4, 2023

Here's a sample of blocked requests I just took from the Cloudflare logs. You can see that none of these make sense for the nodejs.org origin.
We block literally millions of requests every day. They would otherwise overload our server, which is already struggling.

Screenshot of blocked requests

CleanShot 2023-08-04 at 09 05 57

@adam-browning
Copy link
Author

Hi @targos
Thanks for sharing the information and we would like to partner together to prevent this from happening in the future specifically for our mutual Saas customers.

We have identified the abusive customer on our end and reached out to them to prevent these redundant requests.
We are currently working on a solution to prevent the egress from Artifactory Saas to npmjs.org in cases like these.
In the meanwhile, we would kindly request that you switch to IP-based blocking since User Agent is much more aggressive and encompasses a huge number of customers beyond the impact scope.
Would that be possible?

One last thing, I would like to create an open communication line between us for issues like this
Please feel free to reach out to me at @[email protected] so we can find an easy method to foster the relationship

Thanks in advance,
Adam

@mhdawson
Copy link
Member

@adam-browning, @targos would it make sense for use to find a time for a short discussion. Would 10:30 ET on Friday the 18th work for you two?

@adam-browning
Copy link
Author

Hi @mhdawson
I'm located in Israel and specifically on the 18th I'm not available.
I am available between 9-11am ET on Tuesdays the 22nd and 29 of August
Let me know if that could work for you.
Thanks in advance!
Adam

@targos
Copy link
Member

targos commented Aug 15, 2023

I'm going to be on holiday until the 4th of September. Let's try to clarify things here first.
@adam-browning Our firewall rule tries to accommodate for legitimate URLs. Do you have examples of URLs that Artifactory needs to access and are currently blocked by the rule?

@adam-browning
Copy link
Author

Hi @targos
yes, https://nodejs.org/dist/ is currently blocking traffic from user agent: Artifactory/

image

@adam-browning
Copy link
Author

Blocking on a user agent level is very excessive since we have ~7,000+ mutual customers accessing npmjs.org using Artifactory Saas and are masked by this user agent.
We have contacted the abusive customer that incorrectly configured a maven repository against the npm url and are working to prevent these issues from happening again.

In addition, I'm adding JFrog's NAT Saas IPs list https://jfrog.com/help/r/what-are-artifactory-cloud-nated-ips
If an abusive pattern originates from one of these IPs please contact me directly and we will attempt to block the traffic from our end before you need to block.
Thanks in advance,

@targos
Copy link
Member

targos commented Aug 15, 2023

There is a big misunderstanding here. https://nodejs.org/dist/ is not the same as npmjs.org. It is not the npm registry, nor is it a mirror for it.
While we have seen some requests coming from misconfigured maven repositories, the vast majority of requests come from misconfigured npm registry.

@ovflowd
Copy link
Member

ovflowd commented Aug 15, 2023

Can we please switch this conversation to e-mails or Slack (OpenJS)? I definitely don't want see this issue getting polluted with calendar scheduling messages 😅

Also, I'm with @targos here, we have a Firewall rule that allows legitimate JFrog/Artifactory requests for Node.js Binaries.

The screenshot you've (@adam-browning) shown above is not one of those legitimate cases, as it is a directory listening. (If JFrog needs to have directory listening, let us know)

Here's the regex we use for our WAF rule.

(http.user_agent matches "^(Artifactory/|Nexus/|AzureArtifacts/|Gradle/)" and not http.request.uri.path matches "\.(gz|xz|pkg|7z|zip|msi|txt|asc|sig|lib|exe|json|tab)$")

You can see that requests going to Node.js binaries will succeed. But requests that are directory listing requests or NPM package requests will be blocked.

@adam-browning
Copy link
Author

Understood, moving to OpenJS

@crazed
Copy link

crazed commented Sep 6, 2023

I am unable to cache the nodejs binaries using Artifactory at this time due to this. We need this functionality as nodejs.org has poor uptime compared to our internal cache. @adam-browning can you make it so there's a way to change the User-Agent header on the self-hosted Artifactory as a workaround here?

@ovflowd
Copy link
Member

ovflowd commented Sep 6, 2023

can you make it so there's a way to change the User-Agent header on the self-hosted Artifactory as a workaround here?

Please don't do this.

I am unable to cache the nodejs binaries using Artifactory at this time due to this.

I assume you haven't read the comments on this issue. This is not an issue on our side but a misconfiguration on Artifactory. nodejs.org should not be used for Artifactory. At least not for NPM packages.

We already have rules that 100% allow Artifactory requests if they go for fetching Node.js binaries. And we even added screenshots here and other comments about how our rules are set in place...

We need this functionality as nodejs.org has poor uptime compared to our internal cache.

That's a blunt assertion. The Node.js Website has very few moments of poor performance on specific cache-purge times. That might create a wrong sensation that the nodejs.org website has poor performance or it is slow. But hey, we serve Petabytes of data every month.


Tackling this in the "let's find ways how to circumvent this issue that is 100% an issue on the user side, is not the way to go with this.

We're still waiting @adam-browning to reach us out on the OpenJS Slack so we can better communicate. But again, the root issue here is that many examples of how to "configure" Artifactory for Node.js out there are wrong, and attempt to make Artifactory request NPM packages against https://nodejs.org/dist, which is wrong. And that's what we are blocking in our Firewalls. This causes your Artifactory instances to fail, because one of the "sources" is giving HTTP 401. The solution is simple, update your artifactory configuration to use https://npmjs.com for NPM packages, and https://nodejs.org for Node.js binaries.

@jensborrmann
Copy link

@adam-browning : Are there any updates/ideas on the open questions?
We want to use Artifactory only for Node.js binaries - not for NPM packages. This use case still does not run with the current firewall settings.
We were not able to identify the real root cause for our problems. A speculative idea might be as follows: Are there initial requests to the root address, before accessing the dist-URLs? Receiving 401 from accessing the root URLs might prevent the real request from being executed. (Just guessing...)

@Zvikac
Copy link

Zvikac commented Dec 5, 2023

@ovflowd
I, as a Jfrog customer and the originator of this issue, get the 403 rejection when accessing https://nodejs.org as well as for https://nodejs.org/dist and we want to download the nodejs binaries.
this 403 error on https://nodejs.org keeps me from access to https://nodejs.org/download/release/ as well.
the last Nodejs version we've downloaded successfully was 18.0.0, 1.5 year ago.
for the npm packages we use https://registry.npmjs.org successfully.

@ovflowd
Copy link
Member

ovflowd commented Dec 5, 2023

As we mentioned before, directory listing for JFrog is prohibited at the moment. You can still directly download binaries via JFrog/Artifactory if they request the binary directly.

If JFrog/Artifactory relies on directory listing to download the versions, that's a bad software design from their side, as "regexing/eval'ing" a directory listing is flaky. I'd recommend that the Artifactory/JFrog team to fix that and simply read the versions from nodejs.org/dist/index.json

Again, the issue is not on our side but on the software you rely on. We're not changing our firewall rules, which will degrade the experience for everyone, because some paid product is misusing our infrastructure.

We might eventually lift these firewall rules once we migrate our binaries "CDN" infra to Cloudflare R2/Cloudflare Workers, which will mitigate the issue of our infrastructure overload/being hammered by a bunch of bots and whatnot.

If you're curious about the progress, here's the issue: https://github.com/nodejs/build/issues/346We we welcomed JFrog (or whatever company behind this software) to join our Slack and talk with us, but apparently, they didn't care enough, so I doubt we're the baddies here. We're open to compromise, but the current way the software you're using is doing things against our infra is just unhealthy.

@Zvikac
Copy link

Zvikac commented Dec 5, 2023

@ovflowd
Yes, but https://nodejs.org/ is not a directory and I still get 403.

@ovflowd
Copy link
Member

ovflowd commented Dec 5, 2023

@ovflowd

Yes, but https://nodejs.org/ is not a directory and I still get 403.

Why would Artifactory want to open the main page? This is meant for humans not bots.

@ovflowd
Copy link
Member

ovflowd commented Dec 5, 2023

Well, nodejs.org should not be used as KeepAlive; Check if Website is "up" endpoint... If you intend to check if our distribution endpoint is alive, I'd recommend making a HEAD or GET to nodejs.org/dist/index.json (preferably a HEAD)

@Zvikac
Copy link

Zvikac commented Dec 5, 2023

@ovflowd
I'm not really trying to open https://nodejs.org/ but it's a part of the trial-error troubleshooting I'm going through, I want to focus on the root cause.
and if I get 403 on the main link than obviously I get it down the road.

@Zvikac
Copy link

Zvikac commented Dec 5, 2023

@ovflowd
O.K. I got your 2nd comment.

@ovflowd
Copy link
Member

ovflowd commented Dec 5, 2023

@ovflowd O.K. I got your 2nd comment.

I apologise for the unfortunate experience you're having, if I were at your place, I would also be frustrated. I just wanted to ensure that we did these changes as a precaution, as it was deteriorating our infrastructure to a point that it was very problematic.

You can read more here: https://nodejs.org/en/blog/announcements/node-js-march-17-incident

@ovflowd
Copy link
Member

ovflowd commented Dec 5, 2023

@ovflowd I'm not really trying to open nodejs.org but it's a part of the trial-error troubleshooting I'm going through, I want to focus on the root cause. and if I get 403 on the main link than obviously I get it down the road.

Right, we shared the regex rule of our firewall in this issue thread, so you can test it by yourself of what it blocks and not. (Actually not finding it here, so I feel we shared elsewhere, let me share again:

(http.user_agent matches "^(Artifactory/|Nexus/|AzureArtifacts/|Gradle/)" and not http.request.uri.path matches "\.(gz|xz|pkg|7z|zip|msi|txt|asc|sig|lib|exe|json|tab)$")

@adam-browning
Copy link
Author

Dear @Zvikac & @jensborrmann
I apologize for not responding sooner and I would like to address your issues.
Please reach out to me directly via email to [email protected] and let's connect to see how we can help you.

I was previously in contact with both @ovflowd & @targos and they have been extremely patient and attentive in truly helping with this issue. Unfortunately, I have not been available for some time and I intend to remedy that now.

@ovflowd & @targos I truly appreciate all you have done here to assist us. I hope we will be able to continue our collaboration and address your concerns to eliminate the invalid requests that were impacting your servers.

Thanks in advance,
Adam

@ovflowd
Copy link
Member

ovflowd commented Dec 5, 2023

Dear @Zvikac & @jensborrmann

I apologize for not responding sooner and I would like to address your issues.

Please reach out to me directly via email to [email protected] and let's connect to see how we can help you.

I was previously in contact with both @ovflowd & @targos and they have been extremely patient and attentive in truly helping with this issue. Unfortunately, I have not been available for some time and I intend to remedy that now.

@ovflowd & @targos I truly appreciate all you have done here to assist us. I hope we will be able to continue our collaboration and address your concerns to eliminate the invalid requests that were impacting your servers.

Thanks in advance,

Adam

Life happens, appreciate you reaching us out again!

@adam-browning
Copy link
Author

Dear @Zvikac @jensborrmann @crazed

We have conducted a thorough investigation of this issue together with our friends at nodejs.org and confirmed there is
No disruption of service around the main ability to download binaries from https://nodejs.org/

However, we do see 2 issues exhibited specifically by the experience in JFrog/Artifactory that behave differently than what you are probably expecting, Namely: "TEST connection" and "Remote Repository Browsing".
Remote repository browsing is not supported by all/most registries/remote targets.

I invite you to contact your ESL or me directly to help with these issues.

Additionally, We have identified an article https://sionwilliams.com/posts/2020-12-09-node-n-npm-mirror/ that is showing an outdated practice.
In response, we have reached out to the maintainer to update this article and in parallel JFrog will provide an official Knowledge base article on how to set this up correctly.

If you have stumbled upon this thread, please refrain from using https://nodejs.org/dist in your Generic Remote Repository URLs, the correct URL for nodejs binaries is https://nodejs.org/ (WITHOUT /dist)

From our perspective, this Issue closed
We deeply appreciate all the help from our friends at Nodejs
@ovflowd, @targos - Thank you 🙏

@ovflowd
Copy link
Member

ovflowd commented Dec 6, 2023

Hey @adam-browning, I appreciate your effort here and the communication. 🙇

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants