Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mismi-s3: Multi-thread downloadRecursive #378

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

erikd-ambiata
Copy link
Contributor

This complies. Waiting to see if the tests pass.

Hard coded the number of forked threads to 10. Should that be a parameter?

@nhibberd
Copy link
Member

Hard coded the number of forked threads to 10. Should that be a parameter?

Can we do similar to sync, expose a --fork with a default

@olorin
Copy link
Contributor

olorin commented Oct 31, 2017

Why the per-file concurrency on top of the per-chunk concurrency? Not sure if I missed a conversation somewhere, but I suspect this might actually be slower - increasing the concurrency by a factor of ten would make us more likely to hit rate-limiting.

@erikd-ambiata
Copy link
Contributor Author

@olorin This is just a WIP. I think we probably need to collect file sizes as well and if the file sizes are small enough using multiple downloadSingle and for large files, use a single multipart download.

@olorin
Copy link
Contributor

olorin commented Nov 1, 2017

This is just a WIP. I think we probably need to collect file sizes as well and if the file sizes are small enough using multiple downloadSingle and for large files, use a single multipart download.

That sounds good. Would be good to have some empirical basis for whatever the split ends up being too.

@erikd-ambiata
Copy link
Contributor Author

Might make sense to fix recursiveUpload first. More obvious how to do that.

@nhibberd
Copy link
Member

nhibberd commented Nov 2, 2017

As a first pass it would be fine to have a default of 1 with an option to override as well. That way we can start using it even if its not the most efficient at the moment

@erikd-ambiata
Copy link
Contributor Author

When this does the right/smart thing, would we still want an option to specify the parallelism? If not it seems a bit weird to add it for now only to remove it later.

@erikd-ambiata erikd-ambiata force-pushed the topic/better-recursive branch 2 times, most recently from 60e6de8 to 92bcfa0 Compare November 2, 2017 23:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants