refactor: Extracted common Cache class for KeeperParams and made it pickable #1178
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hello,
We are using the keepercommander library as a secrets backend in Airflow, so for that we implemented a custom SecretsBackend which allows us to use Keeper as a backend for all our Airflow connections. To do so we directly use the KeeperParams class in python, not through CLI, which works great.
The issue we discovered is that when we have a lot of simultanious tasks, the Keeper API get's called to much and we are being throttled meaning tasks are stuch until Keeper allows us to call the API again, which is logical. Still we didn't understand why the API got called so much as the results are being cached in the KeeperParams class.
So after investigation we discovered that even though the results are being cached, it doesn't help in our case because we are running in a multi-processing environment using Kubernetes which not only means threads don't share state but also that the process running the task could be running on another pod.
So our solution would be to pickle (and of course encrypt the pickled result) the cached records of the KeeperParams and store it on the shared filesystem, that way each time we have to instantiated a new KeeperParams, we first check if there is a pickled file available and if so load it. That way we have the cached records instantly available without the need to reconnect to the Keeper API.
The first thing I did is move al cache related dictionnaries in the KeeperParams to a dedicated class named Cache which is pickable but also comparable which makes assertions in the tests easier. I also made the BaseFolderNode comparable. The I implemented the reduce method for KeeperParams which only pickles non sensitive data and the cache.
At the moment we have a custom solution which pickles the required caches from the KeeperParams class, but I thought it would be better if this was directly available in the keepercommander code base, so that in the future refactorings would occur, the pickling would still work and we just have to pickle the KeeperParams without being worried it could break in the future.
Of course I added a unit test which tests the pickling and makes assertations on the fields that should be pickled and those that should not.