System to make API-calls reproducible #136
Closed
ingoboerner
started this conversation in
Ideas
Replies: 1 comment 2 replies
-
@ingoboerner Is this still a valid project or has it been obsoleted by dockerizing a specific DraCor API instance? |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
In (our) analysis of literary text API responses can be understood as statements and should be therefore reproducible across version changes (of corpora, API, metrics service). To achieve this, imo we need at least two system components:
To simply look at a handful of certain API responses, setting up a whole system might be too much effort (it might probably take so time to run a build process to generate the corresponding image...), so we could implement a second system component, that would only allow to quickly reproduce a certain explicitly saved response:
A relatively simple solution to ensure that other user can reproduce an analysis based on the API calls and responses would be to implement a system that stores the actual response and assigns it a unique identifier (e.g. a hash-value of API-call, versions/commits of data and API, current datetime). At some later point this identifier could be used to query a certain API endpoint, that could a) return the saved response and b) provide some hint how to reproduce the whole system state (data + API) of a given point in time.
We could implement a parameter for all API endpoints that would explicitly trigger a save-response-operation function.
?store=true
(default would be false, so responses are not stored by default), e.g.https://dracor.org/api/corpora/ger/play/ger000321/metrics?store=true
When storing is requested, the response should be stored in a designated database ("response-store"); a challenge might be that we would have to store several response formats, e.g. JSON, XML, CSV, ... If the API would only produce JSON, a database like couchDB could be used, in which we stored Response under the generated hash as key identifying the response. With multiple response formats, this could become a challenge.
After storing the response the API would then return the response data but also add additional data containing the relevant information on the system state/version and the identifier of the stored response.
To retrieve stored responses, we could either use a designated endpoint
.../{storedresponse}/{identifier}
, that would return the stored response from the database. Another way would be to add the retrieval function to each API-Endpoint; anyway the user should get information in the stored response on how to run it on the current data and api state and a hint on how to get a running system at the state of the stored response.Beta Was this translation helpful? Give feedback.
All reactions