System to make API-calls reproducible #136

ingoboerner · 2021-05-07T09:33:56Z

ingoboerner
May 7, 2021
Maintainer

In (our) analysis of literary text API responses can be understood as statements and should be therefore reproducible across version changes (of corpora, API, metrics service). To achieve this, imo we need at least two system components:

a system to providing the API + corpora data at a certain state in time (identified by ids of commit of the corpus data/version number). I could be possible to e.g. use a system making use docker-container to reproduce this state; maybe this has to be done locally by the user, but we could still implement a system, that would provide the image or the docker-compose.

To simply look at a handful of certain API responses, setting up a whole system might be too much effort (it might probably take so time to run a build process to generate the corresponding image...), so we could implement a second system component, that would only allow to quickly reproduce a certain explicitly saved response:

Response store
A relatively simple solution to ensure that other user can reproduce an analysis based on the API calls and responses would be to implement a system that stores the actual response and assigns it a unique identifier (e.g. a hash-value of API-call, versions/commits of data and API, current datetime). At some later point this identifier could be used to query a certain API endpoint, that could a) return the saved response and b) provide some hint how to reproduce the whole system state (data + API) of a given point in time.

We could implement a parameter for all API endpoints that would explicitly trigger a save-response-operation function. ?store=true (default would be false, so responses are not stored by default), e.g. https://dracor.org/api/corpora/ger/play/ger000321/metrics?store=true

When storing is requested, the response should be stored in a designated database ("response-store"); a challenge might be that we would have to store several response formats, e.g. JSON, XML, CSV, ... If the API would only produce JSON, a database like couchDB could be used, in which we stored Response under the generated hash as key identifying the response. With multiple response formats, this could become a challenge.

After storing the response the API would then return the response data but also add additional data containing the relevant information on the system state/version and the identifier of the stored response.

{
    "storedResponseMeta" : {
        "_id" : "hashOfStoredResponse" ,
        "systemState" : {
            "existdb": "4.1.0",
            "name": "DraCor",
            "status": "beta",
            "version": "0.20.1"
        } ,
       "dataState" : {
          "corpusname" : "ger" ,
          "commit" : "84d68c743d172933e8989662bb6a0066887b9a7d" ,
          ...
       } 
       ...
    } ,
   "responseData" : { actual response of API-Request ... }
}

To retrieve stored responses, we could either use a designated endpoint .../{storedresponse}/{identifier} , that would return the stored response from the database. Another way would be to add the retrieval function to each API-Endpoint; anyway the user should get information in the stored response on how to run it on the current data and api state and a hint on how to get a running system at the state of the stored response.

cmil · 2024-02-06T10:58:02Z

cmil
Feb 6, 2024
Maintainer

@ingoboerner Is this still a valid project or has it been obsoleted by dockerizing a specific DraCor API instance?

2 replies

ingoboerner Feb 6, 2024
Maintainer Author

I think, the solution with dockerizing the whole system solves this

cmil Feb 6, 2024
Maintainer

Let's close this discussion then.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

System to make API-calls reproducible #136

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{title}}

Select a reply

System to make API-calls reproducible #136

ingoboerner May 7, 2021 Maintainer

Replies: 1 comment · 2 replies

cmil Feb 6, 2024 Maintainer

ingoboerner Feb 6, 2024 Maintainer Author

cmil Feb 6, 2024 Maintainer

ingoboerner
May 7, 2021
Maintainer

Replies: 1 comment 2 replies

cmil
Feb 6, 2024
Maintainer

ingoboerner Feb 6, 2024
Maintainer Author

cmil Feb 6, 2024
Maintainer