Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tutorial part 2 #37

Open
franknoe opened this issue Mar 24, 2017 · 0 comments
Open

Tutorial part 2 #37

franknoe opened this issue Mar 24, 2017 · 0 comments

Comments

@franknoe
Copy link
Collaborator

This refers to the tutorial part 2:
https://github.com/markovmodel/adaptivemd/blob/master/examples/tutorial/2_example_run.ipynb

  • General:

    • I think it would be important to make clear that this notebook (if I understand correctly) defines an execution plan. If I understand it correctly, it just defines how trajectories will be run or restarted before anything has been run. Either way this isn't really clear from the text, and we need to make clear in both the text and the API what refers to plans or queues, and what actually reports the status of available data - these things need to be clearly separated.
    • Many typos and grammar errors, many sentences are not clear (that's it, do whatever you need...). Generally, always avoid the words 'this' or 'that' whenever it's ambiguous what they mean (e.g. when they refer to a noun in one of the previous sentences, but there are multiple nouns they could refer to).
  • Cell 3: OK, so creating a Project with a name either creates a new Project if the name doesn't exist yet, or it loads the existing project under that name. That's not clear from the API docs and could also be made more explicit in the tutorial here.
    Other issues / comments with that:

    • it looks like anyone can delete any project by knowing its name. If people are using the same MongoDB server this is a bit dangerous - I'm not calling for a secure solution with passwords etc (takes too long to implement), but please think of a minimal solution that helps to avoid unintentional deletion of other people's projects. Maybe request to also specify the user name of a project in the delete command.
    • is there a way to list the project in the database? Also how do you specify the database (The tutorial just creates a Project, so I guess there is a standard location or name for the database on the local host, but if it's not there, how do I specify which host/database to use?)
  • Cell 6: Nice. However, Files (probably also other objects) seem to have almost no API docs. Please complete API docs for all objects that are visible to the user.

  • "make the task" should be "create the task" (everywhere)

  • Cell 7: Trajectory object, parameter frame. Is this always the starting frame for the trajectory? Then give it a more explicit name.

  • Cell 10: That is a bit unclear and tricky indeed.

    • I guess by "source trajectory" you mean the trajectory to be extended. Now you write the trajectory can only be run if the source trajectory exists - since the object apparently exists already at that point, I guess you mean that the extend command is only executed when the trajectory object is filled with the results of the first run - if correct, please make that clear in the tutorial.
    • In general, I don't see how the correct order is guaranteed. For example, if I create 10 tasks, consisting of 1 task that runs the initial trajectory, and 9 extension tasks, then these tasks should run in sequence, i.e. it doesn't make sense if all or several of the 9 extension tasks start running after the first piece is available. Is this implemented? In general it's not clear at this point how the user can ensure an execution sequence.
    • I feel that a "complete solution" of managing execution sequences would require implementing dependency graphs, which is beyond what's possible now. I'm perfectly fine with having a very minimal solution that just provides very basic tools of flow control (e.g. be able to give ordered task lists to make sure that they are always run in sequence). Just be explicit in the documentation about that and say what hasn't been done so far.
  • Cell 12: It's unclear if this is showing planned trajectories and lengths, or actually existing data. In the text it says we would expect a trajectory of length 100 and another of length 150. That is not shown in the output. I guess we only see the trajectory length 100 that was added to the project in cell 8 of this notebook, but if so, that's also confusing because the planned length is really 150 and at this point nothing has been run yet I guess. So how do we get information of all planned trajectories and all available data?

  • Cell 14: You say the task failed, but the output shows u'queued'. Why is it clear that it failed?

  • Cell 16: I don't understand the NOTE. Please reformulate / clarify.

  • Cell 17: I don't understand what "If you have the trajectory object and want to just create the trajectory, you can also put this in the queue." means. Also "do whatever you need" doesn't say anything.

  • Cell 17: At first it was unclear to me what is this cell supposed to show. Say this is a shortcut, where the project generates the task for you.

  • Cell 18: trajectory = project.trajectories.one what does that mean? Is the attribute actually called one and what is it? Do you want to refer to the first trajectories in the list?

  • Cell 18 shows now output, but the text below says "Good, at least 100 frames". What are we supposed to see here, and again would this show frames that have already been run, or just the planned length?

  • Cell 20: project.wait_until(condition) is extremely important, but just a side note here. Maybe highlight how to do flow control in a separate notebook or section.

  • Cell 22: should this be model / modeller or analysis? Make sure terminology is consistent.

  • Cell 27: What does project.find_ml_next_frame(4) mean, especially what's ml? Also, you say in the text that the simplest strategy is to use 1/c, but I don't see anything in the commands specifying that.

  • Cell 27: Now the term "Brain" appears, but that is not used anywhere else so far, so change to the actual nomenclature. Also what "This will be moved to the Brain" mean? What is "this"?

  • Cell 27/28: Relationship is not clear. Is executing Cell 27 required to run Cell 28? If so, why? Or are they independent, and if they're independent, what's the use of Cell 27 and how can we use the results from Cell 27 in Cell 28?

  • What about workers: Control and behavior of workers (also automatic shutdown when they don't receive tasks for a while) should be documented in a general doc page (link to it here)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant