Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert import and cleanup tasks to their own classes; separate behavior from entry and catalog #54

Open
lzkelley opened this issue Jul 12, 2016 · 1 comment

Comments

@lzkelley
Copy link
Member

lzkelley commented Jul 12, 2016

We have a Task object defined in astrocats/catalog/task.py which represents a to-do-task. These to-do-tasks include a function name (and the submodule they're located in) which should be executed to complete the given task. Instead, the Task object should be expanded to include the function itself. Likely, this should be something like:

class Task:
    def __init__(self, ...):
        # setup all of the variables currently in the `Task` objects

    def load(self, ...):
        # The actual function to complete the task

A list of Task objects will be created, and each will just have a few parameters (like they do now). If a Task is active (i.e. Task.active == True) then Task.load() will be called in the import script.

Benefits:

  • Make it easy to subclass the general-catalog tasks as needed by individual catalogs.
  • Remove the need for a tasks.json input file. Instead all of the tasks can live in the tasks directory, which will be searched. The default settings will be stored in the class definitions.
@lzkelley lzkelley self-assigned this Jul 12, 2016
@lzkelley lzkelley changed the title Convert do_task functions to classes (combine with existing task objects) Convert import and cleanup tasks to their own classes; separate behavior from entry and catalog Jul 31, 2018
@lzkelley
Copy link
Member Author

Continuing on the previous train of thought, Tasks (and really the whole import/processing) process might be better as its own class, instead of mixed in with the Catalog base class, to clean things up and keep them better organized. The import class would always be given the catalog, of course, and thus access to any required attributes/functions. Likely, the same should happen with cleanup and sanitization: instead of just being a task used during import, this might be better as its own class, which one method of triggering its use, is via one of the import tasks. Cleaning could also be merged with exporting/saving.

For debugging and data improvements, it would really help if both directions (import and cleaning/export) had a particular function that was run on each event, in addition to each task. That way it would be easier to target particular events. e.g.

Preserve a Task as simply a record of the task to be done, i.e. a simple wrapper for some json-data (including module, activity, etc). For each task, subclass a new Importer class.

class Importer:
    def __init__(self, catalog, ...):
        # setup all of the variables currently in the `Task` objects

    def import_task(self, ...):
        # Load files, source information, etc; setup progress bar
        # ...
        for entry in entry_list:
            self.import_entry(entry, ...)

    def import_entry(self, ...):
        # Load/parse each entry, add to catalog
        # ...

Similar structure could be used for cleaning/exporting with an def clean_task() function and separate def clean_entry() function.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant