Skip to content

Batch framework

Cristina Yenyxe Gonzalez Garcia edited this page Jan 11, 2017 · 2 revisions

Job workflows are implemented using the Spring Batch framework, whose main advantages are:

  • Automated tracking of job status
  • Resuming capabilities: after a failed run, the next one will skip automatically all the successful steps in a job, and successful chunks in a step from that job
  • Component-based job definition
    • A job is made of steps
    • A step can be self-contained (tasklet), or made of reader + processor + writer(s)
  • Easy-to-replace components thanks to Spring's dependency injection

Spring Batch API

Job

Jobs must implement the Job interface.

Flow

Jobs can share sequences of steps known as "flows". These flows must implement the Flow interface.

Step

Jobs and flows can be made up of steps. These steps must implement the Step interface, and they can do it in at least 2 different ways: chunk-oriented and tasklets.

Chunk-oriented

Chunk-oriented steps represent tasks suitable for batch processing. These usually have the following components:

  • Reader (one)
  • Processor (zero or one)
  • Writer (zero or more)

See the VariantLoaderStep class for an example or a chunk-oriented step.

Tasklets

Tasklet steps represent atomic tasks that are not suitable for batch processing.

See the GenerateVepAnnotationStep class for an example or an atomic step that just spawns a VEP process.

Clone this wiki locally