Task Adapters¶
Task Adapters help to deploy, execute, and monitor tasks on various computing environments.
To execute a Task, the BDP system spawns an Adapter process that handles the user-specified arguments, reads the runtime configurations and parse the Workflow Playbook. Both runtime configuration and Workflow Playbook can be directly defined on the web pages provided by BDP, so the Task can be executed on the fly right after the Task is configured.
The above pre-process is done by the Task Reader of the Adapter process. After the Task Reader, the process knows the correct Adapter and computing resources from the runtime configurations. The process also get the task specifications from the Workflow Playbook. Then, these tasks are ready to deploy.
We provided a base class of the Adapter. To support various computing resources, different Adapters can be implemented from this base class. That is, depending on the used Adapter, jobs are deployed on various computing resources locally or remotely.
Flowchart of an Adapter process¶
After getting the task specifications which are parsed by the Task Reader from the Workflow Playbook,
the Adapter formulates these specifications into job objects by the taskOverrides
functions.
Adapter developers can write the command argument recipe in a YAML format, so that the _parseRecipe
function can be used to parse the recipe to formulate job commands and the taskOverrides
function organizes the task spec including job commands into job objects.
The number of jobs is the number of job objects and the Adapter has a built-in queue to schedule these jobs. There is a concurrency
option to control the number of concurrent jobs and the retry
option to re-execute those failed jobs.
The Adapter process keeps alive to monitor jobs until all jobs have been successfully finished or a retry limit has been hit.
Interface functions to implement for various computing resources¶
There are 6 interface functions to implement, taskOverrides
, beforeStart
, processSpawn
, detectJobStatus
, processExitCallback
, and beforeExit
.
As mentioned above, we provided a base class of Adapter to extend and implement for all kinds of computing resources. There are several interface functions that need to be implemented.
The above taskOverrides
is the first interface function that translates the job objects into the platform specific commands by the customized argument recipe
that is parsed by the function _parseRecipe
.
Before scheduling these job objects, the beforeStart
interface function can be implemented to do the preparations, such as synchronizing files or logging into the computing resources.
Next, the processSpawn
and detectJobStatus
are the two main interface functions that can be implemented to submit the command and monitor job status, respectively.
After each job is existed successfully or failed, the processExitCallback
is called. This function may be implemented to download job logs or update all information for the job.
Lastly, when all jobs are executed either successfully or failed, the interface function beforeExit
is called. Adapter developers may implement this function to clear the resources or synchronizing files.
See also
For more information about the Adapter, please see the code document on Github.