Runners

... or how secator's internals work.

A runner is at the core of secator live processing capabilities. It handles the parsing, converting and processing of input options (CLI and library) and output items.

All runners inherit from secator.runners._base.Runner.

Supported runners

Some built-in runners are available out-of-the-box:

Runner

Description

Additional features

Command

Run an external command and stream it's output.

Automatic command install.
Priviledged mode (sudo).

Task

Run a task.

Remote mode (Celery).
Chunking on big inputs.
Direct calling from library.

Workflow

Run a DAG of tasks, defined in a YAML config file.

Remote mode (Celery).
Distributed (Celery).
Task chaining and parallel.
Re-use previous results as task inputs.

Scan

Run a DAG of workflows, defined in a YAML config file.

Distributed (Celery).
Workflow chaining.
Re-use previous results as workflow inputs.

Lifecyle hooks

Here is an overview of how a runner's lifecycle:

The Runner lifecycle contains hooks that a user can plug into:

Base hooks:

before_init: executed before the base runner's init starts.
on_init: executed when the base runner's init is completed.
on_start : executed when the runner has started running.
on_iter: executed when the runner iterates.
on_end : executed when the runner has finished running.
on_cmd: runs when the mapped command is built [ Command runner only ].
on_cmd_done: runs when the command has finished running [ Command runner only ].
on_line: executed when a line is output to stdout or stderr [ Command runner only ].

Item hooks:

on_item_pre_convert: executed before an item is converted to an output type.
on_item: executed when the runner emits an item.
on_duplicate: runs after an item has been marked as a duplicate.
on_line: executed when a line is emitted to stdout or stderr [ Command runner only ].
on_error: executed when an error is emitted by the command [ Command runner only ] .
on_{serializer}_loaded: executed when a serializer has finished running. For instance, on_json_loaded after the JSONSerializer has finished running [ Command runner only ] .

All hooks are defined with the @staticmethod decorator and take self as the first argument so that you can use the runner data in your hook implementation. Item hooks take item as the second argument and expect you to return the modified item.

Using hooks

There are two different ways of specifying hooks: static hooks (in the task definition class), dynamic hooks (passed to a runner at runtime), or drivers (collection of hooks).

Static hooks

Static hooks are specified in the task specification class as staticmethods:

from secator.runners import Command
from secator.decorators import task
from mylib import send_to_aws_s3


@task()
class mytool(Command):
    # ...

    @staticmethod
    def on_item(self, item):
        if item._type == 'url':
            send_to_aws_s3(item.stored_response_path)
        return item

Dynamic hooks

Dynamic hooks are specified at runtime by passing them to a runner.

Dynamic hooks are a library-only feature, they are not available in the CLI.

Here are examples of specifying dynamic hooks:

from secator.task import mytool

api_url = 'https://myapi.com'
hooks = {
    'on_item': lambda self, item: requests.post(api_url, json=item.toDict())
}
mytool('TARGET', hooks=hooks).run()

from secator.runners import Workflow, Task
from secator.template import TemplateLoader
from secator.hooks.mongodb import update_runner, update_finding

config = TemplateLoader(path='/path/to/my/workflow.yaml')
workflow = Workflow(
    config,
    hooks={
       Workflow: {
            'on_init': [update_runner],
            'on_start': [update_runner],
            'on_iter': [update_runner],
            'on_end': [update_runner]
        },
        Task: {
            'on_init': [update_runner],
            'on_item': [update_finding],
            'on_duplicate': [update_finding],
            'on_iter': [update_runner],
            'on_end': [update_runner]
        }
    }
)
workflow.run()

Drivers

See Drivers.

PreviousExporters NextDrivers

Last updated 3 months ago

Was this helpful?