runWorker
DESCRIPTION
runWorker.pl is an eHive component script that does the work of a single Worker. It specialises in one of the analyses and starts executing Jobs of that Analysis one-by-one or batch-by-batch.
Most of the functionality of the eHive is accessible via beekeeper.pl script, but feel free to run the runWorker.pl if you think you need a direct access to the running Jobs.
USAGE EXAMPLES
# Run one local Worker process in ehive_dbname and let the system pick up the Analysis
runWorker.pl -url mysql://username:secret@hostname:port/ehive_dbname
# Run one local Worker process in ehive_dbname and let the system pick up the Analysis from the given resource_class
runWorker.pl -url mysql://username:secret@hostname:port/ehive_dbname -rc_name low_mem
# Run one local Worker process in ehive_dbname and constrain its initial specialisation within a subset of analyses
runWorker.pl -url mysql://username:secret@hostname:port/ehive_dbname -analyses_pattern '1..15,analysis_X,21'
# Run one local Worker process in ehive_dbname and allow it to respecialize within a subset of Analyses
runWorker.pl -url mysql://username:secret@hostname:port/ehive_dbname -can_respecialize -analyses_pattern 'blast%-4..6'
# Run a specific Job in a local Worker process:
runWorker.pl -url mysql://username:secret@hostname:port/ehive_dbname -job_id 123456
OPTIONS
Connection parameters:
- --reg_conf <path>
path to a Registry configuration file
- --reg_alias <string>
species/alias name for the eHive DBAdaptor
- --reg_type <string>
type of the registry entry (“hive”, “core”, “compara”, etc - defaults to “hive”)
- --url <url string>
URL defining where database is located
- --nosqlvc
“No SQL Version Check” - set if you want to force working with a database created by a potentially schema-incompatible API
Configs overriding
- --config_file <string>
JSON file (with absolute path) to override the default configurations (could be multiple)
Task specification parameters:
- --rc_id <id>
resource class id
- --rc_name <string>
resource class name
- --analyses_pattern <string>
restrict the specialisation of the Worker to the specified subset of Analyses
- --analysis_id <id>
run a Worker and have it specialise to an Analysis with this analysis_id
- --job_id <id>
run a specific Job defined by its database id
- --force
set if you want to force running a Worker over a BLOCKED Analysis or to run a specific DONE/SEMAPHORED job_id
Worker control parameters:
- --job_limit <num>
number of Jobs to run before the Worker can die naturally
- --life_span <num>
number of minutes this Worker is allowed to run
- --no_cleanup
don’t perform temp directory cleanup when the Worker exits
- --no_write
don’t write_output or auto_dataflow input_job
- --worker_base_temp_dir <path>
The base directory that this worker will use for temporary operations. This overrides the default set in the JSON config file and in the code (/tmp)
- --hive_log_dir <path>
directory where stdout/stderr of the whole eHive of workers is redirected
- --worker_log_dir <path>
directory where stdout/stderr of this particular Worker is redirected
- --retry_throwing_jobs
By default, Jobs are allowed to fail a few times (up to the Analysis’ max_retry_count parameter) until the systems “gives up” and considers them as FAILED. retry Jobs if the Job dies knowingly (e.g. due to encountering a die statement in the Runnable)
- --can_respecialize
allow this Worker to re-specialise into another Analysis (within resource_class) after it has exhausted all Jobs of the current one
- --worker_delay_startup_seconds <number>
number of seconds each Worker has to wait before first talking to the database (0 by default, useful for debugging)
- --worker_crash_on_startup_prob <float>
probability of each Worker failing at startup (0 by default, useful for debugging)
Other options:
- --help
print this help
- --versions
report both eHive code version and eHive database schema version
- --debug <level>
turn on debug messages at <level>