runWorker¶
DESCRIPTION¶
runWorker.pl is an eHive component script that does the work of a single Worker. It specialises in one of the analyses and starts executing Jobs of that Analysis one-by-one or batch-by-batch.
Most of the functionality of the eHive is accessible via beekeeper.pl script, but feel free to run the runWorker.pl if you think you need a direct access to the running Jobs.
USAGE EXAMPLES¶
# Run one local Worker process in ehive_dbname and let the system pick up the Analysis
runWorker.pl -url mysql://username:secret@hostname:port/ehive_dbname
# Run one local Worker process in ehive_dbname and let the system pick up the Analysis from the given resource_class
runWorker.pl -url mysql://username:secret@hostname:port/ehive_dbname -rc_name low_mem
# Run one local Worker process in ehive_dbname and constrain its initial specialisation within a subset of analyses
runWorker.pl -url mysql://username:secret@hostname:port/ehive_dbname -analyses_pattern '1..15,analysis_X,21'
# Run one local Worker process in ehive_dbname and allow it to respecialize within a subset of Analyses
runWorker.pl -url mysql://username:secret@hostname:port/ehive_dbname -can_respecialize -analyses_pattern 'blast%-4..6'
# Run a specific Job in a local Worker process:
runWorker.pl -url mysql://username:secret@hostname:port/ehive_dbname -job_id 123456
OPTIONS¶
Connection parameters:¶
--reg_conf <path> | |
path to a Registry configuration file | |
--reg_alias <string> | |
species/alias name for the eHive DBAdaptor | |
--reg_type <string> | |
type of the registry entry (“hive”, “core”, “compara”, etc - defaults to “hive”) | |
--url <url string> | |
URL defining where database is located | |
--nosqlvc | “No SQL Version Check” - set if you want to force working with a database created by a potentially schema-incompatible API |
Configs overriding¶
--config_file <string> | |
JSON file (with absolute path) to override the default configurations (could be multiple) |
Task specification parameters:¶
--rc_id <id> | resource class id |
--rc_name <string> | |
resource class name | |
--analyses_pattern <string> | |
restrict the specialisation of the Worker to the specified subset of Analyses | |
--analysis_id <id> | |
run a Worker and have it specialise to an Analysis with this analysis_id | |
--job_id <id> | run a specific Job defined by its database id |
--force | set if you want to force running a Worker over a BLOCKED Analysis or to run a specific DONE/SEMAPHORED job_id |
Worker control parameters:¶
--job_limit <num> | |
number of Jobs to run before the Worker can die naturally | |
--life_span <num> | |
number of minutes this Worker is allowed to run | |
--no_cleanup | don’t perform temp directory cleanup when the Worker exits |
--no_write | don’t write_output or auto_dataflow input_job |
--hive_log_dir <path> | |
directory where stdout/stderr of the whole eHive of workers is redirected | |
--worker_log_dir <path> | |
directory where stdout/stderr of this particular Worker is redirected | |
--retry_throwing_jobs | |
By default, Jobs are allowed to fail a few times (up to the Analysis’ max_retry_count parameter) until the systems “gives up” and considers them as FAILED. retry Jobs if the Job dies knowingly (e.g. due to encountering a die statement in the Runnable) | |
--can_respecialize | |
allow this Worker to re-specialise into another Analysis (within resource_class) after it has exhausted all Jobs of the current one | |
--worker_delay_startup_seconds <number> | |
number of seconds each Worker has to wait before first talking to the database (0 by default, useful for debugging) | |
--worker_crash_on_startup_prob <float> | |
probability of each Worker failing at startup (0 by default, useful for debugging) |
Other options:¶
--help | print this help |
--versions | report both eHive code version and eHive database schema version |
--debug <level> | |
turn on debug messages at <level> |