runWorker¶

DESCRIPTION¶

runWorker.pl is an eHive component script that does the work of a single Worker. It specialises in one of the analyses and starts executing Jobs of that Analysis one-by-one or batch-by-batch.

Most of the functionality of the eHive is accessible via beekeeper.pl script, but feel free to run the runWorker.pl if you think you need a direct access to the running Jobs.

USAGE EXAMPLES¶

    # Run one local Worker process in ehive_dbname and let the system pick up the Analysis
runWorker.pl -url mysql://username:secret@hostname:port/ehive_dbname

    # Run one local Worker process in ehive_dbname and let the system pick up the Analysis from the given resource_class
runWorker.pl -url mysql://username:secret@hostname:port/ehive_dbname -rc_name low_mem

    # Run one local Worker process in ehive_dbname and constrain its initial specialisation within a subset of analyses
runWorker.pl -url mysql://username:secret@hostname:port/ehive_dbname -analyses_pattern '1..15,analysis_X,21'

    # Run one local Worker process in ehive_dbname and allow it to respecialize within a subset of Analyses
runWorker.pl -url mysql://username:secret@hostname:port/ehive_dbname -can_respecialize -analyses_pattern 'blast%-4..6'

    # Run a specific Job in a local Worker process:
runWorker.pl -url mysql://username:secret@hostname:port/ehive_dbname -job_id 123456

OPTIONS¶

Connection parameters:¶

`--reg_conf <path>`
	path to a Registry configuration file
`--reg_alias <string>`
	species/alias name for the eHive DBAdaptor
`--reg_type <string>`
	type of the registry entry (“hive”, “core”, “compara”, etc - defaults to “hive”)
`--url <url string>`
	URL defining where database is located
`--nosqlvc`	“No SQL Version Check” - set if you want to force working with a database created by a potentially schema-incompatible API

Configs overriding¶

`--config_file <string>`
	JSON file (with absolute path) to override the default configurations (could be multiple)

Task specification parameters:¶

`--rc_id <id>`	resource class id
`--rc_name <string>`
	resource class name
`--analyses_pattern <string>`
	restrict the specialisation of the Worker to the specified subset of Analyses
`--analysis_id <id>`
	run a Worker and have it specialise to an Analysis with this analysis_id
`--job_id <id>`	run a specific Job defined by its database id
`--force`	set if you want to force running a Worker over a BLOCKED Analysis or to run a specific DONE/SEMAPHORED job_id

Worker control parameters:¶

`--job_limit <num>`
	number of Jobs to run before the Worker can die naturally
`--life_span <num>`
	number of minutes this Worker is allowed to run
`--no_cleanup`	don’t perform temp directory cleanup when the Worker exits
`--no_write`	don’t write_output or auto_dataflow input_job
`--hive_log_dir <path>`
	directory where stdout/stderr of the whole eHive of workers is redirected
`--worker_log_dir <path>`
	directory where stdout/stderr of this particular Worker is redirected
`--retry_throwing_jobs`
	By default, Jobs are allowed to fail a few times (up to the Analysis’ max_retry_count parameter) until the systems “gives up” and considers them as FAILED. retry Jobs if the Job dies knowingly (e.g. due to encountering a die statement in the Runnable)
`--can_respecialize`
	allow this Worker to re-specialise into another Analysis (within resource_class) after it has exhausted all Jobs of the current one
`--worker_delay_startup_seconds <number>`
	number of seconds each Worker has to wait before first talking to the database (0 by default, useful for debugging)
`--worker_crash_on_startup_prob <float>`
	probability of each Worker failing at startup (0 by default, useful for debugging)

Other options:¶

`--help`	print this help
`--versions`	report both eHive code version and eHive database schema version
`--debug <level>`
	turn on debug messages at <level>