generate_timeline
SYNOPSIS
generate_timeline.pl {-url <url> | [-reg_conf <reg_conf>] -reg_alias <reg_alias> [-reg_type <reg_type>] }
[-start_date <start_date>] [-end_date <end_date>]
[-top <float>]
[-mode [workers | memory | cores | pending_workers | pending_time]]
[-key [analysis | resource_class]]
[-n_core <int>] [-mem <int>]
DESCRIPTION
This script is used for offline examination of the allocation of Workers.
Based on the command-line parameters “start_date” and “end_date”, or on
the start time of the first Worker and end time of the last Worker (as
recorded in pipeline database), it pulls the relevant data out of the
worker
table for accurate timing. By default, the output is in CSV
format, to allow extra Analysis to be carried.
You can optionally ask the script to generate an image with Gnuplot.
USAGE EXAMPLES
# Just run it the usual way: only the top 20 Analysis will be reported in CSV format
generate_timeline.pl -url mysql://username:secret@hostname:port/database > timeline.csv
# The same, but getting the Analysis that fill 99.5% of the global activity in a PNG file
generate_timeline.pl -url mysql://username:secret@hostname:port/database -top .995 -output timeline_top995.png
# Assuming you are only interested in a precise interval (in a PNG file)
generate_timeline.pl -url mysql://username:secret@hostname:port/database -start_date 2013-06-15T10:34 -end_date 2013-06-15T16:58 -output timeline_June15.png
# Get the required memory instead of the number of Workers
generate_timeline.pl -url mysql://username:secret@hostname:port/database -mode memory -output timeline_memory.png
# Draw the CPU-usage timeline across several databases
generate_timeline.pl -url mysql://username:secret@hostname:port/database -url mysql://username:secret@hostname:port/another_database -mode cores -output timeline_cpu.png
OPTIONS
Connection options
- --help
print this help
- --url <url string>
URL defining where eHive database is located. It can be repeated to draw a timeline across several databases
- --reg_conf
path to a Registry configuration file
- --reg_type
type of the registry entry (“hive”, “core”, “compara”, etc - defaults to “hive”)
- --reg_alias
species/alias name for the eHive DBAdaptor
- --nosqlvc
“No SQL Version Check” - set if you want to force working with a database created by a potentially schema-incompatible API Be aware that generate_timeline.pl uses raw SQL queries that may break on different schema versions
- --verbose
Print some info about the data loaded from the database
Timeline configuration
- --start_date <date>
minimal start date of a Worker (the format is ISO8601, e.g. “2012-01-25T13:46”)
- --end_date <date>
maximal end date of a Worker (the format is ISO8601, e.g. “2012-01-25T13:46”)
- --top <float>
maximum number (> 1) or fraction (< 1) of Analysis to report (default: 20)
- --output <string>
output file: its extension must match one of the Gnuplot terminals. Otherwise, the CSV output is produced on stdout
- --mode <string>
what should be displayed on the y-axis. Allowed values are “workers” (default), “memory”, “cores”, “pending_workers”, or “pending_time”
- --key <string>
“analysis” (default) or “resource_class”: how to bin the Workers
- --key_transform_file <string>
the path to a Perl script that defines a function named “get_key_name”. The function is used to provide custom key names for analyses and resource classes instead of their own display names. The function must take the object (Analysis or ResourceClass) as a sole argument and return a (non empty) string. See scripts/dev/generate_timeline_example_key_transform_file.pl for an example.
- --resolution <integer>
Timestamps are rounded up to multiples of this amount of minutes (default: 1). Increase this value when displaying timelines of very large pipelines.
Farm configuration
- --n_core <int>
the default number of cores allocated to a Worker (default: 1)
- --mem <int>
the default memory allocated to a Worker (default: 100Mb)
EXTERNAL DEPENDENCIES
Chart::Gnuplot