ensembl-hive  2.7.0
Bio::EnsEMBL::DBSQL::BaseSequenceAdaptor Class Reference
+ Inheritance diagram for Bio::EnsEMBL::DBSQL::BaseSequenceAdaptor:

Public Member Functions

public StringRef fetch_by_Slice_start_end_strand ()
 
public Boolean can_access_Slice ()
 
public Bio::EnsEMBL::Slice expand_Slice ()
 
public Bio::EnsEMBL::DBSQL::SequenceAdaptor new ()
 
protected _init_seq_instance ()
 
public void clear_cache ()
 
public chunk_power ()
 
public cache_size ()
 
public seq_cache_max ()
 
protected ScalarRef _fetch_raw_seq ()
 
protected ScalarRef _fetch_seq ()
 

Detailed Description

Description

The BaseSequenceAdaptor is responsible for the conversion of calls from 
fetch_by_Slice_start_end_strand() for Sequence data into requests for a 
backing data store. In Ensembl these are the seqlevel sequence region
records held in the MySQL database. 

The base adaptor also provides sequence caching based on normalisation 
technique similar to the UCSC and BAM binning indexes. The code works 
by right-shifting the requested start and end by a seq chunk power 
(by default 18 approx. 250,000bp) and then left-shifting by the same 
value. This means any value within a given window will always result in
the same value. Please see the worked examples below:

  # Equation
  p=position
  o=seq chunk power
  offset=( (p-1)>>o ) << o
  
  # Using real values
  p=1340001
  o=18
  right_shifted = (1340001-1) >> 18 == 5
  offset = 5 << 18 == 1310720

To control the size of the cache and sequences stored you can provide
the seq chunk power and the number of sequences cached.

Definition at line 35 of file BaseSequenceAdaptor.pm.

Member Function Documentation

◆ _fetch_raw_seq()

protected ScalarRef Bio::EnsEMBL::DBSQL::BaseSequenceAdaptor::_fetch_raw_seq ( )
  Arg [1]     : String $id
                The identifier of the sequence to fetch.
  Arg [2]     : Integer $start
                Where to start fetching sequence from
  Arg [2]     : Integer $length
                Total length of seuqence to fetch
  Description : Performs the fetch of DNA from the backing storage 
                engine and provides it to the _fetch_seq() method
                for optional caching.
  Returntype  : ScalarRef of DNA fetched. All bases should be uppercased
  Exceptions  : Thrown if the method is not reimplemented
 
Code:
click to view

◆ _fetch_seq()

protected ScalarRef Bio::EnsEMBL::DBSQL::BaseSequenceAdaptor::_fetch_seq ( )
  Arg [1]     : String $id
                The identifier of the sequence to fetch.
  Arg [2]     : Integer $start
                Where to start fetching sequence from
  Arg [2]     : Integer $length
                Total length of seuqence to fetch
  Description   : If the requested region is smaller than our maximum length
                cachable region we will see if the cache already contains
                this chunk. If not we will request the region from _fetch_raw_seq()
                and cache it. If the region requested is larger than 
                the maximum cacheable sequence length we pass the request
                onto _fetch_raw_seq() with no caching layer.
                This module is also responsible for the conversion of
                requested regions into normalised region reuqests based
                on chunk_power.
  Returntype    : ScalarRef of DNA fetched. All bases should be uppercased
  Exceptions    : Thrown when _fetch_raw_seq() is not re-implemented
 
Code:
click to view

◆ _init_seq_instance()

protected Bio::EnsEMBL::DBSQL::BaseSequenceAdaptor::_init_seq_instance ( )

Undocumented method

Code:
click to view

◆ cache_size()

public Bio::EnsEMBL::DBSQL::BaseSequenceAdaptor::cache_size ( )

Undocumented method

Code:
click to view

◆ can_access_Slice()

public Boolean Bio::EnsEMBL::DBSQL::BaseSequenceAdaptor::can_access_Slice ( )
  Description : Returns a boolean indiciating if the adaptor understands
                the given Slice.
  Returntype  : Boolean; if true you can get sequence for the given Slice
  Exceptions  : Thrown if not redefined
 
Code:
click to view

◆ chunk_power()

public Bio::EnsEMBL::DBSQL::BaseSequenceAdaptor::chunk_power ( )

Undocumented method

Code:
click to view

◆ clear_cache()

public void Bio::EnsEMBL::DBSQL::BaseSequenceAdaptor::clear_cache ( )
  Example       :
$sa->clear_cache();
  Description : Removes all entries from the associcated sequence cache
  Returntype    : None
  Exceptions    : None
 
Code:
click to view

◆ expand_Slice()

public Bio::EnsEMBL::Slice Bio::EnsEMBL::DBSQL::BaseSequenceAdaptor::expand_Slice ( )
  Arg  [1]    : Bio::EnsEMBL::Slice slice
                The slice from which you want the sequence
  Arg  [2]    : Integer; $strand (optional)
                The start base pair relative to the start of the slice. Negative
                values or values greater than the length of the slice are fine.
                default = 1
  Arg  [3]    : (optional) int endBasePair
                The end base pair relative to the start of the slice. Negative
                values or values greater than the length of the slice are fine,
                but the end must be greater than or equal to the start
                count from 1
                default = the length of the slice
  Arg  [4]    : Integer; $strand (optional)
                Strand of DNA to fetch
  Returntype  : Bio::EnsEMBL::Slice
  Description : Creates a new Slice which represents the requested region. Provides
                logic applicable to all SliceAdaptor instance
  Exceptions  : Thrown if the Slice is circular (we currently do not support this as generic logic)
 
Code:
click to view

◆ fetch_by_Slice_start_end_strand()

public StringRef Bio::EnsEMBL::DBSQL::BaseSequenceAdaptor::fetch_by_Slice_start_end_strand ( )
  Arg  [1]   : Bio::EnsEMBL::Slice slice
               The slice from which you want the sequence
  Arg  [2]   : Integer; $strand (optional)
               The start base pair relative to the start of the slice. Negative
               values or values greater than the length of the slice are fine.
               default = 1
  Arg  [3]   : (optional) int endBasePair
               The end base pair relative to the start of the slice. Negative
               values or values greater than the length of the slice are fine,
               but the end must be greater than or equal to the start
               count from 1
               default = the length of the slice
  Arg  [4]   : Integer; $strand (optional)
               Strand of DNA to fetch
  Returntype : StringRef (DNA requested)
  Description: Performs the fetching of DNA based upon a Slice. All fetches
               should use this method and no-other.
               Implementing classes are responsible for converting the
               given Slice and values into something which can be processed by 
               the underlying storage engine. Implementing class are also
               responsible for the reverse complementing of sequence.
  Exceptions : Thrown if not redefined
 
Code:
click to view

◆ new()

public Bio::EnsEMBL::DBSQL::SequenceAdaptor Bio::EnsEMBL::DBSQL::BaseSequenceAdaptor::new ( )
  Arg [1]    : Int  $chunk_power; sets the size of each element of 
                    the sequence cache. Defaults to 18 which gives 
                    block sizes of ~250Kb (it is actually 2^18)
  Arg [2]    : Int  $cache_size; size of the cache. Defaults to 5 meaning
                    a cache of 1Mb if you use default values
  Example    :
my $sa = $db_adaptor->get_SequenceAdaptor();
  Description: Constructor.  Calls superclass constructor and initialises
               internal cache structure.
  Returntype : Bio::EnsEMBL::DBSQL::SequenceAdaptor
  Exceptions : none
  Caller     : DBAdaptor::get_SequenceAdaptor
  Status     : Stable
 
Code:
click to view

◆ seq_cache_max()

public Bio::EnsEMBL::DBSQL::BaseSequenceAdaptor::seq_cache_max ( )

Undocumented method

Code:
click to view

The documentation for this class was generated from the following file:
Bio::EnsEMBL::DBSQL::BaseSequenceAdaptor::seq_cache_max
public seq_cache_max()
Bio::EnsEMBL::DBSQL::BaseSequenceAdaptor::_fetch_raw_seq
protected ScalarRef _fetch_raw_seq()
Bio::EnsEMBL::DBSQL::BaseSequenceAdaptor::expand_Slice
public Bio::EnsEMBL::Slice expand_Slice()
Bio::EnsEMBL::DBSQL::BaseSequenceAdaptor::fetch_by_Slice_start_end_strand
public StringRef fetch_by_Slice_start_end_strand()
Bio::EnsEMBL::DBSQL::BaseSequenceAdaptor::chunk_power
public chunk_power()
Bio::EnsEMBL::DBSQL::BaseSequenceAdaptor::_init_seq_instance
protected _init_seq_instance()
Bio::EnsEMBL::DBSQL::BaseSequenceAdaptor::can_access_Slice
public Boolean can_access_Slice()
Bio::EnsEMBL::DBSQL::BaseSequenceAdaptor::cache_size
public cache_size()
Bio::EnsEMBL::Slice::length
public Int length()
Bio::EnsEMBL::DBSQL::BaseSequenceAdaptor::clear_cache
public void clear_cache()
Bio::EnsEMBL::DBSQL::BaseSequenceAdaptor::_fetch_seq
protected ScalarRef _fetch_seq()