3 See the NOTICE file distributed with
this work
for additional information
4 regarding copyright ownership.
6 Licensed under the Apache License, Version 2.0 (the
"License");
7 you may not use
this file except in compliance with the License.
8 You may obtain a copy of the License at
12 Unless required by applicable law or agreed to in writing, software
13 distributed under the License is distributed on an
"AS IS" BASIS,
14 WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15 See the License
for the specific language governing permissions and
16 limitations under the License.
20 Please email comments or questions to the
public Ensembl
21 developers list at <dev@ensembl.org>.
23 Questions may also be sent to the Ensembl help desk at
24 <helpdesk@ensembl.org>.
35 implementation of the .fai index lookup as defined by samtools. The code uses
36 this indexing system to access portions of sequence and translates
Slice requests
37 into sensible locations
for our FASTA query layer.
39 The adaptor must be initalised with access to a Faidx compatible
object and the FASTA
40 file backing must use the same seq_region_name as the querying slices otherwise
41 we cannot
return the required data.
45 package Bio::EnsEMBL::DBSQL::FastaSequenceAdaptor;
55 use English qw/-no_match_vars/;
60 Arg [1] : FileFaidx; $faindex. A FileFaidx
object or a compatible version
61 Arg [2] : Integer; $chunk_power. Size of the region to cache
62 Arg [3] : Integer; $cache_size. Number of regions to cache
68 my ($class, $faindex, $chunk_power, $cache_size) = @_;
69 my $self = $class->SUPER::new($chunk_power, $cache_size);
74 =head2 fetch_by_Slice_start_end_strand
77 Arg [2] : Integer; $start. Start of region to retrieve relative to the
Slice (defaults to 1)
78 Arg [3] : Integer; $end. End of region to retreive relative to the
Slice (defaults to length)
79 Arg [4] : Integer; $strand. Strand to fetch (defaults to 1)
80 Description : Fetches sequence
for the given slice. Unlike the normal
SequenceAdaptor we assume
81 Sequence is held in a FASTA file under the
Slice's seq_region_name.
82 Exception : Thrown if we are given a circular slice
85 sub fetch_by_Slice_start_end_strand {
86 my ( $self, $slice, $start, $end, $strand ) = @_;
90 if(defined $end && $start > $end && $slice->is_circular()) {
91 throw "Currently we do not support circular requests";
94 #Get a new slice that spans the exact region to retrieve dna from.
95 #Then constrain to seq region if it's gone negative or over the end
96 $slice = $self->expand_Slice($slice, $start, $end, $strand);
99 # This call is likely to barf if we try to query using a chr name
100 # we do not understand. Use can_access_Slice() to make sure
101 my $seq_ref = $self->_fetch_seq($slice->seq_region_name(), $slice->start(), $slice->length());
102 reverse_comp($seq_ref)
if $strand == -1;
108 Description : Holds a reference to the Faindex
object to use
for sequence access
113 my ($self, $faindex) = @_;
114 if(defined $faindex) {
115 assert_ref($faindex,
'Bio::EnsEMBL::Utils::IO::FileFaidx',
'faidx');
116 $self->{faindex} = $faindex;
118 return $self->{faindex};
121 =head2 can_access_Slice
123 Description : Checks the lookup to see
if we have access to the Slice given (
using
124 seq region name as the ID). We reject any Circular Slice
128 sub can_access_Slice {
129 my ($self, $slice) = @_;
130 return 0
if $slice->is_circular();
131 return $self->faindex()->can_access_id($slice->seq_region_name());
136 Description : Unsupported operation. Please use a FASTA serialiser
141 throw "Unsupported operation. Cannot store sequence in a fasta file";
146 Description : Provides access to the underlying faindex
object and returns a sequence scalar ref
151 my ($self, $id, $start, $length) = @_;
152 my $seq_ref = $self->faindex()->fetch_seq($id, $start, $length);