Hepatitis B Virus Regulatory Sequence Database (HBVRegDB)



HBVRegDB tutorials


Tutorials for HBVRegDB



1. Comparison of your sequence to a well annotated HBV genome

2. Testing for conservation of a sequence across genomes

3. Testing for conservation of an RNA secondary structure across genomes

4. Repeating similarity searches against HBVRegDB Sequences, RefSeq viral genomes and proteins

5. Adding your own custom track from the results of a search

1. Comparison of your sequence to a well annotated HBV genome

This can be done by performing a nucleotide-nucleotide (Blastn) search against the well-annotated HBV genome (blast 2 sequences). A well-annotated sequence (AM282986) could be used as the sequence 1. Then, the HBV sequence of interest can be pasted, or the GI entered, as sequence 2. This search allows user to map sequence numbering from a well-annotated sequence to a query sequence as well as a presentation of CDS features (Figure 1).

Figure 1: Result of blast 2 sequence when the HBV genotype C (NC_003977) was used as a query sequence against the well-annotated HBV genotype A (AM282986).



2. Testing for conservation of a sequence across genomes

The HBVRegDB GBrowse view provides a graphical view of the primary conservation of representative HBV sequences. The regions of conservation are indicated by peaks of a mountain plot diagram generated from a multiple sequence alignment by a CDS-Plotcon programme. CDS-Plotcon is specifically designed to search for conserved functional elements within coding sequences.

The conserved region is easy to identify with an aid of grid lines. By also turning on tracks for features or annotations, the user can detect whether regulatory regions contain conserved elements (see example in Figure 2).

To see the detail of nucleotide sequence, the user can adjust the zoom. Please note that only the HBVRegDB reference sequences (curated sequences of AM282986, NC_003977, NC_001344 and NC_004107) offer additional information about regulatory signals.

Figure 2: Primary conservation across genomes of the representative human HBVs. Peaks of mountain plot indicate the regions of conserved sequence. This analysis indicates that most of the regulatory regions contain highly conserved elements. For example, the post-transcriptional regulatory element (HBV PRE) contains several conserved elements (marked by a red box).




Another example is the conserved coding regions and RNA structures within the epsilon element (AM282986:1750..1850) shown statically here.

3. Testing for conservation of an RNA secondary structure across genomes

The HBVRegDB employs the Alidot programme from ViennaRNA to detect RNA secondary structures for each set of multiple sequence alignments. The results of the analysis are viewed using GBrowse. Peaks from the graph identify regions of conserved local RNA secondary structures. The grid lines and the zoom assist user to identify the nucleotide positions of the conserved RNA secondary structures. By viewing the feature annotation, a user can find whether any regulatory regions contain conserved RNA secondary structures (see example in Figure 3). Other tools for RNA conservation, for example those described in Bralibase, could also be used and plotted as a user defined custom track.

Figure 3: Detection of conserved RNA secondary structures from a multiple sequence alignment of human hepatitis B virus. The analysis shows that regulatory regions such as HBV PRE alpha and the epsilon element which contain conserved RNA secondary structures (marked by a circle). Please note that the results from Alidot are only shown in the HBVRegDB reference sequences (e.g. AM282986 and NC_003977).



4. Repeating similarity searches against HBVRegDB Sequence, RefSeq viral genomes, and representative proteins

This search is provided by HBVRegDB for nucleotide-nucleotide searches within specific databases.

  • To begin the search users can choose whether they would like to search against the RefSeq Virus nucleic acid database or a set of HBVRegDB_32 nucleotide sequences (Figure 4).
  • Users can search using a well-annotated HBV genome (set as default) against nucleotide sequences from the specific database. Alternatively, the user can paste or browse a sequence of interest for searching.
  • The parameters are set differently from NCBI Blast defaults to detect more distant matches (-W 7 -G 2 -E 1 -q -1 -r 1 -e 100).

Figure 4: HBVRegDB BlastN



This service allows reanalysis of HBVRegDB precomputed searches with the same or different parameters. Users can choose different BLAST programmes to search against different specific databases (Figure 5). Notably, parameters used in HBVRegDB are adjusted to allow matching of short sequences( -W 2 -G 8 -E 2). For example, the tblastx of HBVRegDB returns the hit of the short motif YMDD of the HBV P protein to YMDD in the P protein from Human T-lymphotropic virus, Simian T-lymphotropic virus 1 and Squirrel monkey retrovirus (Figure 6).

Figure 5: Blast analyses provided by HBVRegDB. Users can do different blast searches with different specific databases.





Figure 6: Excerpt from tblastx result showing matching of the short motif YMDD in HBV genotype A (AM282986) to YMDD in Simian T-lymphotropic virus 1.


5. Adding your own custom track from the results of a search

Searches for motifs in HBV sequences can be done using online tools. Transterm provides a function to 'Search your own sequence for regulatory elements', for example:



Figure 7. Pattern search is done through the Transterm database - an interactive database of mRNA regions and motifs from all sequenced species and genomes.


A search with this sequence (AUWAAA) will result in an emailed hit below:


Results for pattern: User defined pattern
Pattern definition: AUWAAA
Found 1 match only.


This can be reformatted into GFF format and uploaded as a custom track or the user can click the ‘NEW’ button and add anntotation in the content of a GFF formatted text file as seen below


Content of GFF formatted text file [help]


reference = AM282986M

ATWAAA User 2735-2740 "My Element"


Displayed in gbrowse:

Alternatively the user can use the ‘highlight feature’ option to highlight the sequence. This can be done by only specifying the region (2735..2740).

Displayed below:



Figure 8. User-added custom tracks in HBVRegDB Gbrowse. The result from uploading or adding a content of GFF file of a search using a 'ATWAAA' element is displayed as the new track (marked by a circle). By using the highlight-feature option, the position of ATWAAA is also highlighted (grey bar, indicated by an arrow).


Last Modified 7/09/2007 by CMB