Ribosome Atlas Tutorial

Roma Nagle, Jamie H.D. Cate, Yekaterina Shulgina

This tutorial walks through the main features of Ribosome Atlas: selecting and browsing phylogenetic trees, specifying alignment positions, and interpreting the visualizations.

1. Navigating the Phylogenetic Tree

Step 1 – Choose a domain

Use the Choose tree dropdown to select either Bacteria or Archaea. This loads the corresponding GTDB phylogeny.

Step 2 – Choose a taxonomic level

Use the Choose a level dropdown to pick the rank you want to explore:

Domain – shows the entire Archaea or Bacteria tree.
Phylum, Class, Order, Family, Genus – each level narrows the set of available groups shown in the next dropdown.

Step 3 – Select a group

The Options dropdown is populated with all groups at the chosen level. Select one to load its subtree in the tree panel on the left.

Step 4 – Choose a view level

The Choose a view level dropdown controls how the tree leaves are colored and grouped. For example, selecting Species colors each tip by species, while selecting Genus collapses colors at the genus level. This is independent of the level you used to filter.

The tree is interactive — clicking on a node will drill into that clade and update all panels. The breadcrumb below the controls shows which clade is currently displayed.

2. Specifying Alignment Positions

The two text inputs — 16S positions and 23S positions — let you select specific columns from the ribosomal RNA alignments to display. Positions are numbered relative to the E. coli reference.

Input format

Single position: 530
Range: 530-540
Mixed: 5, 10-20, 25

Example workflow

Enter 1492-1510 in the 16S field to examine the 3′ end of the small subunit rRNA across the selected clade, then click Generate SVG.

After entering your positions, click Generate SVG. The tree and alignment panels will update to reflect your selection. You can leave either field blank to skip that molecule.

Positions that fall outside the alignment or are only gaps will be silently skipped. If no valid positions remain, an error message will appear in red below the inputs.

3. Interpreting the Visualizations

Phylogenetic tree (left panel)

The tree shows the evolutionary relationships among organisms in the selected clade. Tips are labeled and colored by the view level you chose. Internal nodes can be clicked to zoom into a subtree.

Base composition plot

For each selected alignment position, the stacked bar chart shows the proportion of each nucleotide (A, U, G, C) present in the clade at that column of the alignment. Gap characters are excluded from all counts, so the chart reflects only organisms that have a nucleotide at that position.

Colors follow standard nucleotide conventions:

A – green
U – yellow
G – blue
C – red

How to read the chart:

Single color, tall bar – the position is highly conserved across the clade; nearly all organisms share the same nucleotide.
Mixed colors – the position is variable; different organisms have different nucleotides at this column.
Short bar – a large fraction of organisms have a gap at this position, meaning the nucleotide is absent or the region is not present in many sequences.

Each bar corresponds to one alignment position. When you specify multiple positions or a range, the bars are arranged left to right in the order you entered them. Gaps between non-contiguous ranges are shown as visual separators.

The base composition is computed from species-level alignments for all organisms in the selected clade, regardless of the view level chosen in the tree panel.

Position detail page

Clicking on any bar in the base composition chart pins a summary panel for that position. At the bottom of the panel, click Open full details to open a dedicated page in a new tab. That page contains:

Shannon entropy — the positional entropy in bits (0–2), its normalized value (0–1), and the number of sequences used (gaps excluded).
Base frequency table — exact counts and percentages for A, U, G, C, and gaps at that position.
Pie chart — a visual breakdown of A/U/G/C proportions (gaps excluded).
Species / taxonomy by nucleotide — one table per nucleotide listing every species that has that base at the selected position, along with its full taxonomic classification (Phylum through Genus) and its complete alignment sequence for that rRNA molecule.

This page is useful for identifying exactly which organisms contribute to a conserved or variable position, and for cross-referencing sequence variation with phylogenetic placement.

Hovering over a row in the pinned summary panel highlights which clades carry that nucleotide before you open the full details page.

Shannon entropy plot

Shannon entropy is a measure of sequence variability at each alignment position. It is computed from the same per-position nucleotide frequencies as the base composition chart (gaps excluded).

Entropy near 0 – the position is nearly invariant; one nucleotide dominates across the clade.
Higher entropy – the position is more variable; the four nucleotides are more evenly distributed.
Maximum entropy ≈ 2 bits – all four nucleotides appear with equal frequency (25% each).

The y-axis is log-scaled to better distinguish low-entropy (highly conserved) positions. Positions with zero variance — where every organism has the same nucleotide — cannot be shown on a log scale and are marked with * at the base of the chart.

Domain Consensus

The Domain Consensus row shows the consensus sequence computed from all organisms in the selected domain (all Archaea or all Bacteria), at the positions you specified. It uses the same R/Y/N notation as the Clade Consensus (described below) but represents the full-domain background rather than the selected subtree.

Use this row to see whether a position is universally conserved across the domain or whether the pattern you observe in your selected clade is domain-wide or clade-specific.

Clade Consensus

The Clade Consensus row shows the consensus sequence for the specific clade you have selected (the organisms currently shown in the tree). It summarizes nucleotide identity at each position using the following rules:

A, U, G, C – shown when ≥95% of non-gap bases agree on that nucleotide
R – purine (A or G) when ≥70% of non-gap bases are purines
Y – pyrimidine (C or U) when ≥70% of non-gap bases are pyrimidines
N – no clear majority at this position
– – position is mostly gaps in this clade

Comparing the Domain Consensus and Clade Consensus side by side lets you quickly identify positions where your selected clade diverges from the broader domain pattern.

Alignment panel (right panels)

The alignment SVG shows the actual nucleotide sequence for each organism at the selected positions, arranged to match the tree on the left. This lets you directly compare sequence variation across the phylogeny at the positions you specified.