DNA Encoded Glycan Libraries

Header-based Fixed-length Block Encoding with Self-describing Protocol

About Glycan.rna.mk

This tool translates complex branched glycan sequences into dense, orthogonal DNA barcodes, and decodes them back with error-correction.

Designed for DNA Encoded Glycan Libraries (DEG), it supports the standard IUPAC-condensed nomenclature and offers multiple optimization modes to respect biological synthesis constraints (GC limits, palindrome avoidance).

🧬 DEGL Encoding Control Panel

Examples: Sialyl Lewis X Lewis Y N-Glycan Core LacNAc Complex N-Glycan
📋 IUPAC Format Support: Glycan monomers, linkages, modifications and brackets are separated by " : ". Supports both custom ":" format and standard IUPAC-condensed format.

Result:

Input Glycan:
DNA Code:
Header Tag:
Total Length:
Mode Used:
IUPAC Format:

Decode DNA to Glycan (Auto-detect Header)

Automatically detects encoding mode from Header tag. Supports Max Density Mode sequences without Header.
Enter Nucleotide:
Examples (Legacy): AAGAATAATAGATAAAGA Lewis Y (Legacy)

Result:

Input DNA:
Detected Mode:
Decoded Glycan:
IUPAC Format:

Current Codebook Reference

View the complete mapping of DNA barcodes to Glycan IUPAC codes for the currently selected encoding mode.
Note: Changing the token length or mode will dynamically regenerate this table.

Methodology: DEGL Encoding Modes

Table 1: Characteristics and constraints of the encoding modes available in the DEGL system.
Encoding Mode Token Length Theoretical Capacity Error Correction Capability GC Constraints Palindrome Check
Max Density
(Legacy)
Variable Dependent on dictionary Heuristic matching (d=N/A) None Off
Optimized
(Ultra-compact)
4 nt ~256 tokens None (d=1) 0–100% (No limits) Off
Optimized
(Balanced / Default)
5 nt ~996 tokens None (d=1) 0–100% (No limits) Off
Optimized
(Thermodynamic)
6 nt >500 tokens Error detection only (d=2) 40–60% On
Optimized
(High Fidelity)
7 nt ~350 tokens 1-bp active correction (d=3) 40–60% On

Table 1 Notes:

  • Token Length: The number of nucleotides (nt) assigned to each glycan building block or macro-compressed core.
  • Error Correction Capability: The minimum Hamming distance (d) enforced across the codebook. A distance of d=2 allows for the detection of sequencing errors, while d=3 enables the active mathematical correction of a 1-bp mutation.
  • GC Content & Palindrome: Strict thermal stability controls required for optimal PCR amplification and the prevention of polymerase slippage during library construction.