Biomolecular Informatics: Sequence to Structure to Function (BIOC156/MEDC156)
Part of the Methods in Biophysics Series

Instructor: Iosif Vaisman, School of Pharmacy

Spring 2000: Feb 18 - Mar 28, Lec: T 2:30-3:30, F 2:00-3:00; Lab T 3:30-5:00 (2307 McGavran-Greenberg)

(Lecture notes will be posted on this page after the lectures)

1. Introduction to information theory
  • Measures of information 
  • Information content of biomolecular sequences
  • Information and Shannon's entropy
  • html
    2. Introduction to computer networks 
    and network-based bioinformatics resources
  • Internet organization and architecture
  • Distributed computing and server-client information systems
  • Research and collaboration on Internet
  • Electronic publishing
  • Molecular visualization
  • html
    3. Principles of data organization
  • Databases and database structure
  • Biomolecular databases: DNA and protein sequences, genomes, protein structures
  • Interfaces and data access
  • pdf
    4. Artificial intelligence for 
    biomolecular applications
  • Stochastic and mechanistic models.
  • Neural networks
  • Genetic algorithms
  • Markov models and hidden Markov models

  • pdf
    5. Sequence alignment and 
    database search algorithms
  • String alignment problem
  • Scoring functions and substitution matrices 
  • Dayhoff and BLOSUM matrices
  • BLAST, FASTA, and Smith-Waterman algorithms

  • pdf
    6. Multiple sequence alignment
  • String comparison in n dimensions
  • Phylogenetic analysis
  • Multiple sequence alignment algorithms and programs

  • pdf
    7. Secondary structure prediction
  • Knowledge-based methods for structure prediction
  • Artificial intelligence in structure prediction

  • pdf
    8. Three-dimensional structure
  • Protein folding problem and its complexity
  • Ab initio methods of protein structure prediction
  • Energy based methods of structure prediction and refinement
  • Homology modeling

  • pdf
    9. Sequence-structure-function correlations
  • Patterns in protein sequences and structures.
  • Databases of protein sites and motifs

  • pdf
    10. Genome informatics
  • Recognizable patterns in DNA sequences: promoter regions, exons, introns, start and stop codons
  • Gene identification and prediction
  • Genome mapping
  • Accuracy of gene prediction
  • Recommended books
    Other reading materials


    SChiSM exercise: page 1, page 2
    Molecular Biology Databases
    Algorithms for Sequence Analysis exercise
    Protein Modeling exercise
    Gene prediction exercise

    Instructor: Iosif Vaisman, School of Pharmacy

    The development of this course is supported in part by the Education Enhancement Grant from the North Carolina Biotechnology Center