Software Projects

Microsoft Excel Implementation of the Needleman-Wunsch Sequence Alignment Algorithm

Posted in Uncategorized by rmt on February 16, 2010

The Needleman-Wunsch algorithm aligns nucleotide sequences by taking into account a numerical penalty for gaps in the sequences. Using Excel, this template calculates the similarity matrix and uses Excel’s conditional formatting to display the path for traceback.

You can copy the formulas outward if you want to compare longer sequences.

By default, random sequences are loaded for the top (subject) and left-side (query) sequences. You can press F9 in Excel to generate new random sequences.

Screenshots

Needleman-Wunsch Algorithm Template Screenshot

Screenshot of the Excel Template

Screenshot showing tracing precedents in Excel

Using Excel’s “Trace Precedents” command to show all the dependencies for cell (2,2)

Traceback Key

Red: move diagonally up and left, sequences are aligned
Green: move up, gap in top (query) sequence
Blue: move left, gap in left-side (subject) sequence
White: alternate optimal alignments. Move up, left, or diagonally toward the highest value

Default Scoring Values

Gap Penalty: -1
Match Score: 1
Mismatch Score: 0

Example Formula

The formula for cell C8 is representative of an entry in the table (cells B1 through B3 hold the match, mismatch, and gap scoring penalties):

    =MAX(B7+IF(C$6=$A8,$B$1,$B$2),C7+$B$3,B8+$B$3)

Finding Semi-Global Alignments

The default behavior of this Excel template is to find global alignments. It can be modified to find semi-global alignments by making three adjustments:

  • Initializing the leftmost numerical column and the topmost numerical row with all 0s
  • Allowing free horizontal moves in the last row
  • Allowing free vertical moves in the last column

Download Link

Needleman-Wunsch Algorithm Excel Template

(Works for Microsoft Excel 97-2003, Excel 2007, and Excel 2010)

Alternate Version

This alternate version shows the traceback arrows along with the cell coloring. The arrows are helpful for understanding the algorithm, but the formula gets cluttered (for example, in cell internal2!C8):

=CONCATENATE(
   IF(
      AND(internal1!B7+IF(internal1!C$6=internal1!$A8,Main!$B$1,Main!$B$2)>internal1!C7+Main!$B$3,
         internal1!B7+IF(internal1!C$6=internal1!$A8,Main!$B$1,Main!$B$2)>internal1!B8+Main!$B$3),
         "↖",""),
   IF(
      AND(internal1!C7+Main!$B$3>internal1!B7+IF(internal1!C$6=internal1!$A8,Main!$B$1,Main!$B$2),
         internal1!C7+Main!$B$3>internal1!B8+Main!$B$3),
         "↑",""),
   IF(
      AND(internal1!B8+Main!$B$3>internal1!B7+IF(internal1!C$6=internal1!$A8,Main!$B$1,Main!$B$2),
         internal1!B8+Main!$B$3>internal1!C7+Main!$B$3),
         "←","")
)

Screenshot

Screenshot of Excel Window Showing Needleman-Wunsch Scoring Table With Traceback Arrows

Screenshot of alternate version showing traceback arrows.

Alternate Version Download Link

Needleman-Wunsch Algorithm Excel Template with Traceback Arrows

(Works for Microsoft Excel 97-2003, Excel 2007, and Excel 2010)

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s