Most of this research focuses on Computational Biology, or Bioinformatics: applying computer science techniques to problems encountered in the field of Molecular Biology, in situations that can be helped by computation. This work spans from the more theoretical, such as the complexity of finding common approximate substrings, to the more applied, such as designing and building new tools for the Canadian Potato Genome Project. My interests lie most specifically with applying relevant aspects of computer science theory, such as algorithm design, graph theory, and parameterized complexity, to computational biology. Ther are also potential applications to hypertext analysis.

Partly funded by:

Projects:

One form of motif finding, common for DNA or RNA sequences, is to look for substrings that are found with a few mismatches in all (or most) of a set of sequences. This work was a collaboration with Andrew Smith (recent UNB CS PhD graduate) and Todd Wareham of Memorial University, and also included collaboration with researchers (Ken Kent of UNB and Jackie Rice of U.Lethbridge) on FPGA designs to solve the problem.

Publications:

P.A. Evans, A.D. Smith, and H.T. Wareham. On the complexity of finding common approximate substrings.

P.A. Evans and A.D. Smith. Hardness of approximating closest substring problems.

P.A. Evans and A.D. Smith. Towards optimal motif enumeration.

P.A. Evans, A.D. Smith, and H.T. Wareham. The Parameterized Complexity of p-Center Approximate Substring Problems.

K. Kent, S. Van Schaick, J. Rice, and P.A. Evans. Hardware-based implementation of the common approximate substring algorithm.

RNA molecules fold to form 3-d structures based on largely pairwise bonds between nucleic acids in the chain. These structures can be quite complex, but have characteristics that can be used to help in their analysis. Finding common substructures asssists in RNA classification and the analysis of their functions. The complexity of these algorithms also led to research into parallelizing them and other complex dynamic programming recurrences, in collaboration with Dr. Eric Aubanel at UNB.

Publications:

P.A. Evans. Finding common RNA pseuodknot structures in polynomial time.

P.A. Evans and H.T. Wareham. Exact algorithms for computing pairwise alignments and 3-medians from structure-annotated sequences.

P.A. Evans. Finding common subsequences with arcs and pseudoknots.

P.A. Evans.

Genetic information from organisms can be used to determine their relationships. Alternatively, genotypes from individuals in known family relationships can be used to determine their haplotypes.

Publications:

D. Doan and P.A. Evans. A haplotyping algorithm for non-recombinant pedigree data containing missing members.

T. Asano, P. Evans, R. Uehara, and G. Valiente. Site consistency in phylogenetic networks.

In this general area of research, I am also involved with an AIF-funded commercial project developing a DNA-based tracking system for salmon farming.

We are currently exploring potentially interesting problems in metabolic and regulatory network analysis, including network comparison and network correction.

UNB Bioinformatics and GEELS researchers worked on potato research in partnership with personnel at Genome Atlantic, Bioatlantech, the Government of Canada's Potato Research Centre, Nova Scotia Agricultural College, and Carleton University. The Bioinformatics Research component was co-led by Dr. Patricia Evans and Dr. Virendra Bhavsar. We developed tools to assist the project's potato genome researchers in their task of discovering information about genes.

Funded by:

In addition to the uses of parameterized complexity in some of the above projects, I am also working on applying the techniques to privacy problems, and am involved with a larger project using parameterized complexity to investigate issues with cognitive models such as analogy mapping. This latter project is led by Iris van Rooij at Nijmegen.

Publications:

R. Chaytor, P.A. Evans, and H.T. Wareham. Fixed-parameter tractability of anonymizing data by suppressing entries.

I. van Rooij, P. Evans, M. Muller, T. Gedge, and H.T. Wareham. Identifying sources of intractability in cognitive models: an illustration using analogical structure mapping.

Students:

- Dillon Matchett

Alumni:

- Jian Peng (MCS, 2000)
- Andrew Smith (PhD, 2004)
- Aijazuddin Syed (MCS, 2005)
- Marc Cooper (MCS, 2005)
- Andrew McKim (MCS, 2007)
- Zheng Wang (MCS, 2007)
- Mark Dowe (MCS, 2008)
- Eric Snow (MCS, 2009)
- Emad Bahrami Samani (MCS, 2009)
- En Zhang (MCS, 2010)
- Duong (Peter) Doan (PhD, 2011)
- Rachita Sharma (PhD)
- Steven Stewart (MCS)

Document: http://www.cs.unb.ca/profs/pevans/bioinfor.html

Last Revision : October 21, 2008