Patricia Evans: Research

Most of this research focuses on Computational Biology, or Bioinformatics: applying computer science techniques to problems encountered in the field of Molecular Biology, in situations that can be helped by computation. This work spans from the more theoretical, such as the complexity of finding common approximate substrings, to the more applied, such as designing and building new tools for the Canadian Potato Genome Project. My interests lie most specifically with applying relevant aspects of computer science theory, such as algorithm design, graph theory, and parameterized complexity, to computational biology. Ther are also potential applications to hypertext analysis.

Partly funded by:


Projects:
Common Approximate Substrings
One form of motif finding, common for DNA or RNA sequences, is to look for substrings that are found with a few mismatches in all (or most) of a set of sequences. This work was a collaboration with Andrew Smith (recent UNB CS PhD graduate) and Todd Wareham of Memorial University, and also included collaboration with researchers (Ken Kent of UNB and Jackie Rice of U.Lethbridge) on FPGA designs to solve the problem.
Publications:
P.A. Evans, A.D. Smith, and H.T. Wareham. On the complexity of finding common approximate substrings. Theoretical Computer Science, 306(1-3), 407-430.
P.A. Evans and A.D. Smith. Hardness of approximating closest substring problems. Proceedings of 14th International Symposium on Foundations of Complexity Theory (FCT 2003), Springer-Verlag LNCS 2748 (2003), 210-221.
P.A. Evans and A.D. Smith. Towards optimal motif enumeration. Proceedings of Workshop on Algorithms and Data Structures (WADS 2003), Springer-Verlag LNCS 2751 (2003), 47-58.
P.A. Evans, A.D. Smith, and H.T. Wareham. The Parameterized Complexity of p-Center Approximate Substring Problems. Technical report TR01-149, Faculty of Computer Science, University of New Brunswick.
K. Kent, S. Van Schaick, J. Rice, and P.A. Evans. Hardware-based implementation of the common approximate substring algorithm. Proceedings of 8th Euromicro Conference on Digital Systems Design (2005), 314-321.

RNA Structure Comparison and Analysis
RNA molecules fold to form 3-d structures based on largely pairwise bonds between nucleic acids in the chain. These structures can be quite complex, but have characteristics that can be used to help in their analysis. Finding common substructures asssists in RNA classification and the analysis of their functions. The complexity of these algorithms also led to research into parallelizing them and other complex dynamic programming recurrences, in collaboration with Dr. Eric Aubanel at UNB.
Publications:
P.A. Evans. Finding common RNA pseuodknot structures in polynomial time. Proceedings of Combinatorial Pattern Matching (CPM 2006), Springer-Verlag LNCS 4009 (2006), 223-232.
P.A. Evans and H.T. Wareham. Exact algorithms for computing pairwise alignments and 3-medians from structure-annotated sequences. Pacific Symposium on Biocomputing 6 (2001), 559-570.
P.A. Evans. Finding common subsequences with arcs and pseudoknots. Proceedings of Combinatorial Pattern Matching (CPM 1999), Springer-Verlag LNCS 1645 (1999), 270-280.
P.A. Evans. Algorithms and Complexity for Annotated Sequence Analysis. Ph.D. Thesis, University of Victoria, 1999.

Haplotyping and Phylogeny
Genetic information from organisms can be used to determine their relationships. Alternatively, genotypes from individuals in known family relationships can be used to determine their haplotypes.
Publications:
D. Doan and P.A. Evans. A haplotyping algorithm for non-recombinant pedigree data containing missing members. Proceedings of the IEEE International Conference on Bioinformatics and Biomedicine (BIBM) (2007), 275-281.
T. Asano, P. Evans, R. Uehara, and G. Valiente. Site consistency in phylogenetic networks. Algorithms in Bioinformatics 6, London:College Publications, 2006, 15-26.
In this general area of research, I am also involved with an AIF-funded commercial project developing a DNA-based tracking system for salmon farming.

Metabolic Network Analysis
We are currently exploring potentially interesting problems in metabolic and regulatory network analysis, including network comparison and network correction.

The Canadian Potato Genome Project
UNB Bioinformatics and GEELS researchers worked on potato research in partnership with personnel at Genome Atlantic, Bioatlantech, the Government of Canada's Potato Research Centre, Nova Scotia Agricultural College, and Carleton University. The Bioinformatics Research component was co-led by Dr. Patricia Evans and Dr. Virendra Bhavsar. We developed tools to assist the project's potato genome researchers in their task of discovering information about genes.
Funded by:        

Applications of Parameterized Complexity
In addition to the uses of parameterized complexity in some of the above projects, I am also working on applying the techniques to privacy problems, and am involved with a larger project using parameterized complexity to investigate issues with cognitive models such as analogy mapping. This latter project is led by Iris van Rooij at Nijmegen.
Publications:
R. Chaytor, P.A. Evans, and H.T. Wareham. Fixed-parameter tractability of anonymizing data by suppressing entries. Proceedings of the 2nd Annual International Conference on Combinatorial Optimization and Applications (COCOA 2008),Springer-Verlag LNCS 5165 (2008), 23-31.
I. van Rooij, P. Evans, M. Muller, T. Gedge, and H.T. Wareham. Identifying sources of intractability in cognitive models: an illustration using analogical structure mapping. Proceedings of the 30th Annual Conference of the Cognitive Science Society (CogSci 2008), 6 pgs.



Students:

Alumni:

UNB Homepage CS Homepage

Document: http://www.cs.unb.ca/profs/pevans/bioinfor.html
Last Revision : October 21, 2008