PhD Scholarships in Big Data and Data Science Systems in Canada

Research objectives

Data systems are going through a major transition due to the challenges of Big Data processing. The volume and velocity of data generated from a variety of sources are far outpacing the available storage and processing capacity. Data science enables one to bring structure to large quantities of data and make analysis possible. However, existing data systems are not able the meet the computational challenges of Data Science applications. The goal of the research will be to devise new approaches to data processing that can support analysis on data at massive scales.


The University of New Brunswick, Fredericton is one of the top comprehensive universities of Canada. The Faculty of Computer Science is the first faculty of computer science in Canada and a leader in Atlantic Canada since 1968 with the oldest and most successful COOP program in Atlantic Canada.

PhD positions

There are two PhD positions available and they are fully funded.  For both positions a solid background in Computer Science (or Computer Engineering), including a Master's level degree from a reputed university with excellent grades, is required.  Also, strong programming and software implementation skills are desired. Qualities such as being able to take initiatives and work independently, a genuine sense of curiosity in the subject matter, as well as debugging skills, and excellent analytical skills are highly valued.  Previous research experience or work experience in the software industry is an asset.  Further details of each position and requirements are as follows.

Position 1:  High performance query processing  system

This PhD research will develop high performance SQL processing approaches using cutting-edge query compilation techniques, while taking advantage of modern multicore hardware, as well as distributed Big Data frameworks like Hadoop and Spark.  Strong programming background in C/C++ and Java is necessary, and knowledge of Python is expected. Solid understanding of and experience with database system internals, compiler design and Linux systems programming are advantageous.

Position 2:  Scalable parallel runtime for Data Science

This PhD research will focus on developing high performance parallel runtime infrastructure for Data Science applications on modern hardware.  Strong programming skills in Python and C/C++ are important. Any prior experience in parallel programming, Linux systems programming and language runtimes can be asset. Familiarity with Python Data Science echo-system, including Numpy and Pandas, machine learning or data mining libraries is appreciated.



Please contact: with your CV, and Bachelor's and Master's degree transcripts, email to