|Position Title||A Computational Approach to Ramsey Theory|
|Summary||The undergraduate student intern will work with the mentor on parallel algorithms to make progress on a computationally intensive open problem in a branch of mathematics called Ramsey Theory. The problem being worked on is called the Party Problem.|
|Job Description||The student will work with the mentor to implement algorithms in a parallel and distributed computing environment to try to improve the lower bound on the Party Problem. The intern will start by learning about the Party Problem, a problem in Ramsey Theory which is an area of study within Graph Theory. The goal of the R(5, 5) instance of the Party Problem is to find the minimum number of people that need to be put into a room to guarantee that either there is a group of 5 people all of whom know each other or a group of 5 people none of whom know each other. We can model this using a complete graph on v vertices where each vertex represents a person and the edge between two vertices is red if those two people know each other and is blue otherwise. It is known that 43 <= R(5, 5) <= 49. In our research, we attempt to determine the value of R(5, 5) or increase the lower bound. After learning about the problem and the progress that the mentor and collaborators have made thus far, the student will work on developing new algorithms to make more progress and develop code to run on CUDA-enabled graphics cards. |
The intern will begin by working with the mentor to identify strategies to solve this computationally intensive problem. Over the course of the internship, we expect to encounter scalability problems with several methods of solving the problem. We expect to hit network saturation issues, memory usage issues, and file I/O issues because of the quantity of data required to solve the problem with our current proposed solution. The student will work with the mentor to identify and understand the scalability problems and try to find alternative ways to solve the problem that work around the scalability issues. The student will create a 12-24 node cluster using dual core systems at the mentor's institution (all necessary hardware is ready to be used). After creating the cluster, the student will explore solutions using both MPI and OpenMP. The student will also explore solutions involving CUDA, instead. By the end of the summer, the student will run the program they developed on the Ranger supercomputer at TACC with a 100,000 hour allocation (already awarded to the mentor and a co-PI through grant TG-DMS100031) or on a workstation with CUDA-enabled GPUs if it is more appropriate. If the program is run on CUDA-enabled GPUs, performance comparisons between running the program using CPUs only and using various GPUs only will be done. One of the student's contributions will be working alongside the mentor to help discover performance issues and find ways to work around them. The student will develop code with some help from the mentor that uses both OpenMP and MPI or CUDA to make progress on the problem. The student will also document all the strategies used, problems encountered, solutions to problems, and the results of running the program on Ranger or on the GPUs. If the lower bound on R(5, 5) is improved as a result of the project, the student will assist the mentor in writing a paper for publication. Regardless of whether the problem is solved, the student will give a talk presenting the results of his work during the academic year at the mentor and student's institution.
The student currently has limited knowledge of the application domain. However, he has some knowledge of graph theory from his Discrete Math course and the remaining required knowledge to successfully work on the project can be learned from the mentor in approximately one to two days because the problem itself is very simple. The simplicity of the problem will allow the student to focus on the parallel computing aspects of solving the problem rather than on a significant amount of domain knowledge that would need to be learned for other problems.
The student is currently completing the second course in the computer science major (data structures) which is his second course using the C++ language. Although privacy laws prevent me from revealing the student's grades, I can say with complete and total confidence that he has learned the skills necessary to write the programs necessary to make progress on the problem. We would like to increase the lower bound on R(5, 5). We would like to begin by testing the conjecture that R(5, 5) = 46 and that there is a self-complementary regular graph that will demonstrate that R(5, 5) > 45. To test this, we may need to generate and analyze all self-complementary graphs that are regular of degree 22. To accomplish this in a reasonable amount of time, we will need to utilize thousands of cores and will likely need to apply for an even more significant TeraGrid grant when we have results from the first grant that justify an additional and much larger time grant.
The outcomes the student will achieve are getting a good introduction to parallel computing, parallel programming with MPI and OpenMP, performance testing and scalability, an opportunity to do research one-on-one with a faculty member after his freshman year, and a chance to learn more about computational science and supercomputing. As stated, we hope to increase the lower bound on R(5, 5), but would be content to make significant progress testing the conjecture that there exists a self-complementary graph that is regular of degree 22 that does not contain a complete subgraph on 5 vertices.
|Conditions/Qualifications||C++ programming ability is required. No background in Ramsey theory is necessary. Must reside in Boston area and be willing to commute to Merrimack College.|
|Location||Dr. David Toth, Assistant Professor, |
Department of Computer Science
315 Turnpike Street
North Andover, MA 01845