a national resource for computational science education

HOME BWSIP Shodor Blue Waters

Undergraduate Petascale Modules   (...)

Materials for teaching the application, implementation, and analysis of age-structured models in the study of populations.

Parallelization: Area Under a Curve (BW-UPEP Module)

To present some of the general ideas behind and basics principles of high-performance com- puting (HPC) as performed on a supercomputer. These concepts should remain valid even as the technical specification of the latest machines continually change. Although this material is aimed at HPC supercomputers, if history be a guide, present HPC hardware and software become desktop machines in less than a decade.

Materials for teaching the construction of a cellular automata model of microbial biofilms using Mathematica, as well as a parallel version using C with MPI.

This module provides a quick review of dynamic programming, but the student is assumed to have seen it before. The parallel programming environment is NVIDIA's CUDA environment for graphics cards (GPGPU - general purpose graphics processing units). The CUDA environment simultaneously operates with a fast shared memory and a much slower global memory, and thus has aspects of shared-memory parallel computing and distributed computing. Specifics for programming in CUDA are included where appropriate, but the reader is also referred to the NVIDIA CUDA C Programming Guide, and the CUDA API Reference Manual.

This module is largely stand-alone. It is "Part II" only in the sense that it does not contain the overview of dynamic programming seen in Part I, and does not recapitulate the introduction to CUDA. We will continue to refer the reader to various NVIDIA references where appropriate, particularly the NVIDIA CUDA C Programming Guide, and the CUDA API Reference Manual, and where we introduce new CUDA-specific ideas, will linger a bit longer by way of introduction. The algorithms described here are completely independent of Part I, so that a reader who already has some familiarity with CUDA and dynamic programming may begin with this module with little difficulty.

This directory contains student exercises and code examples that reinforce topics across multiple modules.

Blue Waters Undergraduate Petascale Education module introducing the n-body problem, algorithms and approaches used to simulate n-body systems, a serial implementation, and issues and approaches to parallel implementation.

Parallelization: Conway's Game of Life (BW-UPEP Module)

This module teaches an introduction to the Party Problem, a problem in the field of Ramsey Theory, a subfield of mathematics and performance differences of a naive solution to the Party Problem between a sequential program, an OpenMP program, and a CUDA program.

  • Overview
  • The Tyranny the Storage Hierarchy
  • Instruction Level Parallelism
  • Compiler Tricks

This module provides an introduction to the GPU architecture and the CUDA development environment, instructions on interfacing with the GPU hardware, and an emphasis on debugging C and CUDA codes with cuda-gdb.

Guided lesson to teach undergraduate and graduate students how to use OpenMP.

Jennifer Houchins' introduction to Scientific Computing Lab.

  • Strategy
  • Approximations
  • Error Analysis
  • Computer Arithmetic

This module teaches a method for evaluating the scalability of parallel programs. It includes a software toolchain, PetaKit, for automating the collection of performance data both for single runs and for parameter sweeps illustrating both strong and weak scaling.

This module demonstrates the application of matrix operations to the modeling of populations.

This module teaches matrix multiplication in the context of enumerating paths in a graph and the basics of programming in CUDA. It emphasizes the power of using shared memory when programming on GPGPU architectures.

This module will introduce the basics of cellular automaton simulation with an application to studying the effect of fencing artificial watering points on adult cane toad invasion in Australia.

  • Shared Memory
  • Multithreading (OpenMP)
  • Distributed Multiprocessing (MPI)
  • Applications and Types of Parallelism
  • Multicore Madness

Calculus integration program that find the area under a curve. Perfect to teach the basics of OpenMP and MPI.

  • Serial Version
  • Shared Memory Version (OpenMP)
  • Distributed Memory Version (MPI)

This module teaches the use of binary trees to sort through large data sets, different traversal methods for binary trees, including parallel methods, and how to scale a binary tree traversal on multiple compute cores.

Simulates the evolution of simple and complex forms of lives based on simple rules.

  • Serial Version
  • Shared Memory Version (OpenMP)
  • Distributed Memory Version (MPI)

Epidemiology is the study of infectious disease. Infectious diseases are said to be "contagious" among people if they are transmittable from one person to another. Epidemiologists can use models to assist them in predicting the behavior of infectious diseases. This module will develop a simple agent-based infectious disease model, develop a parallel algorithm based on the model, provide a coded implementation for the algorithm, and explore the scaling of the coded implementation on high performance cluster resources.

This module presents the sieve of Eratosthenes, a method for finding the prime numbers below a certain integer. One can model the sieve for small integers by hand. For bigger integers, it becomes necessary to use a coded implementation. This code can be either serial (sequential) or parallel. Students will explore the various forms of parallelism (shared memory, distributed memory, and hybrid) as well as the scaling of the algorithm on multiple cores in its various forms, observing the relationship between run time of the program and number of cores devoted to the program. An assessment rubric, two exercises, and two student project ideas allow the student to consolidate her/his understanding of the material presented in the module.

BW-UPEP module introducing the BLAST similarity search tool, it's algorithm, and performance considerations. Serial and MPI versions of BLAST are benchmarked in this exercise.

This module teaches the basic principles of semi-classical transport simulation based on the time-dependent Boltzmann transport equation (BTE) formalism with performance considerations for parallel implementations of multi-dimensional transport simulation and the numerical methods for efficient and accurate solution of the BTE for both electronic and thermal transport using the simple finite difference discretization and the stable upwind method

This module teaches the principals of Fourier spectral methods, their utility in solving partial differential equation and how to implement them in code. Performance considerations for several Fourier spectral implementations are discussed and methods for effective scaling on parallel computers are explained.

BW UPEP Module from the Earlham College Cluster Computing Group, lead by Charlie Peck.

Markov Chains have numerous applications in biology from ecology to bioinformatics. This module will explore some of these applications along with the need for high performance computing in solving some of the problems.

The purpose of this module is to:

  • To understand some fundamental physical processes governing sand movement in rivers
  • To implement a model for simulating these processes using C
  • To learn how to visualize these simulations using Paraview
  • To see how massive simulations can be achieved using supercomputers

Blue Waters Undergraduate Petascale Education Module exploring the computational issues involved with scaling the the size of a simulated n-body system.

This module give a basic introduction to the CUDA architecture and programming model, OpenGL for 3D graphics, and the interoperability between the two for interactive, high performance scientific visualization.

Social Networks Module developed for the Undergraduate Petascale Education Program.

Download link for standalone BLAST, produces errors in Standard XML.

Module covering stochastic optimization.

This module will:

  • Introduce the suffix tree data structure and its many applications in string matching and bioinformatics
  • Describe how suffix trees are built on a serial computer
  • Discuss the challenges associated with building the tree in parallel
  • Explain one application in bioinformatics (pattern matching) that uses suffix tree
  • Develop a method to implement pattern matching on a distributed memory parallel computer
  • Describe how to analyze parallel performance and identify improvements

  • High Throughput Computing
  • GPGPU: Number Crunching in Your Graphics Card
  • Grab Bag (Scientific Libraries, I/O Libraries, Visualization)

Reviews and Refreshers   (...)

A brief refresher to the C language. Overviews some of the basic information needed to create programs in C.

Distributed Memory   (...)

A collection of code samples illustrating the ideas behind distributed memory parallelism.

A collection of presentation slides and notes discussing distributed memory parallelism.

General   (...)

A collection of code samples, mostly non-parallel, that can be used as the basis for exploring Petascale applications, parallel programming, and High Performance Computing.

A collection of presentations and notes discussing Petascale, High Performance Computing, and Computational Thinking.

Shared Memory   (...)

A collection of code samples illustrating shared memory programming with OpenMP.

A collection of presentation slides and notes discussing shared memory parallelism.

Data Intensive App's and Data Management   (...)

BLAST is a tool to find sequences of local alignment between sequences.

Hybrid Shared-Distributed Memory   (...)

A collection of presentations about creating codes that are hybrids of different parallel programming models (e.g., MPI, OpenMP, and CUDA).

Tools   (...)


High Performance Linpack

Scalable Molecular Dynamics

Network Common Data Form.

Introduction to Parallel MatLab.

Materials to introduce TotalView

GPGPU   (...)

Code examples and related materials for GPGPU.

Presentation materials (slides and notes) related to GPGPU.

Visualization   (...)
openACC   (...)

For OpenACC

BW2015   (...)

txt file from Galen Arnold

Hardware and Memory Hierarchy Presentation

Excel file showing first few steps of a heat flow model's vectors and the dependencies therein

Where Aaron ended up during the session creating a "Hello, World" that uses MPI+CUDA

Commands and code from the netCDF session at the 2015 Petascale Institute

Paraview Tutorial by Wesley Yu and Dinushka Herath

No Results Found