Work So Far
So far, we have created a detailed specification for our project including the user-facing API and the internal structures we will be using. We have also mostly completed a sequential implementation of the BDD library, which allows the user to construct BDD's using logical operators and do simple analysis like "one SAT" (find one satisfying assignment for the BDD) and "count SAT" (find number of satisfying assignments for the BDD). Our BDD's are maintained in a canonical form in order to save memory.
We also gave considerable thought into designing the parallel hash table in order to implement the unique BDD table and the cache. This task turned out to be much more difficult than we initially expected for a few reasons:
- The hardware we had access to did not have 128-bit compare-and-swap, which makes it significantly harder to implement a lockfree hash table since we need to do compare-and-swaps on a pointer plus a counter.
- We realized that we would have to implement our own garbage collector over the cache, which means we would need to maintain reference counters in order to keep the memory footprint manageable.
We also spent a good amount of time trying to implement a fork-join parallelism model to handle work distribution. However, this turned out to be a dead-end because implementing such a system is significantly out-of-scope for the difficulty of a 418 project, especially when combined with the rest of the project. We ended up deciding to use the Cilk library instead.
- Verify the correctness of the sequential algorithm by implementing n-Queens (Chris)
- Finish work on the sequential RelProd algorithm, which is used for model checking (Chris)
- Finish the lockfree hash table for the cache and uniques table (Matt) 4/25:
- Finish the reachability algorithm for model checking (Chris)
- Finish cache and uniques table by adapting the lockfree hash table to the needs of the library (Matt) 4/29:
- Write some test cases to verify the correctness of the model checker (Chris)
- Finish adding fork-join parallelism via Cilk (Matt) 5/2:
- Benchmark against BuDDy and test the library to ensure correctness, debug as necessary (both)
- Gather data about the parallelism of our system, including the speedup and memory footprint (both) 5/6:
- Final fine tuning (both)
- Finish creating a final presentation for our project (both)
- Begin practicing for the presentation (both) 5/9:
- Finish final writeup (both)
Though we have given it much thought and done background research to ensure that it is reasonable, we have yet to actually test the code on a Xeon Phi using Cilk. Our hope is that Cilk will be enough for our fork-join needs though we a bit concerned that we will hit unforeseen roadblocks while actually implementing it in the Phi's. Our backup plan is to run the code on the GHC machines, which will give some benchmarks, though we may not be able to observe as large of a speedup.
Another big concern is the fact that we are somewhat blindly implementing BDD algorithms according to a few research papers that we have read. So far, things have been going reasonably well, but if we eventually get a correctness bug, it will be incredibly difficult to debug since the algorithms presented in the papers are quite complex and presented in a simplistic fashion. We will try to debug this by manually inspecting the BDD's, though this becomes unfeasible when we try to scale up. This may require help from Professor Bryant if we cannot solve the issue.
For our final presentation we hope to show:
- Implementing a simple BDD with our library to show how easy it is to use.
- Performance benchmarks for creating and operating on large BDD's compared against existing sequential libraries.
- STRETCH: Demonstration of a model checker for verifying the correctness of a piece of parallel code (i.e. deadlock detection).