SCC members, Last week we had a meeting of the "Lattice Hadron Physics Collaboration" (24 physicists from 14 institutions including Jlab,MIT, and BU) We began to discuss a more uniform approach to software for this group. Here are some of our initial thoughts. ============================================================================= QCD-API ============================================================================= I described the API as being layered ---keeping a flexible model at least for now to accommodate the different style in CPS/MILC/SZIN. Level 1: Single node Linear Algebra Routines: SU(3), gamma algebra etc. (For example MILC has several 100 in \library written in C, m4 and a variety of assembly languages. A "single name space" leads to rather horrible names! What is the semantics common to MILC/CPS/SZIN?) 1.5 Data parallel implementation hiding the per sublattice structure. Level 2: Communication primitives (equivalent to very small subset of MPI) (These will be low level routines for QCDOC & Myranet, ... callable from MPI or MPI-like syntax. Send/Receive, fast nearest neigh shift operations, global sums,etc.) 2.5 Data parallel implementation hiding the separation between intranode indexing (and threads if necessary) and internode messages. Level 3: Overlapping QCD Computation/Communication primitives. (e.g. SU3_mult_Shift operations: psi_x <---- U(x,mu) psi_x+mu. All basic ingredients to build complete Dirac operators, Hybrid MC etc.) Level 4: Complete Dirac inverters, gauge fixing, FFT,..... (Where does this stop? Include sources, data analysis tools, data format conversions, lattice layout remapping primitives, etc.) Level 5: file I/0, and execution enviroment? ------------------------------------------------------------------------------ COMMENT: (Please make additions and corrections) It would useful to convert this layered into a block picture of code modules --- next iteration. At this early stage it appears that the different platforms and software strategies may regard different layers as being essential to the QCD-API and to high performance. Sorting this all out is of course the major goal of our grant! For the SZIN model, Level 3 appears to be basic Application Programmers "interface". It is hoped that new methods (e.g. new Dirac actions) can be built directly at this level with few extensions and without appreciable lose in efficiency? As I understand in CPS, the best routines are now built directly at Level 4. Extensions to level 4 are inevitably as we build a common "share ware" code base. How do we share this QCD application code units. MILC appears to more often to reach down into Levels 1 and 2 directly. On the other hand it makes use of a predefined "data parallel" compuatation and messaging primatives to hide the annoying details of iteration loops and sublattice decomposition. This is probalby very useful non-overlaping verion of what I called layer 3. ============================================================================== SZIN to Q Upgrade =============================================================================== The Jlab/MIT group discussed a development path from SZIN --- replacing the m4 macros by a new interface (actually a "compiler" written in Scheme called Q). At the same time the first priority is to continue to develop level routines to run QCD efficiently on the Phase I cluster being configured this summer at Jlab and FNAL and to share this effort with Fermilab. We are all aware that physics production must continue unabated as we develop the software inftrastructure. This is in the very early stages but we set up a "straw man" to shoot at. 1. Early Summer 2001: Robert Edwards: Write a manual for SZIN (as is) so that all the "Lattice Hadron Physics Collaboration" can use it and give feedback on its design. 2. Summer 2001: Andrew Pochinsky: Continue to develop enough of Q to have a side by side comparison in early fall for one or two applications. 3. Xmas 2001: Beta-testing of Q system. User Feedback before go ahead on M4 to Q conversion. 4. Mid 2002: Translate all SZIN application into Q, document Q and encourage others to write in and/or extent Q itself to new applications. COMMENTS: One feature of the Q approach is its ability to be linked to optimizers routines that directly build processor specific Level 1 C or assembly routines. This might replace the large library (and its large name space) by compile time code generation. Another feature of SZIN which will not change is that the application code (up to m4 ----> Q "calls") is written in C and therefore any C application can be patched in directly. For example a useful MILC application might be used to save the initial effort at integration. One question, among many, that was raised is whether there is a way to co-ordinate MILC's plans and SZIN's plan to head eventually to a single cluster code base. Finally the ongoing optimization of primatives at Level 1 and 2 is quite independent of this higher level. This is a shared process amongst all the "MILC/CPS/SZIN" advocates. Comparison of SZIN versus Q syntax: // Q // Here is a single node level with single site type operations // Basically, this is Level 1 above // Tangent parallel transport with Wilson factor Uq = proc(out Field(Fermion) result, int num, int d, Index(uq) idx, Field(Gauge) U, Field(Fermion) psi) { forall(i = 0 : num) result[i] = U[i][d]*(1+Gamma[d])*psi[idx[i].forwardF[d]]; } // Proposed for Level 3 Uq = proc(out Field(Fermion) result, int d, Field(Gauge) U, Field(Fermion) psi) { result = U[d]*(1+Gamma[d])*shift(psi,d); } // Current SZIN syntax equivalent of Level 3 SPIN_PROJECT(tmp, psi, d); MULTIPLY(result, U(d), COMMUNICATE(tmp,d)); // Straightforward change to just MULTIPLY(result, U(d), COMMUNICATE(SPIN_PROJECT(psi,d),d), REPLACE); Rich