SMSCBOF
From Bgwiki
Contents |
[edit] Introduction
The SC08 Blue Gene System Management Community Birds of a Feather Meeting is 5:30-7:00pm USCDT Tuesday, November 18, 2008 in room(s) 11A/11B.
This is meant to be a discussion forum with representatives from: Argonne National Laboratory, Juelich Research Centre, Lawrence Livermore National Laboratory, IBM, and Brookhaven National Laboratory/Stony Brook University.
We are still asking people to come prepared to talk for no more than 10 minutes about their configurations and what directions you plan to take your site in. We'll then turn to discussing key issues. To prevent duplication and to share ideas in advance, we've set up this wiki.
[edit] Schedule
This is a rough schedule and totally unfixed. Feel free to make suggestions and changes.
- 5:30-5:40 Opening remarks and introductions - Susan Coghlan, ALCF
- Opening remarks
- Community
- SP-XXL, 2 slides, PPT
- Consortium
- Site Panelist Introductions
- 5:40-6:30 Site Presentations
- 6:30-6:55 Panel Issues Discussion w/ Stump the Experts
- 6:55-7:00 Round-up
- e-mail list admins-wg@bgconsortium.org
- wiki - William to ask for wiki.bgconsortium.org
- Thank yous
[edit] Site Presentations
[edit] Configuration Details
- Model(s) / Driver(s)
- Size
- Topology
- Queuing System
- Network setup
- File System details
- OSes
- Workload description
- Monitoring / Notification System
- Other things that set your system(s) apart
[edit] Issues
- brief discussion of issues faced
[edit] Directions
- Cool tools under development
- Expansion plans
[edit] Panel Discussion Issues
[edit] Juelich Research Centre
Speaker: Jutta Docter (j.docter@fz-juelich.de)
team lead of BlueGene/P system administration
experience with various supercomputers (IBM, Cray, Intel, ...) and Systems (BG/P, BG/L, AIX, LoadLeveler, ...)
- RAS events
- diagnostics
- automatic monitoring
- hardware stability
- software support
- software enhancements
- documentation
[edit] Argonne
- support system integration
- diagnostics
- failure management
- Navigator / CLI tools
- monitoring
[edit] LLNL
- Cross compile environment & autoconf
- Interpreted languages
- Shared memory
- Effective use of 2nd core
- Effective use of SIMD floating point unit
- Scalability of ethernet for parallel filesystems
[edit] Brookhaven National Lab / Stony Brook
Speaker: Nicholas D'Imperio (dimperio@bnl.gov)
Blue Gene Systems Coordinator, team lead for System Administration,
Applications Support, and System Software Development.
- Loadleveler Issues
- User Accounting
- Offsite Cross Compiling
- Future Projects
