CS 378 Programming For Performance, Fall 2014
- Class CS 378 Programming for Performance (Unique Number 53153)
- Instructor Andrew Lenharth
- Lecture M,W 12:30-2:00 GDC 5.302
- Office Location ACES 4.124
- Office Hours Monday, Tuesday 4-5
- TA Soumyajit Gupta
- TA Office LocationTa room desk 3 (sometimes 5)
- TA Office HoursTuesday, Thursday 11-1, Friday 2:30-4:30
- 12/3 -- Final in GDC 1.304 Wed Dec 10 2-5
- 11/24 -- No class 11/26
- 10/22 -- HW4 updated with much smalled number of nodes (because you probably don't have 100s of GB of RAM).
- 10/15 -- Send your code and writeup to the TA for Project 2 by email as normal. Sample outputs can either be compressed (mp3, etc) and sent, or truncated (say 3-4 seconds) and compressed (bzip/gzip) and sent, or (preferably) copied to /work/02118/lenharth/pr2/ on stampede (include your name or uid in the filenames). Do not email hundreds of MB.
- 10/14 -- On stampede, you can run 'idev -m 30' to get an interactive compute node for program development or timing. '-m time' is the number of minutes it will reserver for you. Please do this rather than run on the login nodes.
- 10/13 -- Project 2 due Wednessday 10/15
- 10/2 -- Often the first step in vectorization is to unroll a loop by the width of the vector. __m128 stores 4 floats, so try unrolling by 4. Where vector operations are is usually easier to figure out then.
- 10/1 -- Use -march=native
- 9/24 -- I will respond to questions on Piazza when it emails me and let's me know about them. Directly emailing me is always a good option.
- 9/24 -- See extra resources or Piazza for links to previous semester's version of papi-related homeworks which have example papi code
- 9/22 -- PAPI_LST_INS may not work on stampede. PAPI_LD_INS + PAPI_SR_INS should be equivalent. 'papi_avail' will show which counters are supported on the machine you are using. The numbers for the first couple counters are easy to estimate and thus verify that the result is believable (the growth curve especially).
- 9/22 -- Project 1's performance numbers updated. Target is 3 GFLOPS at 3000x3000.
- 9/17 -- Sample job submission script for tacc This is the documentation for running on stampede Please use the serial queue to submit jobs.
- 9/17 -- Homework 2 now due Saturday, midnight
- 9/10 -- Use the TACC user portal to create a tacc user account. Email the tacc account to the TA so he can add you to the class allocation.
- 9/8 -- New TA
- Homework 1: Timers 9/11 Midnight
- Homework 2: Perforamnce Counters 9/20
- Project 1: Matrix Multiply 9/25
- Homework 3: Vector Loops
- Project 2: HRTF 10/15 Date changed, HRTF data Coordinate system, HRTF DATA, Sample input (stereo file is LRLRLR, also includes separate files for left and right channels if you would rather)
- Homework 4: Graph Representations 10/27
- Project 3: SSSP 11/10 (Groups of 3!)
- Homework 6: Proposal 11/18
- Project 4: Scaling 12/1