Performance and Measurement part 2
Measurement
last time we looked at
Producing Wrong Data Without Doing Anything Obviously Wrong! Todd Mytkowicz, Amer Diwan, Matthias Hauswirth, and Peter F. Sweeney. ASPLOS 2009.
445 references
- Measurement bias is significant
- Changing aspects of an experimental setup can introduce measurement bias. Measurement bias is unpredictable and there are no obvious ways to avoid it. Prior work in computer system evaluation does not adequately consider measurement bias.
- The paper discusses two techniques for dealing with measurement bias: experimental setup randomization and causal analysis.
- Measurement bias occurs for all benchmarks and architectures.
- Measurement bias due to link order can significantly fluctuate conclusions.
- Measurement bias due to UNIX environment size can lead to conflicting conclusions.
- To avoid measurement bias, it is important to use diverse evaluation workloads, randomize the experimental setup, conduct causal analysis, and collect more information from hardware manufacturers. —
A sample blog post about this paper blog
another example
Strangely, Matrix Multiplications on GPUs Run Faster When Given “Predictable” Data!
SIGPLAN Empirical Evaluation Guidelines
Here are the criteria by Berger, Blackburn, Hauswirth, and Hicks (2018):
- Clearly stated claims
- Explicit Claims
- Appropriately-Scoped Claims
- Acknowledges Limitations
- Suitable Comparison
- Appropriate Baseline for Comparison
- Fair Comparison
- Principled Benchmark Choice
- Appropriate Suite
- Non-Standard Suite(s) Justified
- Applications, Not (Just) Kernels
- Adequate Data Analysis
- Sufficient Number of Trials
- Appropriate Summary Statistics
- Report Data Distribution
- Relevant Metrics
- Direct or Appropriate Proxy Metric
- Measures All Important Effects
- Appropriate and Clear Experimental Design
- Sufficient Information to Repeat
- Reasonable Platform
- Explores Key Design Parameters
- Open Loop in Workload Generator
- Cross-Validation Where Needed
- Presentation of Results
- Comprehensive Summary Results
- Axes Include Zero
- Ratios Plotted Correctly
- Appropriate Level of Precision