Testing

  • Static Testing [at compile time]
    • Static Analysis
    • Review
      • Walk-through [informal]
      • Code inspection [formal]
  • Dynamic Testing [at run-time]
    • Black-box testing [external state]
    • White-box testing [internal state]
  • Commonly, testing refers to dynamic testing

Complete Testing?

  • Poorly defined terms: “complete testing”, “exhaustive testing”, “full coverage”
  • The number of potential inputs are infinite
  • It is impossible to completely test a nontrivial system
    • Practical Limits: complete testing is prohibitive in time and cost [e.g., 30 branches, 50 branches, …]
    • Theoretical Limits: e.g. halting problem
  • We need formal testing criteria

Test Case

  • Test Case: [informally]
    • What you feed to software; and
    • What the software should output in response
  • Test Set: a set of test cases
  • Test Case: input values, expected results, prefix values, postfix values necessary to evaluate software under test
  • Expected Results: the result that will be produced when executing the test iff the program satisfies its intended behavior

Test Requirement & Coverage Criterion

  • Test Requirement: a specific element of a software artifact that a test case must satisfy or cover
    • ice cream cone flavors: vanilla, chocolate, mint
    • test requirement: test one chocolate cone
    • TR denotes a set of test requirements
  • A coverage criterion is a rule or collection of rules that impose test requirements on a test set
    • Coverage criterion is a recipe for generating TR in a systematic way
    • Flavor criterion [cover all flavors] More general
    • TR =

Measuring Test Sets

  • How do you know how good a test set is?
    • Testing an ice cream stand:
    • Test set 1:
      • 3 chocolate cones, 1 vanilla cone
    • Test set 2:
      • 1 chocolate cone, 1 vanilla cone, 1 mint cone

Coverage

  • Coverage: given a set of test requirements, , for a coverage criterion, , a test set satisfies iff for every test requirement , at least one satisfies

Coverage Level

  • Coverage Level: given a set of test requirements, TR, and a test set T, the coverage level is the ratio of the number of test requirements satisfied by T to the size of TR
    • TR =
    • Test set 1, T1 = { 3 chocolate cones, 1 vanilla cone }
    • Coverage level = 2/3 = 66.7%
  • Coverage levels help you evaluate how good a test set it, especially in the presence of infeasible test requirements

Graph Coverage

  • N: set of nodes
  • : set of initial nodes
  • : set of final nodes
  • : set of edges, e.g. and ;
    • is the predecessor and is the successor in

Path

  • A path is a sequence of nodes from a graph, G, whose adjacent pairs all belong to the set of edges, E, of G

Test Path

  • A test path is a path [possible of length 0] that starts at some node in and ends at some node in

Subpath

  • A subpath is a subsequence of a path
    • all edges in the subpath must connect consecutive nodes without skipping any in between from the path
  • is not a subpath of , but is

Connecting Test Cases and Test Paths

  • Connect test cases and test paths with a mapping from test cases to test paths
    • e.g., is the set of test paths corresponding to test case t
    • Usually, just write path, a is obvious from the context
    • Lift the definition of path to test set by defining
    • Each test case gives at least one test path
      • if the software is deterministic, then each test case gives exactly one test path;
      • otherwise, multiple test paths may arise from one test case

Paths & Semantics

  • Some paths in a control flow graph may not correspond to program semantics
    • example: the path followed by a true result from if(false)

Syntactical and Semantic Reachability

  • A node is syntactically reachable from if there exists a path from to

  • A node is semantically reachable if one of the paths from to can be reached on some input

  • Standard graph algorithms, like BFS and DFS, can compute syntactic reachability

  • Semantic reachability is undecidable

Node Coverage

  • For each node TR contains a requirement to visit node n
  • Node Coverage [NC]: _ contains each reachable node in
  • TR =

Edge Coverage

  • Edge Coverage [EC]: contains each reachable path of length up to 1, inclusive, in

Subsumption

  • Criteria Subsumption: a test criterion subsumes if and only if every set of test cases that satisfies criterion also satisfies
  • Must be true for every set of test cases
  • Subsumption is a rough guide for comparing criteria, although it’s hard to use in practice

Why does Edge Coverage Subsume Node Coverage?

  • Visiting every edge does require visiting every reachable node
  • If a node is not reachable it must not be part of the graph

Node and Edge Coverage

  • Edge coverage is slightly stronger than node coverage
  • NC and EC are only different when there is an edge and another subpath between a pair of nodes [ as in an “if-else” statement]

Edge Pair Coverage

  • Edge-Pair Coverage [EPC]: TR contains each reachable path of length up to 2, inclusive, in G

Coverage Tools

  • C/C++: Gcov, …

  • Java: JCov, JaCoCo, EclEmma, Covertura, Clover, JMockit, PIT, …

  • Python: Coverage.py, …

  • No good tool to measure coverage across multiple languages

EvoSuite

  • Examines Java byte code and automatically generates JUnit tests
  • A practical form of mutation testing
  • https://www.evosuite.org/

American Fuzzy Lop (AFL) & LibFuzzer

  • Both used by TensorFlow
  • AFL: “A security oriented fuzzer that employs a novel type of compile-time instrumentation and genetic algorithms to automatically discover clean, interesting test cases that trigger new internal stats in the targeted binary. This substantially Improves the functional coverage for the fuzzed code.‘”
  • LibFuzzer: “Tracks which areas of the code are reached, and generates mutations on the corpus of input data in order to maximize the code coverage.”

Simple Path

  • A path is simple if no node appears more than once in the path, except that the first and last nodes may b the same
  • Some properties of simple paths:
    • no internal loops;
    • can bound their length;
    • can create any path by composing simple paths;
    • many simple paths exist [too many!]
  • Examples:

Prime Path

  • Because there are so many simple paths, instead, consider prime paths, which are simple paths of maximal length
  • A path is prime if it is simple and does not appear as a proper subpath of any other simple path

Prime Path Coverage (PPC) & CPC

  • Prime Path Coverage [PPC]: TR contains each prime path in

  • There is a problem with using PPC as a coverage criterion: a prime path may be infeasible but contains feasible simple paths

    • How is this issue addressed?
  • Complete Path Coverage [CPC]: TR contains all paths in G