Error Localization

[!IMPORTANT!] This page is not up to date anymore. Current work on fault localization is being maintained on THIS WEBPAGE.

Implementation changes [DONE]

Here is the white board picture for our discussion on 6.June.2017
Click here for a larger version.

More types of statements [ALMOST DONE]

Should we handle assume statements separately?

Suppose we have an assume statement

To check if the assume statement is relevant or not, our algorithm checks if the following triple is unsatisfiable:

I think the triple will always be unsatisfiable, but the assume statement is never in the unsatisfiable core and hence never marked as relevant in our implementation.
Possible Solution:
Maybe we don't need to consider the relevancy of an assume statement . We should consider the branch after the assume statement. If the branch is relevant, then the assume statement is relevant.

Another possible criteria to determine the relevancy of an assume statement is :
is relevant if replacing this assume statement with (else) and the corresponding else branch does not lead us to the error anymore

2017-05-17 Discussion assume statements

do not try to find relevance criterion for assume statements
our analysis finds "only" error relevant value assignments
in a trace we cannot pinpoint the assume statement that causes the error -- the programmer could always introduce an assume false and the error would not be reachable on this trace
we take control flow structure of program as given and find only "faulty assignments"
TODO find good arguments (maybe in related work) why we do not analyze assumes

Why we don't analyze assume statements?

[Trying to provide some good arguments] We don't analyze assume statements because they don't have any impact on the state of the program. We are interested in transitions that can have an impact on the program states. We are interested in transitions that can potentially take the program in a state Q, such that if we execute the rest of the trace from the program state Q, the error is not reachable anymore.
A statement is relevant if changing it to havoc cause some assume to become restrictive. that means that the value we assign to the variable in the assignment is important to reach the error. Assume statements do not change the values of the variables.In our analysis we are interested in how different transition's effect on program states effect the reachability of error.

Call/return statements?

TODO check examples (better small ones with procedure calls, probably SV-COMP benchmarks form subcategory/folder recursive)
[Pending]

Comparison to related work [DONE]

Examples demonstrating the difference b/w our approach and existing ones:

Consider the following program:

x := 7
x := 7
assert(x >= 10)

Our algorithm says that the second statement is relevant but according to the Error Invariant Paper, since after the second statement, the error invariant is the same (inductive invariant) so the second statement is irrelevant but the first one is relevant.
But also intuitively the second statement should be reported relevant to the user because if we report the first statement as relevant and even if the user fix it, the program still has a bug in it.
In this example we have a different result and our result probably makes more sense.

Do others also consider assume statements?

At least not in the approach with error invariants.
In the error invariants paper, they calculate the error invariant for each location.
An invariant is a formula that is satisfied by all states that are reachable at the position by the execution of the trace till that point.
An error invariant is an invariant for a position in an error trace that only captures states that will still produce the error, if the execution of the trace is continued from that position.
Inductive error invariants are those error invariants that hold for consecutive positions in an error trace. They characterize statements in the trace that are not relevant for the error.

Several error traces [DONE]

Should we consider several error traces?

In the paper From Symptom to Cause: Localizing Errors in Counterexample Traces, the authors analyze multiple error traces.
They present a technique that localize an error cause in an error trace, report a single error trace per error cause, and generate multiple error traces per error cause.
Algo:
error trace t1 = [1,2,4,5]
correct trace t2 = [1,2,3,5,6,7,9]
Declare line 4 is the cause of the error.
Replace line 4 by halt (halt statement instructs the model checker to stop exploring paths throught the statement at line 4) [skip ?].
invoke the model checker again.
the model checker now returns a new error trace, t3 = [1,2,3,5,6,8,9].
Again compare t3 with the correct trace t2 and we see that line 8 is responsible for the error.
Another halt statement is introduced at line 8 and the model checker is invoked for the 3rd time. Now the model checker says there are no more error in the program.
In summary the author's algorithm automatically produces two error traces t1 and t3 with different causes for the error.

Proof of correctness of our Algorithm [DONE]

We proved that if an assignment statement Pi[i] of the form x:=t at position i is relevant then P => WP(Q, havoc(x)) does not hold. Where Q = not WP(false; pi[i+1, n]) and P = not WP(false; pi[i,n]).
But the way our algorithm actually works is, it checks if the hoare triple {P} pi[i] {Q} is valid (that is the triple (P, pi[i], -Q) is unsatisfiable ) and it checks if the statement x:=t is in the unsatisfiable core. If yes, then the statement is marked as relevant.

In case of the havoc statement, our algorithm checks if the hoare triple is satisfiable. If yes, then the havoc is relevant.

Add 2 lemmas stating:

If pi[i] is an assignment statement of the formx := t and the hoare triple {P} pi[i] {Q} is unsatisfiable and pi[i] is in the unsatisfiable core, then P => WP(Q, havoc(x)) does not hold.
if pi[i] is a havoc statement of the form havoc(x) and the hoare triple {P} pi[i] {Q} is satisfiable then P => WP(Q, havoc(x)) does not hold.

No need to state the lemmas. Just a paragraph explaining how we do it in practice

Terminology

(my email form 2017-02-14) How should we call the "normal" relevancy and the "Golden Frame" relevancy?

Some preliminary ideas.

"Assignment Relevancy" and "Havoc Relevancy"
"Restrictive Relevancy" and "Relaxing Relevancy"
"Input Relevancy"
Something that does not have "Relevancy" in its name.
"Error Enforcing" and "Error Admitting"
"Error Supporting"
"Error Fostering"
Something related to "Security"

Security Analysis

When is a bug a security bug?
One possible answer: When some (or only the last?) function whose result is influenced by user/network has the @ symbol. (Means that specific user input can trigger bug)

One task: Check the trunk/examples/programs/toy/faultlocalization/vadim/ examples that Vadim Mutilin form the Linux Verification Center has sent to us.

Improve Performance

Find bottlenecks in analysis of large counterexamples. Matthias: The bottleneck for the LDV examples seems to be the array quantifier elimination during the computation of WP for summaries (resp. the return statements).

The array quantifier elimination can be improved significantly (note for MH: explicit index-value comparison instead of simplifyDDA postprocessing).
Compute WP without explicit summaries (note for MH: use symbolic summary for return, compute preliminary WP up to call, instanciate symbolic summary, obtain final WP via post-processing).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly