Motivation

For any reasonably complex system, it is necessary to ensure that the system is acting as intended without any unpredictable behaviours that may end up compromising the system and/or the systems surrounding it. There are many safety critical systems whose malfunctioning can cause innumerable casualties and loss of life, thus for these systems we need to reliably prove that they have predictable outputs corresponding to determined inputs.

Formal Verification utilizes a lot of different mathematical tools to demonstrate the “correctness” of some system with respect to some desirable specification or property. The set of tools used for Formal Verification is collectively known as Formal Methods and include such theories like logic, automata, type systems and formal languages. These can be used to check the behaviour of some system against a set of specifications in order to determine correctness of the system. Formal Verification can be imagined as a layer of abstraction on top of systems (both software and hardware) that provides some guarantees as to the functioning of these systems.

Below we will be building up to the application of Formal Methods to the field of Machine Learning in order to Verify Neural Networks.

Note

The following is an effort to condense the Introduction to Neural Network Verification book in order to concisely summarize the material that is relevant to my research.

Introduction

Background

One of the first demonstrations of the possibilities of verification came from Alan Turing’s 1949 paper called ‘Checking a large routine’. In the paper the program for finding the factorial of a number was broken down into a simple sequential set of instructions that could either be either $T r u e$ or $F a l se$ . The truth values for each of these assertions were then checked in order to prove that the program is a faithful implementation of the factorial mathematical function. This is known as proving the functional correctness of a program. Although this is the gold standard for demonstrating correctness, it is not possible to do so for Neural Networks as the tasks performed cannot be described mathematically. In a Neural Network we generally test for the following qualities:

Robustness: This class of properties is violated when small perturbations to the inputs result in changes to the output. It also measures how well a model performs on new and unseen test data.
Safety: This is a broad class of correctness properties that ensure that the program does not reach a ‘bad’ state. (e.g., a surgical robot should not be operating at 200mph)
Consistency: This class of properties are violated only when the algorithms are not consistent with the real world. (e.g., tracking an object in free fall should be consistent with the laws of gravitation)

We will mostly be testing for Robustness, as that is the set of properties that are most widely tested for. It can also encapsulate the Safety and Consistency classes (as there is not well-defined boundary between them) if the specifications are designed accordingly.

Note

Alan Turning had also extensively thought about ‘Networks that can Learn’, which can be seen from his 1948 paper Intelligent Machinery where he proposes Boolean NAND networks that can learn over time.

Abstraction of Neural Network

We will not be dealing with the technicalities of the Neural Networks but rather used well-rounded mathematical abstractions that can be rigorously tested over. Since all Neural Networks can be though of as Directed Acyclic Graphs (DAGs), we will be treating them as such. More specifically they will be treated as dataflow graphs of operators over $R$ . The shape of these graphs will determine what specific architecture they belong to. Each node will perform some computation, whose dependencies are the edges. Thus, a neural network will be associated with some function:
$f : R^{n} \to R^{m}$

For a Neural Network: $G = (V, E)$ , we have:
$V$ : finite set of nodes
$E \subseteq V \times V$ : set of edges
$V^{in} \subset V$ : input nodes
$V^{o} \subset V$ : output nodes
$n_{v}$ : total number of edges whose target is $v$
where, $n = ∣ V^{in} ∣$ and $m = ∣ V^{o} ∣$ for the network.

All neural networks are finite DAGs, there are classes of neural networks called Recurrent Neural Networks that have loops, but the number of iterations will depend on the size of the input. RNNs have self-loops that unroll based on the length of the input, which means that it reliably terminates. So while testing neural networks we will not test for loop termination and the explosion of possible program paths that comes with it.

center

For our purposes the following conditions must be met in a network:

all nodes must be reachable from some input node
every node can reach an output node
fixed total ordering on edges $E$ and another one on nodes $V$ .

Each node $v$ of the neural network is a function of the following form: $f_{v} : R^{n_{v}} \to R$

Where each node takes in some vector, does some computation on it and return a single output number that is passed through an activation function (a non-linear function), for ease of analysis they are treated as belonging to two different nodes. Each of the elements of the vectors are the outputs of previous nodes. These relations can be recursively defined with the base case terminating at the input nodes. Therefore, for every non-input node $v \in V$ , we have:

center

Where each $(v_{i}, v)$ represents an edge connecting node $v_{i}$ to $v$ . The ordering of these edges and nodes determines the inputs on the basis of which the computations will be done. Each node will have an $o u t (v)$ function which can be defined as follows: $o u t (v) = f_{v} (x_{1}, ..., x_{n_{v}})$ where $x_{i} = o u t (v_{i})$ for $i \in {1, 2, ..., n_{v}}$ . This is the recursive definition wherein each input to the node $x_{i}$ can be defined as: $o u t (v) = f_{v} (o u t (v_{1}), o u t (v_{2}), ..., o u t (v_{n_{v}}))$
No sequence of operations are defined, only which nodes need what data to perform its computations. A more modified version of these graphs are known as computation graphs. $o u t (v_{i})$ values can be computed in any topological ordering of graph nodes, as it needs to be ensured that all the inputs need to be computed before the target node itself. The total ordering ensures proper reference to the graph nodes and whether computations have been done in their desirable sequence, below the total ordering of nodes helps linearize the nodes into a topologically sorted sequence.

flowchart LR
1(( 1 ))
2(( 2 ))
3(( 3 ))
4(( 4 ))
5(( 5 ))
6(( 6 ))
7(( 7 ))
8(( 8 ))
9(( 9 ))
10(( 10 ))
11(( 11 ))
12(( 12 ))

1 & 2 & 3 & 4 --> 5 & 6 & 7
5 & 6 & 7 --> 8 & 9
8 & 9 --> 10 & 11 & 12

subgraph input
1
2
3
4
end

subgraph hidden1
5
6
7
end

subgraph hidden2
8
9
end

subgraph output
10
11
12
end

All vector computations need to be linear: $f (x) = \sum_{i = 1}^{n} c_{i} x_{i} + b$ or piece-wise linear:

f (x) ⎩ ⎨ ⎧ \sum_{i} c_{i}^{[1]} x_{i} + b^{[1]}, x \in S_{1} ⋮ \sum_{i} c_{i}^{[m]} x_{i} + b^{[m]}, x \in S_{m}

where $\cup_{i} S_{i} = R^{n}$ and $\cap_{i} S_{i} = \emptyset$ .

Note

Neural Networks are an instance of differential programs.

Defining Specifications

We define a language that specifies some properties about the functioning of a neural net. This will enable us to later on make statements and verify them based on the specifying language.

Specifications are generally of the form:

{p reco n d i t i o n} r \leftarrow f (x) {p os t co n d i t i o n}

where both preconditions and post-conditions are statements that specify some property that is adjacent to input and output respectively. Properties dictate the input-output behaviour of the network (and not the internals). Specifications help in quantifying some properties for accurate verification. Each specification can be thought of as being structured in the following way:

$p reco n d i t i o n f or an y in p u t s x, y, \dots t ha t \dots t h e n e u r a l n e tw or k G p ro d u ces p s t co n d i t i o n o u tp u t t ha t \dots$

Although every possible specification needed for complete verification cannot be made, multiple specifications can be combined together to test for stronger properties. Preconditions are generally predicates or Boolean functions defined over set of variables which act as inputs to the system, and post condition is a Boolean predicate over the variables appearing in precondition ( $x_{i}$ ) and assigned variables ( $r_{i}$ ).
For any values of $x_{1}, \dots, x_{n}$ that make the precondition true, let $r_{1} = f (x_{1})$ , $r_{2} = g (x_{2}) \dots$ , where $f (), g (), \dots$ are the computations on the input, then the post condition must also be True.

{p reco n d i t i o n} r_{1} \leftarrow f (x_{1}) r_{2} \leftarrow g (x_{2}) ⋮ {p os t co n d i t i o n}

If the post condition is False, then the correctness property is not true, i.e., the property does not hold.

Consider: c is an actual greyscale image, each element of c is the intensity of a pixel $\in (0, 1)$ ; we can state the following specification about the brightness of c and its corresponding classification:

{∣ x - c ∣ \leq 0.1} r_{1} \leftarrow f (x) r_{2} \leftarrow f (c) {c l a ss (r_{1}) = c l a ss (r_{2})}

$r_{1}$ and $r_{2}$ are vectors whose elements are probabilities for a belonging to a class with labels corresponding to the indexes. $c l a ss (r_{1})$ and $c l a ss (r_{2})$ extract the indexes corresponding to the largest elements of the vectors $r_{1}$ and $r_{2}$ respectively.

Info

Counterexamples: Valuations of variables in precondition that falsifies the post condition.

flowchart TD

A["Verification"]
B["Constraint Based"]
C["Abstraction Based"]

A --> B & C

Constraint-Based Verification

Constraint-Based Satisfaction

A correctness property is taken and encoded as a set of constraints, solving which will help us to decide whether the property holds or not.

Let $f v (F)$ be the set of free variables appearing in the formula $F$ .

Interpretation: $I$ of $F$ is a map from variables present in $f v (F)$ to either $T r u e$ or $F a l se$ . $I (F)$ denotes the formula where the variables have been replaced with their corresponding interpretations.

Example:
Let $F ≜ (p \land q) \neg r$ , which means that $F$ is syntactically defined to be equal to the Boolean formula (as opposed to being semantically equivalent).

f v (F) = {p, q, r} I = {p \mapsto True, q \mapsto False, r \mapsto True} I (F) ≜ (True \land False) \neg True

$e v a l (F)$ : denotes the simplest form of $F$ we can get by evaluating repeatedly.

$F$ is satisfiable (SAT) if there exists an $I$ s.t. $e v a l (I (F)) = T r u e$ , in which case, $I$ is the model of $F$ : $I ⊨ F$ .
$I \neq ⊨ F$ denotes $I$ isn’t a model of $F$ .
$I \neq ⊨ F$ if and only if $I ⊨ \neg F$ .
$F$ is unsatisfiable (UNSAT) if $\forall I, e v a l (I (F)) = F a l se$ .

Validity: If every interpretation $I$ is a model of $F$ , then $F$ is valid.

Boolean satisfiability theories (SAT) can be generalised to more complex theories to include reals, vectors, strings, arrays, etc.; the problem of determining whether statements within these theories are true is known as Satisfiability Modulo Theories or simply SMT. We will be extensively using a first-order logic system called Linear Real Arithmetic (LRA) as it can represent a large class of neural networks and is decidable.
In LRA, each propositional variable is replaced by a linear inequality of the form:

i = 1 \sum n c_{i} x_{i} + b \leq 0 OR i = 1 \sum n c_{i} x_{i} + b < 0

where $c_{i}, b \in R$ .

(p x + y \leq 0 \land q x - 2 y < 10) \lor r x \geq 100

An interpretation $I$ of $F$ is an assignment of every free variable to a real number.

Encoding Neural Networks

We need to translate Neural Networks into a formula in LRA such that we can use SMT Solvers (specialised software that tests satisfiability for SMT problems).

$f_{v}$ : Function in node.
$I$ : Model.
$F_{v}$ : Encoding for node.
$R_{g}$ : Relational mapping for neural network.

Note

Generally Variables with a subscript $G$ refers to the overall neural network as a graph and those with a subscript v refers to individual nodes or a collection of nodes.

Whole Network can be defined as a binary relation:

R_{G} = {(a, b) ∣ a \in R^{n}, b = f_{G} (a)}

$R_{v}$ , $f_{v}$ define the same for a single node $v$ in $G$ network.

For a single neuron with corresponding function $f_{v} : R \to R$ , both can be defined as:

f_{v} (x) = x + 1 and R_{v} = {(a, a + 1) ∣ a \in R}

Thus the encoding for a node with one input is as follows

F_{v} ≜ v^{o} = v^{in, 1} + 1

Models of $F_{v}$ will be of the form ${v^{in, 1} \mapsto a, v^{o} \mapsto a + 1}$ and have one-to-one correspondence with the elements of $R_{v}$ .

For two inputs: The formula for the encoding of a node with the function $f (x) = x_{1} + 1.5 x_{2}$ can be represented as

$F_{v} ≜ v^{o} = v^{in, 1} + 1.5 v^{in, 2}$

Generalizing encoding for any single node:

Formalizing the operation $f_{v}$ of some node $v$ . We assume that the function $f_{v} : R^{n_{v}} \to R$ is piecewise linear, i.e., of the form

f (x) = ⎩ ⎨ ⎧ \sum_{j} c_{j}^{1} x_{j} + b ⋮ \sum_{j} c_{j}^{l} x_{j} + b if S_{1} if S_{l}

where $j$ ranges from $1$ to $n_{v}$ . This generalizes representations to allow any one linear operation corresponding to some predetermined condition $S_{i}$ . Each $S_{i}$ is defined as a formula in LRA over the elements of $x$ . Thus, the encoding for the single node can be written as:

F_{v} ≜ i = 1 ⋀ l [S_{i} \Rightarrow (v^{o} = j = 1 \sum n_{v} c_{j}^{i} \cdot v^{in, j} + b^{i})]

if statement $S_{i}$ is $T r u e$ , then $v^{o}$ is equal to the $i^{t h}$ equality. The statements are combined together using a conjunction over all possible inputs ( $v^{in, j}$ ) to the node. Each clause joined using the conjunction is in $condition ⟹ assignment$ form providing the functionality of an if statement. So the general way to read this condensed statement is:

if S_{1} then \dots AND if S_{2} then \dots

Example (ReLU):

re l u (x) = {x o if if x > 0 x \leq 0

The encoding for a simple ReLU function will be:

F_{v} ≜ (x > 0 v^{in, 1} > 0 \Rightarrow v^{0} = v^{in, 1}) \land (x ⩽ 0 v^{in, 1} ⩽ 0 \Rightarrow v^{0} = 0)

In order to ensure that we are making accurate representations of the actual system that we are modelling, we have to ensure that two qualities are always guaranteed for the encodings: Soundness & Completeness.

Soundness - This is to make sure that our model does not miss any behaviour of $f_{v}$ . Let $(a, b) \in R$

I = {v^{in, 1} \mapsto a_{1}, \dots, v^{in, n} \mapsto a_{n}, v^{o} \mapsto b}

for any given tuple of $a$ and $b$ which is an element of $R_{v}$ , $I$ is a model of $F_{v}$ : $I ⊨ F_{v}$ . Soundness is the property of only being able to prove “true” things, any analysis about the system that is proven to be true, will be true.

Completeness - Any model of $F_{v}$ maps to a behaviour of $f_{v}$ , so if for a model of $F_{v}$ :

I = {v^{in, 1} \mapsto a_{1}, \dots, v^{in, n} \mapsto a_{n}, v^{o} \mapsto b}

then $(a, b) \in R$ . Completeness is the property of being able to prove all true things that can possibly exist in the system. This property can be used to determine counterexamples for any given model.

So far we have seen how to encode a singular node in a larger network. But to encode an entire Neural Net, we need to have the following structure:

flowchart TD
a[Encoding a Neural Network]
b[Encoding a formula for the nodes]
c[Encoding a formula for the edges]
a --> b
a --> c

Encoding the nodes:

for all non-input nodes we have:

F_{V} ≜ v \in V / V_{in} ⋀ F_{v}

Since the input nodes do not perform any operations, we are excluding them from the set using $V / V_{in}$ .

the output of v_{1} is \dots AND the output of v_{2} is \dots

Encoding the edges:

For some node $v \in V / V_{in}$ there exists a total ordering of edges: $(v_{1}, v), (v_{2}, v), \dots$ The total ordering tells us which edge feeds into which corresponding input index of the node.

F_{o \to v} ≜ i = 1 ⋀ n v^{in, i} = v^{o}

F_{E} = v \in V / V_{in} ⋀ F_{o \to v}

As per the total ordering of the input edges, $F_{o \to v}$ determines that the output of the $i^{t h}$ node from the previous layer will become the $i^{t h}$ input of the present node. The $v$ ‘s present in the below diagram are the sequentially ordered nodes of the previous layer.

More concretely:

v_{i}^{[o]} ⟹ v^{[in, i]}

The entire network can be represented as a conjunction of the edges and nodes:

F_{G} ≜ F_{V} \land F_{E}

Assuming that we have ordered input nodes in $V_{in}$ Let $(a, b) \in R_{G}$ and let

I = input {v_{1}^{o} \mapsto a_{1}, \dots, v_{n}^{o} \mapsto a_{n}} \cup output {v_{n + 1}^{o} \mapsto a_{n + 1}, \dots, v_{n + m}^{o} \mapsto a_{n + m}}

Then there exists $I^{'}$ such that $I \cup I^{'} ⊨ F_{G}$ .

Note

Size of encoding is linear in the size of the Neural Network

Example:

flowchart LR

A["v₁"]
B["v₂"]
C(("v₃"))
D["v₄"]

A & B --> C --> D

The functions corresponding to nodes $v_{3}$ and $v_{4}$ are as follows:
$f_{v_{3}} (x) = 2 x_{1} + x_{2} and f_{v_{4}} (x) = relu (x)$
The formulations for each of the corresponding nodes will be:

$Node v_{3} : F_{v_{3}} ≜ v_{3}^{o} = 2 v_{3}^{in, 1} + v_{3}^{in, 1}$
$Node v_{4} : F_{v_{4}} ≜ (v_{4}^{in, 1} > 0 ⟹ v_{4}^{o} = v_{4}^{in, 1}) \land (v_{4}^{in, 1} \leq 0 ⟹ v_{4}^{o} = 0)$

F_{o \to v_{3}} ≜ (v_{3}^{in, 1} = v_{1}^{o}) \land (v_{3}^{in, 2} = v_{2}^{o}) F_{o \to v_{4}} ≜ (v_{4}^{in, 1} = v_{3}^{o})

Encoding of $G$ : $F_{G} ≜ F_{V} F_{v_{3}} \land F_{v_{4}} \land F_{E} F_{o \to v_{3}} \land F_{o \to v_{4}}$
So far we have been considering the linear operations that form the weights and biases for the nodes in a neural network, however, there are also non-linearities that are present.

How to deal with non-linear activations?

We can create representations of non-linear activation functions by over-approximation of said functions. Over-approximation is the partitioning of the function into different input domains and binding the outputs corresponding to each of these input domains to some defined output domain. For example; we can make an over-approximation for the Sigmoid ( $σ (x)$ ) function by diving the function into appropriate intervals that roughly corresponds to its overall behaviour, i.e., anything less than a certain threshold will give us a 0, anything more than a certain threshold will give us 1 and any input values in between them will give us a value of roughly 0.5.

F_{v} ≜ (v^{in, 1} \leq - 1 ⟹ 0 \leq v^{o} \leq 0.26) \land (- 1 < v^{in, 1} \leq 0 ⟹ 0.26 < v^{o} \leq 0.5) \land (0 < v^{in, 1} \leq 1 ⟹ 0.5 < v^{o} \leq 0.73) \land (v^{in, 1} > 1 ⟹ 0.73 < v^{o} \leq 1)

The above can be generalized to any monotonically increasing or decreasing function $f_{v}$ (which is all activation functions). Assume $f_{v}$ is monotonically increasing, sample a sequence of real values $c_{1} < \dots < c_{n}$

F_{v} ≜ (v^{in, 1} \leq c_{1} ⟹ l b < v^{o} \leq f (c_{2})) \land (c_{1} < v^{in, 1} \leq c_{2} ⟹ f_{v} (c_{1}) < v^{o} \leq f (c_{2})) ⋮ \land (c_{n} < v^{in, 1} ⟹ f_{v} (c_{n}) < v^{o} \leq u b

where $l b$ and $u b$ are the *lower bound and upper bound respectively. A point to note in the case of over-approximations are that they will give us soundness, but not completeness. Completeness essentially means that our encoding can find counterexamples, which we are abandoning in this case. Soundness means being able to prove correctness properties using an encoding, which is of highest priority for our models.

A Concrete Example for checking Robustness

f_{G} : R \to R^{2}

Where the output vectors in $R^{2}$ correspond to:

p (cat) p (dog)

Let the specification for this be:

{∣ x - c ∣ \leq 0.1} r \leftarrow f_{G} (x) {r_{1} > r_{2}}

where $c$ is the original image and $x$ is the perturbed image. The formula generated to check this statement is called a Verification Condition (VC), which if valid then ensures that the correctness property holds.

(p precondtion \land G neural network) ⟹ q postcondition

inputs: ${v_{1}, \dots, v_{n}}$ outputs: ${v_{n + 1}, v_{n + 2}}$ Neural Network: $F_{G}$

precondition (i = 1 ⋀ n ∣ x_{i} - c_{i} ∣ \leq 0.1) \land network F_{G} \land network input (i = 1 ⋀ n x_{i} = v_{i}^{o}) \land network output (r_{1} = v_{n + 1}^{o} \land r_{2} = v_{n + 2}^{o}) ⟹ postcondition (r_{1} > r_{2})

LRA does not support vector operations hence the vector operations are decomposed into their constituent scalars. Absolute value operators are also not available in LRA, so we encode them as: $∣ x ∣ \leq 5 L R A (x \leq 5) \land (- x \leq 5)$ . We need to connect variables of $F_{G}$ with inputs $x$ and output $r$ .

For multiple Neural Networks, Correctness Properties have the form:

{P} r_{1} \leftarrow f_{G_{1}} (x_{1}) r_{2} \leftarrow f_{G_{2}} (x_{2}) ⋮ r_{l} \leftarrow f_{G_{l}} (x_{l}) {Q}

For the above specification we have:

(P \land i = 1 ⋀ l F_{i}) ⟹ Q

For the above $F_{i}$ corresponds to $r_{i} \leftarrow f_{G_{i}} (x_{i})$ . $F_{i}$ combines encoding of the neural network $F_{G_{i}}$ along with connections with inputs and outputs, $x_{i}$ and $r_{i}$ , respectively.

F_{i} ≜ F_{G_{i}} \land (j = 1 ⋀ n x_{i, j} = v_{i}^{o}) \land (j = 1 ⋀ m r_{i, j} = v_{n + j}^{o})

x_{i, j} \to v_{i}^{o} \to v_{i}^{in, 1}

(soundness) If $F$ is valid, then the correctness property is true. (completeness) If it is invalid, we know there is a model $I ⊨ \neg F$ if $F$ is encodeable in LRA.

Assuming, input and output variables of the encoding of $G_{i}$ are $v_{1}, \dots, v_{n}$ and $v_{n + 1}, \dots, v_{n + m}$ ; each graph $G_{i}$ has unique nodes and therefore input/output variables.

(P \land_{i = l}^{l} F_{i}) ⟹ Q

The above reads as: “If the precondition is true and we execute all $l$ networks, then the postcondition should be true”

Example:

{∣ x - 1∣ \leq 0.1} r \leftarrow f (x) {r \geq 1}

where $f (x) = x$ .

We take the input value $x$ to be equal to 0.99: $x = 0.99$ . This is a valid value of $x$ as per the defined precondition.
Therefore, we have: $f (x) ⟹ f (0.99) = 0.99$
$F$ is invalid as $I ⊨ \neg F$ , thus the correctness property does not hold.

Info

In the formulations that we have gone over, disjunctions arise due to the ReLU function, due to its active/inactive states for all possible inputs to the Neural Network. Disjunctions cause problem during solving, as without them LRA (and a similar system called MILP) are polynomial time solvable. To solve LRAs with disjunction we either simplify the formulations by using lightweight techniques to discover whether ReLUs are active or not (abstraction-based verification). Alternatively, we can add additional bias to make all the ReLUs either always active or always inactive. Another thing to consider is that verified NNs in LRA may not really be robust when considering bit-level behaviour.

DPLL (Davis-Putnam-Logemann-Loveland)

DPLL algorithm checks the satisfiability of Boolean formulae and underlies modern SMT and SAT solver. An extension of DPLL algorithm is needed to handle first-order formulae over different theories.

Conjunctive Normal Form (CNF)

DPLL expects formulae that will be inputted to be in the shape of Conjunctive Normal Form (CNF), all Boolean formulae must be written in CNF as can be seen below

CNF: C_{1} \land \dots \land C_{n}

where each sub-formula $C_{i}$ is called a clause and is for the form

l_{1} \lor \dots \lor l_{m_{i}}

where each $l_{i}$ is called a literal and is either a variable ( $p$ ) or its negation ( $\neg p$ ). Thus the structure of an entire input should be in the form:

flowchart TD

C_1((C₁))
C_2((C₂))
C_n((Cₙ))
l_1((l₁))
l_2((l₂))
l_m((lₘ))
p((p))
np((¬p))
d1([......])
d2([......])

C_1 --> l_1 & l_2 & l_m
l_1 --> p & np


subgraph Clauses
C_1 -. ∧ .- C_2 -. ∧ .- d1 -. ∧ .- C_n
end


subgraph Literals
l_1 -. ∨ .- l_2 -. ∨ .- d2 -. ∨ .- l_m
end

subgraph Variables
p
np
end

DPLL has the following two alternating phases

flowchart LR

A(["Deduction"])
B(["Search"])

A --> B --> A

Deduction

Boolean Constant Propagation (BCP)

The algorithm searches the Boolean CNF for clauses with single literals, i.e., clauses that contain only one Boolean variable instead of disjunctions of multiple literals. For a model to be satisfiable, the single literal must be $True$ if the variable is $p$ or $False$ if the variable is $\neg p$ .

(l) \land C_{2} \land \dots \land C_{n}

The BCP phase will look for all unit clauses and replace their literals with $True$ .

Example:

F ≜ (p) \land (\neg p \lor r) \land (\neg r \lor q)

BCP: BCP: BCP: (True) \land (\neg True \lor r) \land (\neg r \lor q) \equiv (r) \land (\neg r \lor q) (True) \land (\neg True \lor q) \equiv q (True)

Thus $F$ is SAT with the model

{p \mapsto True, q \mapsto True, r \mapsto True}

Deduction + Search

DPLL:
	Data: A formula F in CNF
	Result: I ⊧ F or UNSAT
▹ Boolean Constant Propagation (BCP)
	while there is a unit clause (l) in F do:
		Let F be F[F ↦ True]
	if F is True then return SAT
▹ Search
	for every possible variable in F do:
		If DPLL(F[p ↦ True]) is SAT then return SAT
		If DPLL(F[p ↦ False]) is SAT then return SAT
	return UNSAT

The model $I$ that is returned by DPLL when the input is SAT is maintained implicitly in the sequence of assignments to variables (of the form $[l \mapsto \dots]$ and $[p \mapsto \dots]$ ).

F ≜ recursion 1: recursion 2: (p \lor r) \land (\neg p \lor q) \land (\neg p \lor \neg r) F_{1} = F [p \mapsto True] = q \land (\neg q \lor \neg r) F_{2} = F_{1} [q \mapsto True] = (\neg r) F_{3} = F_{2} [p \mapsto FAlse] = (True)

DPLL returns SAT and then implicitly builds the model for $F$ :

{p \mapsto True, q \mapsto True, r \mapsto False}

Partial Model: DPLL can terminate with SAT and without assigning values to each and every variable, these incomplete models are called as partial models. The unfilled variables are essentially don’t care variables and can be filled in any way we want. For $F ≜ p \land (q \lor p \lor \neg r) \land (p \lor \neg q)$ , $I = {p \mapsto True}$ is a partial model.

DPLL Modulo Theories

DPLL modulo theories or DPLLᵀ are extensions of DPLL over formulae in mathematical theories such as LRA. We essentially treat a formula completely by taking it as a Boolean, then incrementally add more and more theory info to conclusively prove SAT or UNSAT.

F ≜ F^{B} ≜ (x \leq 0 \lor x \leq 10) \land (\neg x \leq 0) (p \lor q) \land (\neg p)

flowchart LR

A[Original Formula
F]
B[Boolean Abstraction
Fᴮ]

A --B---> B --T---> A

DPLLᵀ Algorithm

Constraints are lost in the abstraction process ( $F B F^{B}$ ) such as the relation between the different inequalities. If $F^{B}$ is UNSAT, then $F$ is UNSAT, but, if $F^{B}$ is SAT, it does not imply that $F$ is SAT.
The DPLLᵀ algorithm first uses DPLL to check if $F^{B}$ is UNSAT following the properties of abstraction. IF $F^{B}$ is SAT, we will take the model $I$ returned by DPLL( $F^{B}$ ) and map it to the formula $I^{T}$ in the theory we have abstracted the theory from (LRA in our case). If the theory solver deems $I^{T}$ satisfiable, $F$ is satisfiable. Otherwise, DPLLᵀ learns that $I^{T}$ is not a model so it negates $I$ and conjoins it to $F^{B}$ . DPLLᵀ lazily learns more and more facts about the formula and refines the abstraction until the algorithm can decide SAT or UNSAT. The whole process is illustrated below:

flowchart LR

FU([F is UNSAT])
FB((Fᴮ))
I((I))
IT([Iᵀ])
FS([F is SAT])
NI((¬I))

FB --SAT---> I --T--> IT --SAT---> FS
FB --UNSAT---> FU
IT --UNSAT---> NI --∧---> FB

DPLL takes care of the disjunctions by taking clauses and searching for satisfiability by mapping literals to $True$ or $False$ values. DPLL assumes access to a theory solver to take care of the conjunctions of linear inequalities, in order to check their satisfiability. For LRA, the Simplex Algorithm can be used as a theory solver.

Example:

LRA: F^{B} : I_{1} = I_{1}^{T} = F^{B} \land \neg I_{1} : I_{2} = I_{2}^{T} : (x \geq 10) \land ((x < 0) \lor (y \geq 0)) p \land (q \lor r) {p \mapsto True, q \mapsto True} p \land q : SAT p x \geq 10 \land q x < 0 : UNSAT p \land (q \lor r) \land \neg I_{1} (\neg p \lor \neg q) p \land \neg q \land r {x \mapsto 10, y \mapsto 0} : SAT

How to convert Boolean formulae into CNF? Usually we use De Morgan’s Law (however all are $O (e x p)$ ) What can be used: Tseitin’s Transformation ( $O (n)$ )

Tseitin’s Transformation

x \in S_{var} F Tseitin’s Transformation x \in S_{var}^{'} F^{'} (CNF)

S_{var} \subseteq S_{var}^{'}

Any model of $F^{'}$ is also a model of $F$ , if we disregard the interpretations of newly added variables. If $F^{'}$ is UNSAT, then $F$ is UNSAT, so we just need to invoke DPLL on $F^{'}$ .

Tseitin’s transformation changes a formula of computations into a set of instructions, each containing one or two variables, connected by a single unary or binary operator respectively. For example:

def f(x, y, z):
    pass
    return x + (2*y + 3)

The above function can be decomposed into a set of instructions, that when executed sequentially will provide the same result. For the above example, the function can be decomposed into the following instructions:

def f(x,y,z):
    t1 = 2 * y
    t2 = t1 + 3
    t3 = x + t2
    return t3

If we are able to conjoin these sequential set of instructions, we will be converting the complex computation in non-CNF to CNF with added temporary variables $t_{i}$ ; Tseitin’s transformation follows a similar procedure over Boolean formulae to generate a CNF.

Tseitin Step 1: NNF

Negation Normal Form (NNF) is achieved by pushing negation inwards so that $\neg$ only appears next to variables, e.g., $\neg p \lor \neg r$ instead of $\neg (p \land r)$ .

\neg (F_{1} \land F_{2}) \neg (F_{1} \lor F_{2}) \neg\neg F_{1} = \neg F_{1} \lor \neg F_{2} = \neg F_{1} \land \neg F_{2} = F_{1}

Tseitin Step 2: Subformula Rewriting

Any subformula of $F$ that contains a conjunction/disjunction is called a subformula. We don’t consider subformula at literal level.

F ≜ F_{4} (F_{1} p \land q) \lor (F_{3} F_{2} q \land \neg r \land s)

$F_{1}, F_{2}$ are the deepest level of nesting, $F_{2}$ is subformula of $F_{3}$ . And all $F_{i}$ are subformulae of $F_{4}$ .

Assuming $F$ has $n$ subformulae:

For every subformula $F_{i}$ of $F$ , create a fresh variable $t_{i}$ . These variables are analogous to the temporary variables that was introduced to decompose some complex program
Starting with the most deeply-rooted subformula: let $F_{i}$ be of the form $l_{i} \circ l_{i}^{'}$ , where $\circ$ is $\land$ or $\lor$ and $l_{i}, l_{i}^{'}$ are literals. One or both of $l_{i}$ and $l_{i}^{'}$ may be the new variable $t_{j}$ denoting a subformula $F_{j}$ of $F_{i}$ , create the formula:

F_{i}^{'} ≜ t_{i} \Leftrightarrow (l_{i} \circ l_{i}^{'})

These formulae are analogous to the assignments to temporary variables in code, where $\Leftrightarrow$ is the logical analogue of variable assignment (=).

F_{1}^{'} F_{2}^{'} F_{3}^{'} F_{4}^{'} ≜ t_{1} \Leftrightarrow (p \land q) ≜ t_{2} \Leftrightarrow (q \land \neg r) ≜ t_{3} \Leftrightarrow (t_{2} \land s) ≜ t_{4} \Leftrightarrow (t_{1} \lor t_{3})

l_{1} \Leftrightarrow (l_{2} \lor l_{3}) l_{1} \Leftrightarrow (l_{2} \land l_{3}) \equiv (\neg l_{1} \lor l_{2} \lor l_{3}) \land (l_{1} \lor \neg l_{2}) \land (l_{1} \lor \neg l_{3}) \equiv (\neg l_{1} \lor l_{2}) \land (\neg l_{1} \lor l_{3}) \land (l_{1} \lor \neg l_{2} \lor \neg l_{3})

F^{'} ≜ t_{n} i ⋀ F_{i}^{'}

each $t_{i}$ is assigned true iff subformula $F_{i}$ evaluates to true. The constant $t_{n}$ in $F^{'}$ says that $F$ must be true, $t_{n}$ is like a return statement.

F^{'} ≜ t_{n} \land F_{1}^{'} \land F_{2}^{'} \land F_{3}^{'} \land F_{4}^{'}

Theory Solving

Theory solver for LRA receives $F$ formula as a conjunction of linear inequalities

i = 1 ⋀ n (j = 1 \sum m c_{ij} \cdot x_{j} \geq b_{i}) where c_{ij}, b_{i} \in R

Goal: Check SAT for $F$ and discover $I ⊨ F$ .

The Simplex Algorithm will be used as a theory solver, and it expects formulae to be conjunctions of equalities of the form $\sum_{i} c_{i} \cdot x_{i} = o$ and bounds of the form $l_{i} \leq x_{i} \leq u_{i}$ , $u_{i}, l_{i} \in R \cup {\infty, - \infty}$ . The infinities are included to ensure that variables with no upper/lower bounds are also adequately represented.

Converting inequalities into simplex (slack) form:

F ≜ i = 1 ⋀ n (j = 1 \sum m c_{ij} \cdot x_{j} \geq b_{i}) s_{i} = j = 1 \sum m c_{ij} \cdot x_{j} AND s_{i} \geq b_{i}

where $s_{i}$ : Slack Variable (analogous to tseitin temporary variables)

Let $F_{s}$ be the simplex form of some formula $F$ . Then we have the following guarantees (analogue of Tseitin Transformation for non-CNF formulae):

Any model of $F_{s}$ is a model of $F$ , disregarding assignments to slack variables.
If $F_{s}$ is UNSAT, then $F$ is UNSAT.

Simplex Algorithm

Goal: Find a satisfying argument that maximizes some objective function. Our interest in verification is to find any satisfying assignment, so it will be a subset of Simplex.

The set of variables in simplex form is classified into two subsets:

Basic Variables: those that appear on the left hand side of the equality; initially basic variables are the slack variables.
Non-Basic Variables: all other variables.

At the beginning, basic variables: ${s_{1}, s_{2}, s_{3}}$ and non-basic: ${x, y}$ , as Simplex progresses, formulae are rewritten so some basic variables may become non-basic and vice versa.

Simplex simultaneously looks for a model and a proof for unsatisfiability. Each equality defines a halfspace (something that splits $R^{2}$ into two parts). Simplex starts from $I_{0} = {x \mapsto 0, y \mapsto 0}$ and toggles the values to satisfy all of the equalities. We are total ordering our variables to make it easier to refer to specific variables. We assumes variables are of the form $x_{1}, \dots, x_{n}$ . Given a basic variable $x_{i}$ and a non-basic variables $x_{j}$ , we will use $c_{ij}$ to denote the coefficient of $x_{j}$ in the equality

x_{j} = \dots + c_{ij} \cdot x_{j} + \dots

for variable $x_{i}$ , $l_{i}$ and $u_{i}$ denotes its lower and upper bound. (Non-Slack variables have no bounds)

Two invariants that are maintained in Simplex:

$I$ always satisfies the equalities, so only bounds may be violated; initially true as all $0$ in $I$ .
Bounds of all non-basic variables are satisfied; initially true as they have no bounds.

For $x_{i} < l_{i}$ , we need to update $x_{i}$ using any non-basic var $x_{j}$ . Pick any $x_{j}$ with $c_{ij} \neq = 0$ . If no such $x_{j}$ exists that satisfies our condition $⟹$ UNSAT Increase current interpretation by $\frac{l _{i} - I ( x _{i} )}{c _{ij}}$ , interpretation of $x_{j}$ increases by $l_{i} - I (x_{i})$ , barely satisfying the lower bound, i.e., $I (x_{i}) = l_{i}$ .

x_{j} \mapsto 0 x_{i} = \dots + 0 \dots ∵ I (x_{i}) = 0 for previous interpretation I (x_{i}) = previous I (x_{i}) + l_{i} x_{j} \mapsto \frac{l _{i} - I ( x _{i} )}{c _{ij}} x_{i} = \dots + c_{ij} \cdot \frac{l _{i} - I ( x _{i} )}{c _{ij}} \dots x_{i} = \dots + l_{i} - I (x_{i}) \dots x_{i} = \dots + l_{i} + \dots

After updating $x_{j}$ , there is a chance that we might have violated the bounds of $x_{j}$ so we rewrite the equation such that $x_{j}$ becomes the basic variable and $x_{i}$ becomes the non-basic variable. The pivot operation follows as:

x_{i} = k \in N \sum c_{ik} x_{k} x_{j} = replace x_{j} with this - \frac{x _{i}}{c _{ij}} + k \in N / {j} \sum \frac{c _{ik}}{c _{ij}} x_{k}

where $N$ is the set of indices of non-basic variables. For Simplex basic variables are dependent variables and non-basic variables are independent variables.

Example:

x + y \geq 0 - 2 x + y \geq 2 - 10 x + y \geq - 5

Converting the set of equations into their slack form:

s_{1} = x + y s_{2} = - 2 x + y s_{3} = - 10 x + y s_{1} \geq 0 s_{2} \geq 2 s_{3} \geq - 5

Ordering $x, y, s_{1}, s_{2}, s_{3}$ and apply first model:

I_{0} = {x \mapsto 0, y \mapsto 0, s_{1} \mapsto 0, s_{2} \mapsto 0, s_{3} \mapsto 0}

$s_{1}$ and $s_{3}$ are satisfied, $s_{2}$ is not: $I_{0} (s_{2}) = 0$ but $s_{2} \geq 2$ .

Modulate the first element in our ordering: $x$ Decrease $I_{0} (x)$ to $- 1$ to meet $s_{2}$ ‘s bounds.

I_{1} = {x \mapsto - 1, y \mapsto 0, s_{1} \mapsto - 1, s_{2} \mapsto 2, s_{3} \mapsto 10}

Now we pivot with unchanged bounds:

x = 0.5 y - 0.5 s_{2} s_{1} = 1.5 y - 0.5 s_{2} s_{3} = - 4 y + 5 s_{2}

$I_{1} (s_{1}) = - 1 < 0$ , bound not satisfied.

New ordering of variables: $y, s_{2}, x, s_{1}, s_{3}$ $I (y)$ value increased by $\frac{1}{1.5} = 2.3$

I_{2} = {x \mapsto - \frac{2}{3}, y \mapsto \frac{2}{3}, s_{1} \mapsto 0, s_{2} \mapsto 2, s_{3} \mapsto \frac{7}{3}}

At this point we pivot $y$ with $s_{1}$

Simplex terminates as $I_{2} ⊨ F$ .

Simplex terminates due to the fact that the variables are ordered and we always look for the first variables violating bounds (Bland’s Rule), this ensures that we never revisit the same set of basic and non-basic variables.

Note

Basic variables are dependent variables and Non-Basic variables are independent variables.

Using Simplex as the theory solver within DPLLᵀ allows us to solve for LRA, but this approach is not scalable as ReLUs are encoded as disjunctions. This is because the SAt-solving part of DPLLᵀ will handle it and consider every possible case of disjunction (active = x, inactive = 0), leading to many calls to Simplex, so we extend Simplex to Reluplex.

Reluplex

Equations in reluplex form:

equations (same as simplex)
bounds (same as simplex)
ReLU constraints of the form: $x_{i} = re l u (x_{j})$ (also add bound(s) $x_{i} \geq 0$ implied by relu)

We call simplex on the weaker version (less constrained) formula of $F$ called $F^{'}$ , which does not have any relu constraints. If $F^{'}$ is UNSAT, then $F$ is UNSAT.

F ⟹ F^{'} (is valid)

If Simplex returns $I ⊨ F^{'}$ , it may not be that $I$ satisfies $F$ . If $I \neq ⊨ F$ , we pick one of the violated relu constraints

x_{i} = re l u (x_{j})

and modify $I$ to make sure that it is not violated. If any of $x_{i}$ or $x_{j}$ are basic, we pivot it with a non-basic variable. The pseudocode for Reluplex Algorithm can be found below:

Note: Without Case Splitting, reluplex might not terminate - it may get stuck in a loop where the Simplex satisfies all bounds but violates a relu, then satisfying that relu causes a bound to be violated and so on.

We try to count whether a relu constraint has not been attempted to be fixed for more than $τ$ times. $x_{i} = re l u (x_{j})$ is split into two cases: $F_{1} ≜ x_{j} \geq 0 \land x_{i} = x_{j}$ and $F_{2} ≜ x_{j} \leq 0 \land x_{i} = 0$ . Reluplex is recursively invoked on the two instances of the problem: $F \land F_{1}$ and $F \land F_{2}$ .

If any of the instances are SAT, then $F$ is SAT.

F \equiv (F \land F_{1}) \lor (F \land F_{2})

Abstraction Based Verification

Approximate (or Abstract) techniques for verification can have two possible outcomes:

If they succeed, they can produce proofs of correctness for a neural network.
If they fail, we do not know whether the correctness property holds or not.

Abstraction Based Verification techniques require specific techniques in order to abstractly define a domain of inputs to be given to a modified version of a neural network in order to test their robustness. There are mainly three ways of abstracting input domains for verification: Interval Abstraction, Zonotope Domain Abstraction and Polyhedra Domain Abstraction.

flowchart TD

A["Abstraction Techniques"] --> B["Interval"] & C["Zonotope"] & D["Polyhedra"]

In Interval Abstraction, we define an interval for each of the entire set of variables/predicates independently of each other, thereby defining a rectangular structure in some high dimensional space (hyperrectangles).

In Zonotope Domains, instead of defining hyperrectangles, the input domain is defined as a collection of parallelograms superimposed on each other created using auxiliary variables called generators whose values range from -1 to 1, each generator has its corresponding coefficient values which help in making a convex structure. The main difference between zonotopes and interval domains is that we can encode the relations between multiple variables/predicates using the generators.

In Polyhedra Domain, we have a setup similar to that of Zonotope Domains, however instead of the generator values ranging from -1 to 1, we have a set of linear inequalities bound together via conjunctions that determine the overall convex shape of the interval we are defining.

These techniques have been described in more detail below:

Neural Interval Abstraction

Using the specifications that have been defined beforehand, we see that it defines a system to test for individual images through a particular neural network.

{∣ x - c ∣ \leq 0.1} r \leftarrow f (x) {c l a ss (r) = 1}

There is a great number of possible images $x$ that can be tested for, so we will try to lift the function $f (\dots)$ to work over a set of images instead in order to comprehensively test for possible examples. Thus we want to transform the neural network computation that is represented by $f$

f : R^{n} \to R^{m}

f^{S} : P (R^{n}) \to P (R^{m})

where $P (S)$ is the power set of set $S$ . Thus, the newly defined mapping from a set of elements ( $X$ ) to the set of all predictions corresponding to all elements of $X$ will following form:

f^{S} (X) = {y ∣ x \in X, y = f (x)}

where $X$ is a set that is defined as per the specifications as the set of all $x \in X$ that are valid under the precondition. Which for our example will be:

X = {x ∣ x - c ∣ \leq 0.1}

To verify our property, we simply check:

f^{S} (X) \subseteq {y c l a ss (y) = 1}

all runs of $f$ on every image $x \in X$ result in network predicting class $1$ . We are defining $\infty$ set of inputs using data structures that we can manipulate called abstract domains. So we are taking a Neural Network and generating a version that can take a potentially infinite set of images, however during this abstraction process we will also be losing some precision, as we will be seeing soon. $f^{S} (X)$ is called the concrete transformer of $f$ .

Example: The function is: $f (x) = x + 1$ Corresponding concrete transformer: $f^{S} (X) = {x + 1 x \in X}$

Interval Abstract Domain considers an interval over $R : [l, u]$ where $l, u \in R$ and $l \leq u$

{x l \leq x \leq u}

We simplify concrete transformer by considering only sets which have a nice form (Abstract Interpretation).

f^{a} input interval [l, u] = output interval [l + 1, u + 1]

$f^{a}$ is the abstract transformer of $f$ . Interval $[l, u]$ is infinite (considering $l < u$ ) so $f^{a}$ adds $1$ to an infinite set of reals. Expanding the above to any arbitrary $n^{t h}$ dimension we have a n-dimensional interval (hyperrectangle) region in $R^{n}$ , i.e., set of all n-ary vectors ${x \in R^{n} l_{i} \leq x_{i} \leq u_{i}}$

R^{n} : hyperrectangle interval : ↑ ⋮ R^{3} : box interval : ↑ R^{2} : rectangle interval : ↑ R : line interval : ([l_{1}, u_{1}], \dots, [l_{i}, u_{i}], \dots, [l_{n}, u_{n}]) ([l_{1}, u_{1}], [l_{2}, u_{2}], [l_{3}, u_{3}]) ([l_{1}, u_{1}], [l_{2}, u_{2}]) ([l, u])

Soundness: We need to ensure that the $f^{a}$ designed is a sound approximation of $f^{S}$ . Output of $f^{a}$ should be a superset of $f^{S}$ , as we do not want to miss any behaviour. For any interval $[l, u]$ , we have: $f^{S} ([l, u]) \subseteq f^{a} ([l, u])$ . Equivalently, for any $x \in [l, u]$ , we have: $f (x) \in f^{a} ([l, u])$ Although in practice we often see: $f^{S} ([l, u]) \subset f^{a} ([l, u])$

We are modifying the functions to take in intervals of inputs and output the interval that contains the set of all the mappings from the input interval. However, the interval domain cannot capture the relations between different dimensions (non-relational)

X = {(x, x) 0 \leq x \leq 1}

Best we can do is define the unit square between $(0, 0)$ and $(1, 1)$ which is denoted as the 2D interval $([0, 1], [0, 1])$ . $X$ defines points where higher x-coordinates correspond to higher y-coordinates. Our abstract abstract domain can only represent rectangles whose faces are parallel to the axes. So instead of capturing the relation between two dimensions, we are simply saying that any value that is in $[0, 1]$ for $x$ can correspond to any value of $y$ in $[0, 1]$ , thereby overapproximating for the function to the extent that we are unable to conserve any information about the relations between the two elements of the input.

Example 1: Consider $f (x, y) = x + y$

$f^{S} : P (R^{2}) \to P (R)$ is defined as $f^{S} (X) = {x + y (x, y) \in X}$
$f^{a}$ is defined as a function that takes two intervals (a rectangle) representing range of values for $x_{1}$ and $x_{2}$

f^{a} ([l, u], [l^{'}, u^{'}]) = [l + l^{'}, u + u^{'}] f^{a} ([1, 5], [100, 200]) = [101, 205]

Take any $(x, y) \in ([l, u], [l^{'}, u^{'}])$ By definition, $l \leq x \leq u$ and $l^{'} \leq y \leq u^{'}$ , so $l + l^{'} \leq x + y \leq u + u^{'}$ thus $x + y \in [l + l^{'}, u + u^{'}]$ proving soundness.

Example 2: Consider $f (x, y) = x \times y$ For only positive inputs to $f^{a}$ : $f^{a} ([l, u], [l^{'}, u^{'}]) = [l \times l^{'}, u \times u^{'}]$ However,

f^{a} ([- 1, 1], [- 3, - 2]) = [3, - 2]

We were expecting $l \times l^{'} \leq u \times u^{'}$ but we got $l \times l^{'} \geq u \times u^{'}$ which would mean that the result is not a proper interval. Thus we have to compute the interval domain in a different way:

f^{a} ([l, u], [l^{'}, u^{'}]) = [min (B), ma x (B)]

where

B = {l \times l^{'}, l \times u^{'}, u \times l^{'}, u \times u^{'}}

For the above numerical example we therefore have: $f^{a} ([- 1, 1], [- 3, - 2]) = [min (B), ma x (B)] = [- 3, 3]$

These were all basic abstract transformers that overapproximate for some given binary function.

Affine Function:

$f (x_{1}, \dots, x_{n}) = \sum_{i} c_{i} x_{i}$ where $c_{i} \in R$ , we can define the abstract transformer as:

f^{a} ([l_{1}, u_{1}], \dots, [l_{n}, u_{n}]) = [i \sum l_{i}^{'}, i \sum u_{i}^{'}]

where $l_{i}^{'} = min (c_{i} l_{i}, c_{i} u_{i})$ and $u_{i}^{'} = max (c_{i} l_{i}, c_{i} u_{i})$ , trying to cover the largest area possible.

Example: $f (x, y) = 3 x + 2 y$ $f ([5, 10], [20, 30]) = [3 \times 5 + 2 \times 20, 3 \times 10 + 2 \times 30] = [55, 90]$

Monotonic Function:

$f : R \to R$ $f^{a} ([l, u]) = [f (l), f (u)]$

Example: ReLU in the domain $[3, 5]$ $re l u (3) \leq re l u (x) \leq re l u (5)$ Therefore, $f^{a} ([3, 5]) = [re l u (3), re l u (5)]$

Composing Abstract Transformers:

For $(f \circ g) (x)$ we do not need to define a different transformer, we can define one for $f (f^{a})$ and one for $g (g^{a})$ and compose them to find a sound abstract transformer of $f \circ g$ : $f^{a} \circ g^{a}$

Example: $g (x) = 3 x$ , $f (x) = re l u (x)$ and $h (x) = f (g (x))$ Affine followed by ReLU and output in $R$

h^{a} ([2, 3]) = f^{a} (g^{a} ([2, 3])) = f^{a} ([6, 9]) = [6, 9]

Abstractly Interpreting Neural Networks

For a network $G = (V, E)$ we have: $f_{G} : R^{n} \to R^{m}$ where $n = ∣ V^{in} ∣, m = ∣ V^{o u t} ∣$

$f_{G}^{a}$ takes $n$ intervals and outputs $m$ intervals
for every output node $v_{i}$ : $out^{a} (v_{i}) = [l_{i}, u_{i}]$ (note: we have a fixed ordering of nodes)
for every non-input node $v$ : $out^{a} (v) = f_{v}^{a} (out^{a} (v_{1}), \dots, out^{a} (v_{k}))$ where $f_{v}^{a}$ is the abstract transformer of $f_{v}$ and $v$ has incoming edges $(v_{1}, v), \dots, (v_{k}, v)$
the output of $f_{v}^{a}$ is the set of intervals $out^{a} (v_{1}), \dots, out^{a} (v_{m})$ where $v_{1}, \dots, v_{m}$ are output nodes

Example: $f_{v_{3}} (x) = 2 x_{1} + x_{2}$ $f_{v_{4}} (x) = re l u (x)$

$f_{G}^{a} ([0, 1], [2, 3])$ :

flowchart LR

v1((v₁))
v2((v₂))
v3((v₃))
v4((v₄))
o["[2,5]"]
v1 --[0,1]--> v3
v2 --[2,3]--> v3
v3 --[2,5]--> v4 --> o

out^{a} (v_{1}) = [0, 1] out^{a} (v_{2}) = [2, 3] out^{a} (v_{3}) = [2 \times 0 + 2, 2 \times 1 + 3] = [2, 5] out^{a} (v_{4}) = [re l u (2), re l u (5)]

Limitations

Interval Domain often overshoots and computes wildly overapproximated solutions.

Example 1:

flowchart LR

v1((v₁))
v2((v₂))
v3((v₃))
o["[-1,1]"]

v1 --[0,1]--> v2 --[-1,0]--> v3 --> o
v1 --[0,1]--> v3

$f_{v_{3}} (x) = - x$ $f_{v_{4}} (x) = x_{1} + x_{2}$

So we can see that $f_{G} (x) = 0$ Expected abstract transformer should be: $f_{G}^{a} ([l, u]) = [0, 0] \forall l, u \in R$ But $f_{G}^{a} ([l, u])$ returns $[- 1, 1]$ since it does not know the relation between $- x$ and $x$ .

Example 2:

flowchart LR

v1((v₁))
v2((v₂))
v3((v₃))
o1["[0,1]"]
o2["[0,1]"]

v1 --[0,1]--> v2 & v3
v2 --> o1
v3 --> o2

$f_{v_{2}}, f_{v_{3}}$ are both relus. $f_{G} (x) = (x, x)$ is the desirable result However, $f_{G}^{a} ([0, 1]) = ([0, 1], [0, 1])$ is the result. $f_{G}^{a}$ tells us that for inputs between $0$ and $1$ , the neural network can output $(x, y)$ where $0 \leq x, y \leq 1$ which is a loose approximation, we needed $0 \leq x \leq 1$ .

We cannot capture set of points where $x = y$ and $0 \leq x \leq 1$ , instead we can give $([0, 1], [0, 1])$ , syntactically an abstract element in the interval domain is captured by constraints of the form:

i ⋀ l_{i} \leq x_{i} \leq u_{i}

Every inequality involves a single variable and fails to capture relationships so interval domain is called non-relational. To address these limitations we will be turning towards Zonotope-based abstraction methods.

Zonotope: Relational Abstract Domain

Assume that we have a set of $m$ real-valued generator variables $ϵ_{1}, \dots, ϵ_{m} \in [- 1, 1]$ . A $1 D$ zonotope is the set of all points in the set

{c_{0} + i = 1 \sum m c_{i} ϵ_{i} ϵ_{i} \in [- 1, 1]}

where $c_{i} \in R$ . For 1 generator variable:

{c_{0} + c_{1} ϵ_{1} ϵ_{1} \in [- 1, 1]}

This is just defining an interval of the form $[c_{0} - c_{1}, c_{0} + c_{1}]$ , assuming $c_{1} \geq 0$ . In the defined interval, $c_{0}$ is the centre. For a one dimensional zonotope this can be interpreted as the point $c_{0}$ being stretched in both the available directions with the magnitude of $c_{1}$ .

c_{0} - c_{1} \leftarrow c_{0} \to c_{0} + c_{1}

Zonotopes are more expressive from $R^{2}$ and above. Thus, similarly for multiple generator variables, we can observe the same “stretching” effect. The initial coordinates are stretched into a line which represents a specific vector with endpoints at $(c_{10} + c_{11}, c_{20} + c_{21})$ and $(c_{10} - c_{11}, c_{20} - c_{21})$ . The vector is then stretched by another generator into a parallelogram. The more generator variables there are, the more faces there are in the resulting zonotope. However each zonotope can be described a convex figure resulting from the summation of different parallelograms. center The name zonotope is derived from zona meaning ‘belt’ in Latin. Zonotopes are named such as we can trace an uninterrupted path through the parallel vectors that wrap around the figure like a belt. center

In n-dimensions, a zonotope with m-generators is the set of all points

⎩ ⎨ ⎧ first dimension c_{10} + i = 1 \sum m c_{1 i} ϵ_{i}, \dots, nth dimension c_{n 0} + i = 1 \sum m c_{ni} ϵ_{i} ϵ_{i} \in [- 1, 1] ⎭ ⎬ ⎫

Example: $(1 + ϵ_{1}, 2 + ϵ_{2})$ $(1 + ϵ_{1} + 0 ϵ_{2}, 2 + 0 ϵ_{1} + ϵ_{2})$ Centre will be $(1, 2)$ , the centre of the zonotope is always the vector of the constant coefficients.

$(2 + ϵ_{1}, 2 + ϵ_{1})$ Two dimensions are equal, so we get a line shape centred at $(2, 2)$ . Zonotopes allow us to model relational intervals with the help of the generator variables. Above we see a zonotope modelling an interval that follows $(x, x)$ .

$(2 + ϵ_{1}, 3 + ϵ_{1} + ϵ_{2})$ coefficients of $ϵ_{1}$ are $(1, 1)$ so it stretches the centre $(2, 3)$ along the $(1, 1)$ vector

coefficients of $ϵ_{2}$ are $(0, 1)$ so it stretches all points along $(0, 1)$ vector

Above is the final zonotope, adding more and more faces adds more faces to the zonotope.

⎩ ⎨ ⎧ first dimension c_{10} + i = 1 \sum m c_{1 i} ϵ_{i}, \dots, nth dimension c_{n 0} + i = 1 \sum m c_{ni} ϵ_{i} ϵ_{i} \in [- 1, 1] ⎭ ⎬ ⎫

We will use a compact notation to signify the above equation, it will be defined as a tuple of vectors of coefficients

(⟨ c_{10}, \dots, c_{1 m} ⟩, \dots, ⟨ c_{n 0}, \dots, c_{nm} ⟩)

for an even more compact notation

(⟨ c_{1 i} ⟩_{i}, \dots, ⟨ c_{ni} ⟩_{i})

where $i$ ranges from $0$ to $m$ , the number of generators. We can compute the upper bound of the zonotope in the $j$ th dimensionby solving the following optimization problem:

max c_{j 0} + i = 1 \sum m c_{ji} ϵ_{i} s.t. ϵ_{i} \in [- 1, 1]

which can easily be solved by setting $ϵ_{i}$ to $1$ if $c_{ji} > 0$ and $- 1$ otherwise.

Similarly, we can compute the lower bound of the zonotope in the j-th dimension by minimizing instead of maximizing, solving the optimization problem by setting $ϵ_{i} = - 1$ if $c_{ji} > 0$ and $0$ otherwise.

Example: $(2 + ϵ_{1}, 3 + ϵ_{1} + ϵ_{2}) ⟹ (⟨ 2, 1, 0 ⟩, ⟨ 3, 1, 1 ⟩)$

Upper Bound in vertical dimension: $3 + ϵ_{1} + ϵ_{2}; ϵ_{1}, ϵ_{2} = 1 ⟹ 3 + 1 + 1 = 5$

$f (x, y) = x + y$

Define: $f^{a}$

$f^{a} (⟨ c_{10}, \dots, c_{1 m} ⟩, ⟨ c_{20}, \dots, c_{2 m} ⟩)$ compare to $f^{a} ([l_{1}, u_{2}], [l_{2}, u_{2}])$

$⟹ ⟨ c_{10} + c_{20}, \dots, c_{1 m} + c_{2 m} ⟩$

$(0 + ϵ_{1}, 1 + ϵ_{2}) : f^{a} (⟨ 0, 1, 0 ⟩, ⟨ 1, 0, 1 ⟩) = ⟨ 1, 1, 1 ⟩$

output zonotope is the set ${1 + ϵ_{1} + ϵ_{2} ϵ_{1}, ϵ_{2} \in [- 1, 1]}$ which is the interval $[- 1, 3]$ .

Affine Functions

f (x_{1}, \dots, x_{n}) = j \sum a_{j} x_{j}

where $a_{j} \in R$ .

f^{a} (⟨ c_{1 i} ⟩, \dots, ⟨ c_{ni} ⟩) = ⟨ j \sum a_{j} c_{j 0}, \dots, j \sum a_{j} c_{jm} ⟩

$f (x, y) = 3 x + 2 y$
$f^{a} (⟨ 1, 2, 3 ⟩, ⟨ 0, 1, 1 ⟩) = ⟨ f (1, 0), f (2, 1), f (3, 1)⟩ = ⟨ 3, 8, 11 ⟩$
$3 + 8 ϵ_{1} + 11 ϵ_{2} ϵ_{1}, ϵ_{2} \in [- 1, 1]$

Abstract Transformer of Activation Functions

For intervals domains we had: $re l u^{a} ([l, u]) = [re l u (l), re l u (u)]$
This formulation does not know how the inputs are related to their corresponding outputs. Geometrically the interval domain of ReLU approximates the function with a box as shown below: The breadth of the box depends on the lower bound of the interval, i.e., whether $l$ is positive or negative. Using zonotopes, the ReLU abstract transformer is build using a 1-dimensional zonotope $⟨ c_{i} ⟩_{i}$ as input

re l u^{a} (⟨ c_{i} ⟩_{i}) = ⎩ ⎨ ⎧ ⟨ c_{i} ⟩_{i} for l \geq 0 ⟨ 0 ⟩_{i} for u \leq 0 ? otherwise

If the lower bound is greater than zero, we return the input, if the upper bound is lesser than or equal to zero we return zero. Since zonotopes allows for relating input and output, we can shear the rectangles to form better approximations of ReLU.

The bottom face of the zonotope is $y = λ x$ , for some slope $λ$ . The top face is $y = λ x + u (1 - λ)$ . For $λ = 0$ , we get the base rectangle that is the interval domain for the ReLU, the steepness of the shear depends on the $λ$ parameter, which cannot be more than $u / (u - l)$ to ensure that the parallelogram covers the ReLU along the input range $[l, u]$ . The distance between the top and bottom faces of the parallelogram is $u (1 - λ)$ , Thus the centre of the zonotope is at the point

η = \frac{u ( 1 - λ )}{2}

With this information, we can now complete the definition of $re l u^{a}$ as follows:

re l u^{a} (⟨ c_{i} ⟩_{i}) = ⎩ ⎨ ⎧ ⟨ c_{i} ⟩_{i} for l \geq 0 ⟨ 0 ⟩_{i} for u \leq 0 ⟨ λ c_{1}, ..., λ c_{m}, 0 〉 + ⟨ η, 0, 0, ..., η 〉 otherwise

we have added a new generator, $ϵ_{m + 1}$ , in order to stretch the parallelogram in the vertical axis; its coefficient is $η$ , which is half the height of the parallelogram. We also add the input zonotope scaled by $λ$ with coefficient $0$ for the new generator to ensure that we capture the relationship between the input and output.

Neural Polyhedron Abstraction

So far we have been approximating functions using a hyperrectangle with interval domains but it was non-relational, the zonotope domain allows us to approximate functions using a zonotope, e.g., a parallelogram and capture relations between different dimensions. The polyhedron domain is a more expressive abstract domain as it enables us to approximate functions using any arbitrary complex polyhedra. A polyhedron in $R^{n}$ is a region made of straight faces. Convex polyhedra are shapes that have any two points of the shape completely contained in the shape and can be specified as a set of linear inequalities. This approximation is more precise for ReLU functions than those afforded by the interval and zonotope domains as the shapes used to approximated are not limited to hyperrectangle or parallelograms.

interval domain abstracts a function using a hyperrectangle \to zonotope abstract domain approximates a function using zonotopes/paralleloprams \to polyhedra abstract domain approximates a function using arbitrary convex polyhedra

Polyhedra are defined in a manner analogous to a zonotope abstractions, using a set of $m$ generator variables, $ϵ_{1}, \dots, ϵ_{m}$ which are then bounded by a set of linear inequalities instead of being limited to the interval $[- 1, 1]$ as is the case for zonotopes.

A zonotope in $R^{n}$

{(c_{10} + i = 1 \sum m c_{1 i} ϵ_{i}, \dots, c_{n 0} + i = 1 \sum m c_{ni} ϵ_{i}) F (ϵ_{1}, \dots, ϵ_{m})}

where $F (ϵ_{1}, \dots, ϵ_{m})$ is a Boolean statement that evaluates to $t r u e$ iff all of its arguments are in $[- 1, 1]$ . For a polyhedron $F$ is defined as a set (conjunction) of linear inequalities over the generator variables, e.g.,

F (ϵ_{1}, ϵ_{2}) \equiv 0 \leq ϵ_{1} \leq 5 \land ϵ_{1} = ϵ_{2}

$F$ defines a bounded polyhedron over the generator variables, giving a lower and upper bound for each generator, e.g., $ϵ_{1} \leq 0$ is not allowed, because it does not enforce a lower bound on $ϵ_{1}$ . In the 1-dimension, a polyhedron is simply an bounded interval.

To find the upper and lower bounds of a polyhedron, we need to solve a linear program which takes polynomial time in the number of variables and constraints.

To compute the lower bound of the $j$ -th dimension, we solve for the following:

min c_{j 0} + i = 1 \sum m c_{ij} ϵ_{i}

Similarly to calculate the upper bound of $j$ -th dimension, we just take the $max$ instead of minimizing the generator terms.

A given polyhedra in $R^{n}$

{(c_{10} + i = 1 \sum m c_{1 i} ϵ_{i}, \dots, c_{n 0} + i = 1 \sum m c_{ni} ϵ_{i}) F (ϵ_{1}, \dots, ϵ_{m})}

will be abbreviated as:

(⟨ c_{1 i} ⟩_{i}, \dots, ⟨ c_{ni} ⟩_{i}, F)

Example: Given polyhedron: ${(ϵ_{1}, ϵ_{2}) ∣ F (ϵ_{1}, ϵ_{2})}$

$F \equiv (0 \leq ϵ_{1} \leq 1) \land (ϵ_{2} \leq ϵ_{1}) \land (ϵ_{2} \geq 0)$

From the Boolean function we can understand that a bounded x-y space is being defined. For ease of visualisation, we can replace $ϵ_{1}$ and $ϵ_{2}$ with $x$ and $y$ which gives us a set pf equations that $x$ and $y$ must satisfy:

0 \leq x \leq 1 y \leq x y \geq 0

This set of equations clearly defines a LP problem in the shape of a triangle situated in the first quadrant of the x-y plane. Thus the polyhedron comprises of all of the points inside this triangle.

$(⟨ 0, 1, 0 ⟩, ⟨ 0, 0, 1 ⟩, F)$

We need to solve for the following four equations to get the upper and lower bound on the generators

max ϵ_{1} s.t. F min ϵ_{1} s.t. F max ϵ_{2} s.t. F min ϵ_{2} s.t. F

Thus the valid interval domains of the generators are: $ϵ_{1} \in [0, 1]$ and $ϵ_{2} \in [0, 1]$ .

Abstract Transformers for Polyhedra

Affine Functions

For an affine function $f (x_{1}, \dots, x_{n}) = \sum_{j} a_{j} x_{j} a_{j} \in R$ We will have the following abstract transformer:

f^{a} (⟨ c_{1 i} ⟩, ⟨ c_{ni} ⟩, F) = (⟨ j \sum a_{j} c_{j 0}, \dots, j \sum a_{j} c_{jm} ⟩, F)

the set of linear equalities does not change between input and output of the function.

Example:

$f (x, y) = 3 x + 2 y$

$f^{a} (⟨ 1, 2, 3 ⟩, ⟨ 0, 1, 1 ⟩, F) = (⟨ 3, 8, 11 ⟩, F)$

The abstract transformer for ReLU generated in polyhedron domain:

$re l u^{a}$ in convex polyhedra

Top face: $y = \frac{u ( x - l )}{u - l}$ from the equation of a straight line, where $x_{1} = l$ , $x_{2} = u$ and $y_{1} = re l u (l)$ , $y_{2} = re l u (u)$ .

We need to compute the shaded area which is bounded by $y = 0$ from below, $y = x$ from the right, and $y = \frac{u ( x - l )}{u - l}$ from above. We define $re l u^{a}$ as

re l u^{a} (⟨ c_{i} ⟩_{i}, F) = (⟨ m 0, 0, \dots, 0, 1 ⟩, F^{'})

where

F^{'} \equiv F \land (ϵ_{m + 1} \leq \frac{u (⟨ c _{i} ⟩ - l )}{( u - l )}) \land (ϵ_{m + 1} \geq 0) \land (ϵ_{m + 1} \geq ⟨ c_{i} ⟩)

$l$ and $u$ are the lower and upper bounds of the input polyhedron that can be computed using linear programming.
$〈 c_{i} 〉_{i}$ is being used to denote the full term $c_{0} + \sum_{i = 1}^{m} c_{i} ϵ_{i}$ .
And a new generator, $ϵ_{m + 1}$ has been added, the new set of constraints $F^{'}$ relates this new generator to the input, effectively defining the shaded region.

Condensing the constraints into two dimensions:

re l u^{a} (⟨ 0, 1 ⟩, - 1 \leq ϵ_{1} \leq 1) = (⟨ 0, 0, 1 ⟩, F^{'})

where $ϵ_{1}$ determines $x$ and $ϵ_{2}$ determines $y$ thus,

F^{'} \equiv (- 1 \leq ϵ_{1} \leq 1) \land (ϵ_{2} \leq \frac{ϵ _{1} + 1}{2}) \land (ϵ_{2} \geq 0) \land (ϵ_{2} \leq ϵ_{1})

Abstractly Interpreting Neural Networks

$G = (V, E) f_{G} : R^{n} \to R^{m}$

f_{G}^{a} : n-dimensional polyhedron \to m-dimensional polyhedron

$f_{G}^{a} (⟨ c_{1 j} ⟩, \dots, ⟨ c_{nj} ⟩, F)$

For every input node $v_{i}$ , we have $o u t^{a} (v_{i}) = (⟨ c_{ij} ⟩_{j}, F)$
For every non-input node $v$ we have $o u t^{a} (v) = f_{v}^{a} (p_{1}, \dots, p_{k}, ⋀_{i = 1}^{n} F_{k})$ where $f_{v}^{a}$ is the abstract transformer of $f_{v}$ , $v$ has incoming edges $(v_{1}, v), \dots, (v_{k}, v)$ and $o u t^{a} (v_{i}) = (p_{i}, F_{i})$
output of $f_{G}^{a}$ is the $m$ -dimensional polyhedron $(p_{1}, \dots, p_{m}, ⋀_{i = 1}^{m} F_{i})$ , where $v_{1}, \dots, v_{m}$ are the output nodes and $o u t^{a} (v_{i}) = (p_{i}, F_{i})$

Some abstract transformers for activation functions add new generators, we assume all of them were already in the polyhedron but with coefficients set to $0$ , they get non-zero coefficients only in the output of activation functions.

Abstract Interpretation based Verification

For verification using the Hoare triplet to work, we require the a sound representation of the set of val of $x$ in the abstract domain. Along with the abstract representation of the neural network $f$ on all values of $x$ that results in an over-approximation of the values of $x$ . We also need to check that all values of $r$ satisfy the postcondition.

The generic precondition can be defined as:

{∥ x - c ∥_{p} \leq ϵ} r \leftarrow f (x) {class (r) = y}

Here for the sake of generality, we do not specify a particular norm, instead we define an $l_{p}$ -norm. The main norms that we will be considering in this case are the $l_{2}$ -norm and the $l_{\infty}$ -norm both of which are distance metrics but serve very different purposes.

l_{p} norm: ∥ z ∥_{p} ⎩ ⎨ ⎧ l_{2} norm: ∥ z ∥_{2} = \sum_{i} ∣ z_{i} ∣^{2} l_{\infty} norm: ∥ z ∥_{\infty} = i ma x ∣ z_{i} ∣

$l_{2}$ -norm is the length of the straight line between two images in $R^{n}$ and $l_{\infty}$ -norm is the largest discrepancy between two corresponding pixels. They have their own usability based on what kinds of deviations we wish to categorize. If there is a great deal of variability restricted to some locality of the image we can use $l_{2}$ -norm as it allows a small number of pixels to significantly differ in brightness. If the noise is random and spread out through the entire image, $l_{\infty}$ -norm is more suitable as it bounds the maximum discrepancy a corresponding pixels in the two images can have.

Abstracting the Precondition

We have the precondition as: ${x ∣ ∥ x - c ∥_{\infty} \leq ϵ}$

Interval: $I = ([c_{1} - ϵ, c_{1} + ϵ], \dots, [c_{n} - ϵ, c_{n} + ϵ])$ $∥ \cdot ∥_{\infty}$ allows us to take elements of $c$ and change it by $ϵ$ independently of other dimensions. Define abstract transformer $f^{a} (I) = ([l_{1}, u_{1}], [l_{2}, u_{2}], \dots, [l_{m}, u_{m}])$

$I^{'} = ([l_{1}, u_{1}], [l_{2}, u_{2}], \dots, [l_{m}, u_{m}])$ represents all possible values of $r$ and more.

We have to prove that $\forall r \in I^{'}$ , $c l a ss (r) = y$ We see that if $l_{y} > u_{i} \forall i \neq = y$ then, $\forall r \in I^{'}, c l a ss (r) = y$

If $y$ -th interval is larger than all others, then we know that the classification is always $y$ .

Note: if $l_{y} \leq u_{i}$ for some $i \neq = y$ , then we cannot disprove this property, so this is a one-sided check. In simpler words, since we will be receiving a range of probabilities corresponding to each of the class indices. We will be choosing the class whose lower bound probability ( $l_{y}$ ) is higher than all other upper bounds of all other class ( $u_{i}$ ). We will be picking the non-overlapping bound with highest lower bound. If there is any overlap between the range of probabilities, we cannot be entirely sure as to which class the resulting probability belongs to inside of the overlapping region. Thus making this is a one-sided check. center Example:

$f^{a} (I) = I^{'} = ([0.1, 0.2], [0.3, 0.4])$ $c l a ss (r) = 2 \forall r \in I^{'}$

where $I^{'} = ([0.1, 0.2], [0.15, 0.4])$

two intervals overlap in the 0.15 to 0.2 region, this means that we cannot conclusively say that $c l a ss (r) = 2 \forall r \in I^{'}$ , so verification fails. $\exists r \in I^{'}$ that can belong to both $[0.1, 0.2]$ and $[0.15, 0.4]$ .

Verifying Robustness with Zonotopes

Checking $l_{\infty}$ robustness property using zonotopes. Since precondition is hyperrectangular it can be precisely represented:

f^{a} (z) = z^{'}

we want to ensure that the dimension $y$ is greater than all others, problem is akin to checking if a 1-D zonotope is always > 0.

z^{'} = (⟨ c_{1 i} ⟩, \dots, ⟨ c_{mi} ⟩)

To check that dimension $y$ is greater than dimension $j$ , we check if the lower bound of the 1D zonotope $⟨ c_{y i} ⟩ - ⟨ c_{ji} ⟩$ is > 0 or not

Example: $z^{'} = (2 + ϵ_{1}, 4 + ϵ_{1} + ϵ_{2})$ For this region, we can graphically see that $y > x$ for any point $(x, y)$ To check $y > x$ mechanically, we subtract the $x$ -dimension from the $y$ -dimension.

(4 + ϵ_{1} + ϵ_{2}) - (2 + ϵ_{1}) = 2 + ϵ_{1}

The resulting 1D zonotope $(2 + ϵ_{1})$ denotes the interval $[1, 3]$ which is greater than zero.

So, for multiple $m$ -generators and $n$ -dimensions

(⟨ c_{0} + c_{1} ϵ_{1} + \dots + c_{m} ϵ_{m} ⟩_{1}, ⟨ c_{0} + c_{1} ϵ_{1} + \dots + c_{m} ϵ_{m} ⟩_{2}, \dots, ⟨ c_{0} + c_{1} ϵ_{1} + \dots + c_{m} ϵ_{m} ⟩_{n})

for some $y$ -dimension check $⟨ c_{y} ⟩_{y} - ⟨ c_{i} ⟩_{i}$ for every $i \neq = y$ and check whether $> 0$

(we will be getting a set of ranges, if every range > 0 then $y$ is the class)

Verifying Robustness with Polyhedra

Analogous to zonotopes but requires invoking a linear program solver.

We represent the interval as a hyperrectangle polyhedron $Y$ . Then we evaluate $f^{a} (Y)$ resulting in a polyhedron

Y^{'} = (⟨ c_{1 i} ⟩, \dots, ⟨ c_{mi} ⟩, F)

To check if dimension $y$ is greater than dimension $j$ , we ask a linear program-solver if the following constraints are satisfiable

F \land ⟨ c_{y i} ⟩ > ⟨ c_{ji} ⟩

Robustness in $l_{2}$ -norm

${x ∣ ∥ x - c ∥_{2} \leq ϵ}$ : this defines a unit circle around $(0, 0)$ .

let $c = (0), ϵ = 1$

Cannot be precisely represented in the interval domain, best we can do is $([- 1, 1], [- 1, 1])$ with polyhedra, we can define polyhedra with more and more faces to more accurately approximate a circle, but there is a precision-scalability trade-off that comes into the picture.

Robustness in NLP

Synonyms of words should not confuse Neural Networks. Complete set of synonyms: $S_{w}$ ; each word is $w$ $i$ -th element of a vector is the $i$ -th token/word embedding of the sentence

Correctness Property:

{x_{i} \in S_{c_{i}} \forall i} r \leftarrow f (x) {c l a ss (r) = y}

all vectors $x$ that are like $c$ but where some words are replaced by synonyms. Set of possible vectors $x$ is finite but exponential in length of input sentences

([min S_{c_{1}}, max S_{c_{1}}], \dots, [min S_{c_{n}}, max S_{c_{n}}])

Abstract Training of Neural Networks

Training Data: ${(x_{1}, y_{1}), \dots, (x_{m}, y_{m})}$ $x_{i} \in R^{n}$ binary label (classification model): $y_{i} \in {0, 1}$

We assume we have a family of functions represented as a parameterized function: $f_{θ}$ Where $θ$ is the vector of weights. Search the space of $θ$ and find the best values of $θ$ .

Family of affine functions: $f_{θ} (x) = θ_{1} + θ_{2} x_{1} + θ_{3} x_{2}$

Solve optimization problem: $arg θ min \frac{1}{m} \sum_{i = 1}^{m} 1 [f_{θ} (x_{i}) = y_{i}]$

1 [b] ⎩ ⎨ ⎧ 1 if b is True 0 if b is False

This function is difficult to resolve as Boolean is non-differential so we use MSE:

arg θ min \frac{1}{m} i = 1 \sum m (f_{θ} (x_{i}) - y_{i})^{2}

family of functions $f_{θ}$ represented as neural net graph $G_{θ}$ , where every node $v$ ‘s function $f_{v}$ may be parameterized by $θ$ . Formally we solve:

arg θ min \frac{1}{m} i = 1 \sum m L (θ, x_{i}, y_{i})

Loss function $L$ can also be represented as a neural network. $L$ is represented as an extension of the graph $G_{θ}$ by adding a node at the very end that computes the loss function.

$f_{θ} : R^{n} \to R$

flowchart LR

v1[v₁] & v2[v₂] --> A((...)) & B((...))
A & B--> vo[vₒ]

subgraph "intermediate nodes"
A
B
end

We can construct a graph for the loss function $L (θ, x, y)$ by adding am input node $v_{y}$ for the label $y$ and creating a new output node $v_{L}$ that compares the output of $f_{θ}$ (the node $v_{θ}$ ) with $y$ .

flowchart LR

v1[v₁] & v2[v₂] -.- vo((vₒ)) --> vL[v_L]
vy[v_y] --> vL

input node $v_{y}$ takes in the label $y$ and $f_{v_{L}}$ encodes the loss function, e.g., MSE

(\frac{\partial g}{\partial θ _{1}}, \dots, \frac{\partial g}{\partial θ _{n}}) for a particular θ^{0} (\frac{\partial g ( θ ^{0} )}{\partial θ _{1}}, \dots, \frac{\partial g ( θ ^{0} )}{\partial θ _{n}})

Gradient Descent:

Start with $j = 0$ and a random value $θ$ , called $θ^{0}$
Set $θ^{j + 1}$ to $θ^{j} - η ((\nabla g) (θ^{i}))$
Set $j$ to $j + 1$ and repeat $η > 0$ learning rate

\frac{1}{m} i = 1 \sum m \nabla L (θ, x_{i}, y_{i})

Set $θ^{j + 1}$ to $θ^{j} - \frac{η}{m} \sum_{i = 1}^{m} \nabla L (θ^{j}, x_{i}, y_{i})$

SGD/mini-batch gradient descent:

Start with $j = 0$ and a random value $θ$ , called $θ^{0}$
Divide the dataset into a random set of $k$ batches: $B_{1}, B_{2}, \dots, B_{k}$
For $i$ from 1 to $k$ :
- Set $θ^{j + 1}$ to $θ^{j} - \frac{η}{m} \sum_{(x, y) \in B_{i}}^{m} \nabla L (θ^{j}, x_{i}, y_{i})$
- Set $j$ to $j + 1$
reiterate

Size of batches $k$ is dependent on how much data can be put into the GPU at any one point.

These optimization algorithms do not provide any robustness to the networks, as we are only minimizing loss over average prediction. Not robust to perturbations, even if they are, abstract interpretation fails to provide a proof due to its overapproximate nature. So we would like to train neural networks in such a way that they are friendly to abstract interpretation.

Redefining Optimization Objective for Robustness

(Robustness Optimization Objective)

For every $(x, y)$ in our dataset, we want the neural net to predict $y$ on all images $z$ such that $∥ x - z ∥_{\infty} \leq ϵ$ . This set can be characterized as

R (x) = {z ∣ ∥ x - z ∥_{\infty} \leq ϵ}

New Optimization Objective

arg θ min \frac{1}{m} i = 1 \sum m z \in R (x_{i}) max L (θ, x_{i}, y_{i})

Instead of minimizing for the loss of $(x_{i}, y_{i})$ , we minimize loss for the worst-case perturbation of $x_{i}$ from the set $R (x_{i})$ . This is known as robust optimization problem. Training the neural net using such an objective is known as adversarial training (very similar in appearance to minimax or other game playing techniques).

Solving Robust Optimization via Abstract Interpretation

$R (x)$ can be defined in interval domain precisely as it represents a set of images within an $l_{\infty}$ bound. We can overapproximate the inner maximization by abstractly interpreting $L$ on the entire set $R (x_{i})$ . By virtue of the soundness of the abstract transformer $L^{a}$ , we know:

(z \in R (x_{i}) max L (θ, x_{i}, y_{i})) \leq u

where $L^{a} (θ, R (x_{i}), y_{i}) = [l, u]$ Therefore we can overapproximate the inner maximization by abstractly interpreting the loss function on the set $R (x_{i})$ and taking the upper bound.

Thus the robust optimization objective can be rewritten as:

arg θ min \frac{1}{m} i = 1 \sum m upper bound of L^{a} (θ, R (x_{i}), y_{i})

instead of treating $L^{a}$ as an abstract transformer, we can treat it as a function taking a vector of inputs and returning upper and lower bounds, this is called flattening the abstract transformer.

Example: $re l u (x) = ma x (0, x)$

$re l u^{a} ([l, u]) = [ma x (0, l), ma x (0, u)]$

flattening the $re l u^{a}$ to $re l u^{a f} : R^{2} \to R^{2}$

$re l u^{a} ([l, u]) = (ma x (0, l), ma x (0, u))$ : returns a tuple of values instead of an interval.

arg θ min \frac{1}{m} i = 1 \sum m L_{u}^{a f} (θ, l_{i 1}, u_{i 1}, \dots, l_{in}, u_{in}, y_{i})

Where $L_{u}^{a f}$ is the only upper bound of $L^{a f}$ output

R (x_{i}) = ([l_{i 1}, u_{i 1}], \dots, [l_{in}, u_{in}])

SGD can optimize for such objectives because all of the abstract transformers of the interval domain that are of interest for neural nets are differentiable almost everywhere. Some can be adapted into zonotopes also.

Example: $f : R \to R$

f^{a} : 1D zonotope with m-generators ⟨ c_{0}, c_{1}, c_{2}, c_{3}, c_{4}, c_{5}, \dots, c_{m} ⟩ \to 1D zonotope with m-generators ⟨ c_{0}^{'}, c_{1}^{'}, c_{2}^{'}, c_{3}^{'}, c_{4}^{'}, c_{5}^{'}, \dots, c_{m}^{'} ⟩

$f^{a f} : R^{m + 1} \to R^{m + 1}$

Flattening does not work for polyhedra domain, because it invokes a blackbox linear programming solver for activation functions which is not differentiable.

Neural Networks trained with abstract interpretation are:

more robust to perturbation attacks
verifiably robust using abstract interpretation A Neural Network could satisfy a correctness property but we may not be able to verify the neural net using abstract domain. By incorporating abstract interpretation right into the training we guide SGD towards neural nets that are more amenable to verification.

Note: $l_{p}$ robustness properties are closely related to the notion of Lipschitz continuity. For instance, a network $f : R^{n} \to R^{m}$ is K-Lipschitz under the $l_{2}$ -norm if

∥ f (x) - f (y) ∥_{2} \leq K ∥ x - y ∥_{2}

The smallest K satisfying the above is called the Lipschitz constant of $f$ . If we can bound K, then we can prove $l_{2}$ -robustness of $f$ .

🧠 Electric Brains

Explorer

Neural Network Verification

Motivation

Introduction

Background

Abstraction of Neural Network

Defining Specifications

Constraint-Based Verification

Constraint-Based Satisfaction

Encoding Neural Networks

Encoding the nodes:

Encoding the edges:

A Concrete Example for checking Robustness

DPLL (Davis-Putnam-Logemann-Loveland)

Conjunctive Normal Form (CNF)

Deduction

Boolean Constant Propagation (BCP)

Deduction + Search

DPLL Modulo Theories

DPLLᵀ Algorithm

Tseitin’s Transformation

Tseitin Step 1: NNF

Tseitin Step 2: Subformula Rewriting

Theory Solving

Simplex Algorithm

Reluplex

Abstraction Based Verification

Neural Interval Abstraction

Affine Function:

Monotonic Function:

Composing Abstract Transformers:

Abstractly Interpreting Neural Networks

Limitations

Zonotope: Relational Abstract Domain

Affine Functions

Abstract Transformer of Activation Functions

Neural Polyhedron Abstraction

Abstract Transformers for Polyhedra

Affine Functions

Abstractly Interpreting Neural Networks

Abstract Interpretation based Verification

Abstracting the Precondition

Verifying Robustness with Zonotopes

Verifying Robustness with Polyhedra

Robustness in l2​-norm

Robustness in NLP

Abstract Training of Neural Networks

Redefining Optimization Objective for Robustness

New Optimization Objective

Solving Robust Optimization via Abstract Interpretation

Graph View

Table of Contents

Backlinks

Robustness in $l_{2}$ -norm