вторник, 13 апреля 2010 г.

Vulnerability models in web applications

I've seen much confusion in understanding and using such terms as "taint propagation", "static analysis for vulnerabilities", "dynamic analysis". So I decided to clarify some important thing here (particularly with respect to web applications).
First of all, we should distinguish a model of program behavior and a model of vulnerability.
Every model of program behavior introduces its own terms for behavior specification. Depending on the level of abstraction, those terms could specify low-level (i.e. how program uses CPU registers) or high-level (i.e. dataflow dependencies) behavior.
The trivial models of program behavior are specifications in terms of source and byte- or binary code. Here are some other popular models: CFG (Control flow graph), various dependence graphs, program slices and pre- and postconditions, a set of possible values for each variable at every program point (hello, so-called string analysis).
What is most important is that these models have nothing to do with static or dynamic analysis. Static and dynamic analysis are the means to construct the model of program behavior. Of course, static analysis tends to yield complete but imprecise models while dynamic analysis yields precise but incomplete models. You can find concepts of both static and dynamic slices; there are concepts of pre and post conditions in static and runtime.

Now let us move to vulnerability models. Every vulnerability model is tightly bound to the model of program behavior. Indeed, the existence of vulnerability should be specified in terms of program behavior!
There is widely accepted model of vulnerabilities at the time - the non-interference model and its extensions. In simple words it tells us that untrusted input should not interfere (e.g. change the intended behavior) with critical operations. Okey, we have this requirement, so how do we check it in our program?

The simplest way is to follow the very definition of non-interference: let us mark untrusted input as taint source and critical operations as taint sinks. Now if a taint sink depends on a taint source, and input data is not sanitized (i.e. made trusted) we shall raise a vulnerability warning. We can see that this re-formulation of non-interference is perfectly expressed in terms of dependencies. Hence, automated tools that utilize this definition should be able to make program slices and/or build dependence graphs (does not matter in runtime or in static).

A more subtle (and in my opinion beautiful) way to express interference formally was proposed in early 2000s. Let us survey what SQL (and any other) injection is. It's when a user can mix control and data channels of a query, or in other terms, change its syntactic structure. So, the definition of non-interference could be re-formulated as follows: we should raise a vulnerability warning if syntactic structure of a critical query depends on the user input. We can see that this re-formulation of non-interference is perfectly expressed in terms of parse trees of string values, which program variables may hold at every execution point. Hence, automated tools that utilize this definition should be able to perform string analysis in static (at runtime there is no difficulty to obtain the values of the needed variable).

And finally, we can see now that taint propagation is not a method to detect vulnerabilities. Taint propagation is one of the means to determine data dependencies at runtime.

Some references:
Basic dependency model (aka taint propagation)
[2006] Pixy - Technical Report
[2006] Static Detection of Security Vulnerabilities in Scripting Languages
[2009] TAJ Effective Taint Analysis of Web Applications

[2004] Analysis of Perl Taint Mode
[2005] Automatically hardening web applications using precise tainting
[2005] Dynamic Taint Propagation for Java

Parse tree validation model:
[2005] Static Approximation of Dynamically Generated Web Pages
[2007] Sound and precise analysis of web applications for injection vulnerabilities

[2005] Combining Static Analysis and Runtime Monitoring to Counter SQL-Injection Attacks
[2005] Using Parse Tree Validation to Prevent SQL Injection Attacks
[2006] The Essence of Command Injection Attacks in Web Applications
[2006] Using Positive Tainting and SyntaxAware Evaluation to Counter SQL Injection Attacks

In the next post we will discuss some limitations of these models. Stay tuned!

Комментариев нет:

Отправить комментарий