As its very name suggests a black-box scanner doesn't know how each input http parameter is handled by web application. From an attacker's perspective for each input parameter we would like to know:
- the list of sensitive operations, which are called with arguments derived from the input value (whether parameter "name" is passed to echo, to mysql_query or to both?)
- the syntactic structure of sensitive queries
- URLs, which return http-response with input parameter processing results (there we will check for evidences of successful exploitation).
In general case it is impossible to precisely derive this kind of information with black-box analysis. That's why in order to demonstrate good results black-box scanners have to:
- sequentially inject in each input parameter attacks of every type (XSS, SQLI, command injection, XPath injection, RFI, and many more)
- within a given attack type (e.g. SQLI) we have to guess the control characters (for SQL - quotes, number of brackets, etc), which we need to prepend to the injected attack vector so that the resulting query is left syntactically correct. * assuming web application handles exceptions properly *
- in order to detect multi-module vulnerabilities (which make second order attacks possible) we have to crawl every web application interface after each injection. Indeed, in general case we do not know which web application interface contains the results of poisoned parameter processing.
Let us call this way of testing - "undirected fuzzing". If we just could gain data indicated in the first list above we could perform a "directed fuzzing". We would inject in each input parameter only those attack vectors that are relevant to the processing path of the data. Besides, we would inject only such attack vectors that leave queries generated by web application syntactically correct. As such, we might focus on filter bypassing techniques. Finally, after each injection we will know which URLs to check for evidences of successful exploitation.
And all this kind of useful information is possible to gather by means of dynamic analysis. Let us suppose, we have two components: a scanner and a server-side module (implemented either as a part of an interpreter (Python, Perl, PHP, Ruby) or as a web application instrumentation/hooks (.NET, Java)). Here's one of the possible workflows:
1. A scanner component performs crawling with form submission and authentication. At the same time server-side module builds traces of web application execution for each http-request.
2. Once crawling completes, execution traces are sent to the scanner for further analysis. At this stage scanner builds dependence graphs and analyses them. The goal is to infer:
- the list of sensitive operations, which are called with arguments derived from the input value
- the syntactic structure of sensitive queries
- URLs, which return http-response that show input parameter processing results.
This information is used to create a plan for directed fuzzing.
3. Scanner performs directed fuzzing, while server-side module collects execution traces. Once fuzzing completes, collected traces are sent to the scanner for the final analysis. Scanner uses different checks to detect whether each attack was successful:
- response analysis (good for XSS detection);
- parse tree analysis (see explanation here);
- control flow analysis (detect exceptional conditions if any).
This is just an illustration of how one could leverage dynamic analysis to substantially enhance black-box scanning results (and time!).
I hope you have found this blog post useful and I'm always interested in hearing any feedback you have.
среда, 11 августа 2010 г.
Подписаться на:
Сообщения (Atom)