SQL
  Injection attacks target the core of a web application: its database. 
 Their most significant impact enables an attacker to retrieve, modify, 
 or delete arbitrary data. It is a serious threat to any application 
with  a database back-end and a threat that should be fully understood 
in  order to develop adequate countermeasures. 
Every
 web server  administrator must acknowledge techniques that can be used 
to identify  an SQL Injection vulnerability (see Tobias Glemser's 
Article SQL  Injection Attacks with PHP and MySQL, hakin9 03/2005) and 
assess the  scope of its risk. The basic methodology for an SQL 
Injection attack is  to identify a potential vector, then exploit that 
vector with customized  SQL queries - all through the web browser. 
Identification
 of the  potential for a vulnerability is important, but even more 
important is  the ability to evaluate its impact. In some cases, a SQL 
Injection  vector may offer nothing more than the capability to generate
 some  syntax errors, such as trying to convert strings to numeric 
values. In  other cases, the vector may enable the attacker to fully 
compromise a  database's information. Although the examples refer to 
MySQL databases,  the techniques apply to any database platform and, in 
most cases, can be  applied without modification. The core of these 
techniques targets the  SQL language. Certain database extensions merely
 make these techniques  easier to accomplish. 
To refresh the memory
SQL Injection tests can be classified into three categories based on which aspect of the query is targeted: 
-   attack
  the syntax of the query -  insert common SQL characters with the  
intention of generating errors to  identify potential attack vectors,  
 
-   attack  the syntax of the language - target the  SQL language 
itself in order to  generate database errors or perform  simple queries 
by manipulating  language constructs and semantic  identities,  
 
-   attack  the logic of the query - rewrite the  query to retrieve
 arbitrary data  from tables to which developers did  not intend access.
  
 
These techniques can be combined to 
assess a web  application and determine its vulnerability to SQL 
Injection attacks. In  the next sections the SQL Injection payloads are 
presented without the  entire URL as an example. This makes it easier to
 understand the  techniques without cumbersome parameters and text. 
This
 is also  because the injection of these payloads is quite simple. Given
 a URL of  the form http://site/page.cgi?a=foo&b=bar, a SQL 
Injection  attack replaces the vulnerable parameter's value with its 
payload:  http://site/page.cgi?a=&b=bar. As a further reminder, one 
has to  remember to encode spaces and other characters in the payload so
 that  they do not disrupt the syntax of the URL. 
Attack the syntax of the query
The
  single quote, while arguably the most popular character for 
identifying  SQL Injection vectors, is by no means the only character 
necessary to  generate a database error. This technique encompasses most
 fundamental  tests for potential vulnerabilities by using SQL language 
metacharacters  or formatting characters to disrupt the syntax of the 
original query.  For example, the following statements cannot be parsed 
into valid  queries because they have an ill-formed syntax due to an 
unterminated  single quote: 
-   SELECT foo FROM bar WHERE a = ''';,  
 
-   SELECT foo FROM bar WHERE a = '/*;,  
 
-   SELECT foo FROM bar WHERE a = ';--;,  
 
-   SELECT foo FROM bar WHERE a = '#;.  
 
While the most common example is the 
single quote  character (ASCII 0x27), many characters can be used to 
disrupt the  syntax including: 
-   unmatched parenthesis,  
 
-   semi-colon,  
 
-   comment delimiter - /*, #, or --.  
 
Validation filters that only prohibit
 single quote  characters (or some small set of characters) might 
prevent full  exploitation of a vulnerability, but such filters are 
often inadequate.  They may simply obscure more fundamental problems 
with the application's  database connection architecture. 
Quotes vs. slashes
PHP
  developers face several challenges and potentially confusing  
recommendations when creating strong input validation filters. PHP's  
magic_quotes() function automatically escapes all single quotes with a  
backslash character; however, if this feature is combined with a call to
  the strip_slashes() function, then the escape characters have been  
removed: 
-   SELECT foo FROM bar WHERE a = '\''; - single quote escaped,  
 
-   SELECT foo FROM bar WHERE a = '''; - backslash stripped, query ill-formed.  
 
The other danger of focusing on the 
single quote  character is that developers may not be aware of the full 
range of  characters and techniques available to an attacker for 
exploiting a SQL  query. The attacker can combine SQL functions to 
generate errors in the  syntax of a query. 
You can also use inherent SQL functions to generate errors. The SQL CHAR()
  function prints the ASCII equivalent of the argument. An attacker may 
 be able to inject quote characters by using odd or even amounts of CHAR(0x27)
  strings (hexadecimal 0x27 represents the ASCII code for the single  
quote). This is important, because the attack consists of alphanumeric  
characters plus the parentheses. Consequently, monitoring input for quote characters will not catch or block the attack. 
Variables may vary
Database-related
  errors can also be generated by attacking variable types. This is most
  effective against numeric values, but is also successful against date 
or  time variables. For example, here is a list of different values that
  you may try against parameters that expect decimal numbers: 
-   8-, 16-, 32- and 64-bit values - 256, 65536, etc.,  
 
-   integer overflows - 2^8 + 1, 2^16 +1, 2^32 + 1, or 2^64 + 1,  
 
-   unsigned vs. signed values - inserting negative values,  
 
-   floating-point overflows - for example 3.40282346638528860e+38, 1.79769313486231570e+308,  
 
-   alternate presentation - binary, octal, hexadecimal, or scientific notation.  
 
These numeric attacks often succeed 
in generating  errors because the variables used to track these values 
are not strongly  typed. In PHP the parameter type of all $_REQUEST variables is a string. This means that, although you can perform arithmetic operations on variables ($a = 1; $a++),
  the actual type of the variable may be considered a numeric string. 
The  variable may even be silently promoted from a number to a numeric  
string when the value would normally result in an overflow, inf (infinity), or NaN (not a number) equivalent. For example, PHP's is_numeric("1e308") function returns true (it is a number), but is_numeric("1e309")
  returns false - neither a number or numeric string because it is 
beyond  the double float type that PHP supports. A variable must be set 
to  numeric explicitly using the settype() function, but beware that large values may return a value of inf - which can also lead to errors in the query if it is expecting numerals. 
Fighting the synonyms
Robust
  input validation filters can be an effective countermeasures to these 
 techniques, but they are not sufficient. Database errors and other  
exceptions should be trapped and prevented from being sent to the  
browser. Verbose error information tends to provide useful information  
for malicious users targeting a database. As we will see a bit later,  
input validation filters may be inadequate. For example, we have already
  seen that the value 1e309 is not a number (for most languages and SQL 
 databases) and will generate an error in less secure applications. Yet 
 1e309 does not contain any characters that are normally malicious. It 
is  a purely alphanumeric value. 
Note that SQL is a rich language that provides an attacker to create many synonomous permutations. For example, CHAR(0x27) is equivalent to ASCII(0x27) which can also be written as x'27. We focus on using the CHAR(0x27)
  string to avoid raw quotes in the payload, but the specifics of each  
test are highly mutable. This also implies that syntax-based filtering -
  such as application-layer firewalls - must be very robust in order to 
 prevent these attacks. In fact, the combination of alternate encoding  
schemes (URL encoding, Unicode) and creative SQL will bypass most  
pattern-matching filters. Remember, CHAR(0x27) is the same as cH%41r(0x68-0x41). 
Semantic doppelgangers - attack the syntax of the language
In SQL, Shakespeare's observation of roses might look like the decidedly unpoetic: 
SELECT name FROM roses 
WHERE scent='sweet'; 
Whether
  a rose might be called shoe, bumblebee, or clock, its sweet-smelling  
attribute remains unchanged. SQL provides a rich set of functions that  
can be used to create semanticly equivalent queries that look quite  
different textually. This capability enables an attacker to identify and
  exploit injection vulnerabilities even when the server does not reveal
  error information or similar output. 
While
 it is useful to break  queries in order to find potential 
vulnerabilities, it is also  profitable to attack the query using the 
semantics of built-in SQL  functions. Thus, instead of attacking the 
parser of the application  language (PHP, JSP, etc.), the attack focuses
 on the SQL language  itself. This has the added benefit of not only 
identifying attack  vectors, but also provides more information about 
the input validation  filters used by the application. Another byproduct
 of this technique is  the ability to perform blind SQL Injection attacks, or attacks that do not rely on error generation in order to identify or exploit. 
Numeric data types
Numeric
  data types are the easiest candidates to test with this technique.  
Figure 1 shows the original example URL, while Figures 2 and 3 present  
modified addresses We are using an older, insecure version of FreznoShop
  online shopping system - releases newer than 1.4 branch are quite  
invulnerable. 
Consider the following list of name/value pairs: 
-   rowid = 111,  
 
-   rowid = 0x6f,  
 
-   rowid = 0157 (octal representation),  
 
-   rowid = 110+1 (use 110%2b1 in practice because the + stands for a space character in the URL),  
 
-   rowid = 112-1,  
 
-   rowid = MOD(111,112),  
 
-   rowid = REPEAT(1,3),  
 
-   rowid = COALESCE(NULL,NULL,111)
 
From
  a database's point of view, each one of these requests results in the 
 same value: 111. Also notice that none of these rely on the single 
quote  character. The first three look like numeric or alphanumeric 
strings,  the next two have apparently innocuous characters for the 
addition and  subtraction symbols, and the final three include 
parentheses and a  comma. If input validation were to focus on stripping
 the single quote,  then a vulnerable application would gain no benefit 
from such a  countermeasure. 
Raw parameters
This
 technique, which uses  semantic doppelgangers, enables the user to 
identify SQL Injection  vectors. If the result of each request is 
identical, then it can be  assumed that the application engine has 
parsed the raw parameter value  and inserted it into the underlying SQL 
query. For example, consider  this query for a rowid: 
SELECT foo FROM table 
WHERE rowid = 110+1; 
The
  database calculates 110+1 = 111 before resolving the rest of the 
query,  according to its order of operations. This bears the same result
 as the  original query: 
SELECT foo FROM table 
WHERE rowid = 111; 
Before
  we explain how to extend this attack to extract arbitrary data, let us
  first examine some other cases that can be used for error generation. 
 Even though this technique does not require us to generate database  
errors, such information is useful to determine versions and names of  
tables or columns. If the application's input validation filters have  
stripped quote characters, but not trapped database errors, then we can 
 target incorrect SQL function syntax. For example: 
-   BIN(-1),  
 
-   LIMIT a (this is useful because it does not require parentheses),  
 
-   MOD(0,a).  
 
Of course, numeric values should also be tested for boundary conditions as mentioned in the previous section. 
Premature termination characters
This
  technique lends itself to the creation of custom SQL queries. Such  
queries often do not require quote characters, but often require  
premature termination characters. Thus, a request might employ /* or -- in order to truncate additional, undesired statements. A string SELECT foo FROM table WHERE rowid = MOD(111,112)+UNION+SELECT+USER()/*; is a good example. 
String
  values present a greater challenge because there are fewer functions 
in  the SQL language that provide helpful semantic doppelgangers. The CONCAT() function is useful for these cases. In cases where the string argument only contains the letters a-f , the HEX() function can be used: 
-   op=add,  
 
-   op=HEX(2781),  
 
-   op=REVERSE(dda),  
 
-   LEAST(0x6d75736963,0x6e75736963),  
 
-   GREATEST(0x61,0x6d75736963).  
 
Once again, we have consciously 
chosen to avoid using  quote characters because they set off alarms or 
may be blocked. Yet this  doesn't prevent us from creating complex 
strings. The REVERSE(), LEAST(), and GREATEST() functions only need parentheses and commas. The following examples are all semantically identical: 
-   page.cgi?category=music,  
 
-   page.cgi?category=REVERSE(cisum),  
 
-   page.cgi?category=GREATEST(0x61,0x6d75736963),  
 
-   page.cgi?category=LEAST(0x6d75736963,0x6e75736963).
 
Countermeasures
The
 best  countermeasures for these attacks use input validation filters 
and  strong data types when assigning user-supplied values to query  
parameters. Even though 0x27 is a valid hexadecimal value, it should be 
 prohibited by the application because the raw value contains a  
non-numeric character (or possibly silently coaxed into 27 decimal).  
Likewise, octal 0157 should either be denied because of the leading  
zero, or the leading zero could be stripped so the value becomes 157  
decimal, which is merely a different row number. At the very least,  
developers should be aware of alternate bases and understand where they 
 are interpreted: either in the application language or in the database.
 
It's  very easy to handle all 
user-supplied data as strings, but if the data  are to be inserted into a
 query, then they should be explicitly assigned  (cast) to the 
appropriate data type. For interpreted languages such as  PHP, Perl, C#,
 or Visual Basic the assignment should be safe or generate  a conversion
 error. If the web application uses a compiled language  such as C or 
C++, then the type casting should be handled carefully and  checked for 
exceptions (think of format-string attacks). 
Attack the logic of the query
Breaking
  the syntax of a query is useful for identifying SQL Injection  
vulnerabilities, but it only demonstrates the existence of a problem.  
Arbitrary data access is the true risk associated with SQL Injection  
attacks. 
MySQL supports a 
specific comment macro that triggers on  the database version /*! */, 
where  is a 5-digit value that represents  the MySQL build. For example,
 version 3.23.02 looks like 32302, version  4.1.10 looks like 40110, and
 version 5.0.3 looks like 50003. The most  immediate way to test for 
embedded SQL attacks with MySQL is to combine  the comment extension 
with a statement that ensures the query will fail:  
-   /*!32302+AND+0+*/,  
 
-   /*!32302+AND+0+*//* (it may be necessary to terminate the query).  
 
Then, one can flip the query and 
ensure that it  succeeds in order to verify the injection vector - 
/*!32302+AND+1+*//*  (it may be necessary to terminate the query). 
UNION SELECT
Once
  a parameter has been identified as a vector for SQL Injection attacks,
  the next step is to determine the amount to which the database is  
exposed. This is accomplished by manipulating the logic of the original 
 query. Most basic queries are of the form SELECT foo FROM bar WHERE a=b; in which the b of a=bSELECT. The quickest technique is to use the UNION keyword.  clause is the parameter that can be manipulated. Consequently, the new query must consider the previous 
The UNION statement combines multiple SELECT statements and is supported by most databases. The basic form looks like SELECT foo FROM bar WHERE a=b UNION SELECT foo2 FROM bar2 WHERE c=d;. 
One useful UNION clause is to display the user name under which the database connection has been established. On MySQL you would do this with SELECT USER(). Inside a UNION clause the request might look like 
SELECT text FROM articles 
WHERE id=0 
UNION SELECT USER(); 
Several challenges present themselves when using UNION statements for SQL Injection attacks: 
-   the UNION clause should terminate the query to ensure valid syntax - any additional logic must be truncated,  
 
-   UNION statements require matching column counts in each SELECT clause.  
 
The first challenge is relatively 
easy to accomplish.  Simply use one of the common terminators described 
in the previous  section. This can be a comment delimiter (#, /*, --) in combination - if necessary - with a semicolon or single quote. 
Columns and bears
The
  second challenge is not difficult to overcome, but requires a few  
iterative steps remniscent of Goldilocks and the three bears. The  
injected UNION
  clause will either have too few columns or too many of them - what you
  need is a number that is just right! If you can observe the database's
  error messages, then you'll see something like The used SELECT statements have a different number of columns. 
Column undercounts can be fixed by adding extra columns or column place-holders to the SELECT statement (see Figure 4). For example, consider the following statements: 
-   SELECT user FROM mysql.user,  
 
-   SELECT 1,user FROM mysql.user,  
 
-   SELECT 1,1,user FROM mysql.user,  
 
-   SELECT user,user,user,user FROM mysql.user.  
 
Figure 4. A successful UNION SELECT attack  
Each one of these queries is designed to grab the user name (or names) from the default mysql.user
  table. The number of columns increases from one to four in each  
example. In practice, it is better to repeat the column name to ensure  
that the value is displayed in the application. The first placeholder  
works, but it's hard to tell which column the web application will  
display. 
Column overcounts can 
be addressed by using the CONCAT  statement. Overcounts occur when the 
first SELECT statement expects  fewer columns than your custom query. 
The CONCAT statement resolves this  by concatenating each column into a 
single string. Thus, multiple  columns are reduced to a single column. 
For example: 
SELECT foo FROM table 
WHERE a=b 
UNION SELECT CONCAT(*) 
FROM mysql.user; 
This can be combined with the undercount technique when necessary: 
SELECT foo,bar FROM table 
WHERE a=b 
UNION SELECT 1,CONCAT(*) 
FROM mysql.user; 
The major caveat is that any NULL value in one of the column results will cast the CONCAT string to NULL. 
Aim at rows
Once
  you have matched column counts for the query, the next step is often 
to  specify an arbitrary row to retrieve from a table. When the query  
returns multiple rows, often only the first one is displayed. To some  
degree, a good WHERE
  clause can help target specific rows, but only if the table's general 
 structure (column names) is known before-hand. A much easier method 
uses  offsets within the LIMIT clause. You can limit the result to one row by using LIMIT 1, but you can control which row is returned by adding the optional offset beginning with 0. For example: 
-   SELECT foo FROM table WHERE a=b UNION (SELECT CONCAT(*) FROM mysql.user LIMIT 0,1);,  
 
-   SELECT foo FROM table WHERE a=b UNION (SELECT CONCAT(*) FROM mysql.user LIMIT 1,1);,  
 
-   SELECT foo FROM table WHERE a=b UNION (SELECT CONCAT(*) FROM mysql.user LIMIT 2,1);.  
 
You can progress through the offsets until the query returns a NULL row. Unlike the previous examples of simple queries, it is necessary to place parentheses around the clause that contains the LIMIT statement. Otherwise it will be incorrectly applied to the entire query. 
Defence by statements
The
  use of prepared statements (also known as parameterized queries) or  
stored procedures are effective countermeasures to these techniques  
because they separate the logic of the query from the data of the query.
  Consequently, injection attacks can corrupt the original SQL query, 
but  will not be able to rewrite it in such a manner that arbitrary 
tables  or data can be accessed. 
A
 potential drawback of prepared  statements is that they require 
additional set-up within the  application. This could lead to a 
performance degradation; however, such  an impact may be minimal. The 
security gains are definitely good. 
Help yourself and separate
Inadequate
  input validation filters are an integral part of SQL Injection  
countermeasures, but they are often not the underlying problem. Strong  
data typing (assigning numbers to numeric data types, etc.) is also key,
  but string data always presents a challenge (see Frame Additional SQL tricks). 
Additional SQL tricks
Our
  core idea is to identify a SQL Injection vulnerability via creative 
use  of SQL formatting characters (syntax) or SQL functions (semantics),
  then exploit the vulnerability by attacking the SQL logic. Although it
  primarily focuses on numeric and string manipulation, other functions 
 can be used (or rather misused) to generate errors for vulnerability  
identification: 
-   INET_ATON(),  
 
-   INET_NTOA(),  
 
-   SOUNDEX().  
 
Enumeration is another important part
 of SQL Injection;  one that is beyond our scope here. Nevertheless, 
here are some simple  queries that can be used to further determine 
information about a  database: 
-   SHOW VARIABLES,  
 
-   SHOW STATUS,  
 
-   SHOW DATABASES,  
 
-   SHOW TABLES,  
 
-   DESCRIBE
 
-  
 
 
-    EXPLAIN 
 , 
-    EXPLAIN SELECT  FROM    
 , 
-    SHOW FULL COLUMNS FROM    
 , 
-    SELECT USER(),   
 
-    SELECT SESSION_USER(),   
 
-    SELECT CURRENT_USER(),   
 
-    SELECT SYSTEM_USER(),   
 
-    SELECT SUBSTRING_INDEX(USER(),'@',1),   
 
-    SHOW CHARACTER SET,   
 
-    SELECT CURDATE(),   
 
-    SELECT CURTIME().A more fundamental problem of SQL Injection is the lack of separation between the query's logic and data. The logic is defined by the developer and is expected to remain static. The data are collected from the user. When the data and logic intermingle, such as using string concatenation to build queries, then user-supplied data can manipulate the logic of the query. This is the higher risk compared to input validation, because a modified query provides access to arbitrary data in the database. A formatting character maliciously inserted into a stored procedure may merely produce a database error instead of exposing the actual data. This is not meant to imply that input validation is not important; however, any countermeasure to these types of attacks should focus equally on query construction and execution.Without a comprehensive understanding of the different techniques that attackers employ against web applications, developers will not create effective countermeasures. From an assessment perspective, auditors who do not adequately investigate the scope of a SQL Injection vulnerability present an inaccurate view of the application's risk - and if testing only relies on injecting single quote characters, then the assessment may be useless. SQL Injection attacks can be executed with many different characters
 
No comments:
Post a Comment