SQL Grammar - Word Order Matters

SQL Grammar – Word Order Matters

January 30, 2023
Web Application Firewall

Vercara Cyber Threat Intelligence Team

Just like in natural languages, SQL statements have a standard order of words. In English, a sentence uses the order: subject + verb + object. For SQL statements, it depends on the action but directionally looks like¹:

The SQL programming language is almost 50 years old but still one of the most relevant technologies used today. Injection attacks rank third (3rd) on OWASP’s Top 10 categories and occupy the second (2nd) most occurrences of common weakness enumerations (CWE). Anything from XPATH to Hibernate² may be impacted. SQL is one of the more common vulnerabilities within the broader category of injection based attacks.

Injection attacks attempt to embed a malicious SQL statement inside a legitimate payload submitted to the web server. Effects of a successful injection attack may result in the exfiltration, deletion or modification of data. This inevitably leads to monetary losses from reputational damage, regulatory fines, litigation costs, court ordered settlements and additional marketing/PR budget to repair the brand’s image.

Like all other computer languages, SQL implementations (e.g. ANSI SQL 2003) have keywords. The total number reserved words depend on the implementation. For example, SQLite has almost 150 words that are reserved. The keywords are not to be used as the names of tables, indices, columns, etc. Examples: LIKE, SELECT, FROM, TABLE, UNION.

These keywords can be legitimately used within data posted to a web server and this is what drives false positives.

Imagine a webpage displaying types of furniture. The GET query string to populate a category page may be referenced by: “?furniture=table&extension=drop%20leaf”. Ostensibly the values (table, drop leaf) will be inserted in a SQL statement returning all pieces of furniture of the selected style.

The injection attack attempts to insert characters or keywords to semantically change the statement. The malicious input may use a condition that always returns true (and 1=1) and/or in-line comments (double hyphen) to ignore the remaining portion of the text. Not commenting out the remainder of the string input may break the newly created malicious input. There are a lot of basic tutorials with examples of a simple injection attack so no need to recreate here.

The furniture store example is contrived but based on a real-world issue we solved. Many reserved words will be present in legitimate transactions posted to a web server. Simple pattern matching will throw too many false-positives – legitimate requests incorrectly blocked.

Relaxation rules are a way to let the WAF know of a false-positive that can be safely ignored. Relaxation Rules override the default countermeasure. A violation that was initially logged or blocked, can be inspected by the user and allowed within the policy. These rules inspect a URL, Cookie, Form Field, or Header searching for the specific keyword, special character, or pattern that should be allowed in context.

One common Relaxation Rule would be to override the blocking of all occurrences of “like” to only allowing “like” in a specific HTTP Request Header:

User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7)
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/108.0.0.0 Safari/537.36

As a convenience, the relaxation rules can be created from the logs dashboard within UltraWAF. Having to manually configure all the relaxation rules may be unreasonable and a grammar would be more useful. The objective of the grammar is to distinguish between legal and illegal sequence of characters – SQL characters/words in this case.

UltraWAF offers a Boolean option to enable “SQL Injection Grammar” allowing transactions to pass even when keywords are present but are not in a valid order. The following sentence would not be blocked even though there are at least two reserved words:

3 drop leaf table in stock

The idea is to analyze the string, recursively, to determine if the keywords are in a syntactically correct statement. A simple string for a grammar to reject would be:

drop table leaf

Words used more frequently, in English, have a higher chance of being a false-positive in a WAF. Some words commonly found in false positives would be “and”, “like”, or, “null”. Instead of naive word-matching and counting, the position of the words within the sentence matters. Just like natural languages, SQL statements also have a grammar.

Graphically these are often represented as a finite state machines. The SQLite drop table command looks like³:

The grammar can be evaluated using regular expressions. The first half of this regex may look like:

(?i)/drop table?:\s*(.*?)\s*(?:if exists?:\s*(.*))?$/

The takeaway is that a grammar can dramatically increase the accuracy of the SQL Injection rules by reducing the false positives (requests that are detected as an attack when they are not) and the false negatives (requests that are not detected as an attack when they are an attack). A calculation for this is the Matthews correlation Coefficient (MCC) which is the ratio between these measurements to find the “sweet spot” in tuning where you minimize false-positives and false-negatives⁴. The more accurate the WAF countermeasures and rules are, the lower the level of effort to tune and the lower Total Cost of Ownership.

1. https://www.sqlite.org/lang_expr.html
2. https://owasp.org/www-project-top-ten/
3. https://sqlite.org/lang_droptable.html
4. https://en.wikipedia.org/wiki/Phi_coefficient

Last Updated: March 27, 2024

Introducing UltraAPI: Bash bots and secure APIs.

SQL Grammar – Word Order Matters

Solutions

Industries

General