Skip to content

SESSION.counter is inaccurate #1273

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
csutherl opened this issue Nov 24, 2016 · 7 comments
Closed

SESSION.counter is inaccurate #1273

csutherl opened this issue Nov 24, 2016 · 7 comments
Assignees

Comments

@csutherl
Copy link

Using the SESSION collection configuration does not yield expected results for the SESSION counter, however the global counter works as expected. This issue affects both prefork and worker MPMs. Additionally, the tests that I am doing are all after applying the fix from PR #1224 because that PR corrects a lot of the concurrency issues that existed prior to the fix. One more piece of important information is that I can only seem to reliably reproduce this behavior on a physical machine; I guess because it's a concurrency issue that slower VMs I'm using just don't hit the race condition.

To reproduce, use the following mod_security configuration, start httpd with mod_security enabled, and make some requests.

# SESSION 
SecRule REQUEST_COOKIES:SESSIONID !^$  phase:5,id:118,nolog,pass,setsid:%{REQUEST_COOKIES.SESSIONID}
SecAction phase:5,id:119,nolog,pass,setvar:SESSION.my_counter=+1

I'm using the following loop to make 500 requests using five different sessions.

for i in `seq 1 500`;do
    curl -b "SESSIONID=testing$(shuf -i 1-5 -n 1)" localhost &>/dev/null &
done

After the requests are done, I'm checking the collection with modsec-sdbm-util to see if the counter values add up to the total number of requests, and they don't (I usually get ~490). Example:

# modsec-sdbm-util /var/lib/mod_security/default_SESSION -du | egrep '(^Key|my_counter)'
Key: "testing4", Value len: 212
                    my_counter: 99
Key: "testing2", Value len: 214
                    my_counter: 103
Key: "testing1", Value len: 214
                    my_counter: 104
Key: "testing5", Value len: 214
                    my_counter: 100
Key: "testing3", Value len: 212
                    my_counter: 90

The total number of all the my_counter variables should be equal to the number of requests.

@csutherl
Copy link
Author

csutherl commented Nov 24, 2016

Tagging @p0pr0ck5 and @bostrt here since we talked a bit about it on IRC.

@zimmerle
Copy link
Contributor

Hi @csutherl,

That is kind the expected behavior. To avoid the utilization of locks, each process (or threads) has it own version of the collection that is synced every once in a while. The idea is to reduce the idle wait caused by the utilization of a lock. With that strategy we got a closer value without have to hold the request for too long.

In ModSecurity version 3 you can choose the back-end to storage the collection values, among of other options you can also make this value straight by the utilization of a lock.

@zimmerle zimmerle self-assigned this Nov 25, 2016
@csutherl
Copy link
Author

Thanks for checking it out @zimmerle :) I just have a few questions in response to that.

Understood...I think. So, we've identified that persistent storage isn't 100% accurate here because there isn't any locking so there's really no way to have an accurate count across multiple processes/threads. PR #1224 is fixing that portion by adding locks and synchronizing access to the database on disk, but now I think there is an error in the retrieval/storage logic for the non-global collections. Did you take a look at PR #1274 as a proof that the logic is flawed? After that PR, I get an accurate count of sessions (total session counts are equal to the number of requests) in all of my test cases (with prefork and worker). Maybe it would help me understand what is going on there if you could you explain how real_col_name and col_name are used. If this is expected behavior then we'll leave it, but given that my hacky patch changes the behavior in a positive manner, I want to make sure all of our avenues are exhausted before giving up.

Additionally, we're working on finding the actual problem instead of that hack so the default configuration still works, so keep in mind that that's only meant to show the problem not be fix it permanently.

And lastly, unfortunately moving to mod_security v3 isn't an option, so I'm not even testing there at the moment.

@zimmerle
Copy link
Contributor

The PR #1224 was not yet merged. The fact that we don't have a lock does not means that we have an issue. If you put a lock there or a global lock, it will reduce significantly the throughput of your server. Reduce the throughput is not the expected behavior for the most of the cases. The fact that the counter are not precise is took into consideration (or should be) while the rules as being write. Generally speaking: the benefit of having precise counters does not worth the throughput reduce.

@csutherl
Copy link
Author

In response to your update: That is some interesting information, is it documented somewhere? I'm happy to use the system as is if we can get the desired result. Can you tell me how to predict counter values when writing the rules given the the counters are not precise? Is there some margin that we can use or something?

In addition to your question: Can you verify the patch proposed is an issue? The session counters are way off compared to the global counters due to the bug that I've fixed. Even without the lock code, deltas are never applied to session counters like they are for global ones so the values are way off.

@bostrt
Copy link

bostrt commented Nov 30, 2016

@zimmerle coty's latest commit in PR #1274 resolves a bug in the data persistence logic. The incorrect collection key name is being passed in and is fixed in PR.

Coty's PR #1274 can be treated completely separate that Mladen's PR.

zimmerle pushed a commit that referenced this issue May 11, 2017
…need a change, but the session counts are 100% accurate.
@zimmerle
Copy link
Contributor

The collection sync had an issues. Originally the collection name is given after: web app id + collection name. Some parts of the code was referring to collection name and others the prefix web app id was used. This issue was fixed by 2de5175

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants