What data should be logged?

Question

Imagine having a web application. You then decide that you want to create your own logging system, for whatever reason. What data should be logged to put a very good logging system in place?

I was thinking about the following:

Date and time of access for every user
User IP
Number of consequent login attempts
Session length
Data entered in form fields (to see if anybody is trying an SQL injection)

What other data should be logged, especially as far as security is concerned?

Also, do you think that the last point of the list can make some sense? Of course, only non-sensitive data would be collected, for example search queries.

"Data entered in form fields" That can be problematic. For example you probably don't want to log passwords. — CodesInChaos, May 04 '12 at 20:17
I'd approach this from the old end - what do I want to be able to report on after the fact. A "very good logging system" is extremely subjective IMHO. For example, number of consequent login attempts can be derived if you capture all incoming requests and doesn't necessarily need to be logged separately. With that said, is the point to build a logging system to replace or to supllement existing functionality? — bangdang, May 04 '12 at 20:38
@bangdang Actually, I didn't really think about this, I was thinking about both options, but let's say that you want to do it to replace existing functionality, the question should be more general this way. — user1301428, May 04 '12 at 20:41
Hmm. Capturing data that resolves who (sip, sport, username, http header(s), ), what (http header(s)(i.e. uri), action, when (absolute and/or relative), where (uri (depending on how the logging is deployed), and how (http method) serve as good high-level guidelines. I left "why" out because that's also very subjective and difficult to log. I'm sure there are plenty of other elements. — bangdang, May 04 '12 at 21:01

score 5 · Accepted Answer · edited Dec 24 '20 at 22:24

To expand upon Rory's recommendations, you really need to ask yourself what is the driver behind your logging, and what information you need to accomplish those goals.

For example, if you need user attribution then you probably need

Username
Timestamp
Source IP
GET string and possibly POST variables
Session IDs
Cookie information (expirations, tokens if appropriate, chocolate/sugar/gluten-free, etc)

Are you looking for unauthorized access attempts?

Timestamp
Source IP
Action Performed (login, data query, etc)
Related information to action, (username, query string, etc)

Do you have policy/contractual/regulatory/etc. requiring full session reconstruction? Well, that's a lot harder and will require all kinds of scary data on every request. This will likely require deep app integration and possibly need things like stack traces, variable dumps, packet captures, etc.

score 4 · Answer 2 · answered May 04 '12 at 22:06

Really you need to look at this the other way round - what do you need logging for? That should drive your decision on what to log.

Are you checking for suspicious behaviour from an IP or range of IP's?
Are you trying to monitor usage or performance stats?
Do you need to be able to help your users with their session if something goes wrong?
Are you needing to work within a regulatory framework which specifies data handling?

etc.

score 2 · Answer 3 · answered May 04 '12 at 21:15

2

There is certainly value in logging the HTTP headers. Exactly which ones to log vary highly depending on the specific web application.

answered May 04 '12 at 21:15

Dave

21
1

1

But you need to be careful with logging HTTP headers as they might contain sensitive data, e.g. `HTTP_AUTHORIZATION` header might contain base64-encoded passwords if using basic authentication. – Yoav Aner May 05 '12 at 08:32

What data should be logged?

3 Answers3

Linked