Logfile format
W3Perl can be used with Apache/IIS/FTP/Squid or RealServers.
In fact, the package is based on log file parsing so any servers which
have logfile output can be read.
Various web logfiles format are supported. It include the
standard Common Logfile Format (CLF), ECLF, NECLF, IIS and W3C format (IIS server) but also many variation from this ones.
FTP/Squid and Mail (Exim/Postfix/Sendmail) logfiles have been added.
User's own logfile can also be parsed because you can define logfile strings using some predefined keywords as %host, %page, %status
... or %null if not available.
|
|
Few examples are listed here :
| CLF |
%host %null %login %date %hourshift %method %page %protocol %status %requetesize |
|
www.lyot.obspm.fr - - [01/Jan/97:23:12:24 +0000] "GET /index.html HTTP/1.0" 200 1220
|
| ECLF |
%host %null %login %date %hourshift %method %page %protocol %status %requetesize %referer %agent |
|
www.lyot.obspm.fr - - [01/Jan/97:23:12:24 +0000] "GET /index.html HTTP/1.0" 200 1220 "http://www.w3perl.com/softs/" "Mozilla/4.01 (X11; I; SunOS 5.3 sun4m)"
|
| IIS |
%host %login %date %hour %null %null %null %null %null %requetesize %status %null %method %page
|
|
129.142.90.150, -, 5/5/97, 14:33:27, W3SVC, RHINO, 194.182.141.6, 2601, 207, 1272, 200, 0, GET, /frabout.htm, -,
|
| W3C (*) |
%hour %host %method %page %status
|
|
19:05:37 193.149.100.108 GET /images/ap.gif 304
|
| FTP |
%date %transfert_time %host %requetesize %page %null %null %direction %null %login %method %null %null %status
|
|
Tue May 7 15:28:51 2002 920 mix.iap.fr 668499968 /ftp1/linux/redhat-7.3/valhalla-i386-disc1.iso b _ o a guest@unknown ftp 0 * c
|
| RealServer |
%host %null %login %date %hourshift %method %page %protocol %status %requetesize %agent
|
|
62.123.125.30 - - [09/Apr/2003:16:32:10 +0200] "GET admin/xblib.js HTTP/1.0" 200 0 [Mozilla/5.0 (X11;U;Linux i686;en-US;rv:1.3a) Gecko/20021212] [] [UNKNOWN] 0 0 0 0 0 398
|
| Squid native |
%date %elapsed %host %codestatus %requetesize %method %page %null %peerstatus %mimetype
|
|
1042153466.411 120 4.1.200.248 TCP_REFRESH_HIT/304 258 GET http://www.voyages-sncf.com/img/seldate.gif - DEFAULT_PARENT/127.0.0.1 - ALLOW
|
| Squid common |
%host %null %login %date %hourshift %method %page %protocol %status %requetesize %codestatus
|
|
6.20.235.223 - - [28/Feb/2008:00:29:27 -0500] "GET http://www.internet-direct.net:8080/news.html HTTP/1.1" 200 392 TCP_MISS:DIRECT
|
| Squid ECLF |
%host %null %login %date %hourshift %method %page %protocol %status %requetesize %referer %agent %codestatus
|
|
6.20.235.223 - - [28/Feb/2008:00:29:27 -0500] "GET http://www.internet-direct.net:8080/news.html HTTP/1.1" 200 392 "http://www.three.com.hk/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; .NET CLR 1.1.4322)" TCP_MISS:DIRECT
|
| Postfix/Sendmail |
%date %null %module %id %message
|
|
Jan 7 12:09:36 portal postfix/lmtp[21014]: B43F5744258: to=<desti1@some.com>, relay=/var/lib/imap/socket/lmtp[/var/lib/imap/socket/lmtp], delay=1, status=sent (250 2.1.5 Ok)
|
(*) W3C format is automatically detected as the format description is included in the logfile
But you can define your own logfile format using :
| Field |
Description |
Value (ex) |
| %host |
The name or IP of the remote host |
www.lyot.obspm.fr or 145.238.44.5 |
| %date |
Date as Day:Hour or just Day |
01/Jan/97:23:12:24 or 1998-02-02 |
| %time |
Hour |
23:12:24 |
| %hourshift |
Shift from GMT |
+0200 |
| %method |
Method requested to send the file |
GET |
| %page |
The file requested |
/index.html or /sky/astro.gif |
| %protocol |
Protocole used |
HTTP 1.0 |
| %status |
The status code |
200 |
| %requetesize |
The byte transfered for the requested file |
1345 |
| %agent |
Browser and OS of the remote host |
Mozilla/4.01 (X11; I; SunOS 5.3 sun4m) |
| %refer |
The page the request come from |
http://www.google.com/ |
| %virtualhost |
Name of the server the request was |
www.lyot.obspm.fr |
| %query |
Arguments from the request |
q=w3perl&meta=lr%3D%26hl%3Den |
| %direction |
FTP : outgoing or incoming transfer |
o |
| %I |
Input bandwith |
56 |
| %O |
Output bandwith |
345 |
| %transfert_time |
FTP : time to transfer |
920 |
| %elapsed |
Time to answer the request |
120 |
| %codestatus |
Proxy status code |
TCP_REFRESH_HIT/304 |
| %peerstatus |
Proxy peer status |
DEFAULT_PARENT/127.0.0.1 |
| %mimetype |
Proxy mimetype |
text/html |
| %id |
Mail log id |
B43F5744258 |
| %module |
Mail log module |
sendmail |
| %message |
Mail log message |
a-pit@roll.com H=smtp.dom.com [38.113.3.61] P=esmtp S=15923 id=01c78cbd$354bb4d0$6c822ecf@a-pitarch-v |
| %null |
Anything not in this list ! |
- |
For example, to get a CLF format, you should use :
%host %null %login %date %hourshift %method %page %protocol %status %requetesize
|
|