W3Perl developement 

Time break. Will spend the next weeks to fix problems, increase speed, save memory and improve display.

Next release Previous release
3.06 3.04
MAR
04
3.05 Final
04.03.2009 | 3.05 Final | Auteur : Domisse
One month later, I'm still working on the 3.05 release. Many small improvements thanks to user's feedbacks.
- To avoid updating twice the main stats, a lock have been added in the main script.
- If you run the master script trying to update the stats and no output have yet done, the script will switch to the initialization mode.

Squid reports gets lighter (and so faster). Some filetype are now rejected (number, length > 15 characters).

At last, the IIS installer seems to work fine. There are still some improvements to be done but it can be used. The output directory is now build within the install as it allow to add IIS User to write into.

There are still some errors you can get :

error msg : "No log files found !"
- Logfile directory should be readable by IIS User

error msg : "404 - Page not found"
- Perl CGI Extension should be activated in IIS admin (Web Service Extension)

error msg : "can't exec : Make sure cmd.exe can be run by IIS User"
- IIS user should be allow to run cmd.exe (running master script on web interface).

error msg : "Unable to open xxx file"
- Output directory should be read/write for IIS User (set on the default output directory during installation).

More informations about adding IIS 6.0 Metabase Compatibility in IIS 7.0 :
- http://msdn.microsoft.com/en-us/magazine/cc163453.aspx
- http://msdn.microsoft.com/fr-fr/library/bb675150.aspx
- http://technet.microsoft.com/fr-fr/magazine/2008.03.iis7.aspx

I've been able to install the IIS package on IIS 7.0 (Windows Server 2008) :
If UTF-8 enabled, log filename will be u_ex instead of ex.

FEB
28
Roadmap
28.02.2009 | Roadmap | Auteur : Domisse
As the package is becoming more and more popular, I have more feature requests ... so many things still need to be done. Here is the list of 2008 main improvements :

  • Frame to CSS display
  • Clickmap (still experimental)
  • City location (thanks to the GeoIPCity module)
  • Configuration file can be build online
  • Windows Abyss installer
  • Javacsript tag package

and 2009 planning :

  • Load-balancing server logfiles : merging reports
  • Clickmap working with floating page
  • Domino / Streaming server
  • Better installation for IIS
  • More speed
  • Running a Web Service ?
  • Build every graphic type output and choose one of them online
  • Graph enhancements (Flash ?)
  • More tests on the tag package
  • Search in the toolbar ?
  • SearchHost for session stats
  • Adding Cookbook and updating PPT
  • Cloud tag on referer (sentences ?)
  • Spam detection
  • XML output + Xslt
  • Blacklist Email
  • Multi-proc (WS)
  • Subversion repository ?

Problems to fix :

  • Agent stats to rewrite
  • Using cgi-perl (fastcgi) instead of cgi-bin
  • Applying Tablesort on search results
  • Sorting on increase/decrease array
  • Improving script stats with big logfiles
  • Better worldmap
  • Filtering on tablesort
  • Running refer/agent together if nb_proc > 1

FEB
24
3.05 fixes
24.02.2009 | 3.05 fixes | Auteur : Domisse
The new IIS installer is now working but as noone seems to be insterested in, I will switch to bug fixes.

On some Windows, the 'system' command from the master script (cron-w3perl.pl) fails when running from the web interface. I've spent days on google to find why. When calling exec or system, windows launch a new shell with cmd.exe. If your configuration allow cmd.exe to be run by the IIS user (IUSR_[hostname]), no problem. If not, you'll receive a fatal error. Why some computer have permission deny and others not is a mystery. The only way to allow the script to run is to change permission on cmd.exe to allow read/write accesses on IUSR_XXX ... not very safe.
Of course, you can run the master script from the Windows Start Menu and not from a web interface to avoid this problem. Or if you are using the web interface, you can still launch the different scripts ... but not the master one.

Fixes :
- Email report cleaning
- Removing some extension checkings
- Referer stats is now restricted to URI without script parameters (causing memory crash)
- Squid report bypass html tree generation (was designed for a single website, not thousand)
- Using fork rather than system in the master script (ActivePerl now support some sort of forking).

Internet Explorer 8.0 will soon be released and will introduce webslice, a nice feature which allow to bookmark only a part of an html page. Very useful if you want to refresh information without getting to the website. Look like RSS with visual data. But it will make stats more difficult !
- http://googlexxl.blogspot.com/2009/01/creer-un-webslice.html
- http://msdn.microsoft.com/fr-fr/ie/cc963660.aspx

FEB
14
IIS Installation
14.02.2009 | IIS Installation | Auteur : Domisse
I've been working on improving IIS installation for the last week. As a starting point, I used this script from Ken Robertson's Blog. It uses ADSI rather than WMI to query the IIS metabase. Sadly, with IIS 7.0, everything changes and the IIS metabase doesn't exist anymore. But you can still use IIS6 Metabase Compatibility feature on IIS7.

WMI only works from IIS 6.0, but using ADSI to configure the webserver works on IIS 5.0, 5.1 and 6.0. So if you want something working from 5.0 to 7.0, you need to use both ADSI and WMI !

What about IIS Timeout ? Well, a complete review is available here, but of course, it won't work on IIS 7.0. The installer changes the default value from 300 seconds to 10 hours.

The IIS ADSI Provider
The IIS ADSI provider exposes COM Automation objects that you can use within command-line scripts, ASP pages, or custom applications to change IIS configuration values stored in the IIS metabase. IIS ADSI objects can be accessed and manipulated by any language that supports automation, such as Microsoft® Visual Basic® Scripting Edition (VBScript), Microsoft JScript®, Perl, Microsoft Active Server Pages (ASP), Visual Basic, Java, or Microsoft C++.

The IIS WMI Provider
Windows Management Instrumentation (WMI) is a technology that allows administrators to programmatically manage and configure the Windows operating system and Windows applications. Beginning with IIS 6.0, IIS includes a WMI provider that exposes programming interfaces that can be used to query and configure the IIS metabase.

The first step is to list IIS website with a vbs script. Then you need to add a custom page in NSIS to allow user to select the IIS website found.
Within IIS, you should allow Perl to be run in 'Web Service Extension'.
The IIS configuration file is then build according to the website you have chosen. You can still use the web admin to change some values but it will work with the default iis configuration file.

Feel free to test this new installer on IIS 6.

For IIS 7.0, you should select the 'IIS6 Metabase Compatibility', I will add WMI scriptings very soon.

Useful links :
- http://weblogs.asp.net/krobertson/archive/2004/04/01/106002.aspx
- http://blogs.msdn.com/david.wang/archive/2005/07/13/HOWTO_Enumerate_IIS_Website_Configuration.aspx

FEB
03
3.05
03.02.2008 | 3.05 | Auteur : Domisse
New stable release is out. This is mainly a bug fixes release. I'm not able to test everything (Squid/Email/FTP reports ... ) with so many options (City/Intranet/Robots ... ) and so many different logfilename and format (daily/monthly and/or gzip files) on different environnement (Ubuntu/Mandriva/Windows ... ) ! So your help is very welcome.

Lots of improvements have been done :

- Easier FTP installation for users. Upload w3perl's files on your provider and it should work from scratch.
- Many bug fixes on the Windows side. Upgrading is also easier.
- More configuration checkings to avoid conflict.
- Web admin interface have been improved.
- Many cosmetic changes including a new sortable library, faster and more powerful.
- More tool like a css switcher which allow you to use your own css stylesheet.
- More reports like referer list for each entry point.
- Resources files updated with Thai language added and Spanish/German updated.
- Better support for HTMLDoc (PDF) and Mime::Lite (email)

Ideas for the next release :

- Improving Windows installer (installing/upgrading - choosing install directory).
- Support for load balancing server (with logresolvemerge or with a script to combine several reports)
- Support for Streaming server
- Support for SMTP Authenticated server
- Adding filtering in table
- Agent script to rewrite
- Improving script stats (saving memory).
- Support for multi-processor.

JAN
25
3.047 bis
25.01.2008 | 3.047 bis | Auteur : Domisse
3.047 have been updated. Mainly bug fixes and cosmetic changes. Next stable release should be available before the end of January.

- Support for 7zip compressed logfiles is available.
- The ReadParse function is now safer thanks to Johannes who suggest me to remove this vulnaribility.
- Abyss and No server package were faulty because the windows path were not translated in the right way.

JAN
18
3.047
18.01.2008 | 3.047 | Auteur : Domisse
Last developement package before the next stable release.

Still testing the package and fixing small issues. To send emails, domain name should be filled with a litteral string (not an IP) also the file sent was broken on Windows. Local hosts was not shown on Cities stats. If resolv_users.csv is there, Intranet reports are improved (reverse dns is not anymore needed and report can display username rather than hosts). Few cosmetic changes.

JAN
10
3.046 - Windows bug fixes
10.01.2009 | 3.046 | Auteur : Domisse
Oupss....there was a bug in 3.045. The trailing slash for Windows path was missing. So a new package is available.

Many improvements on the Windows side :
- Upgrading your package on Windows is now easier, you just need to copy the new files in your current w3perl installation directory and run the upgrade.pl script in order to copy them in your cgi (in fact, the script do more than just copying files).
- A new field is available if your smtp server require authentifcation (to allow email to be send).
- Fix some bugs on Windows (day range selection gave an invalid graph, hourly graphs not shown).
- I've found HTMLDoc 1.8.27 freely available for Windows. The previous version 1.8.24 was a bit buggy so I hope the new one will solve many problems.
- I will try also to add the ability to choose the installation directory for IIS before 3.05.

I've added few more checkings in the web admin and session stats show now percentage (as requested).

The next stable release should be available before the end of Janauary.

DEC
31
W3Perl 2008 Download
30.12.2008 | W3Perl 2008 Download | Auteur : Domisse
Christmas time.... at least Flash is available for Linux 64bits ! If you want to play few games :
- Brain stimulator
- Learn France geography

Time to extract some data from my own website.
First how many times the package have been downloaded ? To speed up, I'm filtering file extension on exe/gz/rpm/deb. Using precision to 4, two graphs are computed for each package (daily and monthly stats). I saved the monthly CSV files for each package and put them together in gnumeric to get this graphic.

graph1

Results for 2008 :

- Around 11 000 downloads (all packages together including dev packages), so it means 30 downloads/day.
- It's about a 300% increase compared to 2007.
- Download rate is the same from January to December ... no increase (variations are mainly due to the IIS package).
- IIS is the top package with 3250 download but falling down in december
- IIS and the tarball packages get 49% of the download (28% and 21%).
- Specific installer for debian/ubuntu and mandriva only 7% each.
- Only 7% have downloaded the developement package.
- From referer stats, 1430 download (8%) from external websites (600 from toocharger.com and only 50 from freshmeat (*)
- Release date doesn't bring a lot of extra download (12/Feb, 24/Jun and 28/Oct), especially windows packages.

(*) Most external websites have their own copy, so user download from their website and not from mine. It means more download but I have no way to know how much.

 Pages  Occurence Percentage  Hosts 
/download/w3perl-iis.exe   3248    vert.gif 28.0 % 1216
/download/w3perl.tar.gz   2463    vert.gif 21.2 % 1518
/download/w3perl.exe   1909    vert.gif 16.4 % 1050
/download/w3perl-apache.exe   1180    vert.gif 10.2 % 579
/download/w3perl_3.03_all.deb   322    vert.gif 2.8 % 215
/download/w3perl-3.02-1mdv.noarch.rpm   297    vert.gif 2.6 % 89
/download/w3perl-dev.tar.gz   277    vert.gif 2.4 % 175
/download/w3perl_3.02_all.deb   274    vert.gif 2.4 % 217
/download/w3perl-dev.exe   213    vert.gif 1.8 % 95
/download/w3perl-apache-dev.exe   211    vert.gif 1.8 % 85
y graph2
x
Top ten Number of download by week

y graph1
x
 Date  Occurence
2008-Jan144
2008-Feb168
2008-Mar112
2008-Apr122
2008-May97
2008-Jun146
2008-Jul135
2008-Aug109
2008-Sep129
2008-Oct138
2008-Nov105
2008-Dec113
y graph1
x
Number of hosts downloading the package (peaks are release day) Number of hosts for w3perl.tar.gz Number of hosts downloading the package w3perl.tar.gz each month
(*) I suppose one host download a specific package per day. This is not true as some hosts download more than one package (some are depositery servers, others can download more than one package). It gave a very low limit on the download number (about 6000). The graphs above are only for the w3perl.tar.gz package.

DEC
23
3.044 Referer stats improved
23.12.2008 | Referer stats improved | Auteur : Domisse

- MacOS installation manual
At last, I have someone which was nice enough to write a small installation guide for MacOS X user.Hope it will be helpful for others. Next step would be to find someone who could package W3Perl ! ;)

- How to have full reports on xml or cgi query ?
By the way, if you need to get stats about any specific filetype, you just have to include it in @extension. Any file with extension listed in @extension will be treated as 'Pages'. It could be useful sometimes to view xml or cgi files as 'html pages'.

- Requests for Streaming server support
W3Perl was able to parse realaudio logfiles in the past. But it seems nobody is using the package with a real audio server anymore so I dropped the support. Now I have a request for Icecast and Darwin streaming server. The first one works without problem as the logfile format is pretty similar to CLF format. More work need to be done for the second one.

- IIS Timeout
You may have a timeout using IIS as the default value is 300 seconds. You can raise this setting with the following command: C:\Inetpub\AdminScripts>adsutil SET W3SVC/CGITimeout <value>
More information can be found here
Hopefully, there is a trick for Apache web server. If you flush stdout, no timeout will occur so the trick is to output some message on the display from time to time.

- Windows installer
Another request is to improve the Windows installer as the package is always using C:. I'm using the registry to find where is located IIS but to find the virtual directory path I need to query wmi. More vbs need to be written !
Also on the Windows side, I take care now about path with space characters. They should be shortcut as Windows is still based on 8+3 filename ... so "Program File" should be "Progra~1" in config file.

Improvements :
- Display list of available configuration files in web admin
- Adding a description field for config filename in web admin
- Icecast streaming configuration file.
- Display list of entry points for each referer and list of referer for each entry point
- Thai language added (thanks to Maxime Carpentier)
- Windows pathname are shortcut when a space character is found

DEC
16
3.043 - Style switcher
16.12.2008 | Style switcher | Auteur : Domisse

You can now change output style thanks to a style switcher. Cookie should be enabled in order to save style across the different pages. If you want to build your own css style, add your css name in @alternate_css_style. The filename should be something like alternate_<number> and alternate_menu_<number> with <number> being the array index number. If you have something nice, send me your css and I will include it in the next release.

Bug fixes :
- Some wrong header in tables.
- htmldoc v 1.9 beta crash on some style. bypassing them.
- Fix some wrong links in the Hosts area.
- CSS menu was broken for old IE (< 6.0), the fix was applyed only to the homepage ! Now available on all pages
- Some fixes in the web admin interface (server protocol and port number)
- CSS menu display improved
- Empty document with PDF/Email when individual script have been ran

DEC
07
3.042 - New sortable
07.12.2008 | New sortable | Auteur : Domisse

3.042 have been updated. I'm switching to a new sortable library which is really faster and much more powerful. A footer can now be included at the bottom of the table, pagination is supported and you can even have some kind of filtering inside your table. !

Someone asked me if W3Perl is multi-thread. Sadly, not. But I have some plans to increase speed as some scripts can be run simultaneously.

Bug fixes :
- Windows configuration files have now yearly stats enable by default
- New option -b in cron-pages.pl to parse logfile until the current day (init)
- Flydraw is the default graphic tool in Ubuntu
- Fix some wrong links in the Hosts area.
- Error in configuration file if PDF option was enable (wrong check).

NOV
28
3.042
28.11.2008 | 3.042 | Auteur : Domisse

Nothing really new in 3.042. I've removed the fixperlpath.pl script which is not anymore required on installation. If you want to install W3Perl on your provider, it will work with the default path. You still need to know where you want to output the stats report and where are located the input logfiles. But most providers provide this kind of informations.

Bug fixes :
- Some german translation fixed.
- Removing space characters in configuration filename
- CSS menu with real-time was broken
- Wrong background link on some install.
- Filtering heatmap script call in stats report.

NOV
23
FTP Installation
23.11.2008 | FTP Installation | Auteur : Domisse

FTP installation does not require anymore running an install script. So you just need to upload the files to your provider's host in order to install the package. The pre-requis are the Perl path (but default should work with most providers), the cgi path, your server root path and the location of your logfiles. If logfiles are not available, you can still use the page tagging package which allow logfiles to be daily build thanks to a small javascript tag to be inserted in your pages.

A few questions sent :

Since the script does not use a database, where does it store info ?
- Data are stored in files. Logging in a database take too much space on large logfiles and query huge tables could take too much time.

I guess this stats script has to be installed on each domain in order to track the stats, it can't be just one installation to track multiple domains
- If the domains are shared on the same server, you just need to install w3perl once (works only on unix server as I use symbolic links). There will be master/slave install.I didn't test the install script in such case for a while, so it may not work fine but I can fix the bugs if found.

I just want to know though, using the real time application can you see individual users on your website and the stats for each user an how long they have been on.
- Image is better than words ... There are no way to track 'users' ... but rather 'hosts'.

Real_time_hosts

Real time report show here (update at 5:52 am) :
- the last visitor was : 82.234.9.232 who made 244 requests and 24 pages. The list of pages seen is listed with their occurence. Mainly this user goes to the forum, view 4 threads and edit one post. It has also seen twice the real time report page. He spents 33 minutes on the website and he's last page view was at 5:52 am (server time).
- There was also another visitor (91.205.124.12) who was reading an article about the Voyager 2 spacecraft. 66 pages were red in one hour (4:52 am to 5:52 am). If you want to know the time spent on each page, you'll have to check the session report.
- and so on....

The page display visitor from the last hour....but you can raised this value by filling the form and submitting back the script.

Bug fixes :
- Running session with apache rotation logfile .1 not compressed (usually only the current logfile is not compressed).
- Add y-label on selected pages graph.
- Removing null entries in daily pages report.
- FTP documentation have been updated.
- Crash when all agent or referer values are empty/invalid.

NOV
18
3.041
18.11.2008 | 3.041 | Auteur : Domisse

This new developement package is mainly a bug fixes release. The only new feature is a graphic which show the number of hosts by day for each page (when using the highest level of precision). Also a CSV file is available. It allow you to use an external program (let's say Excel) to build nicer graph. I will soon plot the number of w3perl download since the begiining of the year.

NOV
12
3.04 Bugs
12.11.2008 | 3.04 bugs | Auteur : Domisse

The first reports from 3.04 are coming.

Problem occurs if you are using a secure server (https rather than http) as protocol was hardcoded. A new field is available in the configuration file so you can choose to select a secure or not secure server.

Problems when updating agent stats are still there. I need to rewrite this script.

A killing bug crash the incremental script if page have been found with a space character inside. The extra space were removed and then saved as two strings confusing the incremental script when reloading the saved data.

Updating the CSS menu was really memory hungry. Instead of loading the file in an array, I first rename it, then open it lines by lines to update the CSS menu. It's really faster also ! (unlike in C, Perl is working faster when parsing a file line by line !). Also some parts of the code have been removed to save memory. Thanks to Tom who provide me big daily logfiles (600 Mb daily).

World map show now smaller point, duplicate points are removed to speed up graph building.

NOV
04
New powerpoint
04.11.2008 | PPT | Auteur : Domisse

A new powerpoint is available. The previous one was five year's old and was in french ! I'll try to make some update from times to times.