Skip to main content

Advances, Systems and Applications

Table 2 Machine and Network Data sources

From: A Neuro-fuzzy approach for user behaviour classification and prediction

#

Log Type

Information Contained in the Log

1

ODBC Logging

• The user’s IP address

• User name

• Request date and time

• HTTP status code

• Bytes received

• Bytes sent

• Action carried out (for example, a download carried out by a GET command)

• The target (for example, the file that was downloaded).

• The time is recorded as local time

• You must specify the database to be logged to.

• You must setup the database table manually to receive the data.

2

Proxy Server Logs

• Apache with mod_proxy

• Apache Traffic Server

• HAProxy

• Internet Information Services configured as proxy module

• Nginx

• Privoxy

• Squid

• Varnish which is reverse proxy only

• WinGate

3

Browser History

All web Internet travellers sustain the user visit history in one structure or another. The web browser makes a .dat data file, while Chrome keeps the data in multiple places. SQLite keeps a data file in their specific data files on the system drive. These data files can be analysed at playback to draw out useful information for web customization purposes.

4

Server or Client-side Visit Logger App

One can develop small applications to run on the server or customer side to collect information and location and check out history information. Usually, such applications are launched on the customer side as add-ons for the web browser or integrated inside the website rules; a widely used example of such an app is search engines analytics, where a small piece of a rule provided by search engines is placed in the footer of each web page. That rule gathers customer geographical info, and the web page checks out the details, including the names of pages visited, customer recommendation website (from which website did the customer come from before arriving at the current website), and in the case of recommendation, also gathers any keyword search; it also maintains customer stay time on a particular web page to obtain an understanding of users’ interests and the website bounce rate, i.e. how many customers left the website from the site they landed on attaining the particular website.