LSE Web server log analysis: Frequently Asked Questions


The log analysis from September 1999 onwards is produced by a new application which gives considerably more information. Answers to questions that School Web editors have posted are listed below:

What is the difference between 'HITS', 'FILES' and 'VISITS'?
'Hits' records requests for individual files; 'Files' records when there was something to send back to the user so missing pages or incorrect addresses won't appear under 'Files'. 'Pages' is preferable to both these as this is a record of Web page requests, with image and some other file type requests excluded, so it is a better measure of site usage.

At present 'Pages' is available for monthly summaries for individual web sites but not for individual pages listed under the monthly report's 'URLs' list. It is possible to see 'Hits' on individual resources under 'URLs' and this information can be used to indicate broad trends in whether pages are becoming more popular or not. The caveat is that increasing the number of images on a page, for example, may bias the results. We are currently looking at whether 'Pages' can be listed under 'URLs' as well as 'Hits'.

'Visitors' is an approximation of how many different users looked at the web site. This is only approximate as a change in visitor is based on the assumption that after thirty minutes any requests from a particular Web address (for example, the one allocated to you by your Internet Service Provider when you connect via dial-up) will be from a different user than the one who connected half an hour earlier using that originating address. Of course some users may stay on longer than thirty minutes and so will be counted twice but again, the data is useful in illustrating broad trends in web usage. The only way of accurately tracking users would be to require them to login to read any LSE Web page but this is not the case for publicly accessible pages.

How do I use the new statistics?
For each School unit included in the analysis there is an initial summary page with a list of the months included in the new analysis, beginning with 'Sep 1999'. Clicking on a month link takes you to the detail analysis for that particular period of time.

Is there any change in the definition of 'hits'?
Yes - previously the term hits only referred to HTML page requests, the new analysis takes all files served into consideration when calculating hits. The actual number of distinct HTML Web page requests is listed and is termed 'Total Pages'.

What is the difference between 'HITS' and 'FILES'?
HITS is the total number of HTTP requests that the server received during the reporting period. Any request made to the server is considered a hit. FILES is the number of hits that actually resulted in something being sent back to the user, such as an HTML page or image. 'Total Files' and '200 - OK' totals should be the same. If you add up the totals in the 'Hits by Response Code' section, it should be the same as the 'Total Hits' figure.
(source: Webalizer FAQs)

What are the benefits of the new analysis?
The benefits of the new analysis are in the detail of the statistics. There are now reports on the following:

  • Daily Statistics - in both tabular and graphical format
  • Hourly Statistics - in both tabular and graphical format
  • URLs of the the top 100 requested pages on your site
  • Entry Pages - the page at which users entered your site
  • Exit Pages - the page at which users left your site
  • Sites - the URLs of users which visited your sites, with School addresses group under '158.143'
  • Visits - the number of distinct visitors to your site
  • Referrers - the last pages from which visitors looked before entering your site
  • Search - terms that users entered to search for information about your site which will help you decide if a particular part of your site needs promoting
  • User Agents - the types of browsers that visitors used to view your pages

^