Web Performance, Branding, and Social Media
In: Browsers| Technology| The Web
3 Sep 2008
Recently there was an outage at a hit-tracking vendor I was using to track the hits on my externally hosted blog, leaving me with a gap in my visitor data several hours long. While this was an inconvenience for me, I realized that this could be mission critical failure to an online business reliant on this data.
To resolve this, I used the PHP HTTP environment variables and the built-in function for converting IP addresses to IP numbers to create my own hit-tracker. It is a rudimentary tracking tool, but it provides me with the basic information I need to track visitors.
To begin, I wrote a simple PHP script to insert tracking data into a MySQL database. How do you do that? You use the gd features in PHP to draw an image, and insert the data into the database.
header ("Content-type: image/png");
include("dbconnect_logger.php");
$logtime = date("YmdHis");
$ipquery = sprintf("%u",ip2long($_SERVER['REMOTE_ADDR']));
$query2 = "INSERT into logger.blog_log values \
($logtime,$ipquery,'$HTTP_USER_AGENT','$HTTP_REFERER')";
mysql_query($query2) or die("Log Insert Failed");
mysql_close($link);
$im = @ImageCreate (1, 1)
or die ("Cannot Initialize new GD image stream");
$background_color = ImageColorAllocate ($im, 224, 234, 234);
$text_color = ImageColorAllocate ($im, 233, 14, 91);
// imageline ($im,$x1,$y1,$x2,$y2,$text_color);
imageline ($im,0,0,1,2,$text_color);
imageline ($im,1,0,0,2,$text_color);
ImagePng ($im);
?>
DROP TABLE IF EXISTS `blog_log`; CREATE TABLE `blog_log` ( `date` timestamp NOT NULL default '0000-00-00 00:00:00', `ip_num` double NOT NULL default '0', `uagent` varchar(200) default NULL, `visited_page` varchar(200) NOT NULL default '', UNIQUE KEY `date` (`date`,`ip_num`,`visited_page`) ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
Data should begin flowing to your database immediately. This sample snippet of code will allow you to pull data for a selected day and list each individual hit.
$query1 = "SELECT
bl.ip_num,
DATE_FORMAT(bl.date,'%d/%b/%Y %H:%i:%s') AS NEW_DATE,
bl.uagent,
bl.visited_page
FROM blog_log bl
WHERE
DATE_FORMAT(bl.date,'%Y%m%d') ='$YMD'
and uagent not REGEXP '(.*bot.*|.*crawl.*|.*spider.*|^-$|.*slurp.*|.*walker.*|.*lwp.*|.*teoma.*|.*aggregator.*|.*reader.*|.*libwww.*)'
ORDER BY bl.date ASC";
print "<table border=\"1\">\n";
print "<tr><td>IP</td><td>DATE</td><td>USER-AGENT</td><td>PAGE VIEWED</td></tr>";
while ($row = mysql_fetch_array($result1)) {
$visitor = long2ip($row[ip_num]);
print "<tr><td>$visitor</td><td nowrap>$row[NEW_DATE]</td><td nowrap>$row[uagent]</td><td>";
if ($row[visited_page] == ""){
print " --- </td></tr>\n";
} else {
print "<a href=\"$row[visited_page]\" target=\_blank\">$row[visited_page]</a></td></tr>\n";
}
}
mysql_close($link);
For my own purposes, this is an extension of the Geographic IP database I created a number of years ago. This application extracts IP address information from the five IP registrars, and inserts it into a database. Using the log data collected by the tracking bug above and the lookup capabilities of the Geographic IP database, I can quickly track which countries and ISP drive the most visitors to my site, and use this for general interest purposes, as well as the ability to isolate any malicious visitors to the site.
Stephen Pierzchala is one of a cadre of crazy Canucks living in the United States. A 10-year veteran of the Web performance field, Stephen also writes on topics as diverse as branding and reputation, bipolar, and Web technologies.