Posts Tagged ‘visitors’

Hit Tracking with PHP and MySQL

September 3rd, 2008 by smp | Comments | Filed in Technology

Recently there was an outage at a hit-tracking vendor I was using to track the hits on my externally hosted blog, leaving me with a gap in my visitor data several hours long. While this was an inconvenience for me, I realized that this could be mission critical failure to an online business reliant on this data.

To resolve this, I used the PHP HTTP environment variables and the built-in function for converting IP addresses to IP numbers to create my own hit-tracker. It is a rudimentary tracking tool, but it provides me with the basic information I need to track visitors.

To begin, I wrote a simple PHP script to insert tracking data into a MySQL database. How do you do that? You use the gd features in PHP to draw an image, and insert the data into the database.


header ("Content-type: image/png");

include("dbconnect_logger.php");
$logtime = date("YmdHis");
$ipquery = sprintf("%u",ip2long($_SERVER['REMOTE_ADDR']));

        $query2 = "INSERT into logger.blog_log values \
               ($logtime,$ipquery,'$HTTP_USER_AGENT','$HTTP_REFERER')";
        mysql_query($query2) or die("Log Insert Failed");

mysql_close($link);

$im = @ImageCreate (1, 1)
or die ("Cannot Initialize new GD image stream");
$background_color = ImageColorAllocate ($im, 224, 234, 234);
$text_color = ImageColorAllocate ($im, 233, 14, 91);

// imageline ($im,$x1,$y1,$x2,$y2,$text_color);
imageline ($im,0,0,1,2,$text_color);
imageline ($im,1,0,0,2,$text_color);

ImagePng ($im);
?>

Next, I created the database table.


DROP TABLE IF EXISTS `blog_log`;
CREATE TABLE `blog_log` (
  `date` timestamp NOT NULL default '0000-00-00 00:00:00',
  `ip_num` double NOT NULL default '0',
  `uagent` varchar(200) default NULL,
  `visited_page` varchar(200) NOT NULL default '',
  UNIQUE KEY `date` (`date`,`ip_num`,`visited_page`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

It’s done. I can now log any request I want using this embedded tracker.

Data should begin flowing to your database immediately. This sample snippet of code will allow you to pull data for a selected day and list each individual hit.


$query1 = "SELECT
                bl.ip_num,
                DATE_FORMAT(bl.date,'%d/%b/%Y %H:%i:%s') AS NEW_DATE,
                bl.uagent,
                bl.visited_page
        FROM blog_log bl
        WHERE
                DATE_FORMAT(bl.date,'%Y%m%d') ='$YMD'
		and uagent not REGEXP '(.*bot.*|.*crawl.*|.*spider.*|^-$|.*slurp.*|.*walker.*|.*lwp.*|.*teoma.*|.*aggregator.*|.*reader.*|.*libwww.*)'
        ORDER BY bl.date ASC";

print "<table border=\"1\">\n";
print "<tr><td>IP</td><td>DATE</td><td>USER-AGENT</td><td>PAGE VIEWED</td></tr>";
while ($row = mysql_fetch_array($result1)) {
        $visitor = long2ip($row[ip_num]);
        print "<tr><td>$visitor</td><td nowrap>$row[NEW_DATE]</td><td nowrap>$row[uagent]</td><td>";

	if ($row[visited_page] == ""){
    	    print " --- </td></tr>\n";
	} else {
    	    print "<a href=\"$row[visited_page]\" target=\_blank\">$row[visited_page]</a></td></tr>\n";
	}

}

mysql_close($link);

And that’s it. A few lines of code and you’re done. With a little tweaking, you can integrate the IP number data with a number of Geographic IP databases available for purchase to track by country and ISP, and using graphics applications for PHP, you can add graphs.

For my own purposes, this is an extension of the Geographic IP database I created a number of years ago. This application extracts IP address information from the five IP registrars, and inserts it into a database. Using the log data collected by the tracking bug above and the lookup capabilities of the Geographic IP database, I can quickly track which countries and ISP drive the most visitors to my site, and use this for general interest purposes, as well as the ability to isolate any malicious visitors to the site.

Tags: , , , , , , , , , , , , , ,

GrabPERF Site Statistics | Web Analytics Index - Mar 08 2006

March 8th, 2006 by smp | Comments | Filed in GrabPERF, Web Performance

The Site Statistics | Web Analytics Index measurements have been running now for about 2.5 days, and I wanted to make some general comments on what I am seeing.

The methodolgy for testing is straightforward. I chose sites | services that allowed you to create a free (if limited) account to track your Web visitors, and allowed you to make these statistics available to for anyone to look at. Using this this, a measurement was established against the landing page that visitors would see if they chose to look at these publicly available statistics.

I am using this blog as the placeholder for the tracking “bugs”  used in this index (see the right-hand column).

Site Stat Services Index - Mar 08 2006

From the graph above, it is clear that ShinyStat is the performance leader in this space. They have the smallest overall page size as well as the fastest and most reliable performance.

It is important to note that services such as WebTrends, Omniture, WebSideStory and Coremetrics are not included, as they are beyond the reach of most bloggers, and do not provide a public side to their data. Also, Google Analytics is not included, as they do not provide public access to the collected data.

The collected data is available in GrabPERF as both the Site Statistics Index, and as individual measurements.

Technorati Tags: , , , , , , ,

Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Google Analytics — Apparently, I have no visitors

November 15th, 2005 by smp | Comments | Filed in RANTING, Technology, Web Performance

So, you think that Google Analytics is hot and cool?

So, have you had any visitors yet?

I haven’t.

But hey, what about those silly Web server logs that say that I have had many, many visitors?

We are the Google; your data cannot be tracked; therefore, it is irrelevant.

So far, I am not impressed.

And I am not alone. [here]


Technorati: , ,

IcreRocket: , ,

Tags: , , , , , , , , , , , , , , , , ,

Vlogging Mini-Conference — Monday, October 3, 2005

September 26th, 2005 by smp | Comments | Filed in smp

For all you local vloggers, there is a vlogging mini-conference coming up in Worcester, MA.

When: Monday, October 3rd, 6 p.m.
Who: You and your interested friends
What: “Meet the Vloggers” Worcester
Why: To learn more about videoblogging and build community
Directions to WPI: http://www.wpi.edu/About/Visitors/directions.html
Campus map to find the Campus Center:
http://www.wpi.edu/About/Visitors/Images/walkingmap.pdf
(The Campus Center is #6, behind Alumni Gym (#3) and next to Olin Hall
(#23). The star is where they are building a new Admissions building and the
circle in the center of campus is a fountain. These are just mentioned as
landmarks.)

Jan McLaughlin, a videoblogger (vlogger) from New York, will be visiting and
we will be presenting about videoblogging. This will include everything from
what equipment to use to how to drive traffic to your site, how to build
community through videoblogging and how to subscribe to news feeds using RSS
so you can watch other peoples’ videos. This will be a fun, interactive
presentation. I hope you can make it.

Thanks to Carl Weaver for the info.

Tags: , , , , , , , , , , , , , , , , ,

Giving up logging my blog traffic

June 24th, 2005 by smp | Comments | Filed in smp

I should explain that. I am no longer inserting the blog traffic into my Web server log database. The amount of crap was getting ridiculous, and taking up too much space.

By doing this, I reduced about 50 days of logs from 450,000 rows to 81,000 rows, a better representation of the traffic that my other domains get.

I am continuing to monitor and capture traffic using a combination of AdSense counters, the built-in logging capabilities of b2evolution, and StatCounter. These sources will show the traffic that I am interested in, and not lose the true visitors in with the idiots.


Technorati: , ,

Tags: , , , , , , , , , , , , , , , , ,

Search Referral Statistics — June 23, 2005

June 23rd, 2005 by smp | Comments | Filed in smp

Ok, I don’t have the most amazing traffic in the world, but here are the Search Engine results for the past 1100 visitors.

SEO Results -- June 23, 2005

Technorati is still out front!

Graph courtesy of StatCounter.


Technorati: , , ,

Tags: , , , , , , , , , , ,

More Stupid Trackback and Comment Spammers

June 2nd, 2005 by smp | Comments | Filed in RANTING

Ok, started to notice a dramtic and sudden increase in traffic to my site yesterday. Turns out that all of these folks were headed to the same place at this host:

/index.php?disp=stats

So, when I checked this out, they were all indicating referrals from the usual illicit medication and adult sites.

<sigh> More trackback and comment spam.

Now, I know that this page exists in b2evolution, and it is a way for visitors to view my traffic stats. However, a link to this page does not exist in my main display page. The only link to my stats is to my StatCounter stats.

Enter mod_rewite.

A simple rule disposes with these morons.

RewriteCond %{QUERY_STRING} disp=stats
RewriteRule ^.*$ http://www.pierzchala.com:9080/ [R,L,NS]

Please do not attempt to load the redirected URL; you will get nothing. NADA! That port is set to be dropped by iptables, effectively hanging the client end as it attempts to make a TCP connection.

/sbin/iptables -A INPUT -p tcp -i eth0 -s 0/0 --dport 9080 -j DROP

I use iptables to handle a lot of these morons. As the only people who view this page are infected with some virus or spyware, then I feel no shame in tying up their systems

Tags: , , , , , , , , , , , , , , , , ,

Interesting StatCounter “Feature”

May 4th, 2005 by smp | Comments | Filed in smp

I use StatCounter to track the visits to a few of my Web sites. Lately I have discovered a number of visitors that are logged as coming from Private IP Space addresses (10.0.0.0/8, etc.).

I know what’s happening here. These folks are behind proxy servers. When they request the StatCounter object, it is actually requested from the proxy server, which then logs their Private IP address, not the one on the external interface of their proxy server.

I also examine my Apache logs and can easily correlate these visitors to their external IP addresses.

A weird “feature”, but kind of cool, except if you are the security admin for these networks.

Tags: , , , , , , , , , , , , , ,

Benchmarking Web Sites — A Re-Examination

April 6th, 2005 by smp | Comments | Filed in smp

Back in November, I mentioned that I was working on the idea of new ways to benchmark the success of online businesses in today’s more mature operational environment. I am still working on the base ideas, but a colleague of mine has helped me coalesce some ideas, and they are now forming the foundation of the concepts my company will begin using internally to more deeply understand the various Web performance benchmarks we monitor.

For those who use the existing Web performance benchmarks to determine the success and failure of your online business, you understand how thin the veneer is on these benchmarks. They do not provide true insight into the operational success of an online business, and they are more likely to sow the seeds of distrust between IT and Business operations in the long-term by creating an artificial standard which becomes the goal.

If an online business truly wants to achieve and maintain exemplary Web performance numbers, it has to start with a strong foundation, and build on it. Why? The team I work with spends a lot of time trying to understand and reverse engineer the broken processes, designs, and architectures that were laid out in order to get big fast. After 3-4 years of technical starvation and underfunding, these online businesses are beginning to show strain; the temporary fix has become the permanently broken process.

The rush into the Web analytics space in the last few weeks is a key sign that companies now see value in and want to exploit the vast quantities of data that they collect on their traffic daily. Web analytics is an astoundingly complex field, but most people boil it up to a single concept: How many Unique Page Views did I get?

Unique Page Views is an outdated Web server analytics metric. It does not tell me anything about the business, other than it has a lot of traffic. Back in the “eyeballs are everything” period, this would have been a big deal. Now, I say so what, and start asking:

  • From where
  • Dialup? Broadband?
  • How many were able to successfully complete their transactions?
  • What paths are most visited
  • Average spend by connection type?
  • Average spend by hour?
  • etc.

Like Unique Page Views, the average Web performance and availability of a Web page or transaction does not accurately represent the overall health of any online business. Within the large populations of data that exist at the Web measurement firms, there is a wealth of data that could be used to clearly expose more important benchmarking statistics.

If you are from an online business, you already understand that the average performance over an artificially-defined period of time is a very inaccurate way to measure the success of the online business. However, it is the accepted standard in the field. Underlying those aggregated values, there are clearly-defined statistical methods which can be used to extract even more meaningful information from the mass of measurement data.

I would discuss more of the ideas and concepts that we are working on, but I know that I do get visitors from our competitors, so I will have to keep our ideas under wraps for right now.

But I want to hear yours. What does your online business use as a benchmark for success? Standard avergae performance and availability? Or something more complex that examines the performance data as a complete population, as opposed to an aggregated summary value? Does your firm tie business goals and objectives into the performance benchmarks so that people across the company can understand how the business is succeeding, and how delivering a good, bad, and downright awful online performance experience can affect the bottom line?

This is an exciting time to have access to large amounts of data on the health of the Internet.

Tags: , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , , ,

Two Career Families and the Death of Unstructured Time

March 24th, 2005 by smp | Comments | Filed in Life

Travis Smith points to this AP article on the death of the family in the two-income world. [here]

Samantha and I have discussions about this topic on a regular basis. We are a one-income family by choice and by restriction (Visa restrictions prevent Samantha from working in the US). However, the concept of scheduling our children’s lives to the point where we are using other people to raise and rear them is broken and failed in my mind.

I am not anti-activity. I believe that a child who is to be properly socialized in today’s world needs to interact with and engage in some structured activities with other children/people.

However, as much as they drive me/us nuts, Samantha and I are the main caregivers to our children. I work 7AM - 4PM every day to ensure that I am home for supper, bathtime, stories and tuck-in.

On weekends, the boys and I try and do at least one activity together. As a family, we do one big outing every weekend. The boys are free to do what they want, when they want, with parental consent. They play. they build. They draw. Cameron is better with a hammer than I am, and I caught him using the cordless drill one day (he’s six), complete with eye-protection, and I did not object because he knows how to use it!

My boys are extremely imaginative. Kinnear tells incredibly inventive stories, and loves to give visitors a tour of the house, describing each room in detail (he’s 3.5).

So, will my kids be the MBAs of the future? Will they lead corporations and make decisions that change the course of history?

Probably not.

But my children will be able to think, adapt, and dream. And to me, the ability to do these things beats the structured, scheduled, contolled thing called “existence” that is described in the article. It is not a life; it is an existence.

Give your kids a life.

Tags: , , , , , , , , , , , ,