Welcome to WebmasterWorld Guest from 107.20.75.63

Message Too Old, No Replies

Hiding tracker from spiders = cloaking?

     
1:15 am on Feb 22, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Dec 5, 2002
posts: 1845
votes: 3


I use Piwik to track my website visitors. For performance, I'd rather not display the tracking code to spiders because it executes some PHP. Would Google detect that as cloaking?
4:21 am on Feb 22, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month

joined:Apr 9, 2011
posts:12702
votes: 244


I hope not, because my own htaccess contains the lines

RewriteCond %{REMOTE_ADDR} ^(65\.5[2-5]|131\.253\.[2-4]\d|157\.(5[4-9]|60)|199\.30\.[123]\d|207\.46|209\.8[45])\. [OR]
RewriteCond %{HTTP_USER_AGENT} ([a-z]Bot|facebook|pinterest|Seznam|Preview) [NC,OR]
RewriteCond %{HTTP_REFERER} cache
RewriteRule ^piwik/ - [F]

to buttress the general roboting-out of the piwik directory. (I don't keep the tracking code on the page itself. It's huge! There's a separate copy in each site's /piwik/ directory, and the database itself lives my personal site.)

I also have a targeted rewrite of requests for piwik's tracking dot-- the one that lives inside <noscript> tags. In rare cases it happens to be the only image on a page, and then places like facebook go berserk asking for it. So instead I rewrite to a site-specific small logo-- the kind you'd use as a link.
2:42 pm on Feb 22, 2014 (gmt 0)

Senior Member

WebmasterWorld Senior Member 10+ Year Member

joined:Dec 5, 2002
posts: 1845
votes: 3


Thanks Lucy. Do you know if Piwik does any special reporting for spiders so that I might want to leave the tracker in there for them? I remember something about that.
9:57 pm on Feb 22, 2014 (gmt 0)

Senior Member from US 

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month

joined:Apr 9, 2011
posts:12702
votes: 244


What I remember is that piwik-- or, for that matter, GA-- is based on the premise that robots don't execute javascript, so only human visits will be tracked. In the case of major search engines this is no longer true. And previews, of course, execute everything. That's why I blocked the plainclothes bingbot.

Have you tried the piwik forums? Someone generally knows what's going on. There's also an area in your piwik prefs where you can tell it to ignore specified IPs. I don't know if you can set it for ranges; otherwise it's just "ignore me" and not useful for much else.

:: idly wondering if many people have interesting personal scripts masquerading under the name "piwik.js" since there would otherwise be no reason to even ask for the file ::