Forum Moderators: phranque

Message Too Old, No Replies

Can Mod rewrite screw up Google Analytics?

         

thesheep

10:49 am on Aug 3, 2006 (gmt 0)

10+ Year Member



I'm wondering if using Rewrite rules in an .htaccess file can interfere with the way Google analytics tries to track page requests.

I've used Google Analytics for some months now, without problem. Just over a week ago I put up a new version of a site, and switched to using 'friendly urls' via Apache's Mod_rewrite facility. Ever since this time, I appear to have completely stopped receiving Analytics data. The 'status' area on the Analytics homepage indicates that the site is receiving data, and I know from other web stats software that the site has had a lot of traffic. But all Google Analytics data is zero for the last week.

The Mod_rewrite rules look like this:

RewriteEngine On
RewriteBase /

#Make sure we always see the www:
RewriteCond %{HTTP_HOST} ^domain\.com [NC]
#Apparently SE spiders don't like redirects (?):
RewriteCond %{REQUEST_URI}!^/robots\.txt$
RewriteRule ^(.*) [domain.com...] [R=301,L]

RewriteRule ^([a-z]+)$ $1/ [R]
RewriteRule ^([a-z]+)/$ $1.php

#First deal with extra trailing slashes:
RewriteRule ^portfolio/([a-z]+)$ portfolio/$1/ [R]

#Now redirect portfolio pages
RewriteRule ^portfolio/([a-z]+)/$ portfolio.php?site=$1

jdMorgan

2:13 pm on Aug 3, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You're using default 302-Found (also termed 302-Moved Temporarily) redirects, and that may be contributing to the problem. You should also use the [L] flag for the sake of efficiency, unless you *know* you don't want it.

Be aware that an external redirect and an internal rewrite are two completely-different things. An external redirect immediately sends a response to the client (browser, SE robot), telling it that the requested resource has moved, and giving it the new URL for that resource. The client then (usually) re-requests that resource using the new URL.

An internal redirect simply changes the server resource used to respond to a requested URL. The client is unaware that this is happening. It helps very much to differentiate these two functions when documenting the code, and observe that the syntax of these two mod_rewrite functions differs.

Fixing up:


RewriteEngine on
RewriteBase /
#
# Redirect non-canonical domain requests
RewriteCond %{HTTP_HOST} ^domain\.com [NC]
# Apparently SE spiders don't like redirects (?):
RewriteCond %{REQUEST_URI} !^/robots\.txt$
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
#
# Redirect "[a-z]+" page requests to append missing trailing slashes for URL consistency
RewriteRule ^([a-z]+)$ [b]http://www.example.com/[/b]$1/ [R[b]=301,L[/b]]
# [b]Internally rewrite[/b] to scripts
RewriteRule ^([a-z]+)/$ $1.php [L]
#
# Redirect to append missing trailing slashes on portfolio page requests for URL consistency
RewriteRule ^portfolio/([a-z]+)$ [b]http://www.example.com/[/b]portfolio/$1/ [R[b]=301,L[/b]]
#
# Now [b]internally rewrite[/b] portfolio page requrest to portfolio script
RewriteRule ^portfolio/([a-z]+)/$ portfolio.php?site=$1 [b][L][/b]

I also assume you've switched all the URLs in your Adwords account to indicate the new 'friendly URLs'? Otherwise, Google will never see a hit to the old URLs, because all requests are now to the new friendly ones.

Jim

thesheep

11:53 pm on Aug 3, 2006 (gmt 0)

10+ Year Member



Thanks for that, I think I'm gradually beginning to understand more. Seems that the mod_rewrite tutorial I read initially was too simplistic.

Maybe it's because it's late, but now I'm looking at the rules, I can't understand why this bit actually works:

RewriteRule (.*) [domain.com...] [R=301,L]

Presumably the .* will try to match as much as it can, including the domain name itself and the TLD. So then the first variable ($1) should consist of this whole string? According to this pattern matching rule, shouldn't

[domain.com...]

become

[domain.com...]

or has it got something to do with the 'rewrite base'?

Anyway thanks for your answer on the 302 vs 301 thing, good to know.

Adwords: don't have any set up, so I don't think that's the problem.

jdMorgan

12:26 am on Aug 4, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



RewriteRule matches against only the local URL-path; Protocol, hostname/domain, and query strings are not available for matching in RewriteRule, but can be tested by and back-referenced in RewriteConds using the appropriate HTTP server variables.

In plain english, if you request "http://domain.com/page.html", then the RewriteRule pattern is applied only to "/page.html" if the code is located in httpd.conf, or to "page.html" if the code is in .htaccess.

For more information, see the documents cited in our forum charter [webmasterworld.com] and the tutorials in the Apache forum section of the WebmasterWorld library [webmasterworld.com].

Jim

thesheep

8:49 am on Aug 4, 2006 (gmt 0)

10+ Year Member



OK thanks. And thanks for the resources.

thesheep

9:39 am on Aug 7, 2006 (gmt 0)

10+ Year Member



OK I've put in place the new .htaccess file, with the correct 301 redirects, but still Google Analytics doesn't appear to be getting any data. The only cause I can think of is the .htaccess file - anyone have any similar experience?

jdMorgan

1:51 pm on Aug 7, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Suggestions:

1) Temporarily rename the .htaccess file so it won't run. Then try G Analytics again.
This will at least prove or disprove your theory, and focus your questions.

2) Examine you server access log and error log files and look for accesses from G Analytics.
What user-agent do they use? Do they provide a referrer? What IP address range do the requests come from?
Now, do you have any access-control code in .htaccess based on these parameters that might be blocking them?

If you answer those questions, then that may help you find an answer, and it would certainly help us suggest one.

Jim

thesheep

12:58 pm on Aug 10, 2006 (gmt 0)

10+ Year Member



OK I feel like a right idiot now.

I went to disable the .htaccess file like you suggested, checked my HTML source files again, and found I'd left off the javascript tracking code. Somehow it got deleted among the various version updates I was doing.

I think the reason I didn't double-check it before was that on the Google Analytics site it says 'receiving data' OK.

Anyway thanks for your help and patience. Analytics works fine now.