Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Split Testing - keeping googlebot happy with multiple page versions

         

luke175

7:39 am on Aug 5, 2008 (gmt 0)

10+ Year Member



Every top web marketer will tell you that you need to test the heck out of a page to maximize revenue and conversions.

The problem is...what does Googlebot do when every time it visits a site and there is a different page there due to split testing?

It would seem to me this would severely impact a site's ability to maintain decent rankings.

The only way I can think of to avoid it is to check for bots and when one is detected to then deliver a static version of the page. However, this is basically what a cloaking script does and I imagine that would also flag a site as suspicious as well.

Any thoughts on effectively split testing multiple landing pages and keeping Googlebot happy?

P.S. Yes, I know about google's website optimizer...believe it or not there are things it CAN'T do. Not to mention, I would prefer G not know everything about my sites...but that's just me.

tedster

8:45 am on Aug 5, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I'd say the key is not to serve a test version of the page to googlebot, or to any search engine bot. Only let them see a different page once you've established it as being better than the previous best version. So this means you don't do testing on the live url.

One approach is to allow the static, established page to take the lion's share of the traffic - but use a javascript replace() method to redirect some of that traffic, say 10% or so, to a test area on a different url. Googlebot will not follow the javascript replace(), and I use that particular method, rather than location.href or similar to be sure that the test pages doen't break the Back Button.

Further protection can come from a meta robots "noindex,nofollow,noarchive" and even a robots.txt disallow rule. And if by some chance a manual inspection comes along - well, you're doing something legitimate any way and only trying to help keep the testing out of the index, which is a help.

You could even get into user-agent detection and never insert the javascript replace() if you see googlebot, slurp, msnbot, askbot etc. While it may seem to the "purist" that this is cloaking, it is well within the spirit of the guidelines when done for testing purposes. That is, it is not deceptive in its intent, it is protective and beneficial to Google's resources.

luke175

9:06 pm on Aug 5, 2008 (gmt 0)

10+ Year Member



Thanks for the response, I'm actually currently using user-agent detection.

Basically a PHP script that looks for a list of user-agents and for example if it sees "Google" then it would load "index_2.html" rather than "index.php".

I'm just wondering how the bots view something like this as it is essentially cloaking albeit for good reasons. The split-tested pages versus the static pages typically are only testing one element like a sub-headline or image.

tedster

9:14 pm on Aug 5, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I would not suggest doing anything if a search bot user-agent is detected. Make all the switches to test new pages for some percentage of the regular visitors. Let the spiders see the default page that is also the same default page for your benchmark.

luke175

9:31 pm on Aug 5, 2008 (gmt 0)

10+ Year Member



Well pretty much every split-testing solution I've tried, even ones costing thousands per month operate similar to what I've mentioned.

Essentially, the various versions of the pages are stored in a MySQL database and are pulled via the PHP script (index.php). So if nothing is done when a SE bot is detected then it will just see a blank page.

I learned this the hard way before setting up an "index_2.html" as described above and my site was listed in Google as "index.php" and nothing else.

tedster

9:47 pm on Aug 5, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I hear you - and that's not something you want to do to any search engine spider. They should always get the established version of the page until a better version comes out of testing.

luke175

10:11 pm on Aug 5, 2008 (gmt 0)

10+ Year Member



Ok, here's a thought...

What if I left my standard static index page as is and added a javascript to the top of it that directed to "splittest.php". Additionally, I would add nofollow/noindex to each of the split test versions.

In this way I could direct all users to the split test for quickest results.

SE bots should in theory not follow the js and visitors with js disabled would also simply see the standard static page.

Am I missing something here or would this work?

tedster

10:22 pm on Aug 5, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Now you're going in the direction I've been trying to explain. I'd suggest using the javascript replace() method so you don't break the back button. I'd also suggest not testing all your javascript enabled users, but only 25% or so. Split tests usually are comparing a new idea to an established benchmark page anyway, so why not let the benchmark continue to be served as the default.

luke175

10:54 pm on Aug 5, 2008 (gmt 0)

10+ Year Member



Thanks very much for all your help.

Is it possible you could point me to an example of the code I could use to do such a thing?

As far as testing a small portion rather than all users...I leave that up to my script to determine.

I'm testing a page that gets thousands of hits per day and the "action" I'm testing is to optin and download a whitepaper.

The script is set to test a predetermined amount of visitors until a statistically significant number is reached. If the new version is the winner then that continues to be served. If the control proves better than the new version then it will be served 100% of the time.

In my eyes, serving only 25% is not really needed as I'm not testing drastic items like the offer itself but rather layout changes like using a male or female photo, headline colors etc. While some of these may depress response, nothing would completely kill results...and if it did, the script would stop serving it rather quickly.

tedster

11:14 pm on Aug 5, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



The core idea is simplicity itself - the blue line in the script box below is the core. The if statement can be as simple as looking at the timestamp of the request and only redirecting if the tenth-of-a-second is "6". That would randomly sample about 10% of the traffic.

Without any "if" statement, the script in blue will redirect all the javascript enabled traffic to the test page - but that's the situation I think is unecessary and potentially even risky. Keep serving the main traffic stream your usual page at the usual URL until your testing decids on the improved version.

<script type="text/javascript">
if {[i][place conditions between the curly brackets][/i]}

location.replace(testpage.htm)
</script>

Sometimes I see people redirect with the line location.href="testpage.html" - and that puts the Back Button into a loop, if they try to use it.

luke175

12:00 am on Aug 6, 2008 (gmt 0)

10+ Year Member



Tedster,

Thanks very much for all your help and the example code. I will try and implement this and see how it works out.

If I wanted to send a flat percentage of visitors to the split test (say 40%) what would I use instead of the above if statement?

Thanks again.

tedster

12:13 am on Aug 6, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



You could approach in in many ways - for example, use a counter to send over 4 out of 10 visitors, or look at the last digit in the timestamp and send over traffic that matches any four of the possible ten final digits.

But I don't want to get any further into a javascript dicsussion in this thread. We're getting too far outside the topical area of Google Search, so I'll leave the exact coding as a homework exercise.