Welcome to WebmasterWorld Guest from 23.23.46.20

Forum Moderators: phranque

Are your test environments indexed in Google?

They shouldn't be: test.example.com, staging.example.com etc.

   
1:52 pm on Jul 24, 2013 (gmt 0)

WebmasterWorld Administrator buckworks is a WebmasterWorld Top Contributor of All Time 10+ Year Member



I just discovered that a site I'm helping with has thousands of pages from their staging server indexed in Google.

That creates massive amounts of duplicate content that can't be blamed on scrapers!

Check to make sure this isn't happening to you.

Pages can be kept out of the search engine indexes by adding the noindex directive to the <head> ... </head> section:

<meta name="robots" content="noindex">

Don't block spiders in robots.txt if you're doing this. Google et al. will only see the NOINDEX if they're able to spider the page.

Also, remember to remove the noindex directive when you publish the content to the main site.
4:33 pm on Jul 24, 2013 (gmt 0)

WebmasterWorld Administrator lifeinasia is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



When possible, we try to block access to DEV sites with a white list of IPs. What Google can't read, Google can't index.
4:40 pm on Jul 24, 2013 (gmt 0)

WebmasterWorld Administrator engine is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



This has been a big problem in the past: Google discovery of test and development sites.

Googlebot has a voracious appetite, and simply using robots.txt will not work.

It has gotten to the stage where i've become paranoid and I avoid using gmail, and ask clients to make sure they don't have any toolbars installed on their machines.
5:30 pm on Jul 24, 2013 (gmt 0)

WebmasterWorld Administrator phranque is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



I prefer to use HTTP Basic Authentication to protect development/staging content from indexing.
this gives googlebot a 401 status code.
as suggested by LifeinAsia using a 403 Forbidden status code works just as well.
the meta robots noindex solution is effective for keeping the dev urls out of the index but uses more server resources.
6:25 pm on Jul 24, 2013 (gmt 0)

WebmasterWorld Administrator buckworks is a WebmasterWorld Top Contributor of All Time 10+ Year Member



to protect development/staging content from indexing


This material is already indexed, alas.

There's no way to know what effect it's having on the main site's SEO, but it can't possibly be a good thing.
6:28 pm on Jul 24, 2013 (gmt 0)

WebmasterWorld Administrator lifeinasia is a WebmasterWorld Top Contributor of All Time 5+ Year Member Top Contributors Of The Month



If you haven't already done so, definitely submit a site removal request for the staging site through Webmaster Tools ASAP.
9:34 pm on Jul 24, 2013 (gmt 0)

WebmasterWorld Senior Member lucy24 is a WebmasterWorld Top Contributor of All Time Top Contributors Of The Month



What Google can't read, Google can't index.

You would think so, wouldn't you. But g### thinks differently. Uncrawled pages can still be indexed, even if the index only reveals that the page exists, not what it says.
12:39 am on Jul 25, 2013 (gmt 0)

WebmasterWorld Senior Member g1smd is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



Move your test server to a different subdomain and set up HTTP Basic Authentication on it.

On the indexed test subdomain set up a site-wide page-by-page redirect to the main site. Leave the redirect in place for at least 3 months after the last request from anywhere is received.

The "noindex" meta tag is not enough to get you out of trouble. A few years back a company fulfilled several orders before they realised the price paid by the customer was way too low. Turns out the customer had noticed that the prices on the test subdomain were quite old and the site allowed you to place an order!
 

Featured Threads

Hot Threads This Week

Hot Threads This Month