homepage Welcome to WebmasterWorld Guest from 107.21.163.227
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld

Visit PubCon.com
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
google indexing my "mirror" forum
a test board I setup
miracle




msg:3198921
 4:23 pm on Dec 24, 2006 (gmt 0)

My live board is at mydomain.com/forum (main domain is myforum.com). Today, I decided to make a copy of the live board at mydomain.com/testboard (copied the entire mysql database and all the files over). The reason for this is because I want to perform a software upgrade on the testboard first before doing it on the live forums.

After I woke up, google.com was indexing the testboard for almost 8 hours. After thinking of how they found my testboard url, I realized that I had google adsense and urchin code on the testboard. Is there a way to remove what it had indexed? I'm worried about duplicate content. :( Please advise, thanks in advance.

PS I already blocked all spiders in /robots.txt with:

user-agent: *
Disallow: /

 

mattg3




msg:3199055
 9:02 pm on Dec 24, 2006 (gmt 0)

PS I already blocked all spiders in /robots.txt with:

user-agent: *
Disallow: /

I assume that you only want to block your testboard, in your /robots.txt

miracle




msg:3199060
 9:15 pm on Dec 24, 2006 (gmt 0)

That's correct, I placed the robots.txt inside the testboard folder. Additionally, I have banned robots on the testboard. It haven't came back in over 3 hours. Do you know what happens to the content that were indexed by google? :-S

mattg3




msg:3199073
 9:26 pm on Dec 24, 2006 (gmt 0)

Well it gets indexed. I assume you just need to wait now.. otherwise there is an removal tool. But you might just want to wait so you don't remove something that shouldn't be removed.

The advice given by G was to block the dual content you don't want to index, otherwise they choose one.

Of they choose for a short period of time the wrong one, remove your testboard to directory three and 301 to the real directory.

In fact why don't do this right now. 301 everything from the testboard to the real one and restart your test in another directory this time removing adsense.

Patrick Taylor




msg:3199086
 9:38 pm on Dec 24, 2006 (gmt 0)

I'm worried about duplicate content.

I wouldn't worry too much about duplicate content in this instance, and certainly wouldn't use a 301. If you've blocked any further crawling on the test board, I believe the pages will eventually go supplemental and then a long time later (a year) they will drop out of the index. In the meantime no harm will be done.

mattg3




msg:3199160
 1:01 am on Dec 25, 2006 (gmt 0)

and certainly wouldn't use a 301

How does a 301 hinder crawling? :\

How can Webmasters proactively address duplicate content issues?

[googlewebmastercentral.blogspot.com...]

# Block appropriately: Rather than letting our algorithms determine the "best" version of a document, you may wish to help guide us to your preferred version. For instance, if you don't want us to index the printer versions of your site's articles, disallow those directories or make use of regular expressions in your robots.txt file.
# Use 301s: If you have restructured your site, use 301 redirects ("RedirectPermanent") in your .htaccess file to smartly redirect users, the Googlebot, and other spiders.

encyclo




msg:3199188
 1:47 am on Dec 25, 2006 (gmt 0)

I placed the robots.txt inside the testboard folder.

Note that robots.txt can only be placed in the document root for your site, never in a folder - it will never be fetched from there. You should exclude the teastboard folder within your main robots.txt instead.

WiseWebDude




msg:3200236
 9:04 pm on Dec 26, 2006 (gmt 0)

Yup, you should use the robots.txt like:

user-agent: *
Disallow: /example-forum/

having it in the actual folder is, as far as I know, worthless.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved