homepage Welcome to WebmasterWorld Guest from 54.198.224.121
register, free tools, login, search, subscribe, help, library, announcements, recent posts, open posts,
Subscribe to WebmasterWorld
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Ranking drop for duplicate content in regional subdomains
tilmes




msg:3721876
 10:55 am on Aug 13, 2008 (gmt 0)

I made a site recently with using sub-domains e.g. australia.example.com, uk.example.com. So it targets regional markets. After a month all pages indexed from google found at the very end of search results. It used to found at first page. I guess it is google panelty. How can i get out of google panalty for sure? With robots.txt, meta tags, or coding in .htaccess file?

[edited by: Receptional_Andy at 10:57 am (utc) on Aug. 13, 2008]
[edit reason] Please use example.com - it can never be owned [/edit]

 

tedster




msg:3721887
 11:25 am on Aug 13, 2008 (gmt 0)

A couple things

1. New sites often start out with great rankings, then lose those positions after a short period only to build back slowly. It's like a text period to see if the site will really catch fire, I think, and most of the time that doesn't happen. We used to call this the Google Sandbox [webmasterworld.com].

2. As you may have just discovered, you should definitely watch it with regional pages that are too close to duplicates. Sure, you can exclude those subdomains through robots.txt or robots meta tags. Then, if they aren't even indexed, why do you want them online? If they have a real business purpose, then I'd suggest making them more tailored to each region so they are no longer duplicate.

tilmes




msg:3721908
 12:14 pm on Aug 13, 2008 (gmt 0)

Hello tedster, thanks for your comments. I would try with robots.txt first. I could not find the code in internet. How can i disallow only those sub domains with robots.txt from indexing? Would you be more detail about "making them more tailored"? I would be glad to keep those sub domains because it shows only ads which are related that region.

tilmes




msg:3721937
 1:10 pm on Aug 13, 2008 (gmt 0)

I still cannot find out how to disallow googlebot from indexing subdomains. Because all subdomains are generated from .htaccess. And there are no sub folders for them.

activeco




msg:3721955
 1:45 pm on Aug 13, 2008 (gmt 0)

Bots usually call for robots.txt in a subdomain, so generate that file too.

tilmes




msg:3722316
 7:11 pm on Aug 13, 2008 (gmt 0)

The root directory is same as www.site.com is located. Can have Robots.txt only for newyork.site.com?

activeco




msg:3722459
 9:43 pm on Aug 13, 2008 (gmt 0)

Probably, if you check request and provide corresponding file.

tedster




msg:3722627
 2:04 am on Aug 14, 2008 (gmt 0)

The root directory is same as www.site.com is located

That can't really be true, from an http request point of view. What you need is a file at this address: newyork.example.com/robots.txt

tilmes




msg:3722684
 5:17 am on Aug 14, 2008 (gmt 0)

newyork.example.com is generated from rewrite rule .htaccess. What can i write in robots.txt, allow to index only www.example.com? Because there can be only one robots.txt in root directory for all cities.

tedster




msg:3722696
 5:53 am on Aug 14, 2008 (gmt 0)

So then use a meta robots noindex tag on all the pages in the subdomains.

tilmes




msg:3722732
 6:51 am on Aug 14, 2008 (gmt 0)

Thanks for your answer. Do you know if meta noindex tag works really for duplicated contents? Meta noindex tag is added now. Well then have to wait and see?

activeco




msg:3722791
 8:55 am on Aug 14, 2008 (gmt 0)

Or make additional .htaccess rules:

RewriteCond %{HTTP_HOST} ^([^.]+)\.yoursite\.com$
RewriteCond %{HTTP_HOST} !^www\. [NC]
RewriteRule ^robots.txt$ /norobots.txt [L]

Which should rewrite all requests for robots.txt in any subdomain except www to norobots.txt file.

norobots.txt:

User-agent: *
Disallow: /

tilmes




msg:3723590
 7:08 am on Aug 15, 2008 (gmt 0)

Hi activeco, great idea. i will try this and will post here if it worked or not. thanks!

tilmes




msg:3728652
 7:09 am on Aug 22, 2008 (gmt 0)

Very strange here, in google web mastertool,
there are so many URLs restricted by robots.txt.
And amny URLs have this ' ' in html pages. Like http://www.example.com/'pages.html'
They are all prhibited by robots.

[edited by: tedster at 7:21 am (utc) on Aug. 22, 2008]
[edit reason] switch to example.com - it can never be owned [/edit]

activeco




msg:3728773
 11:42 am on Aug 22, 2008 (gmt 0)

Probably some other rewriting rules eventually combined with bad linking.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About
© Webmaster World 1996-2014 all rights reserved