homepage Welcome to WebmasterWorld Guest from 54.205.207.53
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Ranking drop for duplicate content in regional subdomains
tilmes

10+ Year Member



 
Msg#: 3721874 posted 10:55 am on Aug 13, 2008 (gmt 0)

I made a site recently with using sub-domains e.g. australia.example.com, uk.example.com. So it targets regional markets. After a month all pages indexed from google found at the very end of search results. It used to found at first page. I guess it is google panelty. How can i get out of google panalty for sure? With robots.txt, meta tags, or coding in .htaccess file?

[edited by: Receptional_Andy at 10:57 am (utc) on Aug. 13, 2008]
[edit reason] Please use example.com - it can never be owned [/edit]

 

tedster

WebmasterWorld Senior Member tedster us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3721874 posted 11:25 am on Aug 13, 2008 (gmt 0)

A couple things

1. New sites often start out with great rankings, then lose those positions after a short period only to build back slowly. It's like a text period to see if the site will really catch fire, I think, and most of the time that doesn't happen. We used to call this the Google Sandbox [webmasterworld.com].

2. As you may have just discovered, you should definitely watch it with regional pages that are too close to duplicates. Sure, you can exclude those subdomains through robots.txt or robots meta tags. Then, if they aren't even indexed, why do you want them online? If they have a real business purpose, then I'd suggest making them more tailored to each region so they are no longer duplicate.

tilmes

10+ Year Member



 
Msg#: 3721874 posted 12:14 pm on Aug 13, 2008 (gmt 0)

Hello tedster, thanks for your comments. I would try with robots.txt first. I could not find the code in internet. How can i disallow only those sub domains with robots.txt from indexing? Would you be more detail about "making them more tailored"? I would be glad to keep those sub domains because it shows only ads which are related that region.

tilmes

10+ Year Member



 
Msg#: 3721874 posted 1:10 pm on Aug 13, 2008 (gmt 0)

I still cannot find out how to disallow googlebot from indexing subdomains. Because all subdomains are generated from .htaccess. And there are no sub folders for them.

activeco

10+ Year Member



 
Msg#: 3721874 posted 1:45 pm on Aug 13, 2008 (gmt 0)

Bots usually call for robots.txt in a subdomain, so generate that file too.

tilmes

10+ Year Member



 
Msg#: 3721874 posted 7:11 pm on Aug 13, 2008 (gmt 0)

The root directory is same as www.site.com is located. Can have Robots.txt only for newyork.site.com?

activeco

10+ Year Member



 
Msg#: 3721874 posted 9:43 pm on Aug 13, 2008 (gmt 0)

Probably, if you check request and provide corresponding file.

tedster

WebmasterWorld Senior Member tedster us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3721874 posted 2:04 am on Aug 14, 2008 (gmt 0)

The root directory is same as www.site.com is located

That can't really be true, from an http request point of view. What you need is a file at this address: newyork.example.com/robots.txt

tilmes

10+ Year Member



 
Msg#: 3721874 posted 5:17 am on Aug 14, 2008 (gmt 0)

newyork.example.com is generated from rewrite rule .htaccess. What can i write in robots.txt, allow to index only www.example.com? Because there can be only one robots.txt in root directory for all cities.

tedster

WebmasterWorld Senior Member tedster us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 3721874 posted 5:53 am on Aug 14, 2008 (gmt 0)

So then use a meta robots noindex tag on all the pages in the subdomains.

tilmes

10+ Year Member



 
Msg#: 3721874 posted 6:51 am on Aug 14, 2008 (gmt 0)

Thanks for your answer. Do you know if meta noindex tag works really for duplicated contents? Meta noindex tag is added now. Well then have to wait and see?

activeco

10+ Year Member



 
Msg#: 3721874 posted 8:55 am on Aug 14, 2008 (gmt 0)

Or make additional .htaccess rules:

RewriteCond %{HTTP_HOST} ^([^.]+)\.yoursite\.com$
RewriteCond %{HTTP_HOST} !^www\. [NC]
RewriteRule ^robots.txt$ /norobots.txt [L]

Which should rewrite all requests for robots.txt in any subdomain except www to norobots.txt file.

norobots.txt:

User-agent: *
Disallow: /

tilmes

10+ Year Member



 
Msg#: 3721874 posted 7:08 am on Aug 15, 2008 (gmt 0)

Hi activeco, great idea. i will try this and will post here if it worked or not. thanks!

tilmes

10+ Year Member



 
Msg#: 3721874 posted 7:09 am on Aug 22, 2008 (gmt 0)

Very strange here, in google web mastertool,
there are so many URLs restricted by robots.txt.
And amny URLs have this ' ' in html pages. Like http://www.example.com/'pages.html'
They are all prhibited by robots.

[edited by: tedster at 7:21 am (utc) on Aug. 22, 2008]
[edit reason] switch to example.com - it can never be owned [/edit]

activeco

10+ Year Member



 
Msg#: 3721874 posted 11:42 am on Aug 22, 2008 (gmt 0)

Probably some other rewriting rules eventually combined with bad linking.

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved