Welcome to WebmasterWorld Guest from 54.167.86.211

Message Too Old, No Replies

Ranking drop for duplicate content in regional subdomains

     
10:55 am on Aug 13, 2008 (gmt 0)

New User

10+ Year Member

joined:Oct 17, 2004
posts:16
votes: 0


I made a site recently with using sub-domains e.g. australia.example.com, uk.example.com. So it targets regional markets. After a month all pages indexed from google found at the very end of search results. It used to found at first page. I guess it is google panelty. How can i get out of google panalty for sure? With robots.txt, meta tags, or coding in .htaccess file?

[edited by: Receptional_Andy at 10:57 am (utc) on Aug. 13, 2008]
[edit reason] Please use example.com - it can never be owned [/edit]

11:25 am on Aug 13, 2008 (gmt 0)

Senior Member

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 26, 2000
posts:37301
votes: 0


A couple things

1. New sites often start out with great rankings, then lose those positions after a short period only to build back slowly. It's like a text period to see if the site will really catch fire, I think, and most of the time that doesn't happen. We used to call this the Google Sandbox [webmasterworld.com].

2. As you may have just discovered, you should definitely watch it with regional pages that are too close to duplicates. Sure, you can exclude those subdomains through robots.txt or robots meta tags. Then, if they aren't even indexed, why do you want them online? If they have a real business purpose, then I'd suggest making them more tailored to each region so they are no longer duplicate.

12:14 pm on Aug 13, 2008 (gmt 0)

New User

10+ Year Member

joined:Oct 17, 2004
posts:16
votes: 0


Hello tedster, thanks for your comments. I would try with robots.txt first. I could not find the code in internet. How can i disallow only those sub domains with robots.txt from indexing? Would you be more detail about "making them more tailored"? I would be glad to keep those sub domains because it shows only ads which are related that region.
1:10 pm on Aug 13, 2008 (gmt 0)

New User

10+ Year Member

joined:Oct 17, 2004
posts:16
votes: 0


I still cannot find out how to disallow googlebot from indexing subdomains. Because all subdomains are generated from .htaccess. And there are no sub folders for them.
1:45 pm on Aug 13, 2008 (gmt 0)

Preferred Member

10+ Year Member

joined:June 13, 2004
posts:650
votes: 0


Bots usually call for robots.txt in a subdomain, so generate that file too.
7:11 pm on Aug 13, 2008 (gmt 0)

New User

10+ Year Member

joined:Oct 17, 2004
posts:16
votes: 0


The root directory is same as www.site.com is located. Can have Robots.txt only for newyork.site.com?
9:43 pm on Aug 13, 2008 (gmt 0)

Preferred Member

10+ Year Member

joined:June 13, 2004
posts:650
votes: 0


Probably, if you check request and provide corresponding file.
2:04 am on Aug 14, 2008 (gmt 0)

Senior Member

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 26, 2000
posts:37301
votes: 0


The root directory is same as www.site.com is located

That can't really be true, from an http request point of view. What you need is a file at this address: newyork.example.com/robots.txt

5:17 am on Aug 14, 2008 (gmt 0)

New User

10+ Year Member

joined:Oct 17, 2004
posts:16
votes: 0


newyork.example.com is generated from rewrite rule .htaccess. What can i write in robots.txt, allow to index only www.example.com? Because there can be only one robots.txt in root directory for all cities.
5:53 am on Aug 14, 2008 (gmt 0)

Senior Member

WebmasterWorld Senior Member tedster is a WebmasterWorld Top Contributor of All Time 10+ Year Member

joined:May 26, 2000
posts:37301
votes: 0


So then use a meta robots noindex tag on all the pages in the subdomains.
6:51 am on Aug 14, 2008 (gmt 0)

New User

10+ Year Member

joined:Oct 17, 2004
posts:16
votes: 0


Thanks for your answer. Do you know if meta noindex tag works really for duplicated contents? Meta noindex tag is added now. Well then have to wait and see?
8:55 am on Aug 14, 2008 (gmt 0)

Preferred Member

10+ Year Member

joined:June 13, 2004
posts:650
votes: 0


Or make additional .htaccess rules:

RewriteCond %{HTTP_HOST} ^([^.]+)\.yoursite\.com$
RewriteCond %{HTTP_HOST} !^www\. [NC]
RewriteRule ^robots.txt$ /norobots.txt [L]

Which should rewrite all requests for robots.txt in any subdomain except www to norobots.txt file.

norobots.txt:

User-agent: *
Disallow: /

7:08 am on Aug 15, 2008 (gmt 0)

New User

10+ Year Member

joined:Oct 17, 2004
posts:16
votes: 0


Hi activeco, great idea. i will try this and will post here if it worked or not. thanks!
7:09 am on Aug 22, 2008 (gmt 0)

New User

10+ Year Member

joined:Oct 17, 2004
posts:16
votes: 0


Very strange here, in google web mastertool,
there are so many URLs restricted by robots.txt.
And amny URLs have this ' ' in html pages. Like http://www.example.com/'pages.html'
They are all prhibited by robots.

[edited by: tedster at 7:21 am (utc) on Aug. 22, 2008]
[edit reason] switch to example.com - it can never be owned [/edit]

11:42 am on Aug 22, 2008 (gmt 0)

Preferred Member

10+ Year Member

joined:June 13, 2004
posts:650
votes: 0


Probably some other rewriting rules eventually combined with bad linking.
 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members