Forum Moderators: open

Message Too Old, No Replies

Does Google Care?

Same content ,different url

         

ffctas

10:18 pm on Jan 20, 2003 (gmt 0)

10+ Year Member



Have two url's 1)widgets.com, 2) widget.net
Both are pointed at the exact same content. Google seems to be picking up each url separatley. In other words, a search for "black widgets" shows as a highly ranked site under both widgets.com and widget.net
Is this OK with google?
thanks
tom

nancyb

10:25 pm on Jan 20, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Supposedly, duplicate content is a "no-no" with Google, but with this last update I am seeing a lot of it. The first page, for one result I monitor, has five different listings for the same content. Hopefully this is just a minor glitch that G is working on.

I'm not too keen on reporting sites to G because sometimes it could just be temporary problem with the site, but many here use the spam reporting tool to let them know.

skibum

6:03 am on Jan 22, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Many times they just combine the 2 URLs and see them as one. One MAJOR pharma drug site we worked with had mirror domains setup for regulatory purposes and no robots.txt to help Googlebot find the right one. They ALL got tossed and it took six months or so to clear it up so that one remained in the index.

If ya wanna play it safe, put a robots.txt to exclude Google from the one with lower visibility. Ya never know what will happen.

daroz

5:03 pm on Jan 22, 2003 (gmt 0)

10+ Year Member



Here's a robots.txt question I can't seem to find an answer...

If you have a site, say 'example.com', and also have 'example.org' and 'example.net' pointing to the same webspace, (Same file structure on the webserver, not seperate virtual hosts) how do you format the robots.txt in such a way to have the spiders avoid the 'alternate' domains?

It looks like the spec calls for the robots.txt to use 'local' URLs, not full blown [etc.etc.etc...] URLs.

Yidaki

5:24 pm on Jan 22, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



No way to do this via robots.txt, daroz. Why not making virtual host folders / entries and putting one robots.txt file into the virtual host's root folders?

ffctas, that's a perfect way out of the google index. ;)