Welcome to WebmasterWorld Guest from 3.234.214.113

Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

Duplicate content and server url

     
3:38 pm on Aug 16, 2005 (gmt 0)

New User

10+ Year Member

joined:July 19, 2003
posts:2
votes: 0


I have a website that was on page 1 for a number of years for our most popular search. Now on page 10. I have found some duplication of the website. mysite.com and the same website on the server with a different url servername.com/site/welcome.htm. Both sites are in the google cache and show as duplicate content when using a search of mysite.com at copyscape. Does google look at this as duplicate content?
1:34 pm on Aug 17, 2005 (gmt 0)

Preferred Member

10+ Year Member

joined:Sept 2, 2003
posts:412
votes: 0


if both the sites are listed in google and cache is also there. I believe this is the duplicacy. Do you own both the sites..?

Exp...

4:36 pm on Aug 17, 2005 (gmt 0)

New User

10+ Year Member

joined:July 19, 2003
posts:2
votes: 0


Don't own both sites, both sies are on the same server. The duplicate site is owned by the company that host my website. The company hosting the sits claims the google bots don't see it or index it as a duplicate site.
5:02 pm on Aug 17, 2005 (gmt 0)

Senior Member from NL 

WebmasterWorld Senior Member lammert is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month

joined:Jan 10, 2005
posts:2959
votes: 38


I have had exactly the same problem. My site was visible under both www.hostingcompany.com/account/ and www.mydomain.com/. Unfortunately the hosting company didn't allow me to use .htaccess under www.hostingcompany.com/account/ to do a 301 redirect, so I had to find another solution. This is what I did.

All my pages were created with SSI and SSI only worked on www.mydomain.com. I used a conditional header in each .shtml file with a meta refresh statement. This is the header for file /example.shtml:

<!--#IF EXPR="$SERVER_NAME!=www.mydomain.com" -->
<html>
<head>
<meta name="robots" content="noindex,follow">
<meta HTTP-EQUIV="refresh" content="10; URL=http://www.mydomain.com/example.shtml">
</head>
</html>
<!--#ENDIF -->
[ here the rest of the page ]

This trick works also when SSI is recognized at www.hostingcompany.com/account/ How it works:

The if statement is parsed by the SSI parser. If it is a request to www.mydomain.com, the header is skipped. Otherwise a header is created with "noindex,follow", followed by a meta redirect to your actual page at www.mydomain.com. The 10 in the content line is the amount of seconds before the redirect takes place. I tried 0 seconds, but that didn't work. In that case the Googlebot directly went to www.mydomain.com, forgot the "noindex,follow" robots tag and indexed the page under the first URL. This is the same behaviour as a 302 redirect. By waiting a few seconds the bot picks up the noindex and pages are removed from the index after some time.

If SSI doesn't work at www.hostingcompany.com/account/ as in my case, it works almost the same. The IF statement is not parsed but seen as a HTML comment and the head block is directly pasted in the output.

If your site uses another server side scripting language like ASP or PHP, you could do the same trick by sending a redirect header whenever the traffic doesn't originate from www.mydomain.com.

 

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week

Featured Threads

Free SEO Tools

Hire Expert Members