Forum Moderators: open

Message Too Old, No Replies

Duplicate content questions

         

pardo

7:38 am on Apr 24, 2003 (gmt 0)

10+ Year Member



On of our customers website has been fully spidered for the first time during the deepcrawl. There were some questions coming up on this one:

1. they have widgets.nl and widgets.com
some pages were spidered for both url's but fysically there is one domain, the other is a server redirect. Is this giving any trouble regarding 'duplicate content'?

2. 3 pages '../widgets.htm' and '../Widgets.htm' were spidered separate but are the same pages. Does Google see this as 'duplicate content' also?

3. we did manage to get the dynamically driven website into 'static looking' pages. But due to some wrong links internal there are links to some dynamic pages. During the crawl several pages are crawled like '../widgets-blue-widgets.htm' and '../category.asp?id=3'
Again my question is whether this will be seen as duplicate content?

pardo

10:05 am on Apr 28, 2003 (gmt 0)

10+ Year Member



One disadvantage of a popular forum like this is the speed of one's topic going down the page for there are lot's of intersting posts going top.

Try a second one on my questions on duplicate content...

heini

10:09 am on Apr 28, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Pardo, how is the server redirect done? Doesn't sound like it worked perfectly?

Yidaki

10:22 am on Apr 28, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



>1. they have widgets.nl and widgets.com - a server redirect.

The redirect should at least return a 301 (Moved permanently) status code.

>2. 3 pages '../widgets.htm' and '../Widgets.htm' were spidered separate but are the same pages.

AFAIK google is smart enbough to merge the results instead of penalizing one or all of the pages. It works with most of my sites that are hosted on "case insensitive servers".

>3. we did manage to get the dynamically driven website into 'static looking' pages. But due to some wrong links internal there are links to some dynamic pages.

You should set up your robots.txt file to disallow crawling either the static or the dynamic urls.