homepage Welcome to WebmasterWorld Guest from 174.129.76.87
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

    
Strange dynamic URL appearing on Google index
possible duplicate content penalty
treborito

5+ Year Member



 
Msg#: 30334 posted 2:39 am on Jul 11, 2005 (gmt 0)

My client's restaurant site has been hit hard by the bourbon update and has yet to return to the very high rankings that it's had for the past five years. We were number one on Google for several different key phrases for <location> Restaurants. We still hold the top positions for the same phrases on MSN and Yahoo. Google crawls our home page every day but only crawls the rest of the site on a monthly basis as the site content does not change that often. I've already submitted a reinclusion request to Google but have yet to hear back. Not sure if it was necessary since we're still in the index.

I've added a permanent 301 redirect, absolute links, removed all duplicate content. As I was reviewing the site, I discovered a few things that seem strange. FYI: The site

1. Google indexed back in January, some pages that do not exist on my site' server. eg. <snip>

There a handfull of these dyanamic URLs that are still showing up in the index. Any idea where they came from? I have two valid PHP pages on my site that ask the user to reserve a table or send in a comment. They contain an image verifier script via a formmail.php file that I downloaded from <snip>. My fear is, that these URLs are being seen as duplicate content.

2. I have noticed that there are countless restaurant/wine related Web site directories that have hijacked a lot of my site's content/wine list/menus, and posted on their own site (which, in turn, back link to our site). Could this duplicate content on these other Web sites be a contributing factor in my site's severe loss in page ranking?

I recently found this thread, <snip>
about dealing with and finding duplicate sites.

Thanks for any assistance.

[edited by: trillianjedi at 3:10 pm (utc) on July 11, 2005]

[edited by: ciml at 3:34 pm (utc) on July 11, 2005]
[edit reason] No URL drops please as per TOS #13 [/edit]

 

Psazf

5+ Year Member



 
Msg#: 30334 posted 2:02 pm on Jul 11, 2005 (gmt 0)

Hi,

Sorry, but I clicked the url that you posted and it works!

<snip>

the page title is "<snip, example : an italian restaurant>"

Kind regards
PSAZF

[edited by: trillianjedi at 3:11 pm (utc) on July 11, 2005]

treborito

5+ Year Member



 
Msg#: 30334 posted 3:01 pm on Jul 11, 2005 (gmt 0)

Yes, I know the URL works but it appears no where on my server. The URL appears dynamic but there are no variables attributed to the paticular orgin page. <snip>. Just not sure why it's there...I have only two following PHP pages that I created and that I see on my server:

<snip>

Any idea why these bogus (but live) dynamic URLs are appearing in the Google index?

[edited by: trillianjedi at 3:11 pm (utc) on July 11, 2005]
[edit reason] See above. Thread also being moved to google forum. [/edit]

jd01

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 30334 posted 4:10 pm on Jul 11, 2005 (gmt 0)

Hmmmm....

Google: There's almost nothing a competitor can do to harm your ranking or have your site removed from our index.

I wonder if I can think of one...

[webmasterworld.com...]

[webmasterworld.com...]

[webmasterworld.com...]

Not saying this is what happened, but it sure seems possible.

I wonder how many times this page will get indexed?
I wonder if they all 'count' as the same page, or duplicate content?

Justin

Edit: Attribution

theBear

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 30334 posted 4:59 pm on Jul 11, 2005 (gmt 0)

Justin, you think?

Google currently has many thousands of these against its own directory. They all end with?il=1

Now what does one make of that?

I gave up trying to find a site using those urls so I assume that someone or something submitted them and went bye bye.

Some are fully indexed and now supplemental, some are url only, cache dates go back to last year and some have current cache dates.

BTW Justin, thanks for the sticky, I will give it try.

I have two sites that are now immune to that problem, one left to go.

PS:In ref to the links is that before or after the redirect hits Google?

I'm sure MikeNoLastName would say they will get indexed .

jd01

WebmasterWorld Senior Member 5+ Year Member



 
Msg#: 30334 posted 12:48 am on Jul 12, 2005 (gmt 0)

theBear:

PS:In ref to the links is that before or after the redirect hits Google?

If you mean the code I sent you, the code will server a 301 to the non-? version of the page, and then if the page exisit a 200 and if not a 404, which might be confusing to the little bots, but I need to be able to pass query strings and that one lets you.

If you are asking about other redirects/links I am not sure what you mean... the three links above serve a 200 on the first request, just like they should =).

The tough part with the whole situation is if the content changes based on the ?blah=stuff you want the pages indexed, but if it does not, then you don't. So, in some cases (where a script uses the parameters to serve the right information) it is right for SE's to index ?page=1, ?page=2, ?page=3 as different pages, but in other cases (where the content stays the same) they should not be indexed.

Unfortunately, there is not really an easy way for a SE's to determine which ones should be indexed and which should be dropped, so that leaves it to us to protect our site the best we can.

I believe the best way around this is to serve all pages as html and rewrite to any necessary script(s). Then you can catch any, uh, *bad* requests on the way in and decide what to do with them.

Personally, I use php and do not serve a file that needs parameters (or one that doesn't) as php, unless I have to. I use mod_rewrite to pass the variables and serve all my pages as html. I initially started doing this to protect my scripts, but in hindsight, some of the other added benefits far out-weigh the 'script hiding' aspect.

Justin

Added: I use mod_rewrite to pass the variables and serve all my pages as html. This is not the same as parsing html as php.

treborito

5+ Year Member



 
Msg#: 30334 posted 1:43 am on Jul 16, 2005 (gmt 0)

I have listed the non-www URLs here with hopes that they will get crawled again and wiped out by the 301 redirect. I've placed them on another site of mine but it has yet to get crawled.

<snip>

[edited by: vitaplease at 3:38 pm (utc) on July 17, 2005]
[edit reason] no url drops please [/edit]

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved