Forum Moderators: open

Message Too Old, No Replies

Having word ID on the URL

         

skuba

12:05 am on Jan 15, 2004 (gmt 0)

10+ Year Member Top Contributors Of The Month



I am facing a major issue here.
I am already tring to get rid of regular expressions and spaces on the dynamic pages of this site.
Now I found out that google doesn't like the word ID also cause it may look like a session ID.
What is ID is just part of a longer parameter like dept_id or sub_id?

Also, how long can a URL be to still be SE friendly.

This website I am talking about has LONG URLs, plus SPACES, plus SYMBOLS, plus word ID.
It seems like completely SE unfriendly.

Check it out:
detail.htm?stylepkey=11049&style_id=030%20COMBB0&dept_id=3&deptName=Parts&sub_id=50&subName=Campagnolo%20Parts&lprice=42.98&hprice=42.98

First I just wanted to have a ASAPI rewrite to replaces?, &, = and spaces by -

Would that be enough?

We also have the problem of the product prices that have dots in it, and also the word ID in some of the paremeters...

Any ideas?

Thanks

Ledfish

2:30 am on Jan 15, 2004 (gmt 0)

10+ Year Member



Well, you'll definitely have problem with those urls. More than two parameters is about the max from what I've heard.

And

ID is very often a problem regardless of whether it is by itself our coupled with another word such as idproduct or productid.

Ledfish

2:31 am on Jan 15, 2004 (gmt 0)

10+ Year Member



Well, you'll definitely have problem with those urls. More than two parameters is about the max from what I've heard.

And

ID is very often a problem regardless of whether it is by itself our coupled with another word such as idproduct or productid.

skuba

3:21 am on Jan 15, 2004 (gmt 0)

10+ Year Member Top Contributors Of The Month



Yes, but if I use rewrite those URLs will look like static pages. So, google won't know they are parameters or not.
And what you said about ID is not completely true. That way google woudn't index pages with words stereoid, identity, idiot, etc...

millie

9:16 am on Jan 15, 2004 (gmt 0)

10+ Year Member



That Google hates session IDs is no secret and has been stated by Google representatives.

I believe it's not the "styleID" that's the problem, it's the next bit ... "ID=" which screams Session ID to a bot.

I understand that parameter names should not be longer than 10 characters for effective spidering of dynamic pages in Google.

For the other SEs paid inclusion or trusted feed seems to be the only solution I can find for dynamic pages.

I can't help you with the ISAPI rewrite thing as I'm only just starting to play with this myself so haven't tested it fully, however it looks like it's the answer to my prayers.

skuba

6:03 pm on Jan 15, 2004 (gmt 0)

10+ Year Member Top Contributors Of The Month



There is no parameter on the URL that is just ID=... , it is always part of a bigger word like dept_id, style_id, etc...
I guess that in that case google won't bother, becuase underscore connects words, so it looks like one word only.

skuba

11:32 pm on Feb 3, 2004 (gmt 0)

10+ Year Member Top Contributors Of The Month



Please, experts, help!
Thanks

Ledfish

1:32 pm on Feb 4, 2004 (gmt 0)

10+ Year Member



Skuba

Back in November we were really struggled with this problem, everyone I talked to said that our urls were fine, however google on had half a dozen of these pages in the index. looking through my logs, I would notice the bot calling from the page without the query string. Of course that would give them an error because the page could display anything (other than an error message) without having the query string data. For example our looked like this:

www.widegets.com/somefile.asp?idcat=32&idproduct=221

In December, we moved to a new host so we could use ISAPI Rewrite. Having had such bad experiences for several months and because it was just as easy, we used ISAPI Rewrite to rewrite the urls completely removing the "idcat" and "idproduct" from the url so they now look like this:

www.widgets.com/product32-221.html

With-in 3-4 we went from 40 indexed pages (only about 6 had the dynamic urls) to over 500. Currently Google has a just about all 950 of our pages indexed.

Now although many have different opinions on whether id is a problem, IMO the id= that you have is a problem, but all I can really say for sure is that in our experience, the query string and it's content, we believe, was giving us problems getting indexed.

If you are struggling getting indexed, I highly suggest rewriting the urls. You can always do a some minor rewriting at first to get rid of the "?" and the "&" adn then if that doesn't get the pages indexed, then rewrite them a little more.

Nova Reticulis

2:02 pm on Feb 5, 2004 (gmt 0)

10+ Year Member



Indeed, URL rewriting is the solution.

However before starting rewriting you should really consider what pages do you want indexed, what pages not, and how do you want your new URL map to look. Otherwise you'll sink in ensuing confusion.

Our composite solution to URL rewriting troubles (as we have dynamic pages with multiple states and subcommands) was, despite of all, canonical URLs. We built a system that actually exposes an URL scheme that's totally classic; it looks entirely like static HTML URLs and in fact, we use a spider to download the entire site tree and then rsync it up to the production webserver - we have practically zero dynamic content on the website.

skuba

5:22 pm on Feb 5, 2004 (gmt 0)

10+ Year Member Top Contributors Of The Month



LedFish, I am not completely sure that it was because you guys removed idcat fropm the URL that you started to get indexed. Google indexed because you used mod-rewrite to make your URLs look like static.
I don't think it was because you removed idcat, the word id here is not different from words like steroid, idiot, etc... I think that ID as part of a word is not a problem.
My question here is, in my case id is connected by an underscore cat_id, dept_id. So I need to know if google sees that as 2 separate words or just one. If it's just one maybe it's not a problem.

Thanks

millie

5:38 pm on Feb 5, 2004 (gmt 0)

10+ Year Member



Does this help?

[webmasterworld.com...]

skuba

6:35 pm on Feb 5, 2004 (gmt 0)

10+ Year Member Top Contributors Of The Month



Is still think that the word ID could be seen as a separate word even with underscore. Because google is looking for things like sessionid, session_id....