Forum Moderators: Robert Charlton & goodroi
This was posted by a Google employee:
<i>I've been aching for a long time to mention somewhere official that sites shouldn't use "&id=" as a parameter if they want maximal Googlebot crawlage, for example. So many sites use "&id=" with session IDs that Googlebot usually avoids urls with that parameter.</i>
I've got a site with about 1,500 articles and all the URLs use "&id=9999" to identify the articles -- tons of them are in Google's index so I've never worried about the &id problem before. I had seen GoogleGuy post on it ages ago, and just let it slide because we were doing OK and because I figured eventually Google would move past that challenge, like all the other challenges they've nailed.
But in the last two big updates, including this recent one, we've seen traffic drop by 90%. In Allegra, our unique domain name was buried way deep below scrapers for a couple of months. In Bourbon, traffic is just way down and I haven't had time to figure out why.
Do you think I should change the id= parameter to article= or something like that? If so, is there any way to do it without completely losing all Google traffic for three months while Google figures out the change?
Any advice or thoughts would be appreciated. Thanks.
Any help is appreciated. Thanks!
Again, my question boils down to this:
Does &id= only hurt your being crawled or do it potentially hurt your rankings and make you suffer for a month or two after every major update?
My site uses &id= and all the articles are in the index, so I'm wondering if it's worth a change.
The problem with using ID is that it gives Google another opportunity to add points to whatever similarity algo they use.
I'd say if your traffic is really low then it may be time to experiment, however I would not change pages that have good inbound links.
Mod-rewrite is the ultimate answer, but I'd definately avoid building a site using?id= Better to use?page= or something. If it's an existing site using?id= and it's still nicely spidered then it's probably not a worry.
You should not use &id= in your URL's
My concern is that ANY parameter in the form of <some_thing>id=## may be getting nailed. For instance I use pages that have expertid=6 or answerid=12 or newsid=31, etc.
This is because I store much of my content in a database.
What I've noticed is that Google is not crawling some of my key pages that have such parameters in them. This is for the past 3 or 4 months (I believe). Also, the pages in this form that Google has indexed now just show up as URL's in the search results.
I don't think the problem is just limited specifically to?id=##.
It would be REALLY REALLY useful if Googleguy could clarify this one as the advice is not clear. In the worst case it almost makes me want to make all of these pages load as static html pages without any parameters. I don't want to have to do this unless I'm certain that this is the problem.
I have a bad feeling due to the way I see Googlebot avoiding pages like the ones I mentioned above.
It shouldn't be too difficult for Google to discern whether the parameter is a sessionid versus just an id that refers to an entry in a database driven site....
sr123
Based on what I've read, I assume there's no downside to going to static addresses without any parameters being explicitly passed.
I just hope that the change does more good than harm, and that is doesn't take forever to get reindexed.
Thanks to all of you for the advice/comments.
What I've noticed is that Google is not crawling some of my key pages that have such parameters in them. This is for the past 3 or 4 months (I believe). Also, the pages in this form that Google has indexed now just show up as URL's in the search results.
it is very clear on the google webmaster guidelines that dynamic URL's take longer to crawl. This is because googlebot will limit itself to a specified # of pages on dynamic sites - to prevent overloading the server. URL only just means that the url is registered with google but not crawled yet.