<moved from another location>
|Does that mean I should avoid rewriting dynamic URLs at all? |
That's our recommendation, unless your rewrites are limited to removing unnecessary parameters, or you are very diligent in removing all parameters that could cause problems.
Dynamic URLs vs. static URLs
This coming from Google? Wow, I'm really confused. But, I know I will never serve your bot or any other bot a query string like you are suggesting. What exactly did ya'll do over there? What are we missing here. Forget about the search engines for a moment, what about all the other reasons we may want a static looking URI over a dynamic one?
Are you really that confident in your crawler's capabilities? I'm not, and I believe I have the support of many others when saying that we will not change what we've been doing for years just because you think you can crawl dynamic query strings more efficiently than rewritten ones.
[edited by: Robert_Charlton at 4:48 am (utc) on Sep. 23, 2008]
[edit reason] spliced post onto thread [/edit]
I can add something I'm dealing with right now which Google can't handle at all which is optional parameters.
Google will think the following are 3 separate pages:
Where "c=" might just be some flag being passed where it doesn't change the content being displayed whatsoever. The content can be identical on all 3 pages, but you'll get dupe warnings in Webmaster Central if you run into this situation and have a big old mess to clean up.
It would be simple enough for them to allow us to specific these optional parameters to be ignored either in webmaster central or a meta object in the page.
I'm going to preface this post with a few of the statements that are currently listed in Google's Webmaster Guidelines...
|Make a site with a clear hierarchy and text links. Every page should be reachable from at least one static text link. |
|If you decide to use dynamic pages (i.e., the URL contains a "?" character), be aware that not every search engine spider crawls dynamic pages as well as static pages. It helps to keep the parameters short and the number of them few. |
|Allow search bots to crawl your sites without session IDs or arguments that track their path through the site. These techniques are useful for tracking individual user behavior, but the access pattern of bots is entirely different. Using these techniques may result in incomplete indexing of your site, as bots may not be able to eliminate URLs that look different but actually point to the same page. |
That Google Blog post sounds as if "they" are the only crawler out there? I have a great deal of respect for Google as a company and a brand. But this Blog Post crosses the line. So, you want us to just leave our sites as they may end up right out of the box? That is the most absurd post I've seen from Google ever, seriously. I'm actually at a loss for words and need to get back to work. This one isn't even worth the discussion really.
|And always ranked well in Google without rewrite rules so some of this is old hat to a few of us but news to the rewrite rule junkies. |
Okay, that's one. Any others?
What is Yahoo!, Live and Ask.com's stance on this?
|If you are not able to figure out which parameters to remove, we'd advise you to serve us all the parameters in your dynamic URL and our system will figure out which ones do not matter. |
I'm looking at that Google blog article, and I think it's aimed at people who don't know anything. I'm wondering why they think anyone who knew what he was doing would take the trouble to use rewrites to create user-friendly urls and serve them up with session IDs.
Is this a hint that keyword rich URLs may not work (rank) so well in a future algo? I see this as a hint by the big G to SEO'ers out there... maybe it's just me...
>>Is this a hint that keyword rich URLs may not work (rank) so well in a future algo?<<
Keyword rich URLs *might* be worth something, but if so it's too miniscule to fret over - with Google, anyway.
>>I see this as a hint by the big G to SEO'ers out there...<<
Swanny, I don't think it's aimed at SEO'ers at all; I think some G folks must have had a meeting and decided to make a well-intentioned goodwill gesture (which I really believe it is) to help out the mom 'n pops who really don't know anything beyond using stuff off the shelf and ready-made templates or software like FP.
There really are people running web business for years who don't know what HTML and CSS are, much less anything about mod_rewrite - for real. Not only that, I've seen a web designer who does a LOT of business with 6 different URLs for the same page on their own site - yes, 6 - using different domains and not a clue about 301s.
I like the fact their blog has rewritten URLs on.
It is dangerous as often it is done badly, exposing as much Duplicate Content as it is hiding.
Additionally, if you don't redirect all the dynamic URLs over to the static-looking format, you increase the problem even more.
One thing that Google does is strip back longer URLs to see what a shorter URL will return.
That is, if they find a URL like somesite.com/folder/file.html they will strip it back to see what somesite.com/folder/ returns. For a rewritten URL, it might return nothing or some random Duplicate Content, whereas on a normal site that URL would have returned the index page from the folder or a bare file listing.
Another thing that badly implemented URL rewriting does, is to wrongly implement what happens when a URL does not exist.
The forum/blog/CMS returns an error message to the user to read, but does so with a "200 OK" HTTP status, rather than a proper "404" in the HTTP Header. That confuses the heck out of of bots, as the site serves infinite Duplicate Content. Google now probes every site with a random filename like /noexist7382062706343724085264362790.html to see what is returned and verify the 404 handling.
Badly done rewriting can also see a request for /robots.txt ending up being handed to the CMS and some random template-driven page being returned, which must upset their bot greatly.
And then there is the somesite.com/blog/post-124532-insert-any-words-you-like-here-it-does-not-matter-what-they-are-at-all.html problem...
Too many people rewrite urls these days and it makes it hard for SE to give priorities - it is particularly problematic when it comes to session IDs that might not be removed if the URL does not look like it has got any query string parameters.
It's shocking that URI spec did not provide for single session ID variable and every programmer out there seems to want to invent their own name.
Now that would have been a good idea: the session ID should always be called sessionid and the value should be exactly X or Y or Z characters long.
[edited by: g1smd at 11:14 am (utc) on Sep. 23, 2008]
As long as the name is the same the size of session variable would be irrelevant - you can bet that all decent SEs recognise most common session IDs, however some sites continue to use totally new names for no reason at all - zero benefit to the site in this case, it will only hurt it.
Okay at this point I'm pretty confused with what to do. Yes, I have a dynamic site that re-writes and while an important reason (I thought!) was for rankings it also serves as a quick way for visitors to understand what a page is about. The links on my site are often passed along in print form (e.g., in a book footnote or reference area) and it just makes much more sense to use URLs that are understandable.
I guess what I'm most confused about is what problems re-writes can cause to a SE. Can someone summarize these.
Also, since my site has been using re-writes for a long time and has many, many pages, wouldn't I have to redirect all these SEF URLs to the dynamic address in order to not lose in rankings?
confused. Result: sticking with rewrite.
The blog post is both confused and confusing. If Google wants to make the claim that they can now handle dynamic URLs as well as they handle static URLs, that's fine. It may or may not be true, but they're free to make the claim. But telling us that we now should avoid rewriting URLs, just because they think they have the dynamic URL issues licked, is disingenuous.
Google put a lot of shoulds into that post, without giving any solid reasons (other than "it's hard to do").
I use rewritten URLs primarily because they're shorter and more user-friendly:
Every day I see people's eyes glaze over and their brain cells turn to mush when they try to decipher all those & and = parameters.
I will continue to use rewrites to create static-looking URLs for my users. I thought Google's point all along was to create sites for users. Now they're telling us what we should and shouldn't do for GoogleBot regardless of whether it's better or worse for our users.
The blog post also ignores the existence of other search engines. Even if Google truly does handle dynamic URLs just fine, other search engines do exist, and static URLs may perform better with other SEs.
Sorry, this doesn't pass the smell test.
This is not new news. Google reps have long been saying you don't need to rewrite your urls, just be smart about it. If a page needs 4-5 parameters, maybe that url needs its own new page? That has been our design philosophy.
My advice to folks thinking to changing back... don't. Google is pretty much saying what we already knew all along... That the content of your page is king.
As long as you have unique urls for each unique page of content you are fine.
Until Google is the ONLY search engine, keep using rewrite.
The others are not so intelligent.
There's also a subtle human friendly element to rewrite in some cases.
[edited by: amznVibe at 12:36 pm (utc) on Sep. 23, 2008]
If rewritten correctly the URL will be so much more user friendly as well, which is as important.
|This is not new news. Google reps have long been saying you don't need to rewrite your urls |
But maximillianos, it's new that Google is saying you should not rewrite. There's a pretty big difference between "you don't need to" and "don't do it."
I hope this doesn't turn into a kudzu monster like nofollow, which went from "use it for user-generated content that you don't trust" to "you should use it for all paid links or we'll punish you."
|But maximillianos, it's new that Google is saying you should not rewrite. There's a pretty big difference between "you don't need to" and "don't do it." |
If you read the article, they say you can still do if you like, but just do it for the right reasons.
|I hope this doesn't turn into a kudzu monster like nofollow |
|, as a matter of fact, we at Google have made some progress in both areas. |
I heard a rumour saying that there still are some other search engines out there, hiding somewhere in the mountains. Let's all change our websites so those outlaws can't read our pages any longer.
We have always done very well with rewritten url's and can't see ever changing. I am with Robert Charlton on this one I think it is a warning to those that really don't understand the aspects of the possibility that if not done right it can cause duplicate issues.
Yahoo and MSN are much harder to get the rewrite correct to keep them from indexing both the dynamic url and the rewritten url and I feel this is one place so many sites are having real problems in Yahoo and or MSN.
I feel this is a good waring to those that are getting started and better to leave the url as is rather than try the rewrite and develop problems down the road.
Having sites in the index for years with rewritten URL's there is no going back to the dynamic URL's without a tanking in all the SE's
I love how Google thinks they're the only search engine and the only reason we write url's is to please them.
Rewriting is a powertool, and it's a really great skill to have, and an excellent technique to employ. But it's SO EASY to do it badly! I wrote an article recently explaining how trivial it is to create loose-ended rewriting patterns that in essence allow an infinity of non-canonical URLs to show the same content.
Bad rewriting isn't just an epidemic problem. It's the norm. My cynical guess is that 75 - 80% of sites (and platforms) employing URL rewriting (or its runty cousin, redirection) do it badly. It's like the entire internet was given a pair of scissors, and now everyone has a mullet.
I interpret Google's stance as being helpful advice, not misleading or obtuse. They are saying, metaphorically, "Please put down the chainsaw. We'd rather receive uncut wood than watch you sever a limb with a tool you're not using properly."
They're also clearly stating that they understand how querystrings are used, that variables in the querystring are often negligible state identifiers, not all are keys to unlock new content.
Google HappyRank Points will be awarded to those SEO's who understand the difference between parameters that should be rewritten to appear static-looking before the "?" and those that should be kept as querystring pairs after the "?".
How does Microsoft Live Search feel about rewritten URLs?
Well, I was at Web 2.0 Expo last week in New York where Nathan Buggia, Lead from Live Search's Webmaster Center, spoke on Advanced SEO for Developers. He stated outright that well-formed, static URLs with keywords can provide ranking benefit.
He said that endusers respond better to shorter URLs which contain keywords which reinforce the query that a user submits to a search engine. So, there's a definite usability/user-experience benefit to giving keyworded URLs some ranking weight.
Google's post mainly says "...don't rewrite your URLs because you might get it wrong... we want to know if a site is dynamic so we can treat those page URLs differently."
They do not mention how keywords in the URL may provide some ranking benefit -- which is sometimes harder to accomplish without some form of URL rewriting. As pageoneresults mentions above, this is somewhat counter to previous advice they've provided, even considering ongoing evolution of Google's ability to handle dynamic URLs successfully.
There are significant dangers with putting sessionids into parameters and even more with putting userids into parameters.
Firstly just think about the problem of bookmarking - if someone bookmarks a page with sessionid parameters many of these sites will send you a session has expired page when you return. Putting userids in URLs just asks for people to start nosing.
I agree that many sites implement rewriting badly - I've made one or two booboos in my time too, but done well it gives a URL which represents the content of the page, so that people reading the URL know what the page is about, and a page which can be bookmarked or linked to without problems.
[edited by: IanTurner at 2:40 pm (utc) on Sep. 23, 2008]
I think the entire point of the article is to avoid this problem:
You should never have a static-looking URL with a session ID, user ID, etc, in it. That's a problem that search engines can't effectively work around.
|There are significant dangers with putting sessionids into parameters and even more with putting userids into parameters. |
Absolutely. I should rephrase: you should never have a URL with a session ID or user ID in it. That's why we have cookies.
[edited by: mcavic at 2:51 pm (utc) on Sep. 23, 2008]
|Okay at this point I'm pretty confused with what to do. Yes, I have a dynamic site that re-writes and while an important reason (I thought!) was for rankings it also serves as a quick way for visitors to understand what a page is about. The links on my site are often passed along in print form (e.g., in a book footnote or reference area) and it just makes much more sense to use URLs that are understandable |
Also IMAGE matters ..thats a huge variable when the masses are looking at a 2 addresses, one with a rewrite that makes sense and another that looks like hackers/spyware code ..! lol
Google admits so when they a say that the static looking address "may have a slight advantage in click through rate"
slight my arse :)
|That's why we have cookies. |
Agree completely; that's the point to be stressed, re-written or not.
You know, I have a killer article to counter all the claims in that post from Google. Maybe I can reference some of the docs that are out there in relation to this.
An eye-tracking study of information usage in Web search:
Variations in target position and contextual snippet length
|Edward Cutrell and Zhiwei Guan from Microsoft Research have conducted an eyetracking study of search engine use (warning: PDF) that found that people spend 24% of their gaze time looking at the URLs in the search results. |
|We found that searchers are particularly interested in the URL when they are assessing the credibility of a destination. If the URL looks like garbage, people are less likely to click on that search hit. On the other hand, if the URL looks like the page will address the user's question, they are more likely to click. |
There is so much research out there to counter everything that post discusses, everything!
Was that some sort of linkbait for the week? We're you meeting a quota for posting information to the Blog? Who thought this one up? And, when will the official Webmaster Guidelines be updated to reflect these new suggestions that you are making? Inquiring minds (and investors) want to know.
| This 59 message thread spans 2 pages: 59 (  2 ) > > |