homepage Welcome to WebmasterWorld Guest from 54.226.43.155
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

This 59 message thread spans 2 pages: < < 59 ( 1 [2]     
Google Says Don't Rewrite Dynamic URLs
incrediBILL




msg:3750108
 2:53 am on Sep 23, 2008 (gmt 0)

Google's Official Blog posted a nice article about dynamic URLs and what works well with Google and what fails miserably.

[googlewebmastercentral.blogspot.com...]

Which can Googlebot read better, static or dynamic URLs?

We've come across many webmasters who, like our friend, believed that static or static-looking URLs were an advantage for indexing and ranking their sites. This is based on the presumption that search engines have issues with crawling and analyzing URLs that include session IDs or source trackers. However, as a matter of fact, we at Google have made some progress in both areas. While static URLs might have a slight advantage in terms of clickthrough rates because users can easily read the urls, the decision to use database-driven websites does not imply a significant disadvantage in terms of indexing and ranking. Providing search engines with dynamic URLs should be favored over hiding parameters to make them look static.

So everyone having problems ranking with their rewrite rules can now see clearly what to do to fix those problems.

As a matter of fact, many ecommerce sites have always used dynamic URLs with parameters for a very long time, since the beginning of ecommerce basically, and always ranked well in Google without rewrite rules so some of this is old hat to a few of us but news to the rewrite rule junkies.

 

claus




msg:3750509
 3:17 pm on Sep 23, 2008 (gmt 0)

There are many ways to interpret this set of messages, IMHO

(1) Bragging rights
"Our searchengine does this, does yours?". Pure marketing: A message to the general public and the other SE's that Google is now capable of "something"

(2) Easing the life of Jane and Joe
There's been a lot of talk (writing) about all this htaccess woodoo during the past few years. A lot of people simply don't get it because, well.. it *is* complicated. So these people risk messing up big time if they try.

(3) Forgiving (Sloppy) code
Because people in general don't always write very "tight" code (both PHP and ASP can be very forgiving) the resulting htaccess rules will sometimes also be "open to interpretation" meaning that even if you've got rewrite rules in place the same page can still be displayed using a number of different URLs

(4) Work in progress
There's "no significant" difference. Translating to: "We don't really know for sure but we believe that it's working allright according to our internal tests" - the more not-rewritten URLS found on the web, the more data Google will have to determine if the logic is good enough (and improve it)

(5) SE competition
"Our engine can do this now, so just dump all that rewrite stuff. We at Google don't really care if the other engines don't get it, you know." Trying to get another competitive advantage; an advantage to Google corp., not necessarily to webmasters or end users.

(6) Web data
Google has this idea that people shouldn't try to "hide anything" from the SE, ever. Not just in order "not to game the system" but also because all the things people do to hide how their web sites really work is a barrier to Google in creating search logic. It makes things a little more complicated than they have to be.

... and there are more.


But, the single issue that strikes me more than anything about this statement is this:

Googles "centre of the universe" mindset

When you go to my web site looking for, say "red widgets" then -- if I have optimized my URLs -- you may be able to find it at an URL looking like this:

example.com/widgets/red/

Then, the next time around you may be looking for blue widgets, and you may find them here:

example.com/widgets/blue/

As a person, you will learn that logic pattern very quickly. You will know what to expect, and what action to take to get to what you want.

How? Well, this is exactly the type of pattern recognition that the human brain is built for, that's why! We're all masters at discovering such stuff. Much more adept at this task than any single search engine will ever be.

Think about that for a moment. Who does this benefit? Who does it *not* benefit?

The answer: It benefits the visitor/customer searching for blue or red widgets. It's simply convenience: It makes the search stage shorter and it brings the fullfillment stage closer.

It also benefits you, as web site owner. How? It builds loyalty. People will want to go directly to *your* site because they know exactly how and where to find stuff.

Well, here's the shocker ...

This - web site loyalty - is directly against the interests of Google (!)

Google has absolutely no interest in seeing any people prefer your site over another. To Google all that matters is that the web at large is as difficult to navigate as possible. Why? Well because that's what search engines are for.

As a normal person you should not have loyalty to any one web site; except google.com.

If you want "red widgets" you should type it in the search box. Then Google will present you with a list of possible matches. That's how the web is supposed to work, if it suits Google.

But that isn't really in the best interest of you as a web site owner. And as a consumer, going *directly* to the "red widgets" that are there for sure bypassing the "Google best guess" alltogether is also the easiest and the safest.

So,

Q: Which party does this *really* benefit?

A: Google corporation.

httpwebwitch




msg:3750548
 4:04 pm on Sep 23, 2008 (gmt 0)

well said, claus!

Purée all those messages together, and it's stating "Don't use power tools to make your URLs better, because all your URLs are belong to us."

incrediBILL




msg:3750595
 5:10 pm on Sep 23, 2008 (gmt 0)

Nothing Google stated is particularly new so I don't see what all the paranoia's about.

I have sites that use parameters and sites with rewrites and they both rank quite well and always have, it was never an either/or, they always performed about the same.

The site that doesn't use rewrite and uses parameters is the oldest and none of the SE's ever had trouble ranking it, nor did they have trouble ranking ecommerce sites with similar configurations.

I never switched a site using parameters to rewrite rules, regardless of all the SEO hype, because there was simply no need. People find the content in the SE, click on it, and bookmark it if they like it. Type in traffic to a domain name is one thing, very few (if any) remember your pages names as well.

Besides, if it ranks on the first page, why risk changing it?

Most of my competitors all use dynamic URLs as well and rank like bandits so if you can't rank on the first page, rewrite rules are probably not going to help you anyway.

FWIW, depending on how your session IDs are implemented, allowing them to be indexed can cause way more problems than Google thinking you have duplicate pages. If the session ID is still active when the URL shows up in Google's index someone else clicking that link could end up in the same session and compromise the original session holders security. Now imagine multiple people clicked the same link with the same session ID embedded in it and they all share the same session file on the server and all hell will break loose.

I've seen that happen on a couple of sites (not mine) before Google got better about sorting out session IDs from the URL.

Besides, session IDs are best stored in cookies, not URLs, and visitors that reject the cookie really didn't want to do business with you in the first place.

BradleyT




msg:3750596
 5:10 pm on Sep 23, 2008 (gmt 0)

For those saying that 80% of all sites do URL re-writes incorrectly can you give an example?

httpwebwitch




msg:3750616
 5:34 pm on Sep 23, 2008 (gmt 0)

@Bradley,
how about:

http://www.amazon.com/I-can-type-anything-i-want-in-here/dp/0545010225/

http://www.chapters.indigo.ca/books/i-can-type-anything-i-want-in-here/9781551929767-item.html

[edited by: Robert_Charlton at 8:17 pm (utc) on Sep. 23, 2008]
[edit reason] de-activated urls [/edit]

brycen




msg:3750638
 6:05 pm on Sep 23, 2008 (gmt 0)

Session ID's are a problem even for static URL's. Servers like tomcat create session ID's in the URL when cookies don't work. Then each load of the page can look like a totally new URL to the search engine:

/awards.do;jsessionid=68B86DFF8E4A8597B210531C3431965D
/awards.do;jsessionid=0621414681C92E1A00A9428A7800AC30

See:
[webmasterworld.com...]

Google has a few million (apparently static) pages that differ only by sessionid in the index.

[edited by: Robert_Charlton at 8:23 pm (utc) on Sep. 23, 2008]

swa66




msg:3750656
 6:44 pm on Sep 23, 2008 (gmt 0)

Funny to see that GOOG asks not to do URL rewriting (in general terms at least) on a page that itself obviously is using URL rewriting itself.

[googlewebmastercentral.blogspot.com...]

Understand who can.

Receptional Andy




msg:3750675
 7:08 pm on Sep 23, 2008 (gmt 0)

I can understand where Google are coming from - there are many very pointless implementations of so-called "friendly" URLs that merely remove the question-mark and replace & symbols with forward slashes. Which is clearly a waste of time for everybody.

But that said, the blog post is should never have made it past quality control IMO and causes as many problems as it aims to solve. It's badly worded:

...rewriting the URL to www.example.com/article/bin/answer.foo?language=en&answer=3 probably would not cause any problems

Make your mind up eh?

Misinformed:

...you can search for static URLs on Google by typing filetype:htm in the search field

...which actually finds a significant amount of rewritten URLs.

Scaremongering:

Hiding your parameters...could cause a loss of valuable information

And by issuing blanket statements, the post is totally contrary to all the good advice out there about planning an effective URL structure. It is no excuse that the post is aimed at (and seemingly written by, as the closing of the opening paragraph implies) novices.

Cool URIs don't change [w3.org], Google, remember?

And for tin-foil-hat wearers: you know how Google has been doing all of that spidering where they manipulate GET parameters? And you know how it's very difficult to URL-hack re-written URLs? ...

Gomvents




msg:3750756
 8:46 pm on Sep 23, 2008 (gmt 0)

Keep doing your rewrites! Do not listen to this nonsense please...

g1smd




msg:3750841
 10:35 pm on Sep 23, 2008 (gmt 0)

Most CMS, blog and forum software, using rewrites or not, has significant flaws. I am totally bamboozled as to how Google ever makes sense of some sites.

I mentioned this retailer about 4 years ago, and they still have the same system in place. A typical URL after drilling down the categories to a product (about 3 or 4 clicks) and then one more click to a "related" product is like this:

www5.domain.com/abc/X6.aspx?GrpTyp=SIZ[b]&[/b]ItemID=120f551[b]&[/b]RefPage=X6[b]&[/b]deptID=53006[b]&[/b]cat
ID=53028[b]&[/b]cmOrigID=149cc25[b]&[/b]cmPosID=2[b]&[/b]CmCatId=53006[b]¦[/b]53007[b]¦[/b]53022[b]¦[/b]53028¦crosssell

I managed to get to some pages where the URL had over 20 parameters.

This URL delivers the same content:

http://www5.domain.com/abc/X6.aspx?GrpTyp=SIZ[b]&[/b]ItemID=120f551

but is never generated by the site navigation system. It has six parameters less.

.

That URL was only a couple of clicks away from this mess:

http://www5.domain.com/abc/x2.aspx?DeptID=53006[b]&[/b]CatID=53006[b]&[/b]cmAMS_T=G1[b]&[/b]cmAMS_C=D2B[b]&[/b]mscssid=5272986ac488238f5628bf243
4867307ikMoVNoCzaIpxPnVNeVzcGW486BB3F0D93789AB66A97350D4BF2C84175D4529217

with session ID tacked on the end.

[edited by: g1smd at 11:14 pm (utc) on Sep. 23, 2008]

skipfactor




msg:3750902
 1:02 am on Sep 24, 2008 (gmt 0)

sounds like "stats"/GA problems to me...

docbird




msg:3750975
 5:00 am on Sep 24, 2008 (gmt 0)

I had a forum on a cms site on which mostly used rewritten URLs. Took me a while to rewrite URLs for the forum [[ie, took a while before someone came up with the code I could use for readily doing this!]]

The forum threads ranked just fine in google.
I switched to rewritten URLs chiefly for other reasons, as cited here.

Even with dynamic URLs, the cms I used (Mambo, then Joomla) was prone to produce duplicates.

Nuttakorn




msg:3750986
 6:09 am on Sep 24, 2008 (gmt 0)

I think as the index in Google as rewritten url is huge, for example just only Amazon, million of pages that need to reindex and changes. IMO, it will take some time for changing this. Google also need to recalculate all value of Pagerank if we do permanent redirect from rewritten url to dynamic url.

Bewenched




msg:3752096
 6:05 pm on Sep 25, 2008 (gmt 0)

All the problems we had with the "Big Daddy" rollout stemmed from having dynamic urls so we changed to re-writes. The problems resolved themselves over about a year or so. I have no intention of changing them back... certainly not with the Christmas shopping season coming. That would be suicide.

g1smd




msg:3752099
 6:14 pm on Sep 25, 2008 (gmt 0)

*** All the problems we had with the "Big Daddy" rollout stemmed from having dynamic urls ***

Was it "dynamic URLs" or was it actually "duplicate content issues"?

webdude




msg:3752153
 7:09 pm on Sep 25, 2008 (gmt 0)

Hey folks, don't you see? There is no need to rewrite dynamic URLS... Google will do it for you!

[webmasterworld.com...]

g1smd




msg:3752154
 7:17 pm on Sep 25, 2008 (gmt 0)

Shut up already! :-)

[There ya go!]

[edited by: g1smd at 7:29 pm (utc) on Sep. 25, 2008]

webdude




msg:3752162
 7:25 pm on Sep 25, 2008 (gmt 0)

So... where's the happy, winking, smiley face?

Well, he ain't winking... but I'll take it.

[edited by: me a few minutes ago]

badbadmonkey




msg:3753669
 8:50 am on Sep 28, 2008 (gmt 0)

As long as the name is the same the size of session variable would be irrelevant - you can bet that all decent SEs recognise most common session IDs, however some sites continue to use totally new names for no reason at all - zero benefit to the site in this case, it will only hurt it.

I'll take that bet - the default PHP session-ID parameter, for when cookies aren't accepted, is PHPSESSID.

Google doesn't recognize it; I had dozens of duplicate listings before I cloaked them for Google's benefit.

[google.com...]

Great huh?

trillianjedi




msg:3753692
 9:40 am on Sep 28, 2008 (gmt 0)

Great post Claus.

What's good for Google is not necessarily what's good for the site owner. A healthy pinch of salt with Googles advice on URL's is probably a good thing.

Marcia




msg:3753693
 9:40 am on Sep 28, 2008 (gmt 0)

For anyone who's trying to use the W3 validator to discover serious HTML errors, there are dynamic URLs from some shopping carts that will absolutely choke the validator and render it useless. Example: MIVA Merchant shopping cart.

In addition, for anyone concerned with usability and page load time, any idea how much extra weight some dynamic URLs add to the page file size and impede usable download time?

I really believe that Google is trying to help Mom 'n Pop, but unfortunately engineers can't address the issues in a helpful enough manner. They'll have to get some Mom 'n Pop webmasters who have half a clue to address some issues in terms that the low end of the ecommerce food chain will be able to understand and make use of.

RonPK




msg:3753753
 1:36 pm on Sep 28, 2008 (gmt 0)

In addition, for anyone concerned with usability and page load time, any idea how much extra weight some dynamic URLs add to the page file size and impede usable download time?

A few bytes extra hardly matter. Parsing rewrite rules probably also takes additional milliseconds. Nothing to worry about, imho.

g1smd




msg:3753754
 1:42 pm on Sep 28, 2008 (gmt 0)

*** there are dynamic URLs from some shopping carts that will absolutely choke the validator and render it useless ***

Mostly that is simply & should be encoded as &amp; instead.

Marcia




msg:3753756
 1:55 pm on Sep 28, 2008 (gmt 0)

In addition, for anyone concerned with usability and page load time, any idea how much extra weight some dynamic URLs add to the page file size and impede usable download time?

A few bytes extra hardly matter. Parsing rewrite rules probably also takes additional milliseconds. Nothing to worry about, imho.


Sorry hoss, but milliseconds isn't the same thing as mega-minutes load time that's about as slow as finding 6 different colors and sizes for 6 different size feet.

How about 6 minutes load time on dial-up? For real! Is anyone worried about that yet? Or do you consider that to be an acceptable load time?

[edited by: tedster at 5:18 pm (utc) on Sep. 28, 2008]
[edit reason] fix formatting [/edit]

RonPK




msg:3753806
 5:02 pm on Sep 28, 2008 (gmt 0)

Let's say a non-rewritten URL is 15 characters longer than the rewritten version, and that there are 100 such URLs on a page (the good old Google max). So the page with non-rewritten URLs is 1500 bytes larger. On a 56k dialup line the 1500 extra bytes will take about 0.25 seconds extra load time.

Perhaps I'm missing the point?

Small Website Guy




msg:3754429
 1:11 pm on Sep 29, 2008 (gmt 0)

This thread hasn't convinced me that it's not worth the small programming effort required to rewrite urls in ASP.NET to avoid QueryStrings on pages you want Google to index.

pageoneresults




msg:3754439
 1:29 pm on Sep 29, 2008 (gmt 0)

This thread hasn't convinced me that it's not worth the small programming effort required to rewrite urls in ASP.NET to avoid QueryStrings on pages you want Google to index.

I don't think it has convinced those of us who know better. :)

Now that I sit here and think about it, the sheer volume of misconfigured redirects out there has to be a challenge for the bots. I've seen them and they are not pretty. From multiple 302s, to 301 loops to improper server headers for 404 files. It would be interesting to know what percentage of the indices can be classified as content due to improper redirects?

Knowing what I know now, I surely wouldn't leave anything up to the bots to figure out, not if I can help it anyway. Rewriting will always be priority one when working with dynamic websites no matter what Google or any of the other SEs may say. From my perspective, it is not about them anymore, its about the user, something G have always stated in their written information.

Those query laden URI strings look more like spoofs than anything else. I want to see a study that places a variety of URI strings in front of the users and then have them rate and give their personal opinions on how they would react to them in various media. For example, how do you think a query laden URI string is going to be received via email? How do you instruct your link partners on how to link? What happens when marketing needs share a link with someone via telcon? There are so many cons when working with dynamic URI strings out of the box.

Nah, I'm not convinced either. Not even remotely convinced. :)

ogletree




msg:3755524
 1:16 pm on Sep 30, 2008 (gmt 0)

I read the other day on the Google blog that we should not use url rewrites because Google has no problem with dynamic content. I logged into my Google Sitemap today and I have a big
WARNING that says
All the URLs in your Sitemap are marked as having dynamic content. Because dynamic content is difficult for search engines to crawl and index, this may impact your site's performance in search results. Check your Sitemap to make sure your site information is correct.

[edited by: Receptional_Andy at 4:38 pm (utc) on Sep. 30, 2008]
[edit reason] Moved from another location [/edit]

ogletree




msg:3755872
 7:15 pm on Sep 30, 2008 (gmt 0)

I think yall are missing the point. They are specifically talking to those who do not know better. Most people who implement url rewrites have no clue of what they are doing.

Google has a real problem with ignorant seo's implementing static url's when they don't know what they are doing. Most of these sites already have a big problem with current url's. These seo's are adding a new problem by adding static urls. SEO's need to fix their current url's before they start using static. You should never implement static url's unless you eliminate the dynamic url's. When you build a site you should be in charge of your url's. Every single way to bring up a page should be known and controlled. If you have page.php?var1=something&var2=something2 and you change it to page/var1/something/var2/something2.htm you need to make sure the first url no longer returns a 200 code. You should set up your site to make it 301 to the new url. Also you need to make sure that page.php?var2=something2&var=something does not bring up the same page. Same goes with the static page. Google has a way of finding url's you never thought of so enforce your url's. If you have a lot of programmers working on the same site they tend to just do whatever they want without regard to seo.

An SEO should set up rules for the websites code and make sure it is enforced. Companies need to set up a naming policy and enforce it. Case can also be an issue on IIS servers where upper and lower case can bring up the same page. Make sure you always use lower case.

This 59 message thread spans 2 pages: < < 59 ( 1 [2]
Global Options:
 top home search open messages active posts  
 

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved