No, I think not. I use something like that on the site in my profile and a lot of pages two levels deep have a PR4 like the homepage. I think the rule of PR-1 per directory level is only used by the toolbar to calculate the temporary PR if none is returned by the toolbarīs query. When a page is indexed the PR is calculated according to the usual rules without having regard to the number of slashes in the URL.
Right, me too think the same. Finally how the pages are linked in the site which matter in decidig the PR. If the site architecture has a matrix structure (all the pages of the site linked with all the pages), then in all probability, the sub pages will have same PR as the home page or maximum one less, no matter how many slashes in the URL. Guess I m right, am I?
> I think the rule of PR-1 per directory level is only used by the toolbar to calculate the temporary PR if none is returned by the toolbarīs query.
Yes, and that temporary guess has nothing to do with real PageRank (other than being a reasonable predictor in typical sites).
McMohan, if all pages in the site link to all others, and if the incoming links to the site point at the home page, then all pages apart from the home page should have the same PR. If the site is large, then the internal pages could be even lower than one less than the home page.
|Yes, and that temporary guess has nothing to do with real PageRank (other than being a reasonable predictor in typical sites). |
How do you know it is a temporary PR?? If the page is in the index, I would guess that the page has real PR, and not estimated PR.
So I ask again, If the inner page is not linked from any outside site and ends up in the index, is that pages rank -1 for each subdirectory it is away from the root.
Where if www.site.com has PR5:
www.site.com/index.asp has a PR4 (is in the Google index)
www.site.com/index.asp/id/1/page/2 has a PR1 or PR0 (also in the index)
>To end up in the index it must be linked from some internal page. It will then get its PR calculated based on the one internal link. Again without regard of the directory structure.
Not necessarily. Googlebot could find it if it had no internal link, but an external link.
Sorry rfgdxm1, but you have to read the quote and my reply together. I quoted PaulPaul who wrote:
|If the inner page is not linked from any outside site and ends up in the index |
I replied to that:
To end up in the index it must be linked from some internal page. It will then get its PR calculated based on the one internal link. Again without regard of the directory structure.
From PaulPaulīs assumptions that there is a) no link from outside and b) the page is in the index I concluded that it must then be linked from some internal page.
I believe that thatīs a perfectly valid conclusion.
The inner page, is only linked from the main site.
Through a site map or inner link. Do these pages get a real PR or estimated?? I would guess it is real based on the PR of the main site.
Once the page makes it into the index it gets a real PR based on the page that links to it. If it is linked from the main page then the PR will be calculated based on the main pageīs PR.
Here's a few things that webmasters should know:
- PageRank is on a page-by-page basis, so the number of slashes don't matter
- Writing dynamic urls as if they were static used to be the "right way" to present dynamic urls, but that's changing, at least for Google. Google is getting better about crawling dynamic urls, and we'd prefer to see dynamic urls in all their glory instead of written as if they were static. You'll see an increasing number of hosts where we could crawl deeply into a site through all kinds of dynamic urls. Google was pretty much the first to crawl dynamic urls, and we want to do it right without causing webmasters to rework their site. Of course, if you have a url with 15 parameters and only two of them actually mean anything, it's always a good idea to shorten a url whenever possible by trimming out the unneeded parameters.
Changing dynamic urls to appear static will be less important over time as Google crawls dynamic urls better. We also estimate host load by taking account whether a url is dynamic or not. Keeping dynamic urls written as dynamic will help us to estimate the load for your server and keep a bot from hitting your server too often.
Hope that makes sense. It's a change from what we've been recommending to webmasters, so I wanted to give people a heads-up.
|Writing dynamic urls as if they were static used to be the "right way" to present dynamic urls, but that's changing, at least for Google. Google is getting better about crawling dynamic urls, and we'd prefer to see dynamic urls in all their glory instead of written as if they were static |
That is HUGELY important news! Boy, does it makes things easier for my shopping cart software... Excellent... Excellent! :)
Googleguy, does "load time" have anything to do with the ranking algo? Just curious, hope I don't sidetrack the thread.
I used to have trouble getting dynamic pages in and was considering making them appear static. But instead I have made a "directory.shtml" page that is dynamically generated to create a list of links to every other dynamic page on the site, and this has done great things for me in terms of getting my dynamic content listed.
Thanks for the reply GoogleGuy. It will help, as I hate using those rewrites :)
I have noticed dynamic URL's ranking fairly high since the last update. Still, I wonder if "static looking" pages would rank a little higher.
I think I will try with some test pages, since PR is "Page-Based" right? :)
Thanks again for the info GG.
In the Static vs Dynamic debate, Google is just a tiny part of the bigger issue.
- Encourage people to link to you.
We pretty much know that people are less likely to link to a dynamic page by a wide margin. No one trusts dynamic urls to remain the same (the url is ui).
- Allow proxy caches to cache you.
Many proxies will not cache dynamic urls. Thus, you reduce bandwidth to what is essentially static content anyway.
- Allow browsers to keep static caches.
Same is true for browsers.
- Slows down rogue bots.
If you have any sort of session id in your dynamic strings, you know rogue bots can spend hours upon hours toying with the urls. They can beat your site to death trying to dl it.
I still think that it is wise to switch your dynamic to static any where and every where you can. A static url site will always work better for all concerned than a dynamic url based site - including Google.
GoogleGuy, we know Google is the single most important SE to optimize for these days, but it's not the only one: most other SE's still don't seem to like querystrings very much, and they will stop crawling your site earlier if they come across dynamic URLs.
I'm glad to hear that Google is getting better and better at crawling weird URLs, because that will help improve the recall of its search results. But I'm afraid my appreciation for URL rewriting techniques will not be influenced by your preference to "see dynamic URLs in all their glory" --as long as you don't start penalizing webmasters for using "static-looking" URLs, of course!
I agree with Brett 100%.
What about this one: static URLs can be meaningful / intuitive.
Which one would you like most: DummyBikeShop.com/store.asp?deptid=4&brandid=23&modelid=19&action=view
[added:] I also deeply respect and appreciate Google's concern about not overloading a server when crawling dynamic sites, but webmasters can always do server-side caching of their dynamic content, can't they? [added]
One negative to consider is the additional overhead imposed by a plethora of Rewrite rules used to accomplish the static to dynamic translation, assuming you use mod_rewrite to accomplish the staticization of the dynamic URL's.
The decreased maintainability of the site due to the fact that you're now translating to static on the front end and the server is translating back.
Get a rule wrong and your logs only show the url that was GET'ed, and you're left scratching your head wondering what exact paramaters your true .php or .pl or whatever script saw.
These won't apply in all cases, and, can always be worked around somehow, usually easily, but, just some things to weigh.
savvy1, I guess the first drawback can be easily overcome by intelligently caching content on the server side... There are many ways to do "URL rewriting"; one of those is to actually create static pages/directories (if the content update rate is low enough, of course): that way the server won't have to "translate" anything.
"Decreased maintainability": you're right, but hey!, nobody told us life would be easy on the Web. ;)
PS: Isn't this stuff getting a bit off-topic here?
usability guru jakob nielsen wrote an alertbox column about the importance of static and nice looking URLs already in 99:
Yeah, but that was 1999 and he also said:
|It is likely that domain names only have 3-5 years left as a major way of finding sites on the Web |
|Keeping dynamic urls written as dynamic will help us to estimate the load for your server and keep a bot from hitting your server too often. |
This confirms what I've observed over the last 18 months as I converted 100,000 pages from dynamic to static (really static -- not just apparently static). Google crawled the static pages much faster, and got much deeper into the site before that month's crawl ended. My server load from Google's crawler went from "problematic" to "ho-hum/zero" even as Google collected pages at several per second -- much faster than before.
Also, since I converted to static, FAST, Inktomi, and Directhit have been crawling deep. I've seen specific evidence, from an engineer's statement at FAST and a statement from Inktomi, that while they may crawl a dynamic page, they are not inclined to follow the links on that page. My experience confirms this. AltaVista is also very quirky on dynamic pages -- they may crawl them wildly on occasion, but the pages never ended up in the index to any significant degree.
The non-Google crawlers are much less predictable than Google, and will crawl off of old doorway pages they found on your site six months earlier, but it's still possible that things can happen to make the other engines significant all of a sudden. This happened to me with Inktomi in early August, due to a change in emphasis at Inktomi that suddenly blossomed when their new update went live at MSN.
So the bottom line is, you're still better off with static pages even though Google is getting better at crawling. Don't keep you eggs in one basket.
|Yeah, but that was 1999 and he also said: |
|It is likely that domain names only have 3-5 years left as a major way of finding sites on the Web |
and he was (somehow) right. a way of finding sites much more important than domain names today is ... google!
These are all good points about static vs. dynamic urls, and a webmaster has to take all these factors into account when they decide or the layout and architecture of their site.
I think Google's goal is (and should be!) to improve our crawling without requiring webmasters to take extra steps or do more work. We are getting better about that with dynamic urls. But each webmaster has to decide what works best for their site.