If no one else has noticed you can get the same Rank type page for your current PageRank from iwebtools like this:
you can parse out the &features=Rank also to get the full XML file:
I checked all 7 of my site and all current page ranks match the <RK> tag in the XML file.
Further proof that <RK> really is Future/Effective Page Rank?
The only difference is the CH value. Did we every determine what that was?
[edited by: jatar_k at 9:32 pm (utc) on Feb. 17, 2006]
[edit reason] removed specifics [/edit]
just one question: What is going on/wrong when I receive "Pagerank unknown" from the iweb tool?
Ok, I have tried to read through this the best I can. I am not nearly as gifted as many of you that have posted, but at a glance I managed to figure out how to look at the BD xml and compare it to a non BD ip. I would like clarification, guidance, or anything else that can clear this one small point up for me.
<R N="2" L="2" is really something that is baffling me.
On the non bd ip i see this
Looks like a winner to me. However, I get a completley different result when looking at BD.
This page does not, and has not ever existed on the domain. So why is it there?
To further add to the confusion the "RK" for the non bd is showing the correct PR for the page that has been there the whole time which has been a 5 for about a year, yet the non bd "RK" is giving this page that does not and has not ever existed a of 3.
I am really hoping someone knows what I am saying because i am trying to understand what the differences are here and compare them to bd and non bd dc's.
*edit It makes me wornder if i should create the page that BD seems to think exists and see what happens in the serps.
Sorry for the extra post, but to further this along, am i correct to assume that R N="1" L="1 all the way to R N="10" L="10 in some fashion calculate the PR? Even if it is a rough calculation, having a dead link showing up R N="1-10" L="1-10 could disrupt the way a web pages is calculated? Maybe even affect serps? The serps may be a stretch, but surely it would have to affect predicted, rough future PR?
also what about the discrepancy between the “RK” on the BD and non BD ip’s? If a BD ip address is giving a dead link an RK of 3 and non BD is giving it an RK of 5 would that not also indicate a slight “devaluation” overall?
|The only difference is the CH value. Did we every determine what that was? |
|am i correct to assume that R N="1" L="1 all the way to R N="10" L="10 in some fashion calculate the PR? |
I suggest you read the entire thread again. Some of your question have already been answered.
|That looks like the formula google usually uses, the index page is almost always 1 point higher than the internal pages on the same domain. Right? |
Yeah, sort of. The formula is not that simple but usually leads to the effect you observed.
Hanu, thanks. I have done so, but I still would like to know why i have a bd data center showing me a subpage on the domain that I am looking up that has never existed. I have read this a couple of times and I guess I am either thick in the head or stuck on that particualr thing for some reason.:P Maybe you can point me to the post that has that info?
I didn't say all of your question were already covered, I said some of them were. I doubt anyone can help you with the "page that never existed" problem without having access to the specifics. I have the feeling that G does not "invent" a URL out of the blue. It might be a bug in BD in which case I wouldn't spend any time on this issue. Have you tried looking at the cached copy of that page on the BD DC, i.e. "cache:http://domain.com/pagethatneverexisted.html"?
yes i did, i think i know what it is. The site was purcased back in 1999 and bd did not have a cache, but someone else did , low and behold there are 160 links on big daddy that page. More than likely the previous owner had that page. I went ahead and made the page yesterday, can't hurt to see what happens. Thanks again.
After posting the above, I did a bd search using the site operator for that page. It was not listed yesterday, however, it is this morning? Not only was the tbpr not the same, it was actually higher than the rk3. In fact it was twice that. It has a tbpr of 6. I think that the pr6 is from my home page in some weird way personally.
I have noticed in the XML generated using the tool that was posted previously along with the site:domain.com command there is a tag that reads
<M>240</M> which corresponds to the number of pages G shows.
Just thought I would let everyone know what that tag means.
Also, the number of pages that G shows on BD for our sites has gone down dramatically, and I know this has been discussed before, but what is everyones take on this, do you think they will go up once everything is settled down?
Hanu, either I'm silly or your code doesn't work in IE. It definitely generates wrong checksums and URLs. I replaced the pipe-characters as requested but I always receive forbidden. What exactly do the functions calculate (do you have any url at hand explaining them)?
Has anyone explained these tags yet:
<L TAG="link:" />
<C SZ="9k" CID="SkCXAwDSf9QJ" TAG="cache:" ENC="" />
<RT TAG="related:" />
SZ seems to mean size. What about that "related-"issue? I noticed that quite recently the results of related:www.mydomain.com, which normally revealed basically my dmoz-neighbours, have completely changed.
I also noticed
<CRAWLDATE>16. Febr. 2006</CRAWLDATE>
is that a general info or does it have to do with the google-sitemaps framework?
Diving deep down into the result-pages with start=>700 I found e.g. some very old google-groups-postings of mine archived. The RK-tag assigned a value of 4 to some of them, whereas the toolbar shows 0.
So unless we assume specific filters concerning these groups-archives I'd say that it is sufficiently disproved this value would show future TBPR. Quite unlikely such old pages would get a new value assigned the next time.
As far as I see, right now, a PR update has already started in two datacenters:
I think this is a good opportunity to compare xml results of these two DCs with other DCs to check if the <RK> values are related to future PR
I saw a change on some internal pages on the other two and this one as well from 0 to 3 which is what is expected
The tool predicted a 4
My home page did not change but the tool showed a 2 level increase for it from 4 to 6
the checksum code is only tested in FF. Its the same code included in G's toolbar for FF and it's used to compute the ch parameter which is a checksum/hash of q parameter itself. Google uses the hash to make it more diffcult to programatically query the PageRank using scripts or so. It was cracked several times in the past and can hardly be considered a secret anymore considering that the source to the hash algo is publicly available in Google's toolbar for Firefox.
Thx, maybe I should install FF some day.
I also have to correct my above comment concerning TBPR. The RK-value shows the general PR of any of googles newsgroups, whereas any particular posting comprises a databasequery over the groups which is not indexed as an idempotent URL. So the value also seems to be fine in these entries.
The new PR matches by <RK> values as the <RK> values were a week or so ago - the <RK> values have moved on since then for me - so it does indicate to me that those <rk> values are pretty on the mark - although the export was a couple of weeks ago.
|...so it does indicate to me that those <rk> values are pretty on the mark |
Despite the contrary opinions, I am almost confident that <rk> values largely reflect current PR (or future TB PR).
For example, <RK> for a newly built site (1 mo old) was 4, and the update shows a PR4.
My blog page had a PR3, <RK> = 5, and now the update shows a PR5.
However, there is one thing weird: my homepage had a P4, <RK> shows 6, bud update shows PR5. But this might be another story, since it had a google directory PR5 while toolbar PR was 4.
It looks like the data for the newly displayed PR is a couple of weeks old though - so your homepage may already have moved on since then.
I dont know exactly what my <RK> values were at Jan 29th to Feb 4th - which appears to have been when the new toolbar PR is being displayed from.
This PR update is only on the non-BD dcs too - I would therefore assume that after BD roll out there may be another PR update.
> since it had a google directory PR5 while toolbar PR was 4.
is it really necessary to point out again that the directory-bar covers a scale from 1-7 instead of 1-10?
[edited by: Oliver_Henniges at 10:18 am (utc) on Feb. 19, 2006]
>>>>is it really necessary to pint out again that the directory-bar covers a scale from 1-7 instead of 1-10?
It is 1-10! (Well technically 1-11 as Google is 11 in the Directory :))
Yahoo 9, yahooligans.yahoo.com 8, Alltheweb 8 etc
No, unless there were some very recent changes, this has been elaborated years ago. The green bar in the directory shows a scale from 1 to 7 and thus helps considearbly to decide whether your TBPR is at the bottom or top of the discrete steps of the 10er scale.
It is against the TOS to post URLs but you may google for chris raimondi, an author who is quite famous to have described this.
Oh - that old chestnut. On the whole I never look or think about Directory PR as it is not the best/most recent indicator - so sorry, I stand corrected :)
Directory PR update timescales are very independent of displayed TBPR - I have no idea when that was last updated - last summer?
I think for selomelo it might have been the case that this lastest toolbar update is already 2 weeks out of date.
This toolbar update has followed the <rk> values for what I can remember when I checked around the beginning of Feb.
Going back to your question early regarding those tags:-
It is just how the serp page appears in the xml format - eg:-
<L TAG="link:" /> Displays the domain name - eg:www.example.com
<C SZ="13k" CID="8zdUBHMdJEoJ" TAG="cache:" /> Is the Size followed by the cache link (the CID is the 12 Alpha-numberic numbers that Google has for every page (forget what that is called)
<RT TAG="related:" /> Just refers to the cache link.
So the xml produces the last line as you would see in the normal serp display:-
www.example.com 20k Cached Similar Pages
The <CRAWLDATE>16. Febr. 2006</CRAWLDATE> is the fresh crawl date that appears next to pages sometimes.
Very interesting, I am currently analyzing this. Kudus to Hanu!
Want to add something:
If you use the "Multiple DC Current PR Tool" from SEO logs you get the RK values of a domain on 48 DCs with 3 BD DCs listed first.
On one domain I have half those DCs as PR4 and other half PR7 (including the BD ones). About two months ago the actual PR for this site would be 4 (and also shows 4 in current non-updated-TBPR). This was before I started to get links.
Thus I believe that those DCs showing 4 does not yet have the new BD infrastructure exported to it and Google pehaps did not care to update the PR values the last 6-8 weeks. Can someone confirm this - that the DCs showing (usually) lower value or older value is the non-BD-updated ones?
There is also another new site that I started linking recently and some PR7 links was added just recently. The BD DC RK shows 7 for this other site and the non-BD DC RK shows 5.
So, I believe:
BD DC RK: 0-3 weeks old
non-BD DC RK: 5-10 weeks old (approximate)
I have checked the RK values of 30 of my sites and I actually believe that it is the real PR thing (0-3 weeks old). Also the BD DC RK value seem to be the only best "PR prediction" as can seen by 30+ forum posters on other forums.
Oh yes. Another swedish guy just recently made a FF plugin that shows the PageRank but from the RK value "Live PR". His tool is sometimes shifting DC, I will contact him about this to get this fixed to be GD DC.
I believe, hope I am right, that instead of 3-4 months we can see the PR of 0-3 weeks ago.
What's puzzling me is the fact that some tools (not that iweb mentioned above) which claim to precict my future pagerank come do different values than the ones listed in the RK-value of the xml-file. Do these tools suck meanwhile? Has the update settled already?
> 1-10 vs. 1-7
I think Oliver is right. The directory's scale used to be 1-7 making it possible to interpolate between the toolbar PR and the directory PR and derive a more acurate reading of the internal PR. OTOH, who knows how up-to-date the directory PR is. The interpolation technique only makes sense if the directory PR and the toolbar PR are updated at the same time.
|Oh yes. Another swedish guy just recently made a FF plugin that shows the PageRank but from the RK value "Live PR". |
What's the name of that extension? I tried "Live PR" but couldn't find anything.
Thank you Oliver for correcting me:
|is it really necessary to point out again that the directory-bar covers a scale from 1-7 instead of 1-10? |
Meanwhile, I checked the Google directory more closely, and discovered a nice feature that shows, I guess, the PR value more acccurately with 1/4 increments, and I just wanted to share what I saw:
The graphical PR indicator has a total width of 40 px, consisting of two components (pos.gif and neg.gif) and the PR is represented something like:
<img src="/images/pos.gif" width=22, ....neg.gif" width=18...
These are the values for my site, and I think it is equal to a PR of 5.5.
When I said that G directory displays a 5 while my TB PR was 4, I was referring to my visual observation (half green and half grey). Now the green bar seems wider. :)
"What's the name of that extension? I tried "Live PR" buts couldn't find anything. "
Seach on "Raketforskning", Swedish for Rocket Science :)
I told the guy that made this tool and he has now submitted an update (LivePR 0.9.3) to Mozilla for review. The change is to only get Bigdaddy DCs (220.127.116.11, 18.104.22.168 or 22.214.171.124).
He has also updated the "Live Pagerank" online tool on his site and it now displays the current TBPR and RK values next to each other on over 40 DCs, nice tool.
I spoke to the guy and asked him on what bases he can back up RK as being the live PR. He believes that the current PR value that is now being exported is old and that it is the RK that is the actual PR that Google uses.
I am myself not sure about this. A few times the RK seems to be too high ... I am not sure.
What are your ideas Hanu? Dayo_UK? Others?
Jim, I just read this on a website. I can't post the link, but can sticky it if you like:
"The actual value in the "Real PageRank?" column is the RK value on the XML page where U matches the URL (the first match with or without the www prefix). If there is no match, then "Unknown" is displayed."
Sorry, that didn't make much sense, but it related to two PR scores- the current one showing on the toolber, and a Real RageRank, which hasn't been displayed on the toolbar yet. It states that the RK value is the latter
| This 182 message thread spans 7 pages: < < 182 ( 1 2 3  5 6 7 ) > > |