Forum Moderators: open
Being that the PR=0 this would mean that they are not a part of the index?
A PR=0 is not necessarily not indexed. At this moment it can mean
1. PageRank is below PR1
2. not indexed
3. penalized
That last one is unlikely for just a few new pages in a site. What do you see if you try to view the cache (from toolbar when the page with PR0 is opened or with a search 'cache:www.mydomain.com/subpage123.htm'). If you see a cache, then you know it is in the index. If not, then Google could still have it in the index (e.g. because there is a meta tag that disallows search engines to cache it). Try a search on your site for a word or quoted string that can be found on that page like:
site:www.mydomain.com "new product, blue widgets" On the home page of the site in your profile, I saw a link to a page with an URL like: 'www.mydomain.com/page.asp?id=3'. Google doesn't like parameters with the name 'id' because it could indicate a session ID (although in this case the page is indexed and it has a PR3).
Another reason for not indexing can be duplicate content. Are the pages with PR0 almost identical to other pages (e.g. slightly different product name but same description and price)?
It gets very confusing when we don't fully understand what direction Google is taking. My guess is we just choose a very bad time to make importaint site changes. Lets just hope for speedy recoveries.
Mack.
[edited by: mack at 2:35 am (utc) on July 4, 2003]
Other pages on an old site that were moved with 301 are ranking but not showing PR yet. This isn't unusual now.
There were some threads about the id?= a few months ago, I'd change that id to something else4 or get rid of it as long as there's a history unless you can find some hard evidence that there are no problems with it. Why ask for unnecessary potential problems if it isn't necessary?
It appears that Google is fluxing with two distinct data sets.
One is more current and up to date, the other is just a sad case of flux probably brought on by them trying to meld the fresh/deep bot into the mix and update on the fly.
For some reason, it appears as if the two are trying to be blended together. So far, unsuccesful since when ever they bring in the bad set, it really screws the results up.
I think until they are done doing what ever it is they are doing, getting a new site listed is going to be tough.
I use this search that Googleguy mentioned in another thread to confirm what Google has spidered:
site:mydomain.org -qwerrew
Helpful, if discouraging. I am absolutely at a loss as to what I can do to get the bots to index the rest of the site. Sites that I've done previously are getting crawled, but on the new one the bot stops by, requests the robots.txt file and then moves on.
I need to build pagerank (backlinks) but my two best links are from sites that use jump menus (which Google ignores). I've gone with an (almost) flat file site layout and use a site map. ("Here little spidey spider"...(;-})
Google has set up some criteria that combines newness with low page rank and ignores anything except the index page. Oh, well....
Interesting, they caught my meta description for the first line of the SERP. On all of my other sites the DMOZ description gets used.
Of course, I've been working that as hard as I can. Two highest PR linked sites use "jump menus" so Google ignores them. I'm trying to get the state site to drop jump menu in favor of simple html with text. Have been trying to get all county sites in state to have mutual link pages (called something other than "links" hopefully...(;-}) Seems like a reasonable and straight forward plan to me, but it might backfire. (There's twenty county sites.)
But I think that the more specific point (at least for nonprofit sites) is that Google has instituted a "new site with modest PR gets index page listing only"policy- and we can only assume that the usual strategies will eventually allow us to break through.
Makes one tempted to start hitting guestbooks (shudder)....
I used it on my domain without the www and found 906
I used it with the www. and found 1120
Not sure what this means and not too sure of how far back they count as being spidered. The pages for the most part are way old with old titles and tags.
Is there a code to check for say this past month's spider visit or not?
Thanks!