Welcome to WebmasterWorld Guest from 188.8.131.52
Forum Moderators: open
I have 11 pages that were added at the end of July. The home page links to one of these, and this page links to the other 10. These ten link to one another and back to the level 1 page. There are links to all pages in a site map which is one level below the home page.
The page linked to from the home page shows as PR1 in the google bar, which is about what I would expect (PR1 or 2), but all the sub pages are greyed out and not listed in Google.
Any suggestions? Is there anything definitive available which describes how Google now works? Have searched, but haven't found anything.
The majority of my pages were indexed by Google when the old deepcrawl/monthly update procedure was standard. But of the pages added subsequently, only the level 1 pages have been indexed (toolbar PR1) and none of the level 2 pages (greyed out).
Does Google now have a PR or level cutoff point below which it does not index?
Google visits my site every week, and possibly more frequently, but my stats only report weekly figures. It typically takes over 30 pages. Also pages that are already indexed are re-indexed in a few days after they have been changed.
Since adding the pages I mentioned, Googlebot has taken my index page, my site map, and the level 1 page of the group, all of which have links to the remaining pages - but Googlebot doesn't ask for these.
This is not an isolated incident. The 11 pages I mentioned were just an example. I have also re-instated a group of pages which were previously indexed but removed as they might have been considered contentious during the invasion of Iraq. Only the level 1 page of this group has been indexed, and the level 2 pages again ignored.
You say "passing on PR from a PR6 page is more easy then from a PR1 page." I don't know what you mean by that, as I am not talking about pages that have a PR0, but pages that are greyed out and never indexed.
Yes, I would love to obtain more PR through links, but mine is not a commercial site and there are limited opportunities. To overcome this to some extent, my site is optimised so that incoming PR is directed away from the index page to the frontline troops. My index page is PR2, my level 1 pages (about 40) are PR2 or PR1, and almost all my indexed level 1 pages (about 150) have a PR1. (Toolbar approximations.)
What I am asking is there a change in the way Google works? Does it ignore pages that are only linked from a PR1? In which case lots of little sites like mine are in dead schtuck.
Obviously I haven't made myself clear. Google visits me regularly and takes 30+ pages per week. I also have a PR5 back link plus a couple of PR3s. A large number of my pages have high search rankings in Google.
I need to know why certain specific pages are not taken since the change in Google working practices.
Deep spidering is not the issue. There are only three levels on my site. And the pages that Google is currently not taking are no deeper or different in type than the ones it already has taken using deepcrawl. So it would appear that something has changed in the way Google works since it dropped deepcrawl. Has Google introduced a cut off point for spidering based on PR? Is Google now limiting the number of pages it is taking into it's database? Or has its spidering become less efficient?
Prior to the demise of deepcrawl Google took ALL my pages every month, but now it is being very picky about what it takes, and as I mentioned in an earlier message, has not taken some reinstated pages that it had accepted earlier.
The number of indexed pages we have is not decreasing and the PR0's are not dropped. More time may lead to deeper spidering, or perhaps on your homepage you could create a new section that you could use to "highlight" direct links to a few of those pages that you want to get Google attention.
Check your logs to determine Google's path through your site and you may be able to determine a problem page or pages.
Get more links to your site and raise your PR. The higher your PR the more you will get crawled. And passing on PR from a PR6 page is more easy then from a PR1 page.
Just get more links. To your index page, to your other pages. Anywhere!
Raise your PR of the pages by getting backward links (from a high PR site) or raise your PR by a link from the indexpage with a high PR.
Is there anyone in this world with facts (other than a very few at Google)? NO
Thenks for a couple of useful ideas.
HarryM could you add a few inbound links from pages outside of your domain to the inner pages that Google isn't indexing? That would be interesting to see.
Unfortunately this is a personal non-commercial site and it is extremely difficult to get incoming links. Would that I could! Also these particular pages are being hosted for an interest group and may be replaced.
Check your logs to determine Google's path through your site and you may be able to determine a problem page or pages
Will do at end of week. I don't get the logs normally, just stats, but can get them on demand.
Many thanks to the above. However I am less enamoured of the people who continue to say that I should throw PR at the problem. Perhaps they don't understand how insulting that sounds - sort of teaching a grandmother to suck eggs.
However I am less enamoured of the people who continue to say that I should throw PR at the problem. Perhaps they don't understand how insulting that sounds - sort of teaching a grandmother to suck eggs.
In fairness to those posters, they are actually correctly identifying the problem which does actually answer the question that you asked originally.
How deeply google crawls your site depends on PR.
So, logic dictates that the only way you can get those pages crawled is to link to them from the higher PR pages of your site. You say you have a PR5 and a few PR3 inbound links. If you have a decent linking structure you will have some PR4 pages in there.
Is there anyway you can link from those, rather than the PR1 page you are trying to link from?
I would guess your site map is (or at least should be) a PR4. That would do the trick.
On one of our lower PR sites, we get all the deep pages crawled by just shoving them in the site map.
I think the point is that even for a personal website, a home page of PR2 is pretty ordinary. I have a personal website with some pages PR5 and the rest PR4.
Getting links isn't THAT hard, even for a personal website. Surely your pages are about something? There are tons of directories and web pages providing links to just about everything under the sun - even directories of personal websites. Try searching for "subject of page" and "add link" or "add url" or "submit site" and associated variations. Works for me.
>> So it would appear that something has changed in the way Google works since it dropped deepcrawl.
Yes, that's exactly it. The Deepbot got sacked. At the same time, the Freshbot got promoted to "Deepfreshbot". It's all in the threads and confirmed by a Google employee, eg in Msg #209 here: [webmasterworld.com...]
The name was coined in msg #43 here: [webmasterworld.com...]
>> Has Google introduced a cut off point for spidering based on PR?
The Deepbot was simply another bot than Freshbot - it crawled deep "by default". Freshbot does not crawl deep by default, but is has been promoted now, so it can crawl deep as well.
The deep pages that you got indexed by Deepbot will of course not dissapear from the index as Google values index size, but pages that were not there after Deepbot got sacked will need to be relevant to the new bot in order to get spidered.
Please read posts #13 and #31 (first two lines, page three) from this thread carefully and make your own conclusions: [webmasterworld.com...]
There's another one here (#24):
-it might be hard to interpret it out of context, but note these statements: "I noticed a few pages indexed because it looked as though they have their own links" and "my main advice is still to get a few more links"
You'll find a few more recent posts by the same author emphazising the value of inbound links if you search thoroughly. As links=PR (well, sort of) you get the advice that you need to increase PR. To do that, you need to get links.
The Deepbot was simply another bot than Freshbot - it crawled deep "by default". Freshbot does not crawl deep by default
That was exactly the sort of information I was looking for. I had searched WW World but hadn't come up with anything. I suspect it is my fault this thread has gone on for so long by not making my original question more specific.
It was suggested by trillianjedi
You say you have a PR5 and a few PR3 inbound links. If you have a decent linking structure you will have some PR4 pages in there.
I have a decent linking structure which admirably suited the situation before the demise of deepcrawl and gave me excellent page rankings. All pages were themed, no page was more than 2 levels deep, all pages were linked from sitemaps, no sitemap was over 100 links (as Googleguy suggested), and great care had been taken to optimise the PR at the deepest level - which is where it used to count.
Now that I know how things have changed I can alter my linking structure to push the PR back to the linking pages. No doubt Google will be doing something different in a few months time and I will once again be trying to compensate. :)
I just checked google for my website. The last time I checked I had a #3 link when people searched for 'jobs for web designers'. Now I can't find my site anywhere... except if I search directly on the domain name, then mostly my sitemap shows up. Sigh. I never was all that good at the keyword/description/seo thing, but was doing ok.
Im kind of bummed. Guess I'll get over it. Maybe its time to hire someone to do my seo for me. I have over 425 pages, so I've tended to use SSI for the same header and footer on most of them (since I tweak so much, I dont want to change 400 pages every time!).
Maybe this is all a result of the loss of the G deepbot, though someone said you dont lose pages because G likes the large index. I'm open to suggestions.
I know how you feel. My site was going along fine with a lot of traffic until Deepbot was replaced. Now my home page has reduced from PR3 to PR2 and I have the problem that new pages are not getting indexed.
With Deepbot I could guarantee that all my pages would be taken every month, now its only 30 pages a week.
But I think I have a solution for the new pages that are not getting indexed because they are linked from a page with low PR. I am temporarily linking them from the home page to see if that helps.
So what's the trick now?
I get index and robots txt indexed daily but my new pages are completely ignored... even those with pr1...
Is there any other solution than getting incoming links to every page in my site?
What I find kinda interesting is Harry is complaining he cannot get links because this is a personal site not commercial. Others have said they cannot get links because their sites are commercial not personal.
What I find kinda interesting is Harry is complaining he cannot get links because this is a personal site not commercial.
No, I am not complaining, just stating a perception. Yes, I can get links from sites that approach me to promote their products, but none are relevant - and could be dodgy. Most of the bigger sites don't link to personal sites, and frequently sites that will have low PR.
I have no problem with low PR per se. My pages get good results from search engines, frequently in the number one spot, and I get a healthy traffic. PR is not as useful as a good keyword related page.
The problem I have now is that Googlebot is ignoring my new pages and I need more PR to correct that situation. In the last few days I have updated every page in the site, but when these changes will get into the database - if ever - is completely unknown.
From an initial look at my logs (not yet completed) it seems as if the new Googlebot works completely haphazardly. It takes pages apparently at random. They are at different levels, in different themed areas, and frequently not linked to each other. On one day it took the same page twice.
Ah, for the good old days when Deepcrawl prowled the web. :)
If you create an index on a page that has a higher PR, the bot should pick up the new pages within a reasonable time after adding them.
I have a commercial site PR 4 in index and PR 1 in some internal pages. The bot is not crawling other things except robots.txt and index and sometimes some old files that are not anymore in server...
What is a reasonable time? I'm talking about that behaviour in 3 weeks
I think getting links from external sites for each page in my site is impossible...
I don't know if this would help, but I have a (free) private Link Exchange on my site (the page has a PR4). It would it least be another external link to your site.
Im kind of bummed. Guess I'll get over it.
Im over it! I had some great revelations about my pages after posting that... making some major changes....
I would be interested in this. Could you send me a PM on where I can find your site?
"Im over it! I had some great revelations about my pages after posting that... making some major changes.... "
What kind of changes did you do? Did you increase your page rank by getting more links from relevant sites?