Forum Moderators: open
Just looking for information on why Google might decide to ignore a site for several months. I'll give some info --
Currently google only looks at the index page and goes NO further. index page has text links, image links, etc. It's PR is 5/10, and most internal pages have a PR of 4/10. However a saturation check comes up showing only the homepage listed in Google. Furthermore I've checked the logs, and googlebot has only ever requested robots.txt (there is none) and the homepage. There are no strange or robot meta tags, only keyword, description, and abstract, which are all probably ignored by google anyway.
I've emailed google and asked if the site was banned, and they replied that it was not banned, and to just wait for the googlebot to crawl the site.
Nevertheless, since roughly june of this year, the site has never been deeply crawled by google.
Other engines seem to crawl it ok, but not spectacular. Inktomi is the best so far, with 40 pages crawled. Alta Vista and Alltheweb only index the pages I submit, but go no further. Ask Jeeves (Teoma) seems to crawl all pages.
Any other info... the site is roughly half html and half cfm pages.. it has a site map in text, linked immediately in text from the index page..
anyway any information and suggestions would be greatly appreciated. This has been stumping me for a while now. Thanks in advance
ou say internal pages are PR4 but then you say they are not listed in Google. Which is it to be?
The toolbar estimates the PR of pages not in the index - typically PR of one less than the page.
As for the original problem - are the links of the front page static or dynamic? If they are dynamic do they have id=blah in the links, or do you use sessionids?
Submit all the pages - that's a blo*dy good idea - certainly worth a try. I'm having this index-only crawl problem at the moment. Index is PR 7, rest of site PR6. It's been up for two years and previously deep crawled, but nothing for months. I was having non-www, www duplicate probs., so I searched for and corrected all the non-www links on the web I could find, and also set up a proper 301. What's happening is the pages listed as non-www are gradually being dropped, but without a proper crawl they're not being reinstated as www. I'm now down to only 12 pages in the Google index. It's been one hell of an end to the year :(
p.s. all static, nothing fancy, standard links, intuitive navigation.
p.p.s.
The toolbar estimates the PR of pages not in the index - typically PR of one less than the page
about the links, all links are static, just regular old html, nothing fancy at all
about the PR issue, yep I guess the internal pages are just an estimate, because they're not listed when I do a search for them in google, yet they all have a PR of 4 or so
perhaps I'll submit the site map and see if that gets me anywhere..
essex boy, how long was your site in that state for, before you decided to submit each page individually? what have your results been since then?
I had been under the impression that submitting a whole slew of pages under one domain could be viewed as spam?
I had been under the impression that submitting a whole slew of pages under one domain could be viewed as spam?
Yeah, this worried me a bit. But it's likely to be an automated process, and while the guidance says it's only necessary to submit the index (the spider will do the rest ;) - there's nothing in the guidance that actually prohibits the submission of sub-pages / multiple pages as far as I can see. Anyway, to be on the safe side I just submitted a selection of half a dozen select pages that are well linked to the rest of the site. Added: Don't forget its a manual submission tool provided by Google - they frown on automated submission software, but it's highly improbable you could be penalised for using something they provide.
Unless someone is using a different toolbar from me, this gives PR0.
[yahoo.com...]
I see PR0 when I type any unindexed page on any existing URL, where it has a special page for 404s or not.
What you are seeing must be something else.
Does this page
[yahoo.com...]
actually exist? If not, surely this is an entirely different case.
I've noticed that if you go to a page with a real variable name which has PR, like:
www.blah.com/index.php?id=25
if that page has pagerank, you can type in an imaginary value, and Google will still give a PR score. eg.
www.blah.com/index.php?id=anynumber
But if you type:
www.blah.com/index.php/imaginarypage
you will get PR0
Interesting though - if the PR of a page can be estimated from the general PR of a site, how does Google 'know' whether that page does actually exist? Unless there's some remanant of its prior existence in the DB somewhere.
I submitted individual pages via googles own submit page - never use a commercial submitter, I did this on the 8th of December and by the 15th they were listed. Breath takingly quick.
Google does NOT punish the multiple submission of a single URL so I see no reason why you would be punished for submiting several URLS from the same site.
I had no problem. All pages rank well athough im yet to receive any page rank for them.
Confirmed: Google states: "We do not penalize sites for 'over-submitting'"
Whilst at the same time hinting 'but don't bother with submitting' ;)
thanks again guys, glad that I'm not the only one experiencing this problem, and glad to see that google was relatively fast for you with indexing your hand submitted pages
my site does have a site map, so I'll submit that, and a handful of other important ones, and see if the spider makes up its mind to dig a little further this time. If not, then I'll be hand submitting every page in the site lol
slydog, I have no idea why the pages inside the site are showing up with a PR but not in the index, but nevertheless, that is what is happening. The idea about it being an estimate sounds good to me. The pages are definitely not indexed though, google confirmed this when I emailed them and asked them if they had a problem with the site.