Forum Moderators: Robert Charlton & goodroi
Google has had the home page in its index since inception but has not indexed any of the site's other pages.
I'm using simple CSS buttons no flash or whatever..... is it simply that it looks to them like I have no content or something?
Y! etc etc have all pages indexed.
PS. Someone was told about and linked to the URL before the URL was active.
Thanks anyway.
PS. Perhaps I should have also mentioned that my simple CSS buttons are just that: simple HTML divs with classes (with background images).
PPS. The home page is effectively a site map. The site only has six pages (+ home). The home has 6 buttons, each button:
<div id="index-main">
<div id="index-body">
<div id="buttonLeft">
<div id="button1">
<div id="logo1"></div>
<div class="button">
<div class="button-text"><a href="site/about-us.php" title="Find out more about us, our people, our philosphy">about us</a></div>
</div>
</div>
<div id="button2">
<div id="logo2"></div>
<div class="button">
<div class="button-text"><a href="site/location.php" title="Where you can get our services">locations</a></div>
</div>
</div> etc etc Is this too complicated?
Maybe they don't like my spelling ;)
The home page PR is? Was the website sans internal links for a long period of time? When did Googlebot last fetch the home page and produce a 200 response? Finally, did Googlebot ever get an error fetching the home page and, if so, when?
is it simply that it looks to them like I have no content or something?
Posting the exact form of the code you use to link to internal pages from the home page might be helpful.
Have you cut and pasted the URLs directly out of the code into your browser (with the requisite prefix) to make sure there's no misspelling?
Have you looked at the raw logs to confirm that Googlebot has not attempted to fetch these other URLs?
As far as cut and paste... surely if the buttons work there should be no problem? I tried as suggested and there's no problem there.
I'll go back and check logs (when I get to a machine with less rigid firewalls!).
I may add a "deep" link from another site and see if that makes a difference.
This is the first time I've made a "splash" type home page / minimilist content home page with CSS and I thought maybe this was a problem.
(in fact I have fully indexed PR0s).
Me too. I also have a PR0 that Googlebot can go months without checking, presumably because it sat unchanged for a long time, and Google kept adjusting the frequency downward.
In fact, just to refresh my memory, I went to look. Googlebot visited this 1-page site on:
2005/05/04
2005/07/01
2005/07/15
2005/09/10
2005/09/29
2006/01/04
2006/03/13
2006/03/18
I expect Googlebot to return by July :-)
If it had internal pages that also had sat there unchanged for a long time, I would expect their Googlebot frequency to be even lower.
The sites that I have built this way have had no difficulty getting picked up and usually get a PR3 within 6 months.
I hope this makes sense, I'm bleary eyed at this time of morning.
Best of luck
Col :-)
(when I get to a machine with less rigid firewalls!)
I place much more faith in the raw logs when debugging a problem, but perhaps your reporting software will simply tell you
a) have there been any failed page fetches at all and by whom?
b) what IP addresses have fetched your internal pages?
In both cases, looking for 66.249.6?.* and presuming that to be Googlebot.
Another sanity check is to Google for:
site:www.yerdomain.com the
or some other common word known to be on the internal pages.
Which brings me to:
has not indexed any of the site's other pages.
What did you base that assertion on?
My site is the exact same. 6 months old and a site:www.mydomain.com only shows the home page indexed.
All I have is a simple home page with some links to other pages.
For a while I could see the other pages indexed on BigD but that is no longer the case.
My server logs show Googlebot hitting robots.txt and Sitemap.xml daily and sometimes some of the other pages
I rank on page 1 of the serps for certain keyword combinations.
Yesterday I put in a 301 redirect from non-www to www and am hoping that this might correct it
The old Googlebot used to rip through and index really well, but the combination of Bigdaddy and Mozillabot has made Google's indexing grind to a halt, they are getting worse than Yahoo. Even my old PR6 site has heaps of new pages which haven't been indexed (new pages on that site used to get crawled and indexed within 48 hours).
I have tried lots of things - Google Sitemaps, submitting URL's to Google's submit URL page, putting more links to individual pages out there, just about anything to help Google 'find' new sites and pages. Nothing gets indexed though, even though the crawler is a regular fixture on the site.
Maybe things will improve once Bigdaddy has settled in? MSN has no problems crawling and indexing new pages, usually within a week.
I can tell you (from web stats) there have 17 hits and 7 robots.txt this month from G the most recent was on index.php about 4 hours ago.
I have no google ads so I can assume these are real hits.
Curiously though they are asking for "/index.php" not "/". I shall have to look at my external links.....
The fix is to either
- add the <base href="http://www.domain.com/"> tag to every page, and set up a 301 redirect from non-www to www (all internal links should then START with a / and count from the root) OR
- to hard code every link on every page with the full domain-and-page URL (this latter option will increase bandwith quite a lot, and should be avoided).
g1smd I was just about to suggest that before I read your post. But I wasn't aware that it would increase bandwidth (besides that obvious fact that all the pages will be crawled rather than 1). Could you explain why please?
Adding the <base href="http://www.your-domain-name.com/"> tag to every page just once, does the same job but using a lot less characters.
Links to pages should always be full path for any small to middling site. Links to images can be relative, but there is no reason to create indistinct structure and a pile of reasons not to use relative links.
a pile of reasons not to use relative links
Thanks, Sam.
PS. I have the logs (at last) but have not gone through them yet - also I have checked external and internal links and my .htaccess etc (in case I had accidentally put it in a ErrorDocument directive) and have no idea why G would be hiting my /index.php directly (rather than just the root).
I should add that I like relative links as it makes testing easier - for me anyway....
...and my IP gets blocked by my site if I run Xenu (just kidding - sort of) as it does not obey robots.txt.... but seriously - it's a 7 page site (and I use htaccess to add a www)
1) Once it's looked just at /robots.txt
2) 3 times it's looked at:
* /robots.txt and /index.php
3) Twice it has looked at:
* /robots.txt, the root (just /) and the sub pages which are 5 * /site/somepage.php and one /contact/
The IPs are all 66. and one has occurred twice, once in the 2nd category and once in the 3rd. The order has been 1, 2, 2, 3, 3, 2.
I'm not a real bot watcher but it looks a little odd to me - as if there are two separate strands for different reasons.
I added a deep link from another site but have not checked to see if that page has been spidered yet.
There was a thread in which GG expressed a preference for the use of absolutes for maximum simplicity for the spiders, can't find it now though.
On the CSS thing, check the text-only version of the cache to see if those links show up, there was an strange thread a while ago claiming some CSS dropdowns weren't showing up in the text-only cache.....
Hi
My site is the exact same. 6 months old and a site:www.mydomain.com only shows the home page indexed.
All I have is a simple home page with some links to other pages.
For a while I could see the other pages indexed on BigD but that is no longer the case.
My server logs show Googlebot hitting robots.txt and Sitemap.xml daily and sometimes some of the other pages
I rank on page 1 of the serps for certain keyword combinations.
Yesterday I put in a 301 redirect from non-www to www and am hoping that this might correct it
---------------------------
Update 2nd Apr. Default Google now shows 4 pages of my site. I did a 301 on March22nd.Yesterday I ran Linksleuth and fixed one outbound link and I also had a broken link in Sitemap.html(Not my Google sitemap.xml) Hopefully this will help me in the Serps
I have a site which is few months old and although Googlebot has crawled many its pages with 200 response, only few of them (4 out of ~200) are indexed.
Until I got an incoming link from PR6 site, I had only the home page in the index. Today, the 4 pages that are indexed are the only pages that have inbound links from external sites.
It may indicate that G gives higher priority to pages having external links and those that don't have just can wait....