Forum Moderators: goodroi
If it makes a difference to any answers here we are also using google Adwords
Googlebot visits us daily and inspects robots.txt and also / And we are indexed on google with meta tags content appearing in the google results when I search on our domain (oddly similar to my user name here)
robots.txt contains:
# basically allow everything
User-agent: *
Disallow: /somedirectoryorother/
It validates and I am content that it ought to allow full spidering
Belt and braces I also have
< meta name="robots" content="index,follow" > (without the spaces near the < and > )
on every page I want indexed (though this was a recent addition and seems not to help).
Is it that googlebot only visits home pages for a while before it gets around to deep spidering? Or am I missing something fundamental?
May I be specific in my questions over this, please:
Do you mean "internal links" within the site?
Or do you mean "external links" from the site?
Or do you mean "Internal links on the homepage"?
Or do you mean "External links" on the home page?
Or do you mean "links from other sites to ours?"
I think ThomasB is refering to incoming links.
And yet I have an obscure personal site on my ISP's obscure server that is fully spidered though I think no-one links to it ever. Well 2 things do according to google. One is dreambook (yup, vanity guestbook, but it seems they have a use) and the other is a personal site of some other nut, too.
Both of those link only to the homepage of the very specialist and uninteresting personal site.
My commercial site is linked to. Google sees the sites that link to it and the links. And yet googlebot only hits the / page and does not burrow deeper.
We're throwing content and links within it and from the site outwards at it at present.
I imagine googlebot ignores google adwords links in, but if you search on my user name here there are currently 144 hits of which I claim 80% as "mine". Among that lot are several genuine permanent links
Anyway that was a side issue. Last night google sent two different bots. One was quite bright and could read gzipped files. It hit / and nothing else.
Regular Googlebot arrived and hit 3 or four more pages. While they are not yet indexed in the SERP I am starting to think that just maybe googlebot visits severla times to make sure it isn't going to make a fool of itself indexing a site that then chnages character immediately or goes away, and then schedules a deep spidering event for the future once it's done a sample spidering. OK that is a pure guess based on what I would design.
We have had the google imagebot visit as well. No images yet. Though we only expect the basic site images to appear and not the images of the dating site members. Those are protected from spidering as are their profiles.
We submitted the site to google on the 7th of April. Maybe we are just expecting miracles :)
Now I am an 80% happy bunny. I can work on my content and specialised meta tags to my heart's content.
One oddity, though. a search on link:http://somedomainorother produces no "links in" under google's regime but produces a small but perfectly formed list under Yahoo (which, by coincidence, was a good little robot last night.)
Google finds sites that link to us if I search for my domain (well actually my user id here). It simply has not "registered" those links. So my next question is "Are these a separate process from the main spidering and listing?"