Forum Moderators: Robert Charlton & goodroi
/GoogleSitemap.xml
Http Code: 200 Date: Oct 17 04:26:23 Http Version: HTTP/1.1 Size in Bytes: 1425
Referer: -
Agent: Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Hasn't visited at all today.
robots.txt is fine
User-agent: *
Disallow:
Sitemap: http://www.example.com/GoogleSitemap.xml
I normally wouldn't normally give this early indexing a second thought but this week Google seems to be on short rations.
Any suggestions anyone?
[edited by: tedster at 6:28 pm (utc) on Oct. 20, 2009]
[edited by: Robert_Charlton at 6:30 pm (utc) on Oct. 20, 2009]
[edit reason] switch to example.com - it can never be owned [/edit]
Thanks for the reply but I disagree. I only posted because this is an exceptional case. I have enough sites that have been INDEXED without links to know that this is not why this site hasn't been indexed. External linking may boost how your content is ranked after it in the index but won't influence a spidering visit to a new site. I actually have plenty of site ranked with little or no external links to know they aren't 100% essential to ranking either.
Respectfully John
Remember that URLs via the submission form or via a sitemap submission are the lowest priority possible. Spidering of sites with external links will always be higher in the queue.
Now the site in question was a well-established site and not a new one, so my take on this is that Google wanted to index content from the site and used the toolbar to do so.
With a new site featuring a sales letter they might not be so keen.
That said, there are ways that Google can find a site without external linking. See this discussion....
Why is Google indexing my entire web server?
[webmasterworld.com...]
I wouldn't depend on this sort of chance path for Googlebot, though, to start it spidering the site. As Andy explains, Google does prioritize its resources. Currently, Google is revising its entire infrastructure, which is probably making it unlikely that an unlinked site will receive much attention.
It is this returning to the sitemap and robots without reading the URLs it acknowledges are on the site map.
If Google is doing Caffiene or some other activity why keep hitting these file at all?
If it has no intention of indexing content why go bashing on the front door?
I have suspected myself that this may be Google following other priorites hence asking if anyone with SIMILAR new site is experiencing the same thing right now, not at some other time or in other situation.
Slightly different question, why then is the bot re-visiting the Sitemap and the robots txt
They're slightly different things - one is about discovering new URLs to add to the queue, the other is to actually retrieve and store the content.
Frequent hits on robots.txt are common - Google needs to know what content it can retrieve prior to retrieving it.
I'm afraid I don't have much other than anecdotal experience of trying to get sites indexed without links - if I want something indexed, I link to it.
I do have a site a couple of weeks old with no links to it (not indexed). I'll see whether that changes if I submit a sitemap.
I am grinding away a little bit on this one because it is so unusual and i would like to work out the answer and store it away for future use. I have occassionally been approached by site owners that have got a site completely moribund and have eventually removed the site and re-built completely, bit drastic!
I will double/triple check robots/htaccess/sitemap etc etc today to see if i can spot an error before I resort to artifically linking it to see if that frees up the stoppage.
Thanks for your input
I'd be hoping that the 2nd level sub pages will all be indexed and ranking within a week or less.