I have a niche market news aggregation site that went live on 1 April.
After 2 months of fine-tuning, we submitted a manual sitemap to google via google webmaster tools account on June 4, with 68 pages identified: 61 news pages and 7 site pages, which include the homepage (current date news display) and the archive access.
Reviewing the available Google sitemap data reveals the statement that googlebot has visited the site once, on June 2nd (prior to sitemap submission). There are 7/68 pages indexed.
Analysis of daily logs indicates that there is site access by ip addresses attributed to google in Mountain View California, google in Michigan, as well as two international google addresses.
Google Webmaster Tools sitemap data is:
The cache is from June 5.Per Webmaster tools: Googlebot last home page access: June 2
Pages indexed: 7/68: homepage, feedlist, April news: 2 April dates; May news: 3 May dates
Log data for June YTD reveals two ip addresses for googlebot; cumulatively the data is: 26 visits / 42 files / 57 hits.
I am trying to make some sense of the apparent discrepancy between what Google Sitemap reporting is stating, server log data, pages crawled and pages indexed.
Suggestions?