Forum Moderators: Robert Charlton & goodroi

Message Too Old, No Replies

WMT shows 1343 pages indexed - actual pages indexed 10

         

captcontent

4:41 pm on Sep 11, 2010 (gmt 0)

10+ Year Member



We had a new site go live on August 23,2010 with just over 2000 pages as shown on the xml site map that was submitted on that same day. Within a few days we started to see some obscure pages starting to show in the index ie. register, site map, forget password, maps... but nothing with any real content. As of this writing, almost three weeks later, not even home pages shows. In fact, site:www.example.com is not even in the index.

Strange part is that WMT shows that 1343 pages are indexed. Same home page was indexed in Bing in 3 days after submitting.

Only clue is that crawl error page shows a 401 that was encountered on August 23. We had to open up the site for some testing for a few hours and Google some how found and got a taste. Password protection was soon removed.

Any ideas on where to start or what to do or not do?

Thanks!

tedster

8:31 pm on Sep 11, 2010 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



At this point I'd stop depending on Google's reports - monitor them as secondary input, yes, but go to your own server logs to see what is actually happening in the real world, not filtered by Google's sometimes questionable data.

I'd suggest looking for googlebot's crawl (both requests and your servers responses) as well as tracking as any actualt traffic sent directly from Google Search. These clues can form your base to untangle what might be going on.

captcontent

11:19 am on Sep 13, 2010 (gmt 0)

10+ Year Member



Tedster, I wanted to follow up and let you know that home page has made it into index and other pages are starting to come through as well.

The interesting part is your suggestion of the raw logs research lead to much dialogue with our ISP. Our stats package is showing almost 30 mg of bandwith every morning at approximately the same time from Googlebot. Seems fairly consistent here.

Only issue is that none of this traffic is showing on my server logs!

phranque

4:47 am on Sep 14, 2010 (gmt 0)

WebmasterWorld Administrator 10+ Year Member Top Contributors Of The Month



perhaps your server access log is only showing requests for one possible hostname yet your bandwidth is being used by requests for another hostname or other hostnames.
does your site serve content for example.com as well as www.example.com?
have you tried a site:example.com -inurl:www search?
is your server on a dedicated IP address?
if so what happens when you request that url?
perhaps your non-canonical requests are not being logged or are being logged elsewhere.

captcontent

5:36 pm on Sep 15, 2010 (gmt 0)

10+ Year Member



phranque,

I checked with ISP about server logs and they did not have an idea as both of our log entries matched. I am checking to see if the bandwidth usage is calibrated daily. We are on a dedicated box and serve both versions of url. I will look into the possibility of serving more than one hostname as we do have more than one, but i am pretty sure all requests go through the same pipe.

It has been three weeks and google has indexed 95% of the new site (just over 2000 pages)and is crawling at least 4 levels on a site with no links to speak of. We feel good about that.

Thank you