Welcome to WebmasterWorld Guest from 22.214.171.124
If the irrelevant pages are from your own site, a robots.txt, a few judicious robots meta tags with noindex should help out.
Slurp accounted for 73% spider traffic (more than 8000 requests), where GoogleBot only accounted for 20%. The Mozilla-Version of GoogleBot only made up for 0.15%.
The funny thing is though, that I'm not doing particularly well on Yahoo.
So I checked a few searches, and noticed one of my pages had a 1/25/2005 date in the SERPS while showing a title that was changed on 1/26/2005. Go figure. Looks like a lot of things are updating.
Googlebot could crawl your site 4X faster with 4X less bandwidth usage if your site supports GZIP. So few sites support GZIP that Google has no real incentive to switch over to a faster crawler. When properly set up GZIP would speed up many, many websites.
To date the optional GZIP request correlates with Google bots indication of HTTP 1.1 protocol.
When Googlebot uses HTTP 1.0 protocol it is definitely not requesting GZIP compressed content.
The protocol indicator 1.0/1.1 is almost adjacent to the page size in bytes so it's very convenient to use as a "GZIP" flag when reviewing your logs.
Even though this capability is available in virtually all web server software, only about 6% of all web hosts and therefore webmasters support this virtually free 4X performance improving and 4X bandwidth reducing technology.
As a 56K modem user, I'd sure like to see dynamic GZIP compression fully supported. Webmaster World unfortunately does not GZIP, Google does for SERPS.
But to relate all our hopes here a little bit, Google is really sick in many ways this time (SandBox, overating links - link farm impact, hilltop oligarchy, big sites oligarchy, 2x32...). And when they are not able to solve this out in some way they will drive it to the wall.
It would be time to put in some cure now.
Or they at least should abandom their ugly sandbox.
Simply search again your keyword with 13x -adfs and see how good SERPS could be...
Effective of January 29, I am now listed as #1 for my most important keyword. Before the recent deep crawl, I was the runner-up for almost a year, with an on-topic non commercial site being #1.
I need to check other SERPS, but it seems the recent deep crawl finds its way into the results.
tail -f access.log ¦ grep -i googlebot
There are more sophisticated solutions though. Some log analysis tools can do "live stats" for example. There are some CRM packages which offer website-visitor-chat-functionality, which give you a live view on your sites visitors. But these tools usually exclude spiders.
I personally use a tool called "What's on?", which monitors all of my sites current visitors and which I have constantly open. It's the only one I found to do this stuff and it has a few bugs and glitches especially when it comes to DNS grouping and geotargetting. Probably there are other tools as well.