Forum Moderators: open
In a way this is a follow on from this thread which made me look at my ASK traffic more carefully:
[webmasterworld.com...]
This from my stats for this month:
Jeeves1238722.32 GB
Inktomi Slurp212941.75 MB
Googlebot (Google) 80012.43 MB
Last month it indexed 159129 pages and used 2.04 GB, so the pace seems to be accelerating if anything. It is there all the time.
Google has it about right - I have somewhere between 900 and 1000 pages in the site proper.
What Ask is so busily indexing is basically Amazon. I have an Amazon shop that is really only intended to have about 5-6 sections of books relevant to my topic, organised in a way very different from the way Amazon does it, but Ask is following everything - every obscure author you (n)ever heard of.
It does send traffic in reasonable numbers but it's mostly irrelevant traffic looking for obscure authors for whom I now rank very highly in Ask - without ever meaning to! They do not buy anything much, maybe a couple of books a month - I have checked the pages they land on and mostly the books are not available - and the Adsense income from those pages is not brilliant.
I am caught between disallowing Ask from that directory altogether and leaving well enough alone - in a way I hate to lose ranking, even relatively useless ones!
I could plaster Adsense all over it in a very aggressive way, but I really do not want to do that for the pages that have value to my users.
Has anyone else experienced this? Does it eventually slow down? What would you do?
I added this:
User-agent: teoma
Crawl-Delay: 240
At various points of this:
<Files .htaccess>
order allow,deny
deny from all
</Files>
ErrorDocument 403 http://www.example.com/errordocs/403.html
ErrorDocument 404 http://www.example.com/errordocs/404.htmlRewriteEngine On
RewriteCond %{HTTP_HOST} ^example.com
RewriteRule (.*) http://www.example.com/$1 [R=301,L]
and can't make it work.
Where should it be?
User-agent: *
Disallow:
Always verify your robots.txt file. A link available from that robots.txt forum.
The second snippet should be in your .htaccess file, not in robots.txt Of course this is only for Apache web servers. If you need help with that post in that forum.
[webmasterworld.com ]
Ive had 24,000 hits in four days and its had almost 3GB of Bandwidth - I wouldnt mind if it indexed some of the pages its cashed, currently ive only a few pages in the index, our home page was last updated about a year ago!.
I can only conclude that its either imposed some sort of penalty on us but still cashing pages but not including them in the index or it updates its index with snail pace.
Either way i think we need to block it for all the good it does. We see next to no traffic from ASK anyway.