Welcome to WebmasterWorld Guest from 54.205.96.97

Forum Moderators: Ocean10000 & incrediBILL

Altavista

Your Thoughts

   
12:16 pm on May 30, 2003 (gmt 0)

10+ Year Member



What are your thoughts on the Altavista spider and search engine. I am getting SLAMMED by their spider. (At least once a minute for the last 3 days) Do you think that they will help with driving more traffic to my site or are they just another 2nd tier search engine. BTW here is what shows in the log.

5/30/2003,5:42:48 AM, ,216.39.48.20,Scooter/3.2,mailto:crawl-support@av.com

1:05 pm on May 30, 2003 (gmt 0)

WebmasterWorld Senior Member wilderness is a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



AV has been quite inactive for some time.
Their just getting rolling again.

Their is an interesting ongoing thread over in alt.webmaster or alt.html about who is the #1 SE.
The results a bit surprising.

5:28 pm on May 30, 2003 (gmt 0)

10+ Year Member



A couple of days ago I told Scooter to take a hike... I have no interest in seeing AltaVista visit me.

Many emails to AV about some of the Scooter bots disregarding robots.txt have gone unanswered, so it's simple: If you don't behave, you're not welcome.

No big loss, either... I get more quality referrals from Google Thailand than all referrals in total from all the global AV sites.

balam

6:04 pm on May 30, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



I get about 5 - 7% referrals from AV across several sites. If you can rank well in AV it can still send some traffic but it varies with niche.

It is ironic that we spent years complaining that AV did not spider deep and now that it is doing so people are complaining about that! :)

6:16 pm on May 30, 2003 (gmt 0)

10+ Year Member



My big complaint is Scooter is told to stay away from all my image directories, but they happily dip in anyways...

balam

12:29 am on May 31, 2003 (gmt 0)

10+ Year Member



I've got Scooter allowed in, but I've also got it lumped int with a number of agents that are not allowed to get non-HTML files. This is especially important at my site as it includes a number of very large binary datasets in numerous locations and the robots have proven too stupid to understand that downloading them is a waste of bandwidth.

RewriteCond %{HTTP_USER_AGENT} .*Ask.Jeeves.* [OR]
RewriteCond %{HTTP_USER_AGENT} .*FAST.WebCrawl.* [OR]
RewriteCond %{HTTP_USER_AGENT} .*ia_archiver.* [OR]
RewriteCond %{HTTP_USER_AGENT} .*InfoSeek.* [OR]
RewriteCond %{HTTP_USER_AGENT} .*Inktomi.* [OR]
RewriteCond %{HTTP_USER_AGENT} .*Scooter.* [OR]
RewriteCond %{HTTP_USER_AGENT} .*Slurp.* [OR]
RewriteCond %{HTTP_USER_AGENT} .*Teoma.* [OR]
RewriteCond %{HTTP_USER_AGENT} .*VoilaBot.* [OR]
RewriteCond %{HTTP_USER_AGENT} .*Google.*
RewriteRule!.*(html¦htm¦txt¦/)$ /www/msgs/badagent.html [F]

5:18 am on May 31, 2003 (gmt 0)

10+ Year Member



Oooo, thanks for the code, rbs10025, and as is the habit around here,

Welcome to Webmaster World!

But, it's not enough to save Scooter... :) AV just irks me too much to let them come back. If they become any sort of player in SE game, I suppose I'll have to rethink my position, but until that day...

So, now that I've taken this step, can anyone tell me what will happen next? My site's been completely (over)indexed by AV, but now that they're not welcome, what becomes of me in their index?

I suppose I'll be progressively dropped over the next (couple of?) months, since AV can no longer verify the existance of any of my pages, yes?

balam

4:57 am on Jun 1, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



rbs:

Welcome to the Webmasterworld!

Sweet code- thanks for posting it. Think I shal have to borrow it! :)

As I posted in another forum, I am having probs on the AV serps. One site- a sote with 14,000 pages in Google!- has one page- the index- in AV. Anither VERY popular site is not even there at all, and hasn't been for 3-4 years. I do NOT know what AV's problem is.

I seem to remember having a problem with their spider going where it diod not belong, but I never banned it. But I have not seen a Scooter around in a LONG time.

You know who else has a VERY poor spider- always going where he does not belong? Jeeves.

dave

11:07 pm on Jun 1, 2003 (gmt 0)

10+ Year Member



I couldn't agree more! Altavista...are you listening?

I wrote altavista late last year about their misbehaving picture bot. Their reply, clearly fresh out of a can, had nothing to do with my question.

I wrote them back and put s p a c e s in the words that a program might search for to automate a reply. Guess what...no reply. So their picture bot was sent away for good.

Then, in March of this year, I wrote them about their seemingly useless Basic Submit.

Again, I got a responce that was all about Express Inclusion, nothing about the Basic Submit that I asked about.

I again, very politely, asked them my question, and once again I received a blurb about spamming their search engine; something that had nothing to do with my question.

This was my responce:

WOW! No wonder your search engine is no longer an important part of today's SEO strategy. You people cannot even answer a question. Truthfully I do not need an answer to my question because as I said before, since you introduced the Express Inclusion program, we have gotten zero (0) well optimized and information filled sites added to AltaVista's database.

Truthfully most SEO's do not even waist time with AltaVista, and now we see why.

Good Day, mysterious unnamed canned-response person.

5:10 am on Jun 3, 2003 (gmt 0)

10+ Year Member



guillermo5000,

i'm sure that would have come across except for the
misspelling of waste... sorry to do that but as a business
owner, i place a great amount of weight on proper
commumication capabilities... many others do, too... that
one word would have tossed you into my questionable
catagory...

no disrespect intended... i hope you understand...

5:22 am on Jun 3, 2003 (gmt 0)

10+ Year Member



Wow! I guess I don't have your superior commumication skills.
5:35 am on Jun 3, 2003 (gmt 0)

WebmasterWorld Senior Member jdmorgan is a WebmasterWorld Top Contributor of All Time 10+ Year Member



Since it seems the code posted above may "get around"...

Adding start anchors to speed up processing where possible, and removing some unneeded stuff, such as ".*" on unanchored patterns and redundant ua strings such as Inktomi/Slurp:

RewriteCond %{HTTP_USER_AGENT} Ask.Jeeves [OR]
RewriteCond %{HTTP_USER_AGENT} ^FAST-WebCrawl [OR]
RewriteCond %{HTTP_USER_AGENT} ^ia\_archiver [OR]
RewriteCond %{HTTP_USER_AGENT} InfoSeek [OR]
RewriteCond %{HTTP_USER_AGENT} ^Scooter [OR]
RewriteCond %{HTTP_USER_AGENT} Slurp [OR]
RewriteCond %{HTTP_USER_AGENT} ^Teoma [OR]
RewriteCond %{HTTP_USER_AGENT} VoilaBot [OR]
RewriteCond %{HTTP_USER_AGENT} ^Googlebot
RewriteRule !\.(html¦htm¦txt)$ /www/msgs/badagent.html [F]

The original code as posted will disallow all of these user-agents from all subdirectories; If you've copied it, make sure that's what you want to do. Otherwise, remove the "¦/" at the end of the RewriteRule as shown above.

IIRC, the issue with Scooter was the re-deployment of Scooter/1.0 to spider images. It did not properly obey robots.txt. I haven't seen any problems with later versions of Scooter, but of coures, YMMV.

Jim

5:53 am on Jun 3, 2003 (gmt 0)

10+ Year Member



guillermo5000,

apologies, dude... there is a difference between chatter in these and other forums and business oriented email... if what you posted was what you sent to them via email, oh well...

again, apologies... there aren't and grammar (not spelling!) checkers for these forums... heck, i can't even figure out how to click on the link so that it carries me to only the new posts instead of having to wade thru all the previous posts that i've already read and still maintain a link to the past postings...

8:40 pm on Jun 3, 2003 (gmt 0)

10+ Year Member



IIRC, the issue with Scooter was the re-deployment of Scooter/1.0 to spider images. It did not properly obey robots.txt. I haven't seen any problems with later versions of Scooter, but of coures, YMMV.

Indeed it does...

216.39.48.114 - - [01/May/2003:07:28:17 -0800] "GET /robots.txt HTTP/1.1" 200 2347 "-" "Scooter/3.3.vscooter"
216.39.48.114 - - [01/May/2003:07:28:17 -0800] "GET /someimage.jpg HTTP/1.1" 200 41331 "http://www.mysite.com/somefile.shtml" "Scooter/3.3.vscooter"

But that's not the only reason I'm unhappy with Scooter. There's numerous 'burps'...

216.39.48.34 - - [05/May/2003:17:44:10 -0700] "GET /inde HTTP/1.0" 302 306 "-" "Scooter/3.3"

And then there's Scooter/3.3_SF, who has an unhealthy fascination with 4 pages of mine. Fetched on a (almost?) daily basis, three of these pages have not changed at all (including re-uploading them, so the "Last-Modified" date hasn't changed) since they were added to the site a couple of years ago. The fourth is updated about every six months... I'd love to know what warrants such attention. A page of mine that automagically updates itself every two hours is steadfastly ignored...

(Actually, the page is updated with my own, very well behaved bot...)

I don't forget Scooter/3.2, but 3.2 forgets me. Months go by before it bothers to re-index the site... That's some fresh database AV has.

Jim, do you know when Scooter/1.0 was redeployed?

Meanwhile, in other news...

Thrust!

i place a great amount of weight on proper commumication capabilities... [...] that one word would have tossed you into my questionable catagory...

Parry!

I guess I don't have your superior commumication skills.

Oooo... Stumble!

there aren't and grammar (not spelling!) checkers

Enter the dogs...
If years of Usenet taught me anything, it's that you DON'T call up people for spelling or grammatical errors - or a distinct lack of understanding of what the SHIFT key is for ;) - because it only turns a big magnifying glass on you and your posts. Plus, spelling & grammar checkers offer nothing when youse gotsta actooally speek to a client.

balam

8:47 pm on Jun 3, 2003 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It's kind of funny that the most successful search engine has a rep that is active at WebmasterWorld. I would take a search engine more seriously if they were here answering questions at WebmasterWorld. I've never seen anyone form Altavista, just Google and Inktomi.
This 18 message thread spans 2 pages: 18
 

Featured Threads

My Threads

Hot Threads This Week

Hot Threads This Month