Forum Moderators: open
How do you get that data from Altavista? It occurs to me that given the right range of queries, a merge, etc. one could get a figure.
Is there an easier way? I think I might have found one, but I'd like to hear expert opinions before giving it all away...:)
Cheers,
Han Solo
For all of the lucky contestants, go to the listings altavista site at listings.altavista.com, and try searching for
-rse
Don't ask me why this does what it does...does this do something in Unix or OpenVMS i should know about?
And if there are other letter combos that give a higher number I'd like to know about it...but this was the biggest number I could get.
Cheers,
Han Solo
I bow before the wisdom of the mighty...and the grand prize goes to msgraph.
Thanx for the tip.
Question remains, why does this work, and what are we seeing, really? Y tambien, gracias por la informacion.
Cheers,
Han Solo
Well for one thing it is "listings"'s DB, seems like everything is in this one.
URL:http:// is pulling
http://domain.com - Last modified on: 03-Sep-00 - 28154 bytes - English
As for the -rse displaying 2 million less pages I think it has something to do with what you said. (Unix or OpenVMS ) It must be that those 2M do not have whatever that code is on their server maybe?
I guess it now shows that AV pulls whatever it needs from "listings" to fill up the whatever_specific AV indexes around the globe.
I'm keeping an eye out for the next re-index of Main AV to see if they match up the current "listings" results. If they do then it is a good prediction tool to see how you will be listed in the Main index next roundup.
I figure that Main AV pulls about 100M of "listings" results into theirs.
The reason for this is some of it is considered spam, and therefore has an added penalty to it...I'm not going to give specifics, but I will say if you search for the highly contested industry term that so many, many companies compete over (just look at the bids on goto to see how crazy some are)...
Then compare the results between the 2 sources. Notice anything changed? Should be...and you can figure out who made the cut, and who didn't...aka, what I believe to be the companies they marked as "spammers", and the ones who play by their rules.
To me, listings represents everything they could store off of the internet, and successfully categorize...I don't honestly believe they have the resources to duplicate what google/fast do with the # of documents they have sorted.
Cheers,
Han Solo
Did AV stop indexing submitted ftp URLs at some point? I haven't paid any attention in a long time, but I guess I missed that change.
I think its the "search or not search" search - since NL supports boolean searches. Won't work on AV though, due to their timeouts and only partial support for boolean logic.
I searched for http on the power search [altavista.com] with "The search should include all the words".
Search found 0 returns, but ignored 1,098,877,583. These are the results using Opera and Netscape.
The kicker is that the same search with IE "ignores" only 383,496,896.
Did it about 3 times with consistant results...
Go figure...
That would be interesting, though. Why don't they publicize that they have a billion documents, too? I'm thinking they might not, can anyone else duplicate the feat of getting the billion to show up?
Thanks for the info, I'll try it again later, to see if the numbers change.
Cheers,
Han solo