Forum Moderators: open
>And what is the difference between "All" and "WebmasterWorld" index
There's WebmasterWorld.com [WebmasterWorld.com] and SearchengineWorld.com [searchengineworld.com].
There are free packages that can handle this size pretty well (2-3 to 10 million pages). There are currently ~96k WebmasterWorld pages indexed at google - that's a nobrainer.
50k queries / hour? Could be an interesting experiment. I'd put the search engine, robot and indexer on 1 to n extra machine(s).
>- have all the bells and whistles possible
What bells and whistles? You won't need site, anchor, link or similar advanced search commands. Link popularity? Not needed here. Phrase support, customizable ranking parameters, ...
Give me full robot access and i'd set something up for ya (mind you: hosted on a mac). Just for fun and testing. If you're interested sticky me.
here is a hint: none of these will work:
[searchtools.com...]
Closest I have seen yet is asp seek. Unfortunatly, the crawler is junk and would need to be rewritten. It also has little in the way of configuarbility.
> currently ~96k
Ya, we only allow a subset to be indexed. many forums are blocked...
mnogo
>Closest I have seen yet is asp seek
aspseek is a mnogo clone.
I run mnogo search engines from 100 k to 2,4 million index sizes and found them to be good enough for my needs. Better than the current WebmasterWorld site search anyway. Speed is an issue with mnogo as well as with aspseek - at least a challenge.
Could be very interesting!
How about Gigablast? [gigablast.com...]
check- handle 2-3 million pages with room to grow to 10million,
check- have all the bells and whistles possible
check- Give a response time under 2 seconds under moderate 50k an hour max load conditions.
free- and cost under $500."