Welcome to WebmasterWorld Guest from 220.127.116.11
Forum Moderators: open
Who are the crawlers, and who do i need to submit each and every page too ?
I am mainly concerned with English speaking web - which crawlers, indexers, need to be submitted to independantly, even though they are the same company. i.e lycos.co.uk and lycos.com. AV, excite, others.
Below is lists of who i think are the crawlers, and who i think are indexers, along with comments on different country versions, pulling on different resources.
Please edit or add to the lists, engines or comments on feeds ?
Crawlers (base url - find internal links, index them pages)
Google (draws from yahoo and odp, all country versions the same)
Fast (only one version - don't know if it feeds of anyone, though feed poss by lycos and hotbot).
Ink (feeds partners algo's, manipulated differently by them, is feed by partners)
AV (each directory has different crawlers)
Direct Hit (only one, feeds partners - MSN, Lycos, etc.)
Northern Light (own maintained db - own crawls, feeds no-one)
Indexers (need to submit each page)
Anzwers (only one - feeds itself and Ink.)
Lycos (many, different regionals - feeds itself, is feed by Fast, DH, and different resources regionally)
Excite (Feeds of partners, and itself, different regionals with different resources)
Sympatico (feeds of partners, lycos, fast, dh and more)
Hotbot (feeds of lycos, fast, itself)
Webcrawler (feeds of excite, and partners)
Canada.com (feeds of direct sub, now closed unfortunatley, feeds Ink.)
There are more such as AOL, netscape, TERRA (latino), Share (european) and others.
Then Obviously there are the big directories.
Looksmart & Partners (.com & regionals - language and domain dependant)
Small national directories (some domain and language dependant)
Any advice or knowledge that anyone is willing to inpart on these matters would be greatly recieved.
AV: submit each page. Occasionally goes on full crawl. Appears to be dependant on click-pop at the moment. They are also in the middle of a full crawl, but most feel this is not a inclusion crawl and instead they are in a data strip mining operation.
INK: submit each page. they can go on full crawls too, but rarely are those crawled pages added.
WiseNut AllTheWeb (lycos), Google: submit the root, and they should get the whole site from just that.
Excite: our prayers are with you.
DirectHit: no one has put together a comprehensive review of DH yet. Submit each page, and make sure to clean cookies and click the mouse out of it at Hotbot and MSN.
NorthernLight: much like AV and ink. It does help to submit each and every page.
Teoma: Not enough data to determine. We've heard (from many people) that Teoma will take submissions via E (don't do it, give them a break), but it is a hand review and add for them.