Page is a not externally linkable
- Search Engines
-- UK & Ireland Search Engines
---- Worth Building a UK Search Engine?


TypicalSurfer - 12:09 pm on Oct 9, 2012 (gmt 0)


discarding 'non-UK' sites


A web crawler can be designed to store all found links but only crawl those that meet certain criteria (TLD in this case), so it wouldn't be a matter of discarding documents, you just don't crawl off target pages.

crawl seed list > collected links stored in a crawl db > crawl db cleaned of unwanted TLDs > crawl selected urls > repeat


Thread source:: http://www.webmasterworld.com/uk_ireland_search_engines/4447195.htm
Brought to you by WebmasterWorld: http://www.webmasterworld.com