Forum: Search Engine Spider and User Agent Identification Category: The Search Engine World
Moderator: incrediBILL & Ocean10000
Previous Moderator: volatilegx (founding moderator: littleman)
Founded: Nov 2, 1999
Overview:
Spiders are small independent programs that go out and download websites. They take the website data (same that is viewed in a browser) and use it for various purposes. Our theme here is mainly Search engine promotion, thus we are mostly concerned with search engine spiders.
PREMODERATED FORUM
Every thread must be approved by a moderator before it is published. Please see the guidelines below for reasons why posts may not be approved. We try to make pre-moderation decisions in a timely manner - but because we are a volunteer staff and not always available, a decision can take as long as 12-24 hours.
The moderators often edit post titles and may not always send a note to explain. Title edits are made to attract more clicks to your thread, to clarify differences between similar topics, and to help similar discussions appear as clearly non-duplicate to the search engines.
Topics Covered:
Spiders, Spider IP's, and other spider topics, design, care & feeding are also welcome.
Additionally, some spiders hide as various programming library default user agents [webmasterworld.com] or common browser user agents therefore the scope of the forum has expanded to include generic user agent identification and elimination as part of the spider identification process.
Posting Guidelines:
The WebmasterWorld Terms of Service [webmasterworld.com] remain in full effect in this forum.
IP addresses tend to change ownership over time so unless the IP information is expressly owned by a search engine, such as Google or Yahoo, needs to be obfuscated in the D block of the IP address.
Any IP address or reverse DNS information not expressly belonging to a search engine should be masked as follows: