homepage Welcome to WebmasterWorld Guest from 54.227.20.250
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Home / Forums Index / WebmasterWorld / New To Web Development
Forum Library, Charter, Moderators: brotherhood of lan & mack

New To Web Development Forum

    
How do you know if your site accepts "Spiders"?
ojgibbins

10+ Year Member



 
Msg#: 259 posted 10:52 pm on Jan 14, 2003 (gmt 0)

How do you know if your site exepts "SPIDERS"?

The site in question is <snip>

I build it and the guy is trying to promote it...The company is asking does my site accept spiders.

Cheers

Owen

 

Dreamquick

WebmasterWorld Senior Member 10+ Year Member



 
Msg#: 259 posted 11:50 pm on Jan 14, 2003 (gmt 0)

Does the site accept spiders?

Unless you have explicitly barred spiders from your site then the answer is normally yes. However "can spiders crawl your site properly" is probably the next question you need to ask...

That will depend on how it was created - some design techniques and construction methods are more suited to being accessed by spiders than others.

For example regular text-heavy HTML can be *very* spider-friendly if used correctly, equally a badly thought-out HTML structure can be extremely spider-unfriendly. Also until recently flash was also another extremely spider-unfriendly technology. The key thing to remember is that spiders are looking for text content within a page - if a spider can get to that content easily and you have put some thought into it then you will often be surprised how well you can do...

As I said a second ago at the moment text-content is king due to the ease with which search engines can get access to it, so from this point of view your goal in creating spider-friendly pages should really be to put the best text content you can into pages and have it nicely formatted and laid-out so that users can make the best use of it.

While you are there you might also want to get those pages as close to perfect in terms of markup validation as you can - again something which will generally benefit your users.

This has been very user-focused so far hasn't it?

There's a reason for that - search engines exist to crawl websites designed for *people* rather than designed purely for search engines so a document that works well for a user is something a search engine spider must be able to understand to have even the slightest chance of having it's search engine make it into the limelight.

As long as you don't try anything too bizarre spiders really do try their best to crawl pages because that's what their objective is in life - to crawl as many pages as they can handle to allow their search engine to provide the best results they can.

I hope some of that has been helpful.

- Tony

jdMorgan

WebmasterWorld Senior Member jdmorgan us a WebmasterWorld Top Contributor of All Time 10+ Year Member



 
Msg#: 259 posted 12:01 am on Jan 15, 2003 (gmt 0)

Owen,

Welcome to WebmasterWorld [webmasterworld.com]!

Here are some resources for you:
A robots.txt tutorial [searchengineworld.com].
The robots exclusion standard [robotstxt.org].

Also, you may want to try a site search (link at top of screen) here on WebmasterWorld for robots and spider-related topics.

HTH,
Jim

Slade

10+ Year Member



 
Msg#: 259 posted 1:27 am on Jan 15, 2003 (gmt 0)

When you're ready to test, also try:

Search Engine Tools [searchengineworld.com] (there is a simulated spider here)
GigaBlast [webmasterworld.com] thread - they do real time spidering, while you watch! (or did when the thread was posted)

wilderness

WebmasterWorld Senior Member wilderness us a WebmasterWorld Top Contributor of All Time 10+ Year Member Top Contributors Of The Month



 
Msg#: 259 posted 2:13 am on Jan 15, 2003 (gmt 0)

<snip>The company is asking does my site accept spiders.>

The only reason your site or any site wouldn't accept spiders would be if you either suggest by using the "disallow" in the robots.txt
[searchengineworld.com...]
or
you have placed limitations of access into your htaccess file.
[baremetal.com...] (couldn't find a reference in WebmasterWorld Glossary)

Global Options:
 top home search open messages active posts  
 

Home / Forums Index / WebmasterWorld / New To Web Development
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved