Forum Moderators: goodroi

Message Too Old, No Replies

Do search Engines crawl and spider registered content?

Spiders, crawlers, indexing, registered content

         

Oimachi2

10:27 am on May 11, 2006 (gmt 0)

10+ Year Member Top Contributors Of The Month



Hi,

I would like to know if search engines crawl and index registered and crawled text.

Let's say for example I had some very precious information that I wanted people to stumble upon but then they would have to pay a subscription fee to view.

Would Google and the rest still see the relevatent content from the search results?

Thanks

Dijkgraaf

8:32 pm on May 11, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



If a user can get to it without login in, then so can Google bot.

What you could do is allow google bot to index it but not cache it, and to do some nifty detection to see wether it is Googlebot visiting or a user and force only the users to subscribe and log in. But there is no 100% reliable way of doing this that wouldn't cause some issues.

Oimachi2

4:22 am on May 12, 2006 (gmt 0)

10+ Year Member Top Contributors Of The Month



Hi Dijkgraaf,

Thank you for your reply. However, I'm still not that clear on the answer...

So if I don't modify robots files or do any tweaking, just publish my pages as they are, but as registered content.

Will Google index and crawl them or not?

Thanks,
Bruno

Dijkgraaf

11:38 pm on May 12, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



What do you mean by "registered content". Do you mean that the user needs to register and log in?

If the users need to log in to see the page, and you haven't done any User Agent detection to allow bots to visit without loging in, then those pages will not get indexed.

Oimachi2

4:12 am on May 13, 2006 (gmt 0)

10+ Year Member Top Contributors Of The Month



Yes, that's what I mean, users have to loggin to see the content, like a subscription.

How do I do this? "User Agent detection to allow bots to visit without loging in,"

Thanks!

Dijkgraaf

9:17 am on May 13, 2006 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



It would depend on what scripting language you are using. Do a search for "user agent detection" plus the scripting language you are using and you will find various resources.
However do be aware that it is very easy to spoof the user agent, so this does open up the possibility of users accesing your content without registering.
You might want to combine User Agent Detection with IP addresses that belong with bots that you want to let in.