Way to use cookies but allow spiders in?

Hi everyone,

This is my first post to the forum. I just found it the other day and have enjoyed reading posts for a week or two. I hope this post is in the right forum group.

I am in the process of changing my site to require the acceptance of cookies from visiting browsers.

At my site, Perl scripts create the HTML page served up by combining HTML content and HTML structure together. In order to determine if a visiting browser has cookies enabled I have had to interject a call to a cookie setting script that will then redirect to my other HTML building scripts. The cookie setting script sets a cookie. The HTML building scripts check to see if the cookie was accepted before serving up the usual HTML combined page.

Anyway it's working well (though I have not yet uploaded all the changes to my actual site) except for one thing..

My site will no longer allow visits from browsers or user agents that do not accept cookies.

But I DO want to allow visits from spiders, search engine robots, and other such user agents.

Will spiders and robots accept cookies and return them?

If not will my strategy mean that all such spiders will cease to spider my site content? When my site rejects them and keeps redirecting them to a page asking them to accept cookies? To which they will obviously not respond?

Will I have to include a pass through list of spiders in my cookie setting code so as to let them in? Is there an easier way? If not what is the most effective strategy for letting the most valuable spiders in?

Any insight on any or all of the above questions would be very appreciated. What I am really looking for is just some direction and not so much step by step instructions on how to do things.

Thanks.

Carlos

[pre] <? /* Use this to start a session only if the UA is *not* at search engine to avoid duplicate content issues with url propagation of SID's */$searchengines=array("Google", "Fast", "Slurp", "Ink", "ia_archiver", "Atomz", "Scooter"); $is_search_engine=0; foreach($searchengines as $key => $val) { if(strstr("$HTTP_USER_AGENT", $val)) { $is_search_engine++; } } if($is_search_engine==0) { // Not a search engine /* You can put anything in here that needs to be hidden from searchengines */ session_start(); } else { // Is a search engine /* Put anything you want only for searchengines in here */ $foo=$bar; }

?> [/pre]

<?php
header("Content-Type: image/gif");
header("Expires: Mon, 26 Jul 1997 05:00:00 GMT"); /* date in the past */
$now = gmdate("D, d M Y H:i:s");
header("Last-Modified: $now GMT"); /* always modified */
header("Cache-Control: no-store, no-cache"); /* HTTP/1.1 */
header("Cache-Control: must-revalidate, post-check=0, pre-check=0", false);
header("Pragma: no-cache"); /* HTTP/1.0 */

function hex2bin($s) {
for ($i = 0; $i < strlen($s); $i += 2) {
$bin .= chr(hexdec(substr($s,$i,2)));
}
return $bin;
}
print hex2bin('47494638396101000100800000ffffff0000002
1f90401000000002c00000000010001000002024401003b');

Way to use cookies but allow spiders in?

carlos123

jatar_k

Nick_W

andreasfriedrich

Nick_W

PaulPaul

carlos123

andreasfriedrich

jatar_k

carlos123

andreasfriedrich

carlos123

andreasfriedrich

carlos123

Join The Conversation

Moderators and Top Contributors

Hot Threads This Week