|How do search engines deal with cookies and/or GET?|
Generally speaking, when a search engine comes to a site, does it accept or ignore cookies?
In the case of a site I'm working on at the moment, if the engine doesn't accept cookies, it'll switch to passing a session id via GET, so how will the engine deal with that? I know that engines used to ignore URL's with QUERY_STRING's, because they used to get sucked into a loop with the end result of bringing both servers down, but what about now?
Do they ignore pages with QUERY_STRING's altogether? Do they follow the links anyway, but use some kind of intelligence to ignore loops? Or do they ignore the QUERY_STRING, and index the pages anyway? If it's the latter, they'll end up creating a load of useless session files on my machine, and I want to avoid that.
Any help appreciated. Ta.
Yes, query strings are still ignored by all the majors.
Brett, am I misunderstanding your post when you say that:
or am I taking it out of context?
Excite, Google, Infoseek, Webcrawler and Hotbot are all setting cookies from the submit page. What am I missing?
Hello Herb, welcome to the board.
Brett is talking about spiders accepting cookies when crawling a site. Getting them set on your machine when you visit a search engine page is a different story.
But if you push Brett, I'm sure he'll tell you that the cookies set on your machine by the engines do nothing also, except maybe tell them one of the elements of a real browser is present. I have turned off cookies many times at submission without ill effects.
Is this still the case?
So if you submit many index pages and clean cookies between submissions, the engines won't penalize you for over submitting and the submission goes through okay?
Yes 2Much, you are right on, this is the correct :)
<if you submit many index pages and clean cookies between submissions, the engines won't penalize you for over submitting and the submission goes through okay?>
Surely not? Would cookies have anything to do with this?
If you oversubmit in a given period from completely different machines on independent IP's you'll still get hammered for oversubmission, cookies or no cookies.
I should have qualified last post - oversubmission is a problem where submit limits apply of course (AV, Excite, Hotbot....)
2_much, there is a bit of debate about it right now. I think it is wise to accept the cookie. The cookie itself isn't much of burden on resubmission, your ip is the key. It is easier for them to tag a submission with your IP address than it is a cookie. The only two engines I worry about in that regard, is Alta, and Excite. I've been convinced Alta is tracking IP's and tossing submissions after #xyz per day from the same IP (I say it is 50, but don't quote me because I submit far greater than that some days).