Welcome to WebmasterWorld Guest from 188.8.131.52
Forum Moderators: lawman
The initial click on a profile link would be picked up by your CGI program, which would return a little box with another "Click for profile". This is easy to do by using the initial link in a hidden variable in the new little box, and changing the usual "Submit" on the form button to "Click for profile."
This way you need two clicks to get to the profile. Bots are too dumb to handle this without special programming. I use this on my site when some domain seems to be a bot. The title of the box I return is called "Robot Roadblock" and I explain, with an apology, that you may not be a bot after all, but I'm trying to save my bandwidth, so please click again to continue, and aren't we all lucky that bots are too dumb to click again?
With this little gizmo installed, you'd keep the profiles out of the bots, and out of Google's cache, and give forum participants more confidence that they can change their profile at any time -- such as going from full ID back to anonymous -- and do so under conditions where this change might actually be useful and effective.
The current setup protects members email addresses and allows them optionally to identify as much or as little information about themselves as possible. That is as far as I am willing to take it since it is completely in the control of members themselves.
We've had several requests to allow members profiles to be indexed. After much discussion, I have put in place a system where meta tag robots noindex tags are present on members profiles with less than 100 posts. After that, we remove the no index tag, because it does help the linking sites link popularity. If a member is willing to contribute that much, we will go to those ends to support them.
One point about it being under the member's complete control: This was true prior to October, before Google went "deep," and it's true for under-100 members, assuming that the noindex meta is respected by all bots, but consider the hypothetical case of an over-100 member who has a full profile ID.
Google comes along and caches this profile. Have you ever tried to get Google to purge a cached file that has been removed or modified on the original site? -- it's difficult to get their attention! If the member wants to vacate the profile for some reason (bad guys are looking for him?), it's probably going to stay cached at Google for anywhere from 30 to 60 more days, despite the member's best efforts.
And that member will be contacting you, because Google will want to verify the situation with the actual website administrator. I went through this last August with Google; I'm assuming that things haven't improved since then.
I don't know how many bots respect the noindex meta, but I'm not confident that many do. I think you need to carry another byte in the user preferences -- a Y/N flag for whether you want your profile protected from spidering. And if the answer is "Y" then BOTH the noindex meta and the two click sequence should be used.
But it's your call. I think this forum is the best I've seen, and based on your track record, whatever you decide is probably the best solution in any case.
And you're one click away from reading the profile. Top ranking, too!
Now I just have a one-pix of me in there. Your "border=1" doesn't frame my best side, but most won't notice.
Any policy on this?
Thanks. It wasn't easy, but you can hardly even tell that I've lost most of my hair!
By the way, is it within policy parameters to generate this transparent one-pix with a CGI program? You don't test for a graphics extension, and I know it would work. I can do some USER_AGENT and USER_FROM logging this way, and get an idea of which bots don't respect the noindex meta.
And if that's okay, then is it okay to generate my portrait (one-pix or any size) with a CGI program and at the same time plant a cookie? Unlike Google, I'll have it expire early -- in 2037 instead of 2038. (Just kidding; I don't really want to plant a cookie. But I think the bot monitoring would be useful.)
inktomi3-not.server.ntl.com - - [13/May/2001:15:44:39 -0400] "GET /portrait.gif HTTP/1.0" 200 43
Technically, according to webbug critic and privacy expert Richard Smith, it's not a webbug unless the one-pix is from an off-site domain. Therefore, this use of the one-pixel GIF under discussion is indeed a webbug. When I use it on my home page, it's not a webbug.
Another thing: while there is plenty of time for a CGI program to catch all the environment variables and do logging, and then spit out the 43 bytes for a one-pixel transparent GIF to keep Apache and the remote browser happy, the HTTP_REFERER variable is almost worthless in this case.
About 999 times out of a thousand, the referer for this profile GIF would be:
In other words, it's referred by the profile page itself. Once in a while you get something more interesting, but it's very rare. I've been log watching for years on this. Too bad, because I'd be curious about how many profile viewers come in locally, as opposed to how many come in from a link provided by bots.
I've been using this technique to rotate some cartoons and boxes with each one of my no-cache refreshes, to keep the home page more interesting. If you use a "CGI-pretend GIF" in this context, as a trigger to rearrange your "no-cache meta" home page, you also need one more trick to guarantee that there's no way your "pretend GIF" gets cached anywhere. Because if it's cached anywhere, it won't trip the next time for that user. Even a no-cache meta on a page doesn't prevent the GIFs themselves from getting cached.
That simple trick is this:
Instead of calling [img src="blah.blah/cgi-bin/blah.cgi"] you should instead call [img src="blah.blah/cgi-bin/blah.cgi/12345"] in which 12345 is the process ID or some other number that changes every time. This number ends up in PATH_INFO, where you just ignore it. The thing about this is that there isn't a cache system on earth, at least in my experience, that doesn't think they have to GET the new image. The number on the end makes it look like a new URL to them. If you're rearranging the page anyway, it's easy to slap a new number on the end of this URL.
Now go out there, webmasters, and get your hands dirty!
I'm still receiving e-mail bombs at least one per week. Hint, e-mail viruses would be a topic worthy of it's own forum.
PS- This a bit of the story so many have asked about.
Jeez, I actually *like* Google as a search engine too! I use them almost all the time. (Not the tool bar, just google.com) I do really like them...
...but I can't help sniggering and guffawing at almost anything that pokes fun at the large, sucessful and/or bloated... be they corporate or government, 'good guys' or bad.
I think a sense of healthy skepticism and humor is an important thing when dealing with the big guys of the world.
Disabling the picture option would be a rather extreme reaction to a few webbugs. Disabling the profile entirely might make more sense, as no doubt there are those with profiles who will, at some point in our googled future, wish they had never filled them out.
(It's too late for me anyway; my FBI and CIA files from the late 1960s were so complete that everything about me in Google's cache is rather like a big stack of footnotes.)
All you have to do is change your "border=1" to perhaps a "border=3". At that point even a one-pix is so noticeable that it defeats any purpose in using one, as the link to a noticeable GIF frequently betrays the person who placed the GIF.
You should also check for a ".gif" or ".jpg" extension to discourage images generated by CGI programs.
As to the "spirit of the board," it seems to me that there are many "spirits" on this board (as well as other boards, too), and it's the very fact of multiple spirits that makes it interesting and informative.