Forced by a court order? Can you say 'overkill'? A quick email to the right person would have done it, but I guess he's on vacation ;)
There is no such thing as a "quick email to someone" in this country anymore. If someone so much as looks at you wrong, you sue them. It's now the American way.
|If someone so much as looks at you wrong, you sue them. It's now the American way. |
Your sentiment is well taken. I've noticed a propensity for relatively minor disputes to escalate into full fledged litigation within a matter of days. It seems to be especially true online, although I have no evidence to support my general observations. Maybe it's not as bad as it seems.
Specific to the quote above...
Within the rapidly rotting urban core of most major U.S. cities, looking at someone the wrong way is more likely to get you stabbed or shot. ;-)
A court order is written proof that the request has been made. That's necessary to show that swift action has been taken by the appropriate parties. Emails get lost, emails get sent to the wrong person and they sometimes are not submissable in court...
By the way, this isn't something minor. Social security numbers could be used for identity theft, costing millions in fraud. No one wants to be responsible and/or held liable for such damage.
"Oh no, my house is on fire! It's ok though, I've called my lawyer and he is going to help me get a court order so that the town fire department puts it out and I can sue them if they don't."
the court order might have been taken to cover their @sses, at least not to be in deeper trouble should students sue.
> at least not to be in deeper trouble should students sue.
bingo. They needed to make sure it was nuked throughout the google network.
So, quick email to the right person at Google? Who exactly would that be and how exactly would someone running a school's website (who probably knows nothing about how search engines work let alone who Matt Cutts is) go about finding that?
And once they sent it, how seriously would G have taken that email? I mean they did just leave that information out there and it then becomes public. Right, isn't that what they say on most privacy issues? Everybody knows that when you build a webpage that a search engine will find it, right?
And exactly how long would it have taken them to get to that email after going through the probably 1000 or so they have gotten this week from other website owners who want to know why their site is either not in G or not ranking well with G.
I think we forget sometimes that we have greater knowledge (and sometimes privledge) than most people. A court order did what needed to be done quickly and efficently, getting it to the right eyes without the big company buracracy.
I love it, a school suing because THEY were incompetent.
Regardless of GET/POST, which a single line of code checking for POST would've solved, do you see a ROBOTS.TXT file for the school?
I sure don't, as this would've been useful telling the bot to stay away from sensitive content. So it comes as no surprise that those sensitive documents also didn't contain meta tags with NOARCHIVE, NOCACHE, NOINDEX, NOFOLLOW or anything else that would've EASILY stopped Google or any other legit crawler from indexing.
Their security over sensitive information is 100% lax and heads should roll.
Not to mention Google's website has very explicit emergency removal instructions and the whole website could've been removed just like WebmasterWorld got delisted once upon a time.
I'm not sure about North Carolina, but such a security breech in California that exposed such personal information like SSN's could've ended up with people in the school responsible for such lax security being charged with a crime.
Ignorance of the internet or the law is no excuse.
Cover their asses? They ought to be legally liable for putting them there in the first place, just as these banks who "lose" laptops with millions of unencrypted SSN's on them ought to be.
I saw this on dig last night. It is so funny how they stood by their IT guys. Saying "we have a secure system". Google did nothing wrong. The Internet is getting bigger and more important every day but the number of people that understand it is not. We were talking in a chat room the other day about how some guy came to a forum and said he had just got an SEO client and wanted to know what he needed to do. You have people put up ecom sites that know nothing about security. You have companies that don't want to pay what it costs to get somebody good in IT. You have companies that won't even hire an IT person. They just pick the only person in the office that has had the least amount of trouble on their computer to become the IT person.
Since when was someone allowed to steal my stuff because I left my door unlocked?
I am often bugged by the fact that people increasingly feel that you are responsible for telling SEs to stay out. Any other part of the law and that would not be the case.
Did the school make a mistake? Yes they did as anyone could have gotten to the info with or without an SE.
If this school had thrown this in the trash and a newspaper printed the information, people would be in an uproar at both the school (for not shredding the info) and the newspaper (for printing what should not have been printed).
Automated processes do not excuse you from creating breeches of privacy. People should not have to tell people/companies to stay out of their space.
So just out of curiosity, why havn't search engines provided the funds to ensure the websites and pages they are cataloging are allowed to be cataloged? Seems a little cheap on their sides too as they are making billions off doing so.
|So just out of curiosity, why havn't search engines provided the funds to ensure the websites and pages they are cataloging are allowed to be cataloged? Seems a little cheap on their sides too as they are making billions off doing so. |
Google wrote an extensive "Webmaster Help Center" which this school, and most others, do not read. You can't go door to door and make sure everyone read it, plainly available, yet ignored.
Did you miss the point I made about them not even having a robots.txt on the schools website?
This isn't a Google issue at all, this is a 100% incompetent webmaster issue!
This is not the same as breaking into your house. This is no different than somebody going by the front of ever house and taking a picture. If you accidently put your SS# on front window it is not Googles fault. They are only taking pictures of stuff that is put out to the public. You can't just put a sign on the side of the road and expect people to not look at it.
You said "why havn't search engines provided the funds to ensure the websites and pages they are cataloging are allowed to be cataloged?" (I still don't know how to do that quote thing).
What do you mean by this? Currently the system is opt-out by way of robots.txt or other things like metatags and NOFOLLOW directives, etc. Are you saying you'd prefer an opt-in system, where SE's are not allowed to index a site unless it explicitly gives them permission?
Also, the difference between a newspaper printing sensitive info and Google indexing it is that a human reviews everything in a newspaper, so there is somebody to blame if something sensitive were printed. With an automated system like a SE, we don't have this check so isn't it understandable that something sensitive could get into the index? And isn't it reasonable enough to have safeguards like robots.txt and an emergency removal process (which apparently Google does)?
|Did you miss the point I made about them not even having a robots.txt on the schools website? |
Then what you are saying is that if a thief write a booklet on how to install a lock he can't pick and leaves it at the library for me to find and then I fail to read the book and fail to put the lock on my house, it is my fault he robbed my house and he can't be prosecuted?
Yep, I am saying it is not enough. Those measures take knowledge. Google (and other SEs) make more than enough money to set up a system that idiot proofs it. Any other industry in the world that handles potentially dangerous material is required to do so.
I have often marveled at the fact that while they have measures that tell them where to stay out of, they still have not taken steps to allow you to tell them where they are allowed in. It might throw a kink in their business model, eh? Why should SEs be inherantly allowed in? It seems to me that it would be better for everyone if you had to make an effort to let them in rather than to keep them out.
It would force people to learn the system or at least reduce the damage done by people who don't learn it.
|The school apparently was unaware that a GET statement worked the same on their software as a POST. |
Isn't it Programming 101 to make sure your software works with only one method?
|Those measures take knowledge. |
It's a SCHOOL, an institution of KNOWLEDGE, if of all places they can't RTM, then we're all lost and the future is in peril.
Sorry, I can't agree with you at all as ignorance is never an excuse, especially when it's all spelled out in black and white, in a reasonably easy to access location where it makes sense to be found, such as Google's website resource pages for webmasters.
Your theft analogy doesn't play for me in this case as the tools to stop the indexing were readily available and simply not used. First off, it's not theft as Google is just a librarian. If you don't want your book to be in the library's index you simply don't sell a copy of your book to the library. In this instance, the library only indexes what's available and robots.txt, NOINDEX, NOARCHIVE, NOFOLLOW, etc. tell the librarians where to go and where not to go.
This is all WEB 101, this is not a secret, this is not a mystery, there are thousands if not hundreds of thousands of web pages devoted to these topics, not to mention hundreds of webmaster forums discussing it daily.
To be blunt, it's not hard for a 12 year old to figure out, so professional IT and educators that can't do this is shameful and again, ignorance is no excuse.
This sounds to me like someone posted a simple html page with all the students info on it. Even if it was password protected why would one students password give access to all the other students test scores and social security numbers?
|It's a SCHOOL, an institution of KNOWLEDGE |
Gee, I didn't know they taught SEO in grade school. I can't even find a decent college course in it. Must be the funding in your area that allows such things.
I never said the school was not liable. I just think the G (and other SEs) are resposible as well.
The theft analogy does work because lack of knowledge does not excuse the crime on either side.
G makes money off cataloging this stuff. I ask again, why should search engines be allowed in without asking permission. it would be safer for everyone involved if it were the other way around. Having to tell them to stay out is no different that being requried to tell a thief to stay out of your house.
Am I stupid if I don't have a lock on my door. Yes, I am. Is the thief still a thief? Yes, he is.
The librarian thing doesn't work either because it would be like a librarian that goes and activly searches in other people's property for books to add. Still a crime if I did not say they could take the book to be cataloged in their library.
The librarian also doesn't come to your house and help him/her self to your books unless you explicitly invite them over.
If SE's we're opt-in via robots.txt, then at least the IT people who probabaly have better training and would be more reasonably expected to know the nuances of search engines could be responsible for getting their websites included in the index.
If someone accidentally exposed this kind of sensitive info, the engines wouldn't by default broadacst it all over the web for everyone to access.
This is what the schools, businesses, etc. PAY their IT people for. They are supposed to have some clue as to how to protect their data.
I am no programmer and I know how to keep someone from accessing a page/directory, etc. It's basic..BASIC stuff.
There is simply no excuse for their security protocols to allow any unknown entity access to that info and as another poster mentioned, why on earth would one persons username/password give access to the entire list? Simply inexcusable.
|Gee, I didn't know they taught SEO in grade school. |
This isn't SEO we're discussing, this is simple indexing and spidering.
|The librarian also doesn't come to your house and help him/her self to your books unless you explicitly invite them over. |
Another bad metaphor as the internet is a different type of medium and if the librarian didn't scan the internet, nobody would know where anything existed in the first place and it would be quite useless, as useless as a library without a card index.
Besides, arguing the validity of opt-in vs opt-out is silly in this instance as the site wasn't properly secured in any way shape or form. So what if Google stayed out, there are a LOT of other things crawling that don't even identify themselves as bots that would've gotten that information, probably did, and probably still will do so if they didn't fix the site as now it's a worldwide target for hackers with the exposure of taking it public.
Sometimes these people simply don't think and they're teaching our kids.
I weep for the future.
[edited by: incrediBILL at 9:31 pm (utc) on June 26, 2006]
> why should search engines be allowed in without asking permission?
Because it is part and parcel of connecting a server to the Web. You may join this community (the Web) only if you agree to leave your front door open at all times, and to post your own guards where you deen appropriate.
Google (and almost all other search engines) comply with the Standard for Robots Exclusion, a document that was published and became the de-facto standard well before Google even existed. So like it or not, the policy for the current Web is opt-opt, and you agree to those terms the minute you open your firewall to Web traffic.
It's also a sad fact that should the Web switch to an opt-in model, almost all old resources --and maybe even almost all old sites-- would vanish from search, because they neither opt-in nor opt-out.
Google and the other SEs would be wise to create a 'filter' for nnn-nn-nnnn -formatted numbers, and an absolute exclusion for pages where such numbers occur in conjunction with either "SSN" or "Social Security Number" or any similar or derivative phrases.
But *anyone* who allows Social Security Numbers, or credit card numbers, or any such personal data to be exposed or to leave a secure computing environment should be put in jail, IMO. I'm not willing to have my personal info put on some idiot's laptop for the sake of corporate convenience. I think we need a law requiring that each piece of personal info be stored on a *separate* secure server, and be 'correlated' only by an encrypted key. And I don't care one whit if it 'inconveniences' the financial industry -- they make plenty off me already.
Sorry, I come down against the school on this one -- If you want to get on the 'information superhighway,' then learn to drive first.
>> Gee, I didn't know they taught SEO in grade
>> school. I can't even find a decent college
>> course in it. Must be the funding in your area
>> that allows such things.
This isn't an SEO issue; if you did find an SEO course it wouldn't be likely to teach you anything that would have prevented this. However, there are any number of classes on server administration and security. Reading the statements in followup on this from the school's CTO, as well as the qualifications and experience that person brought to the job (available on the school's site), makes me feel that some of those classes would have been a good idea.
|If you want to get on the 'information superhighway,' then learn to drive first. |
Want to chip in with me and send them a copy of "Webmastering for Dummies"?
Hey Brett, want to donate a 6 month membership to their IT guy?
You could even start a campaign to help schools like:
"No School Website Left Behind" :)
Comparing a website to an (unlocked) house makes no sense. There are general rules that you and I both know. A search engine finds websites and indexes them. Sometimes, it is just 'how things work'.
The school made the (big) mistake of allowing GET variables. But then again, being liable these days is non-existant (unless of course you are a puny individual).
I can point out several threads where we make fun of people for not knowing the difference between SEs and the Internet. Why is it that there are several very qualified internet professionals in this thread who are doing just that?
Putting a website up on the INTERNET is something completly different than putting it out for a SEARCH ENGINE. Certain status quos were established very early on when only a specialized group of people put up websites. That time has long passed and it is time to look at the status quo.
Robots.txt is SEO. Bottom of the barrel, easy peasy SEO, but SEO none-the-less. You put a robots.txt on a website in order to tell a search engine robot where not to go. It is there for no other purpose than to Optimize your websites for Search Engines. I.E. SEO
All these people that you think should know this stuff... Exactly when and where should they learn it? Would this be the IT guys who got his degree 20 years ago (the internet has only been around for the public for almost 12) or the newbie kid whose professor barely taught himself HTML and thinks SEs still use meta tags? Does it go in with networking or learning HTML or while you are learning ColdFusion? Is it the designers job or the computer administrator's job or is it the programmer's job?
These schools should pay for it (In the US right)? Come on! I know what I charge an hour and I know what my sister-in-law who is a teacher makes in a year. The numbers are vastly different. So you are saying that a school should invest in a $60K-100K IT guy, heck, a whole staff of IT people when they can barely afford to pay a teacher $30K or. Let's cram a few more kids in a classroom so that Google can go on making a few more billion dollars. Heck, for a school, the website is only a service for their students and parents and produces not a penny of income for them. Screw 'em. God forbid the SEs make one penny less.
Just because things have always been one way does not mean they should always be that way. Times have changed and it is high time the search engines change with them.
All those people who clamor on and on about W3C compliance and how SEs should boot anyone who is not, should have no problem with saying that anyone who doesn't have a robots.txt should not be indexed.
| This 57 message thread spans 2 pages: 57 (  2 ) > > |