homepage Welcome to WebmasterWorld Guest from
register, free tools, login, search, pro membership, help, library, announcements, recent posts, open posts,
Become a Pro Member

Visit PubCon.com
Home / Forums Index / Google / Google SEO News and Discussion
Forum Library, Charter, Moderators: Robert Charlton & aakk9999 & brotherhood of lan & goodroi

Google SEO News and Discussion Forum

Remove confidential csv file from google cache

 9:49 pm on Jan 30, 2008 (gmt 0)

I found a csv file containing confidential info on customers - name, address, phone nos.

This IMO is a serious breach of confidentiality.

I have asked the webhost to remove the file but this I'm sure is going to stay in google's cache.

What can be done here? How can I get google to remove the file from the cache? Do I have to be one of the customers named in the csv file?

This is an international list of customers on a US hosted server. I'm in UK.



 11:52 pm on Jan 30, 2008 (gmt 0)

Set up a Webmaster Tools account, if you don't already have one. In the account there is a url removal tool - you often get results in a couple days.


 11:55 pm on Jan 30, 2008 (gmt 0)

But what if this is not my site?

And does the site hosting the file have to remove it first?


 12:28 am on Jan 31, 2008 (gmt 0)

If the web page has been removed from the server and the url returns a 404 page not found error then you can request google to remove the page from the index.

Sign into your google account and go to the URL https://www.google.com/webmasters/tools/removals?pli=1 follow the instructions.

Remember the URL of the page to be removed must return a 404 error otherwise this tool will not work.


 3:34 am on Jan 31, 2008 (gmt 0)

A list of names, addresses and phone numbers? Ever heard of a phone book? How is this so egregious that you feel the need to intervene?


 4:24 am on Jan 31, 2008 (gmt 0)

Odd, but my name and number isn't in the phone book, but my medical insurance company has it. Could be a simple case of HIPAA compliancy.


 8:44 am on Jan 31, 2008 (gmt 0)

I would go with Tedster and keepontruckin 's approach. Webmamster tools is your safest bet. If you don't have access to it kindly ask the webhost or site owner (if you are not the one).

As an additional measure I might have replaced the file with different (junk) content. If the 404 does not have a fast enough effect then a new caching of the csv file might be quicker.


 10:43 am on Jan 31, 2008 (gmt 0)

jomaxx is not understanding the severity here.

I found a competitor site had stored all his customer transactions in csv files on his website. The website 'closed down' a while ago but he had left the csv files in browsable directories.

IMO this is a clear breach of confidentiality and I don't know what country jomaxx is in but here in the UK we have strict data protection laws which prevent this kind of thing happening.

Now to get this resolved.

The site is hosted by a US hoster. I filed in abuse claim last night and tried to contact their support but they close down at 6pm.

I'm pretty sure the hosters will remove the files as soon as they read the abuse report but the google cache I guess takes longer.

1. I have no idea who the webmaster is of the site. As I say the website 'pages' have gone so there is no contact email address

2. A whois lookup returns neulevel.biz. This is some kind of obfuscation / domoain owner privacy (LOL) thing which prevents me from getting a name or number of the person in charge of the site.

If I was a customer on that list what procedure can I follow to identify the site owner?

[edited by: Frank_Rizzo at 10:51 am (utc) on Jan. 31, 2008]


 8:45 am on Feb 1, 2008 (gmt 0)

If you found the file through a search, you could try reporting that search through the "Dissatisfied? Help us improve" link at the bottom of the page. Someon does read all those messages so it might get some action. Since you are an uninvolved third party, I can't think of anything else that you could do.


 10:38 am on Feb 1, 2008 (gmt 0)

Here's an update.

After 24hrs the webhosts did a partial fix. All they did was to remove the facility to browse directories.

Unfortunately the files are still physically in-situ and thus the google removal tool fails with reason disallowed due to the files not returning a 404.

I have now sent a more stern request to the webhosters informing them of their unacceptable resolution. I have also made a legal enquiry as to what to do next.


 2:12 pm on Feb 1, 2008 (gmt 0)

>>>>here in the UK we have strict data protection laws which prevent this kind of thing happening.

Actually, doesn't seem to prevent it at all. Case in point, your complaint.

I suspect the webhost and Google are going to tell you to take a long walk off of a short dock. UK laws don't apply to the US thankfully.

It sounds like you may be making a mountain out of a molehill. Does anyone besides you really care? Has anyone besides you even found this information? Probably not; so while it's technically available if you know the exact URL, it's quite likely that nobody would ever find it.


 2:28 pm on Feb 1, 2008 (gmt 0)

How about opening up another thread to discuss the moral issues whilst I try and get this serious issue resolved?


 5:06 pm on Feb 1, 2008 (gmt 0)

I don't understand what you're saying, precisely. This customer info is not your info nor on your servers? Or is it on your servers?

What you need to do is inform those people about the problem, if this is nothing you have direct control over. Inform them of their rights to protect their information and who to contact.

Take a look around here or even give the ICO a call for their recommendation: [ico.gov.uk...]

Again, assuming this is not your info and is not anything you have control over, you can't do much except encourage the individuals whose personal information may be illegally available to exercise their rights.


 5:13 pm on Feb 1, 2008 (gmt 0)

I found the files via google - anyone can. They contain customer contact details which are supposed to be protected by a data protection act here in the UK.

The site owner is not contactable - he deleted his site and is using an anonymous type whois. When I say deleted his site he deleted homepage, contact details etc. but left behind csv files of sensitive data.

The webhosters have been asked to move the files, or at the very least put them in an non browsable location.

I am not on the list. But I am concerned about the way those customers are exposed to fraud, spam and scams.

There are over 4000 names on the list. They are a certain type of customer who I would consider "vulnerable". The product they purchased was a bit of a scam.

The list is valuable to certain people who pay top dollar for that information. Just trying to be a good samaritan here.

Thanks for the ICO link. I'll persue that avenue too.

[edited by: Frank_Rizzo at 5:14 pm (utc) on Feb. 1, 2008]


 5:50 pm on Feb 1, 2008 (gmt 0)

I can't resist asking this... why do you keep saying bringing up the privacy laws in the UK? The server is in the US, you don't know who the domain owner is, do you have some reason to think that UK law applies in this case?

P.S. web.archive.org may have a cached copy of the original website if you need to look at it. Heck, they may even have cached the data file as well.


 5:57 pm on Feb 1, 2008 (gmt 0)


I have been informed that the domain owner is very likely to be a certain person running a limited company in the UK.

If this is the case (that he is a UK citizen running a UK company) he has quite likely contravened an act or law which is designed to protect users data online.

All of us who handle customer data here have a care of duty to protect customer information. We are required to do so by law. The fact that data is stored in an overseas server is of no matter.


 10:45 pm on Feb 2, 2008 (gmt 0)

There is a lot of sensitivity to data protection issues here in the UK, after various bits of the government have had numerous data losses and breaches in the last few months... losses involving data on at least 30% of the UK poulation.

Global Options:
 top home search open messages active posts  

Home / Forums Index / Google / Google SEO News and Discussion
rss feed

All trademarks and copyrights held by respective owners. Member comments are owned by the poster.
Home ¦ Free Tools ¦ Terms of Service ¦ Privacy Policy ¦ Report Problem ¦ About ¦ Library ¦ Newsletter
WebmasterWorld is a Developer Shed Community owned by Jim Boykin.
© Webmaster World 1996-2014 all rights reserved