Forum Moderators: open

Message Too Old, No Replies

Generate representational datasets

         

tomm

11:31 am on Nov 7, 2008 (gmt 0)



Hey all,

We have decided to assign the development of our commercial website to another company and they are asking for a copy of our databases in order to make some tests on performances.
However, my boss doesn't want any data leak...

I would like to generate datasets with a volume that epitomizes the amount of data within our databases.
Thanks.

eelixduppy

4:43 pm on Nov 7, 2008 (gmt 0)



I'd just copy the database then replace all confidential data with dummy data in one query. As long as the structure doesn't get changed the content of the tables shouldn't matter that much.

ZydoSEO

9:16 pm on Nov 7, 2008 (gmt 0)

WebmasterWorld Senior Member 10+ Year Member



Yes... It typically is very simple to jumble the data like one person's SSN with another person's first name with another person's last name with yet another persons birthday with another person's street address with yet another person's City/State/Zip and so on... to hide all confidential data.

Gtlnd

1:04 pm on Nov 10, 2008 (gmt 0)



Hi,

what i would propose, like Zydoseo, is "data anonymising" : you keep the exact same volume of data in your database(s) but change some of the variables related to your data (and keep others).

This variant allows you to get a database which can't be more representational than the original one, since you decide which changes to make depending on the confidentiality level wanted.