iris-Disguise is a tool for Data Anonymization on InterSystems IRIS.
iris-Disguise helps you to build anonymized production data dumps which you can use for performance testing, security testing, debugging, and development.
Data anonymization is a type of information sanitization whose intent is privacy protection. It is the process of removing personally identifiable information from data sets, so that the people whom the data describe remain anonymous. Wikipedia
iris-Disguise provides a set of anonymization strategies:
- Destruction Sometimes the fastest and the best approach to anonymize a data is to replace all the values with the word CONFIDENTIAL
- Randomization Generate purely random data
- Faking Replace data with random but plausible fake values
- Partial Masking leaves out some part of the data
- Scramble Given "ABCDEFG", return something like "GEFBDCA"
- Shuffling mixes values within the same columns
Make sure you have git and Docker desktop installed.
Open terminal and clone/git pull the repo into any local directory as shown below:
git clone https://github.com/henryhamon/iris-disguise.git
Open the terminal in this directory and run:
docker-compose build
zpm:USER>install iris-disguise
Open IRIS terminal:
docker-compose exec iris iris session iris -U IRISAPP
set ^UnitTestRoot = "/opt/irisbuild/src/iris/dc/Test/Disguise/"
Do ##class(%UnitTest.Manager).RunTest("","/loadudl")
This strategy will replace a entire column with a word ('CONFIDENTIAL' is the default).
Do ##class(dc.Disguise.Strategy).Destruction("classname", "propertyname", "Word to replace")
The third parameter is optional. If not provided, the word 'CONFIDENTIAL' will be used.
This strategy will scrambling all characters in a property.
Do ##class(dc.Disguise.Strategy).Scramble("classname", "propertyname")
Shuffling will rearrange all values in a given property. Is not a masking strategy because it works "vertically". This strategy is useful for relationships because referential integrity will be kept. Until this version, this method only works on one-to-many relationships.
Do ##class(dc.Disguise.Strategy).Shuffling("classname", "propertyname")
This strategy will obfuscate the part of data, a credit card number for example, can be replaced by 456X XXXX XXXX X783
Do ##class(dc.Disguise.Strategy).PartialMasking("classname", "propertyname", prefixLength, suffixLength, "mask")
PrefixLength, suffixLength and mask are optional. If not provided, the default values will be used.
This strategy will generate purely random data. There are three types of randomization: integer, numeric and date.
Do ##class(dc.Disguise.Strategy).Randomization("classname", "propertyname", "type", from, to)
type: "integer", "numeric" or "date". "integer" is the default.
from and to are optional. Is to define the range of randomization. For integer type the default range is 1 to 100. For numeric type the default range is 1.00 to 100.00.
The idea of Faking is to replace data with random but plausible values. iris-Disguise provides a small set of methods to generate fake data.
Do ##class(dc.Disguise.Strategy).Fake("classname", "propertyname", "type")
type: "firstname", "lastname", "fullname", "company", "country", "city" and "email"
Another way to use iris-Disguise is wearing the disguise glasses. In a persistent class, you can extend the dc.Disguise.Glasses class and change any property with the data type with the strategy of your choice. After that just call DisguiseProcess method in your class. All the values will be replaced using the strategy of the data type.
Data types:
- PartialMaskString
- RandomInteger
- RandomNumeric
- FakeString: FieldStrategy parameters: "FIRSTNAME", "LASTNAME", "FULLNAME", "COMPANY", "COUNTRY", "CITY" AND "EMAIL"
- String: FieldStrategy parameters: "DESTRUCTION","SCRAMBLE" AND "SHUFFLING"
Icon by Flaticon from www.flaticon.com
- Henry "HammZ" Hamon github