Dictionaries allow us to reuse values from previous transformations, not only between different jobs, but also in the same execution. This allows us to keep consistency in the resulting dataset, a given input will always produce the same output value.
Dictionaries are common to all taps within the same project. Therefore, we will be able to transform data consistently across multiple databases.
When running or editing a rule, the dictionary usage options will show up in the configuration right panel.
How to use
Depending on the criteria we choose, we will obtain different results depending on the dictionary configuration we set. These are the available ones:
Do not use dictionary
Ignore value matches in the transformation job. Completely different values will result even if the input value is identical.
As we can see, although
Edith was previously anonymized with
second time it appears it becomes another value (
Valerie). The same happens
with the last name
Reuse in the same entity and field
Matches in the same entity and field will be transformed in the same way. Even if there are matches in other entities or fields, they will be ignored.
In this case, the name
Melanie, both the first and the
second time it appears.
This happens because after the first clash, the value is stored in the
dictionary, so when the entity and the field match again, the value is
transformed in the same way. The same happens with the surname
Smith, which in
both occurrences is found in the entity Customers and in the Last name
Reuse by label or in the same entity and fields
Matches on fields labelled with the same label will be transformed in the same way. If they do not have the same label they may still match on the combination of entity and field (as in the previous case) and in which case the result would be identical.
Even if there are matches in other entities or columns they will be ignored if they do not have the same label.
In this case,
Randal is transformed into
Mark in both tables because even
though occurrences happen in different entity and field, they both share the
same label "person/name".
Reuse in all fields
Values that have already been stored in the dictionary will be used regardless of the entity and the field in which they have been found or the label assigned to it.
Although they do not share the same entity, field nor label, all occurrences of
Susan will always become
Percy, no matter where they are found along the
Save new transformations in the dictionary
If this option is enabled, transformations are stored and can be used in the next jobs.
If this option is not active, transformations carried out during the job will be deleted, so that they will only take effect during the job itself.
Overwrite current dictionary
If this option is active, the current project dictionary will be cleared before the rule is run, so no previously stored values will be reused.
Gigantic does not store any source data in its database. We use a cryptographic function to hash the entries. Therefore, it is impossible to revert the process to get the original data back.