Posts

Showing posts with the label Metadata

Tokenizing Sensitive Information - PII Protection

Image
The only way to protect sensitive information is to remove the sensitive values everywhere they are not absolutely needed. Data designers can remove the fields completely or change the field values so that they are useless in the case of data theft.  Data tokenization and Data encryption are two possible solutions to this issue.  Both approaches must be implemented in a way that they return the same non-PII value for a given PII value every time they are invoked. We're going to talk about tokenization here. Tokenized field values must be changed in a repeatable way so that the attributes still be useful for joining data in queries or reports. This means every data set with the same value for the same PII field will have the same replaced value.  This lets us retain the ability to join across datasets or tables using sensitive data fields.  Every PII field has a typecode or a key.  That type is used whenever...

The Future is Zero PII in Lakes and Analytical Stores

Image
The only way to protect PII is to remove it from your Lake or other Analytical Stores.  New regulations and laws create stiff penalties for data leaks and give consumers or customers right to know all the place their data is used.  We want to remove PII to meet new regulations while still retaining enough information to join across datasets. Recorded Talk Speakers Notes Speaker Notes not yet available. Speaker Notes not yet available. Speaker Notes not yet available. Speaker Notes not...

Slice Splunk simpler and faster with better metadata

Image
Splunk is a powerful event log indexing and search tool that lets you analyze large amounts of data. Event and log streams can be fed to the Splunk engine where they are scanned and indexed.  Splunk supports full text search plus highly optimized searches against metadata and extracted data fields.  Extracted fields are outside this scope of this missive. Each log/event record consists of the log/event data itself and information about the log/event known as metadata.  For example, Splunk knows the originating host for each log/event.   Queries can efficiently filter by full or partial host names without having to specifically put the host name in every log message. Message counts with metadata wildcards One of the power features of metadata is that Splunk will provide a list of all metadata values and the number of matching messages as part of the result of any query.  A Splunk query returns matching log/event records and the the number of records in e...