A "filter" corresponds to a type of sensitive information. Airlock has filters for sensitive information such as names, addresses, ages, and lots of others.
These are predefined filters that are ready to be used as well as custom filters that let you define your own Airlock to identify sensitive information outside of what the predefined filters can identify. An example of a custom filter is a filter to identify your patient account numbers, where the structure of an account number is specific to your organization.
Each filter is capable of identifying and redacting a specific type of sensitive information. For example, there is a filter for phone numbers, a filter for US social security numbers, and a filter for person's names. You can enable any combination of these filters based on the types of sensitive information you need to redact.
This section of the documentation describes the filters available in Airlock. The configuration options for each filter can vary due to the type of the sensitive information. For instance, only the zip code filter has a configuration to truncate the zip code.
A selection of filters and their configurations is called a policy. A policy describes how to de-identify a document.
Airlock uses several methods to identify person's names.
Identifies ages such as
Identifies Bitcoin addresses such as
Identifies common cities
Identifies common counties
Identifies VISA, American Express, MasterCard, and Discover credit card numbers.
Identifies dates in many formats such as May 22, 1999
Identifies driver's license numbers for all 50 US states
Identifies email addresses
Identifies common hospital names and their abbreviations
Identifies international bank account numbers
Identifies IPv4 and IPv6 addresses
Identifies network MAC addresses
Identifies US passport numbers
Identifies phone numbers and phone number extensions
Identifies sections in text denoted by
Identifies US SSNs and TINs
Identifies US state names and abbreviations
Identifies UPS, FedEx, and USPS tracking numbers
Identifies vehicle identification numbers
Identifies US zip codes
Custom Filter Types of Sensitive Information
In addition to the predefined types of sensitive information listed in the table above, you can also define your own types of sensitive information. Through custom identifiers and dictionaries, Airlock can identify many other types of information that may be sensitive in your use-case. For example, if you have patient identifiers that follow a pattern of
AA-00000 you can define a custom identifier for this sensitive information.
Airlock can be configured to look identify sensitive information based on custom dictionaries. When a term in the dictionary is found in the text, Airlock will treat the term as sensitive information and apply the given filter strategy.
Custom dictionaries support fuzziness to accommodate for misspellings. The replacement strategy for a custom dictionary has a
sensitivityLevel that controls the amount of allowed fuzziness.