Frequently Asked Questions (FAQ)
On this page our goal is to provide answers to the most common questions about Philterd Data Services. If you cannot find the answer you are looking for, please reach out for support.
What is Philterd Data Services?
Philterd Data Services is a sophisticated, cloud-native platform offering a comprehensive suite of data capabilities specifically designed for the identification, protection, and management of Personally Identifiable Information (PII), Protected Health Information (PHI), and other categories of sensitive data.
Our platform enables users to securely upload various data formats, such as raw text and documents, to perform automated redaction and risk assessments. Philterd Data Services was built to be more than just a redaction tool. Philterd Data Services is designed to be a holistic data management platform that helps organizations to understand and manage their data's risk profile.
What can Philterd Data Services do for me?
Philterd Data Services provides a robust set of tools to help you safeguard the PII and PHI contained within your organizational data. Key capabilities include:
- Automated Redaction: Efficiently remove or mask sensitive information from text, documents, and datasets based on highly configurable policies.
- Risk Assessments: Perform quantitative and qualitative analyses on your documents to identify the presence and concentration of sensitive information, helping you understand your data's risk profile.
- Highly Customizable Configurations: Tailor redaction operations and risk assessments to meet your specific compliance requirements and business needs through flexible, JSON-based policies.
- Auditability and Transparency: Maintain a verifiable record of all data protection activities with our immutable, cryptographically-verifiable redaction ledgers.
Learn about contexts, policies, and disambiguation features of redaction.
When is Philterd Data Services right for me?
Philterd Data Services is the ideal solution for organizations and individuals who need to manage documents containing sensitive information. You should consider using our platform if:
- You handle PII or PHI: You need to redact or manage Personally Identifiable Information (PII) or Protected Health Information (PHI) to comply with regulations, such as HIPAA and others.
- Security is a Top Priority: You require a platform that prioritizes data sovereignty and does not transmit your data to third-party AI services (like OpenAI or Google Cloud AI).
- You need Auditability: Your workflows require a verifiable, immutable record of all redaction activities for compliance and auditing purposes.
- You want to understand Data Risk: You need more than just redaction; you want to quantify the risk profile of your documents through automated risk assessments.
- You require High Customizability: You need to tailor redaction rules to specific domains (e.g., medical, legal) or use custom lists of terms unique to your organization.
- You want to Automate Redaction: You are looking to integrate sophisticated PII/PHI identification and redaction into your own applications or workflows via a robust API.
How does Philterd Data Services use my data?
At Philterd Data Services, we prioritize your data's privacy and security. We adhere to the following strict data usage principles:
- Limited Purpose: We do not use your data for any purpose other than to perform the redaction and risk assessment operations you explicitly request.
- No Third-Party Sharing: We do not share, sell, or disclose your data to any third parties.
- No AI Training: We do not use your data to train our artificial intelligence or machine learning models.
- Strict Internal Governance: We maintain rigorous internal controls to ensure your data is handled according to the highest security standards.
We use no third-party services to process your data. For more detailed information, please refer to our full Privacy Policy.
Are my uploaded documents shared with anyone else?
No. Philterd Data Services is built on a foundation of data sovereignty and security. All redaction and analysis operations are performed exclusively in our cloud environment.
A key differentiator of our service is that we do not use any external, third-party APIs, or services for redaction. Your documents are never transmitted to external services or companies. All AI and machine learning capabilities used by Philterd Data Services have been and will continue to be developed in-house, ensuring that your sensitive information remains within our controlled environment throughout the entire processing lifecycle.
For more information, please refer to our Privacy Policy or contact us.
What types of documents can be redacted?
Currently, Philterd Data Services supports the redaction and risk assessment of the following document types:
- PDF Documents:
.pdffiles. PDF files must be "searchable", meaning the PDF should contain text and not images of text. For examples, see below. - Microsoft Word Documents:
.docxfiles. - Plain Text Files:
.txtand other similar text-based formats.
Important Note regarding PDFs: To be processed successfully, PDF documents must be text-based (containing selectable text) and not scanned images. A simple way to verify this is to open the PDF and attempt to highlight text with your cursor. If you can highlight and select the text, the document is compatible. If the document is an image (e.g., a scan without OCR), our service will not be able to identify and redact the text within it at this time.
Examples of Suitable PDF Documents
Note how in these example PDFs the text can be selected and highlighted with the cursor.
Examples of Unsuitable PDF Documents
Note how in these example PDFs the text cannot be selected and highlighted with the cursor.
What languages are supported by Philterd Data Services?
Philterd Data Services uses two categories of filters to identify sensitive information:
-
Deterministic Filters (Pattern-Based): These filters use regular expressions and pattern matching to identify sensitive information based on specific formats and structures (e.g., Social Security Numbers, credit card numbers, phone numbers, email addresses). Deterministic filters are language-agnostic and will work with text in any language, as long as the data follows recognizable patterns.
-
Non-Deterministic Filters (Named Entity Recognition - NER): These filters use advanced machine learning models to identify entities such as person names, locations, and organizations within text. Currently, the non-deterministic filters support English only. We are continuously working to expand language support for these NER-based filters.
For optimal results with non-English text, we recommend configuring your policies to primarily use deterministic filters that match the specific patterns relevant to your data.
Is Philterd Data Services guaranteed to find and redact all sensitive information within my documents?
No. Philterd Data Services is not guaranteed to find and redact all sensitive information within your documents. While it employs advanced algorithms and machine learning models to identify and redact sensitive information, it is subject to limitations such as the quality of the training data, the complexity of the document structure, and the presence of ambiguous or context-dependent information.
Please see Mistakes for important information.
How much time does it take to process a document?
Often just seconds, but sometimes it can take a few minutes. Processing times vary based on the complexity and size of the document, the number of redactions performed, and the configuration of the selected policy. A policy configured to find many types of PII may take longer to process than a policy that is configured to find only a single type of PII.
What's the difference between Philterd Data Services and the open source Philter and Phileas projects?
Philterd Data Services is a managed, cloud-native platform that builds upon the core capabilities of the open-source Philter and Phileas projects.
While the open-source projects provide the foundational engines for PII/PHI identification and redaction, Philterd Data Services offers several significant advantages:
- Managed Infrastructure: We handle the complex tasks of deploying, scaling, and maintaining the redaction infrastructure, allowing you to focus on your data.
- Enhanced Capabilities: Our platform includes advanced features not found in the open-source versions, such as support for Microsoft Word documents.
- Centralized Management: Manage all your redaction policies, audit logs, and risk assessments through a single, intuitive web dashboard.
- Enterprise-Grade Security and Compliance: Philterd Data Services is architected for high security and is HIPAA-compliant, featuring encrypted storage, rigorous access controls, and detailed auditability.
- Quantitative Risk Assessments: Go beyond simple redaction with sophisticated tools that analyze and quantify the sensitive data risk within your documents.
The Philterd team maintains the Philter and Phileas open source projects, and a lot of Philterd Data Services is built upon those two projects.
- Learn more about Philter or launch it in your own cloud
- View the open source Phileas project on GitHub
How much does Philterd Data Services cost and how is it billed?
Philterd Data Services operates on a subscription-based billing model, processed on a monthly cycle. Our pricing structure consists of two components:
- Flat Monthly Platform Fee: A base fee that provides access to the platform and its core features.
- Usage-Based Pricing: A variable fee based on the volume of data processed (e.g., number of documents or characters redacted).
For detailed and up-to-date pricing information, please visit our Pricing page. Currently, we require a valid credit card for account registration, and this is the primary method for monthly billing.
Why does the pricing include a monthly fee in addition to usage-based pricing?
The monthly platform fee is essential for maintaining the high-quality infrastructure and value-added features we provide. Specifically, this fee covers:
- Secure Data Storage: The costs associated with the encrypted storage of your documents and their associated metadata.
- Infrastructure and Bandwidth: The operational expenses for the high-performance servers and network bandwidth required to process and deliver your documents.
- Premium Platform Features: Access to features that do not have separate usage charges, such as the redaction ledgers, contexts, disambiguation, and policy management.
Is Philterd Data Services HIPAA certified?
It is important to clarify that there is no official "HIPAA certification" granted by a governing body to software companies. Instead, HIPAA compliance is a matter of an organization demonstrating that it has implemented the required administrative, physical, and technical safeguards.
Philterd Data Services demonstrates its commitment to HIPAA compliance through:
- Extensive Training: Our team is small and specialized, and every member has completed comprehensive HIPAA training.
- Regular Risk Assessments: We perform ongoing risk assessments that analyze our platform's configuration, code base, and operational procedures.
- Industry-Standard Safeguards: We implement industry-leading best practices and technical safeguards designed specifically for the protection of sensitive health information.
Is Philterd Data Services HIPAA compliant?
Yes. Philterd Data Services is architected and operated to satisfy the requirements of the Health Insurance Portability and Accountability Act (HIPAA) as it applies to electronic PII and PHI (ePHI).
Our compliance measures include:
- Encryption at Rest: All documents and metadata are stored using strong, industry-standard encryption.
- Encryption in Transit: All data transmitted between your systems and our platform is protected by secure, encrypted protocols (TLS/SSL).
- Access Controls: Rigorous internal access controls ensure that only authorized processes and personnel have access to the necessary data.
For a more in-depth look at our security posture, please visit our Security documentation page.
Does Philterd Data Services provide a BAA?
Yes. We provide a Business Associate Agreement (BAA) for customers who process PHI on our platform. A BAA can be requested through your My Account. Please do not use Philterd Data Services to process PHI without a signed BAA.
Can Philterd Data Services run in my own cloud?
Please Contact us to discuss your requirements and options for an on-premises deployment.