Getting Started
Launching Phirestream
Step through your cloud provider’s steps for launching Phirestream in your cloud. Once Phirestream has been launched and its virtual machine is running, you can continue with this guide below to configure Phirestream.
Phirestream can be used with self-managed Apache Kafka clusters and managed hosting services, such as Amazon MSK, Confluent Cloud, and Instaclustr.
Configuring Phirestream
With Phirestream now running we can configure it. Here we configure how Phirestream listens for incoming data for redaction and the details of the downstream Apache Kafka brokers.
Open the Phirestream configuration file at /opt/phirestream/config/application.properties
. Set the value of the kafka.bootstrap.servers
property to the location of your Apache Kafka broker(s). Use the command below to restart Phirestream to make the change to take affect. (For a full list of the available Phirestream settings see Settings.)
Once Phirestream restarts we are now ready to publish and redact text. Phirestream’s API endpoint is accessible at https://phirestream:8080
, where phirestream
is the IP or DNS name of the Phirestream virtual machine.
Using Phirestream to Redact Text
The following command will publish a single message to Phirestream. In this request, the text George Washington was president is being published to the Apache Kafka topic mytopic
.
Phirestream implements Apache Kafka’s REST API interface. This means that Phirestream can be a drop-in solution for redacting text in your streaming data pipelines.
Consuming the Redacted Text
Now, we will use Apache Kafka to consume from the mytopic
topic to get the redacted message:
The output of the command is a single message with the following content:
You are now ready to redact more streaming text with Phirestream!
Summary
In this example we can see that Phirestream received the request, redacted the person’s name as sensitive information, and published the modified data to Apache Kafka.
The types of sensitive information that are identified by Phirestream are defined in files called filter profiles. A filter profile specifies the types of sensitive information and how to redact those types. Phirestream selects which filter profile to apply based on the name of the Apache Kafka topic. In the example above, the topic name was default so the filter profile named default was applied.
You are now ready to begin using Phirestream to manage sensitive information in your streaming text!