Kafka Source Template
The Kafka source template generates an OpenTelemetry configuration that can be sent to a remotely managed OpenTelemetry collector (otelcol). By creating this source template and pushing the configuration to the appropriate OpenTelemetry agent, you can ensure the collection of Kafka logs and metrics in Sumo Logic.
Fields created by the source templateβ
When you create a source template, the following fields are automatically added (if they donβt already exist):
sumo.datasource
. Fixed value of kafka.messaging.system
. Fixed value of kafka.deployment.environment
. This is a user-configured field set at the time of collector installation. It identifies the environment where the Kafka env resides, such asdev
,prod
, orqa
.messaging.cluster.name
. User configured. Enter a name to uniquely identify your Kafka cluster. This cluster name will be shown in the Sumo Logic dashboards.messaging.node.name
. Includes the value of the hostname of the machine which is being monitored.
Prerequisitesβ
For metrics collectionβ
The Kafka metrics receiver collects Kafka metrics (brokers, topics, partitions, and consumer groups) from the Kafka server. This app has been tested with following Kafka versions: 2.x and 3.x.
For logs collectionβ
In this section, you'll configure logging in Kafka. By default, Kafka logs (server.log
and controller.log
) are stored in the directory called /opt/Kafka/kafka_<VERSION>/logs
. Make a note of this logs directory.
Ensure that the otelcol has adequate permissions to access all log file paths. Execute the following command:
sudo setfacl -R -m d:u:otelcol-sumo:r-x,u:otelcol-sumo:r-x,g:otelcol-sumo:r-x <PATH_TO_LOG_FILE>
For Linux systems with ACL Support, the otelcol install process should have created the ACL grants necessary for the otelcol system user to access default log locations. You can verify the active ACL grants using the getfacl
command. Install the ACL in your Linux environment, if not installed.
The required ACL may not be supported for some rare cases, for example, Linux OS Distro, which is officially not supported by Sumo Logic. In this case, you can run the following command to explicitly grant the permissions.
sudo setfacl -R -m d:u:otelcol-sumo:r-x,d:g:otelcol-sumo:r-x,u:otelcol-sumo:r-x,g:otelcol-sumo:r-x <PATH_TO_LOG_FILE>
Run the above command for all the log files in the directory that need to be ingested, which are not residing in the default location.
If Linux ACL Support is not available, traditional Unix-styled user and group permission must be modified. It should be sufficient to add the otelcol system user to the specific group that has access to the log files.
For Windows systems, the collected log files should be accessible by the SYSTEM group. Use the following set of PowerShell commands if the SYSTEM group does not have access.
$NewAcl = Get-Acl -Path "<PATH_TO_LOG_FILE>"
# Set properties
$identity = "NT AUTHORITY\SYSTEM"
$fileSystemRights = "ReadAndExecute"
$type = "Allow"
# Create a new rule
$fileSystemAccessRuleArgumentList = $identity, $fileSystemRights, $$fileSystemAccessRule = New-Object -TypeName System.Security.AccessControl.FileSystemAccessRule -ArgumentList $fileSystemAccessRuleArgumentList
# Apply new rule
$NewAcl.SetAccessRule($fileSystemAccessRule)
Set-Acl -Path "<PATH_TO_LOG_FILE>" -AclObject $NewAcl
Configuring the Kafka source templateβ
Follow these steps to set up and deploy the source template to a remotely managed OpenTelemetry collector.
Step 1: Set up remotely managed OpenTelemetry collectorβ
In this step, we'll install the collector and add a uniquely identifiable tag to these collectors.
- Classic UI. In the main Sumo Logic menu, select Manage Data > Collection > OpenTelemetry Collection.
New UI. In the Sumo Logic top menu select Configuration, and then under Data Collection select OpenTelemetry Collection. You can also click the Go To... menu at the top of the screen and select OpenTelemetry Collection. - On the OpenTelemetry Collection page, click + Add Collector.
- In the Set up Collector step:
- Choose your platform (for example, Linux).
- Enter your Installation Token.
- Under Tag data on Collector level, add a new tag to identify these collectors as having Apache running on them (for example,
application = Apache
). - Leave the Collector Settings at their default values to configure collectors as remotely managed.
- Under Generate and run the command to install the collector, copy and run the installation command in your system terminal where the collector needs to be installed.
- After installation is complete, click Next to proceed.
- Select a source template (for example, Apache source template) to start collecting logs from all linked collectors, then proceed with the data configuration.
To revisit this screen later: From the Classic UI, select Manage Data > Collection > Source Template. From the New UI, select Configuration > Source Template.
Step 2: Configure the source templateβ
In this step, you will configure the yaml required for Kafka collection. Below are the inputs required for configuration:
- Name. Name of the source template.
- Description. Description for the source template.
- Server file log path. Enter the path to the server log file for your Kafka instance.
- Controller file log path. Enter the path to the controller log file for your Kafka instance.
- Endpoint. The URL of the broker endpoint (default:
localhost:9092
). - Fields/Metadata. You can provide any customer fields to be tagged with the data collected. By default, Sumo Logic tags
_sourceCategory
with the value otel/kafka user needs to provide the value forwebengine.cluster.name
.
Advance options for log collection can be used as follows:
- Timestamp Format. By default, Sumo Logic will automatically detect the timestamp format of your logs. However, you can manually specify a timestamp format for a source by configuring the following:
- Timestamp locator. Use a Go regular expression to match the timestamp in your logs. Ensure the regular expression includes a named capture group called
timestamp_field
. - Layout. Specify the exact layout of the timestamp to be parsed. For example,
- %Y-%m-%dT%H:%M:%S.%LZ
. To learn more about the formatting rules, refer to this guide. - Location (Time zone). Define the geographic location (timezone) to use when parsing a timestamp that does not include a timezone. The available locations depend on the local IANA Time Zone database. For example,
America/New_York
. See more examples here.
- Timestamp locator. Use a Go regular expression to match the timestamp in your logs. Ensure the regular expression includes a named capture group called
Processing Rules. You can add processing rules for logs/metrics collected. To learn more, refer to Processing Rules.
Step 3: Push the source template to the desired remotely managed collectorsβ
A new source template will always be created with the latest version of the source template.
Follow the below steps to create a data collection configuration to gather the required logs and link them to all the collectors with the help of collector tags.
- Complete the source template form with the name and file path for your logs (for example, error logs or access logs), then click Next.
- Under Link Collectors, you will have the option to link the collectors using the collector name or by adding tags to find the group of collectors (for example,
application = Apache
). - Preview and confirm the collectors that will be linked (fetched automatically) to the newly created source template.
- Click Next to complete the source template creation. In the background, the system will apply the configuration to all the linked collectors and will start collecting the respective telemetry data from the remote host (in the example, it would start collecting Apache error logs).
- Click the Log Search or Metrics Search icons to search for and analyze your data collected for this source template.
Refer to the changelog for information on periodic updates to this source template.