Skip to main content

Elasticsearch - Classic Collector

Thumbnail icon

The Elasticsearch app is a unified logs and metrics app that helps you monitor the availability, performance, health, and resource utilization of your Elasticsearch clusters. Preconfigured dashboards provide insight into cluster health, resource utilization, sharding, garbage collection, and search, index, and cache performance.

Sample log messages

{
"type":"server",
"timestamp":"2021-07-12T05:12:07,101+0000",
"level":"WARN",
"component":"o.e.c.NodeConnectionsService",
"cluster.name":"elasticsearch",
"node.name":"elasticsearch-master-0",
"cluster.uuid":"pQ372ZkIQiaHkSVp6hlxZw",
"node.id":"7PdqQlHYRjqbzClkTeoVdA",
"message":"failed to connect to {elasticsearch-master-1}{OfUoMAwoRoKr2sAlYAYuEA}{RnYfI0DUT9uqtF4h5aVDQg}{10.42.1.143}{10.42.1.143:9300}{dim}{ml.machine_memory=2147483648, ml.max_open_jobs=20, xpack.installed=true} (tried [1] times)"
}

Collecting logs and metrics for the Elasticsearch app

Configuring log and metric collection for the Elasticsearch app includes the following tasks.

Configure Collection for Elasticsearch

In Kubernetes environments, we use the Telegraf Operator, which is packaged with our Kubernetes collection. You can learn more about it here. The diagram below illustrates how data is collected from Elasticsearch in a Kubernetes environment. Four services in the architecture shown below make up the metric collection pipeline: Telegraf, Telegraf Operator, Prometheus, and Sumo Logic Distribution for OpenTelemetry Collector.


elasticsearch

The first service in the metrics pipeline is Telegraf. Telegraf collects metrics from Elasticsearch. Note that we’re running Telegraf in each pod we want to collect metrics from as a sidecar deployment, for example, Telegraf runs in the same pod as the containers it monitors. Telegraf uses the Elasticsearch input plugin to obtain metrics. For simplicity, the diagram doesn’t show the input plugins. The injection of the Telegraf sidecar container is done by the Telegraf Operator. Prometheus pulls metrics from Telegraf and sends them to Sumo Logic Distribution for OpenTelemetry Collector which enriches metadata and sends metrics to Sumo Logic.

In the logs pipeline, Sumo Logic Distribution for OpenTelemetry Collector collects logs written to standard out and forwards them to another instance of Sumo Logic Distribution for OpenTelemetry Collector, which enriches metadata and sends logs to Sumo Logic.

Follow the below instructions to set up the logs and metric collection:

prerequisites

It’s assumed that you are using the latest helm chart version. If not, upgrade using the instructions here.

Configure Logs Collection

This section explains the steps to collect Elasticsearch logs from a Kubernetes environment.

  1. (Recommended Method) Add labels on your Elasticsearch pods to capture logs from standard output on Kubernetes.

    1. Apply the following labels to the Elasticsearch pods:
    environment = "dev_CHANGE_ME"
    component = "database"
    db_system = "elasticsearch"
    db_cluster = "elasticsearch_on_k8s_CHANGE_ME"
    db_cluster_address = `ENV_TO_BE_CHANGED`
    db_cluster_port = `ENV_TO_BE_CHANGED`
    1. Enter in values for the following parameters (marked ENV_TO_BE_CHANGED above):
    • environment. This is the deployment environment where the Elasticsearch cluster identified by the value of servers resides. For example, dev, prod, or QA. While this value is optional we highly recommend setting it.
    • db_cluster. Enter a name to identify this Elasticsearch cluster. This cluster name will be shown in the Sumo Logic dashboards.
    • db_cluster_address - Enter the cluster hostname or ip address that is used by the application to connect to the database. It could also be the load balancer or proxy endpoint.
    • db_cluster_port - Enter the database port. If not provided, a default port will be used.
    note

    db_cluster_address and db_cluster_port should reflect the exact configuration of DB client configuration in your application, especially if you instrument it with OT tracing. The values of these fields should match exactly the connection string used by the database client (reported as values for net.peer.name and net.peer.port metadata fields). For example, if your application uses “elasticsearch-prod.sumologic.com:3306” as the connection string, the field values should be set as follows: db_cluster_address=elasticsearch-prod.sumologic.com db_cluster_port=3306. If your application connects directly to a given elasticsearch node, rather than the whole cluster, use the application connection string to override the value of the “host” field in the Telegraf configuration: host=elasticsearch-prod.sumologic.com. Pivoting to Tracing data from Entity Inspector is possible only for “Elasticsearch address” Entities.

    • Do not modify the following values as they will cause the Sumo Logic apps to not function correctly.
      • component: “database” - This value is used by Sumo Logic apps to identify application components.
      • db_system: “elasticsearch”- This value identifies the database system.
    • See this doc for more parameters that can be configured in the Telegraf agent globally.
    1. The Sumologic-Kubernetes-Collection will automatically capture the logs from stdout and will send the logs to Sumologic. For more information on deploying Sumologic-Kubernetes-Collection, visit here.
    2. Verify logs in Sumo Logic.
  2. (Optional) Collecting Elasticsearch Logs from a Log File. Follow the steps below to capture Elasticsearch logs from a log file on Kubernetes.

    1. Determine the location of the Elasticsearch log file on Kubernetes. This can be determined from the log4j.properties for your Elasticsearch cluster along with the mounts on the Elasticsearch pods.
    2. Install the Sumo Logic tailing sidecar operator.
    3. Add the following annotation in addition to the existing annotations.
    annotations:
    tailing-sidecar: sidecarconfig;<mount>:<path_of_Elasticsearch_log_file>/<Elasticsearch_log_file_name>

    Example:

    annotations:
    tailing-sidecar: sidecarconfig;data:/usr/share/elasticsearch/logs/gc.log
    1. Make sure that the Elasticsearch pods are running and annotations are applied by using the command:
    kubectl describe pod <Elasticsearch_pod_name>
    1. Sumo Logic Kubernetes collection will automatically start collecting logs from the pods having the annotations defined above.
    2. Verify logs in Sumo Logic.
  3. Add a FER to normalize the fields in Kubernetes environments. This step is not needed if using application components solution terraform script. Labels created in Kubernetes environments automatically are prefixed with pod_labels. To normalize these for our app to work, we need to create a Field Extraction Rule if not already created for Database Application Components. To do so:

    1. Classic UI. In the main Sumo Logic menu, select Manage Data > Logs > Field Extraction Rules.
      New UI. In the top menu select Configuration, and then under Logs select Field Extraction Rules. You can also click the Go To... menu at the top of the screen and select Field Extraction Rules.
    2. Click the + Add button on the top right of the table.
    3. The Add Field Extraction Rule form will appear:
    4. Enter the following options:
    • Rule Name. Enter the name as App Observability - Database.
    • Applied At. Choose Ingest Time
    • Scope. Select Specific Data
    • Scope: Enter the following keyword search expression:
    pod_labels_environment=* pod_labels_component=database pod_labels_db_system=* pod_labels_db_cluster=*
    • Parse Expression.Enter the following parse expression:
    if (!isEmpty(pod_labels_environment), pod_labels_environment, "") as environment
    | pod_labels_component as component
    | pod_labels_db_system as db_system
    | if (!isEmpty(pod_labels_db_cluster), pod_labels_db_cluster, null) as db_cluster
    1. Click Save to create the rule.

Configure Metrics Collection

This section explains the steps to collect Elasticsearch metrics from a Kubernetes environment, where we use the Telegraf Operator, which is packaged with our Kubernetes collection. You can learn more about this here. Follow the steps listed below to collect metrics from a Kubernetes environment:

  1. Set up Kubernetes Collection with the Telegraf Operator.
  2. On your Elasticsearch Pods, add the following annotations:
 annotations:
telegraf.influxdata.com/class: sumologic-prometheus
prometheus.io/scrape: "true"
prometheus.io/port: "9273"
telegraf.influxdata.com/inputs: |+

servers = ["http://<USER_CHANGE_ME>:<PASS_CHANGE_ME>@localhost:9200"]
http_timeout = "5s"
local = true
cluster_health = true
cluster_stats = true
cluster_stats_only_from_master = false
indices_include = ["_all"]
indices_level = "cluster"
[inputs.elasticsearch.tags]
environment = "ENV_TO_BE_CHANGED"
component = "database"
db_system = "elasticsearch"
db_cluster = "ENV_TO_BE_CHANGED"
db_cluster_address = `ENV_TO_BE_CHANGED`
db_cluster_port = `ENV_TO_BE_CHANGED`
  1. Enter in values for the following parameters (marked ENV_TO_BE_CHANGED above):
    • telegraf.influxdata.com/inputs. This contains the required configuration for the Telegraf Elasticsearch Input plugin. Please refer to this doc for more information on configuring the Elasticsearch input plugin for Telegraf. Note: As telegraf will be run as a sidecar the host should always be localhost.

    • In the input plugins section, that is [[inputs.elasticsearch]]:

      • servers - The URL to the Elasticsearch server. This can be a comma-separated list to connect to multiple Elasticsearch servers. Please see this doc for more information on additional parameters for configuring the Elasticsearch input plugin for Telegraf.
    • In the tags section, which is [inputs.elasticsearch]

      • environment. This is the deployment environment where the Elasticsearch cluster identified by the value of servers resides. For example dev, prod, or QA. While this value is optional we highly recommend setting it.
      • db_cluster. Enter a name to identify this Elasticsearch cluster. This cluster name will be shown in the Sumo Logic dashboards.
      • db_cluster_address - Enter the cluster hostname or ip address that is used by the application to connect to the database. It could also be the load balancer or proxy endpoint.
      • db_cluster_port - Enter the database port. If not provided, a default port will be used.
      note

      db_cluster_address and db_cluster_port should reflect exact configuration of DB client configuration in your application, especially if you instrument it with OT tracing. The values of these fields should match exactly the connection string used by the database client (reported as values for net.peer.name and net.peer.port metadata fields).

      For example, if your application uses “elasticsearch-prod.sumologic.com:3306” as the connection string, the field values should be set as follows: db_cluster_address=elasticsearch-prod.sumologic.com db_cluster_port=3306

      If your application connects directly to a given elasticsearch node, rather than the whole cluster, use the application connection string to override the value of the “host” field in the Telegraf configuration: host=elasticsearch-prod.sumologic.com

      Pivoting to Tracing data from Entity Inspector is possible only for “Elasticsearch address” Entities.

    • Do not modify the following values as it will cause the Sumo Logic app to not function correctly.

      • telegraf.influxdata.com/class: sumologic-prometheus. This instructs the Telegraf operator what output to use. This should not be changed.
      • prometheus.io/scrape: "true". This ensures our Prometheus will scrape the metrics.
      • prometheus.io/port: "9273". This tells prometheus what ports to scrape on. This should not be changed.
      • telegraf.influxdata.com/inputs
      • In the tags section [inputs.elasticsearch.tags]
        • component: “database” - This value is used by Sumo Logic apps to identify application components.
        • db_system: “elasticsearch” - This value identifies the database system.
      • See this doc for more parameters that can be configured in the Telegraf agent globally.
  2. Sumo Logic Kubernetes collection will automatically start collecting metrics from the pods having the labels and annotations defined in the previous step.
  3. Verify metrics in Sumo Logic.

Installing the Elasticsearch app

To install the app, do the following:

note

Next-Gen App: To install or update the app, you must be an account administrator or a user with Manage Apps, Manage Monitors, Manage Fields, Manage Metric Rules, and Manage Collectors capabilities depending upon the different content types part of the app.

  1. Select App Catalog.
  2. In the 🔎 Search Apps field, run a search for your desired app, then select it.
  3. Click Install App.
    note

    Sometimes this button says Add Integration.

  4. Click Next in the Setup Data section.
  5. In the Configure section of your respective app, complete the following fields.
    1. Is K8S deployment involved. Specify if resources being monitored are partially or fully deployed on Kubernetes (K8s)
  6. Click Next. You will be redirected to the Preview & Done section.

Post-installation

Once your app is installed, it will appear in your Installed Apps folder, and dashboard panels will start to fill automatically.

Each panel slowly fills with data matching the time range query received since the panel was created. Results will not immediately be available but will be updated with full graphs and charts over time.

As part of the app installation process, the following fields will be created by default:

  • component
  • environment
  • db_system
  • db_cluster
  • pod
  • db_cluster_address
  • db_cluster_port

Additionally, if you're using Elasticsearch in the Kubernetes environment, the following additional fields will be created by default during the app installation process:

  • pod_labels_component
  • pod_labels_environment
  • pod_labels_db_system
  • pod_labels_db_cluster
  • pod_labels_db_cluster_address
  • pod_labels_db_cluster_port

For information on setting up fields, see Fields.

Viewing Elasticsearch dashboards

All dashboards have a set of filters that you can apply to the entire dashboard. Use these filters to drill down and examine the data to a granular level.

  • You can change the time range for a dashboard or panel by selecting a predefined interval from a drop-down list, choosing a recently used time range, or specifying custom dates and times. Learn more.
  • You can use template variables to drill down and examine the data on a granular level. For more information, see Filtering Dashboards with Template Variables.
  • Most Next-Gen apps allow you to provide the scope at the installation time and are comprised of a key (_sourceCategory by default) and a default value for this key. Based on your input, the app dashboards will be parameterized with a dashboard variable, allowing you to change the dataset queried by all panels. This eliminates the need to create multiple copies of the same dashboard with different queries.

Overview

The Elasticsearch - Overview dashboard provides the health of Elasticsearch clusters, shards analysis, resource utilization of Elasticsearch host & clusters, search and indexing performance.

elasticsearch dashboards

Total Operations Stats

The Elasticsearch - Total Operations Stats dashboard provides information on the operations of the Elasticsearch system.

elasticsearch dashboards

Thread Pool

The Elasticsearch- Thread Pool dashboard analyzes thread pools operations to manage memory consumption of nodes in the cluster.

elasticsearch dashboards

Resource

The Elasticsearch - Resource dashboard monitors JVM Memory, Network, Disk, Network and CPU of Elasticsearch node.

elasticsearch dashboards

Performance Stats

The Elasticsearch - Performance Stats dashboard performance statistics such as latency and Translog operations and size.

elasticsearch dashboards

Indices

The Elasticsearch - Indices dashboard monitors Index operations, size and latency. It also provides analytics on doc values, fields, fixed bitsets, and terms memory.

elasticsearch dashboards

Documents

The Elasticsearch - Documents dashboard provides analytics and monitoring on Elasticsearch documents.

elasticsearch dashboards

Caches

The Elasticsearch - Caches dashboard allows you to monitor query cache size, evictions and field data memory size.

elasticsearch dashboards

Errors And Warnings

The ElasticSearch - Errors And Warnings dashboard shows errors and warnings by Elasticsearch components.

elasticsearch dashboards

Garbage Collection

The Elasticsearch - Garbage Collection dashboard provides information on the garbage collection of the Java Virtual Machine.

elasticsearch dashboards

Login And Connections

The ElasticSearch - Login And Connections dashboard shows geo location of client connection requests, failed connection logins and count of failed login attempts.

elasticsearch dashboards

Operations

The Elasticsearch - Operations dashboard allows you to monitor server stats and events such as node up/down, index creation/deletion. It also provides disk usage and cluster health status.

elasticsearch dashboards

Queries

The ElasticSearch - Queries dashboard shows Elasticsearch provides analytics on slow queries, and query shards.

elasticsearch dashboards

Create monitors for Elasticsearch app

From your App Catalog:

  1. From the Sumo Logic navigation, select App Catalog.
  2. In the Search Apps field, search for and then select your app.
  3. Make sure the app is installed.
  4. Navigate to What's Included tab and scroll down to the Monitors section.
  5. Click Create next to the pre-configured monitors. In the create monitors window, adjust the trigger conditions and notifications settings based on your requirements.
  6. Scroll down to Monitor Details.
  7. Under Location click on New Folder.
    note

    By default, monitor will be saved in the root folder. So to make the maintenance easier, create a new folder in the location of your choice.

  8. Enter Folder Name. Folder Description is optional.
    tip

    Using app version in the folder name will be helpful to determine the versioning for future updates.

  9. Click Create. Once the folder is created, click on Save.

Elasticsearch Alerts

Alert Type (Metrics/Logs)Alert NameAlert DescriptionTrigger Type (Critical / Warning)Alert ConditionRecover Condition
MetricsElasticsearch - Cluster RedThis alert fires when Elasticsearch Cluster status is REDCritical>=3<3
MetricsElasticsearch - Cluster YellowThis alert fires when Elasticsearch Cluster status is YELLOWWarning>=2<2
MetricsElasticsearch - Disk Out of SpaceThis alert fires when the disk usage is over 90%Critical>90< =90
MetricsElasticsearch - Disk Space LowThis alert fires when the disk usage is over 80%Warning>80< = 80
MetricsElasticsearch - Healthy Data NodesThis alert fires when there missing data node in Elasticsearch clusterCritical<3>=3
MetricsElasticsearch - Healthy NodesThis alert fires when there is missing node in Elasticsearch clusterCritical<3>=3
MetricsElasticsearch - Heap Usage Too HighThis alert fires when the heap usage is over 90%Critical>90< =90
MetricsElasticsearch - Heap Usage WarningThis alert fires when the heap usage is over 80%Warning>80< =80
MetricsElasticsearch - Initializing Shards Too LongThis alert fires when elasticsearch has been initializing shards for 5 minWarning>0< =0
MetricsElasticsearch - Pending TasksThis alert fires when elasticsearch has pending tasks.Warning>0< =0
MetricsElasticsearch - Relocating Shards Too LongThis alert fires when elasticsearch has been relocating shards for 5minWarning>0< =0
MetricsElasticsearch - Unassigned ShardsThis alert fires when Elasticsearch has unassigned shardsCritical>0< =0
LogsElasticsearch - Query Time Too SlowThis alert fires when queries are slow to executeCritical>0< =0
LogsElasticsearch - Query Time SlowThis alert fires when query time is greater than 5 msWarning>0< =0
LogsElasticsearch - Too Many Slow QueryThis alert fires when there aret oo Many Slow Query in 5 minutesWarning>100< =100
LogsElasticsearch - Error Log Too ManyError Log Too ManyCritical>1000< =1000
Status
Legal
Privacy Statement
Terms of Use

Copyright © 2025 by Sumo Logic, Inc.