MySQL - OpenTelemetry Collector
The Sumo Logic app for MySQL is a unified logs and metrics app that helps you monitor the availability, performance and resource utilization of MySQL database clusters. Preconfigured dashboards and searches provide insight into the health of your MySQL clusters, replication status, error logs, query performance, slow queries, Innodb operations, failed logins, and error logs.
This app supports MySQL version 8.0.
We use the OpenTelemetry collector for MySQL metric collection and for collecting MySQL logs.
The diagram below illustrates the components of the MySQL collection for each database server. OpenTelemetry collector runs on the same host as MySQL, and uses the MySQL receiver to obtain MySQL metrics, and the Sumo Logic OpenTelemetry Exporter to send the metrics to Sumo Logic. MySQL logs are sent to Sumo Logic through a filelog receiver.
This app includes built-in monitors. For details on creating custom monitors, refer to Create monitors for MySQL app.
Log and metrics types​
The Sumo Logic app for MySQL assumes the default MySQL Error log file format for error logs, and the MySQL Slow Query file format for slow query logs. For a list of metrics that are collected and used by the app, see MySQL Metrics.
- The MySQL - Overview dashboard is based on logs from both the Error and Slow Query log formats, so as to correlate information between the two.
- Dashboards in the Metrics folder are based on MySQL metrics.
- Dashboards in the Logs folder are based on MySQL logs from both the Error and Slow Query log formats.
- Dashboards based on the Error format:
- MySQL - Error Logs
- MySQL - Failed Logins
- MySQL - Replication
- Dashboard based on Slow Query format:
- MySQL - Slow Queries
- Dashboards based on the Error format:
- MySQL General Health is based on logs from the MySQL Error log format.
- MySQL Replication is based on logs from the MySQL Error log format.
- MySQL Slow Queries is based on logs from the MySQL Slow Queries log format.
The MySQL app dashboards dependent on error logs are based on the message types ERROR, NOTE, Warning, and Info. For more details on the MySQL log file format, see here.
Fields creation in Sumo Logic for MySQL​
Following are the fields which will be created as part of MySQL app installation, if not already present.Â
db.cluster.name
. User configured. Enter a name to identify this MySQL cluster. This cluster name will be shown in the Sumo Logic dashboards.db.system
. Has fixed value of mysql.sumo.datasource
. Has fixed value of mysql.db.node.name
. Has the value of host name of the machine which is being monitored.
Prerequisites​
For metric collection​
For metrics collection, you can find the prerequisites here.
For log collection​
Configure MySQL to log to a local file(s). MySQL logs written to a log file can be collected via the Local File Source of a Sumo Logic Installed Collector. To configure the MySQL log file(s), locate your local my.cnf configuration file in the database directory.
- Open
my.cnf
in a text editor. - Set the following parameters in the
[mysqld]
section:
[mysqld]
log_error = /var/log/mysql/error.log
slow_query_log=1
slow_query_log_file = /var/log/mysql/mysql-slow.log
long_query_time=2
long_query_time=2
- Error Logs. By default, error logs are enabled and are logged at file specified by the
log_error
key. - Slow Query Logs.
slow_query_log=1
enables logging of slow queries to the file specified byslow_query_log_file
. Settinglong_query_time=2
will cause queries that take more than two seconds to execute to be logged. The default value oflong_query_time
is 10 seconds. - General Query Logs. We do not recommend enabling
general_log
for performance reasons. These logs are not used by the Sumo Logic MySQL app.
- Save the
my.cnf
file. - Restart the MySQL server:
sudo mysql.server restart
.
For Linux systems with ACL Support, the otelcol install process should have created the ACL grants necessary for the otelcol system user to access default log locations. You can verify the active ACL grants using the getfacl
command. Install the ACL in your Linux environment, if not installed.
The required ACL may not be supported for some rare cases, for example, Linux OS Distro, which is officially not supported by Sumo Logic. In this case, you can run the following command to explicitly grant the permissions.
sudo setfacl -R -m d:u:otelcol-sumo:r-x,d:g:otelcol-sumo:r-x,u:otelcol-sumo:r-x,g:otelcol-sumo:r-x <PATH_TO_LOG_FILE>
Run the above command for all the log files in the directory that need to be ingested, which are not residing in the default location.
If Linux ACL Support is not available, traditional Unix-styled user and group permission must be modified. It should be sufficient to add the otelcol system user to the specific group that has access to the log files.
For Windows system, log files which are collected should be accessible by SYSTEM group. Follow the set of below power shell command if SYSTEM group does not have the access.
$NewAcl = Get-Acl -Path "<PATH_TO_LOG_FILE>"
# Set properties
$identity = "NT AUTHORITY\SYSTEM"
$fileSystemRights = "ReadAndExecute"
$type = "Allow"
# Create new rule
$fileSystemAccessRuleArgumentList = $identity, $fileSystemRights, $type
$fileSystemAccessRule = New-Object -TypeName System.Security.AccessControl.FileSystemAccessRule -ArgumentList $fileSystemAccessRuleArgumentList
# Apply new rule
$NewAcl.SetAccessRule($fileSystemAccessRule)
Set-Acl -Path "<PATH_TO_LOG_FILE>" -AclObject $NewAcl
Collection configuration and app installation​
As part of data collection setup and app installation, you can select the App from App Catalog and click on Install App. Follow the steps below.
Step 1: Set up collector​
If you want to use an existing OpenTelemetry Collector, you can skip this step by selecting the Use an existing Collector option.
To create a new Collector:
- Select the Add a new Collector option.
- Select the platform where you want to install the Sumo Logic OpenTelemetry Collector.
This will generate a command that you can execute in the machine environment you need to monitor. Once executed, it will install the Sumo Logic OpenTelemetry Collector.
Step 2: Configure integration​
In this step, you will configure the yaml required for MySQL collection.
Below are the required inputs:
- Endpoint. enter the url of the server which needs to be monitored
- User Name. enter the MySQL username.
- Error File log Path. enter the path to the error log file for your mysql instance.
- Slow Transaction file log path. enter the path to the slow log file for your mysql instance.
- Tags.
db.cluster.name
.
You can add any custom fields which you want to tag along with the data ingested in Sumo.
Click on the Download YAML File button to get the yaml file.
For Linux platform, click Download Environment Variables File to get the file with the password which is supposed to be set as environment variable.
Step 3: Send logs and metrics to Sumo Logic​
Once you have downloaded the YAML file as described in the previous step, follow the below steps based on your platform.
- Linux
- Windows
- macOS
- Chef
- Ansible
- Puppet
- Copy the yaml file to
/etc/otelcol-sumo/conf.d/
folder in the MySQL instance which needs to be monitored. - Place Env file in the following directory:
/etc/otelcol-sumo/env/
- restart the collector using:
 sudo systemctl restart otelcol-sumo
- Copy the yaml file to
C:\ProgramData\Sumo Logic\OpenTelemetry Collector\config\conf.d
folder in the machine which needs to be monitored. - Restart the collector using:
Restart-Service -Name OtelcolSumo
- Copy the yaml file to
/etc/otelcol-sumo/conf.d/
folder in the MySQL instance which needs to be monitored. - Restart the otelcol-sumo process using:Â
otelcol-sumo --config /etc/otelcol-sumo/sumologic.yaml --config "glob:/etc/otelcol-sumo/conf.d/*.yaml"
- Copy the yaml file into your Chef cookbook files directory
files/<downloaded_yaml_file>
.` - Use a Chef file resource in a recipe to manage it.
cookbook_file '/etc/otelcol-sumo/conf.d/<downloaded_yaml_file>' do
mode 0644
notifies :restart, 'service[otelcol-sumo]', :delayed
end - Use a Chef file resource in a recipe to manage it.
cookbook_file '/etc/otelcol-sumo/env/<downloaded_env_file>' do
mode 0600
notifies :restart, 'service[otelcol-sumo]', :delayed
end - Add the recipe to your collector setup to start collecting. Every team typically has their established way of applying the Chef recipe. The resulting Chef recipe should look something like:
cookbook_file '/etc/otelcol-sumo/conf.d/<downloaded_yaml_file>' do
mode 0644
notifies :restart, 'service[otelcol-sumo]', :delayed
end
cookbook_file '/etc/otelcol-sumo/env/<downloaded_env_file>' do
mode 0600
notifies :restart, 'service[otelcol-sumo]', :delayed
end
- Place the file into your Ansible playbook files directory.
- Run the Ansible playbook using:
ansible-playbook -i inventory install_sumologic_otel_collector.yaml
-e '{"installation_token": "<YOUR_TOKEN>", "collector_tags": {<YOUR_TAGS>}, "src_config_path": "files/conf.d", "src_env_path": "files/env"}'
- Place the file into your Puppet module files directory
modules/install_otel_collector/files/<downloaded_yaml>
. - Use a Puppet file resource to manage it
file { '/etc/otelcol-sumo/conf.d/<downloaded_yaml_file>':
ensure => present,
source => 'puppet:///modules/install_otel_collector/<downloaded_yaml_file>',
mode => '0644',
notify => Service[otelcol-sumo],
} - Use a Puppet file resource to manage it
file { '/etc/otelcol-sumo/env/<downloaded_env_file>'
ensure => present,
source => 'puppet:///modules/install_otel_collector/<downloaded_env_file>',
mode => '0600',
notify => Service[otelcol-sumo],
} - Apply the Puppet manifest. Every team typically has their established way of applying the Puppet manifes. The resulting Puppet manifest should look something like:
node 'default' {
class { 'install_otel_collector'
installation_token => '<YOUR_TOKEN>',
collector_tags => { <YOUR_TAGS> },
}
service { 'otelcol-sumo':
provider => 'systemd',
ensure => running,
enable => true,
require => Class['install_otel_collector'],
}
file { '/etc/otelcol-sumo/conf.d/<downloaded_yaml_file>':
ensure => present,
source => 'puppet:///modules/install_otel_collector/<downloaded_yaml_file>',
mode => '0644',
notify => Service[otelcol-sumo],
}
file { '/etc/otelcol-sumo/env/<downloaded_env_file>':
ensure => present,
source => 'puppet:///modules/install_otel_collector/<downloaded_env_file>',
mode => '0600',
notify => Service[otelcol-sumo],
}
}
After successfully executing the above command, Sumo Logic will start receiving data from your host machine.
Click Next. This will install the app (dashboards and monitors) to your Sumo Logic Org.
Dashboard panels will start to fill automatically. It's important to note that each panel fills with data matching the time range query and received since the panel was created. Results won't immediately be available, but within 20 minutes, you'll see full graphs and maps.
Sample log messages​
{
log.file.name:"mysqld-error.log",
log:"2022-10-14T09:16:02.430542Z 63707 [Note] [MY-010926] [Server] Access denied for user 'root'@'ip-10-0-16-79' (using password: YES)"
}
Sample metrics​
{"queryId":"A","_source":"sumo_hosted_collector_otel_mysql","operation":"deleted","metric":"mysql.row_operations","db.cluster.name":"sumoCluster_otel","_collectorId":"000000000C59BB2E","deployment.environment":"sumodev_otel","_sourceId":"0000000000000000","unit":"1","db.system":"mysql","_sourceHost":"sumoOtelMysql","_collector":"sumo_hosted_collector_otel_mysql","max":0,"min":0,"avg":0,"sum":0,"latest":0,"count":4}
Sample queries​
This sample query is from the Top 10 Slow Queries by Average Execution Time panel.
db.system=mysql db.cluster.name={{db.cluster.name}} "User@Host" "Query_time"Â
| parse regex "(?<query_block># User@Host:[\S\s]+?SET timestamp=\d+;[\S\s]+?;)" multi
| parse regex field=query_block "# User@Host: \S+?\[(?<user>\S*?)\] @ (?<host_name>\S+)\s\[(?<ip_addr>\S*?)\]" nodrop
| parse regex field=query_block "# User@Host: \S+?\[(?<user>\S*?)\] @\s+\[(?<ip_addr>\S*?)\]\s+Id:\s+(?<Id>\d{1,10})" nodrop | parse regex field=query_block "# User@Host: \S+?\[(?<user>\S*?)\] @ (?<host_name>\S+)\s\[(?<ip_addr>\S*?)\]\s+Id:\s+(?<Id>\d{1,10})" | parse regex field=query_block "Schema: (?<schema>(?:\S*|\s)?)\s*Last_errno[\s\S]+?Query_time:\s+(?<query_time>[\d.]*)\s+Lock_time:\s+(?<lock_time>[\d.]*)\s+Rows_sent:\s+(?<rows_sent>[\d.]*)\s+Rows_examined:\s+(?<rows_examined>[\d.]*)\s+Rows_affected:\s+(?<rows_affected>[\d.]*)\s+Rows_read:\s+(?<rows_read>[\d.]*)\n" nodrop | parse regex field=query_block "Schema: (?<schema>(?:\S*|\s)?)\s*Last_errno[\s\S]+?\s+Killed:\s+\d+\n" nodrop | parse regex field=query_block "Query_time:\s+(?<query_time>[\d.]*)\s+Lock_time:\s+(?<lock_time>[\d.]*)\s+Rows_sent:\s+(?<rows_sent>[\d]*)\s+Rows_examined:\s+(?<rows_examined>[\d]*)\s+Rows_affected:\s+(?<rows_affected>[\d]*)\s+" nodrop | parse regex field=query_block "Query_time:\s+(?<query_time>[\d.]*)\s+Lock_time:\s+(?<lock_time>[\d.]*)\s+Rows_sent:\s+(?<rows_sent>[\d]*)\s+Rows_examined:\s+(?<rows_examined>[\d]*)"Â
| parse regex field=query_block "# Bytes_sent:\s+(?<bytes_sent>\d*)\s+Tmp_tables:\s+(?<tmp_tables>\d*)\s+Tmp_disk_tables:\s+(?<temp_disk_tables>\d*)\s+Tmp_table_sizes:\s+(?<tmp_table_sizes>\d*)\n" nodrop | parse regex field=query_block "# Bytes_sent:\s+(?<bytes_sent>\d*)\n" nodrop | parse regex field=query_block "SET timestamp=(?<set_timestamp>\d*);(?:\\n|\n)(?<sql_cmd>[\s\S]*);" nodrop
| fields -query_block
| avg(query_time) as avg_time, sum(query_time) as total_time, min(query_time) as min_time, max(query_time) as max_time, avg(rows_examined) as avg_rows_examined, avg(rows_sent) as avg_rows_sent, avg(Lock_Time) as avg_lock_time, count as frequency group by sql_cmd, db.cluster.name
| sort by avg_time | limit 10
This sample metrics query is from the FSync Op Count panel.
sumo.datasource=mysql deployment.environment=* db.cluster.name=* db.node.name=* metric=mysql.operations operation=fsyncs | sum
Viewing MySQL dashboards​
All dashboards have a set of filters that you can apply to the entire dashboard. Use these filters to drill down and examine the data to a granular level.
- You can change the time range for a dashboard or panel by selecting a predefined interval from a drop-down list, choosing a recently used time range, or specifying custom dates and times. Learn more.
- You can use template variables to drill down and examine the data on a granular level. For more information, see Filtering Dashboards with Template Variables.
Overview​
The MySQL - Overview dashboard gives you an at-a-glance view of the state of your database clusters by monitoring key cluster information such as errors, failed logins, errors, queries executed, slow queries, lock waits, and uptime.
Use this dashboard to:
- Quickly identify the state of a given database cluster.
Error Logs​
The MySQL - Error Logs dashboard provides insight into database error logs by specifically monitoring database shutdown/start events, errors over time, errors, warnings, and crash recovery attempts.
Use this dashboard to:
- Quickly identify errors and patterns in logs for troubleshooting.
- Monitor trends in error logs and identify outliers.
- Ensure that server start, server stop and crash recovery events are in line with expectations.
- Dashboard filters allow you to narrow a search for database cluster.
Failed Logins​
The MySQL - Failed Logins dashboard provides insights into all failed login attempts by location, users and hosts.
Use this dashboard to:
- Monitor all failed login attempts and identify any unusual or suspicious activity.
Replication​
The MySQL - Replication dashboard provides insights into the state of database replication.
Use this dashboard to:
- Quickly determine reasons for replication failures.
- Monitor replication status trends.
Slow Queries​
The MySQL - Slow Queries dashboard provides insights into all slow queries executed on the database.
Note: Slow queries are queries that take 10 seconds or more to execute (default value is 10 seconds as per mysql configuration which can be altered) and excessive slow queries are those that take 15 seconds or more to execute.
Use this dashboard to:
- Identify all slow queries
- Quickly determine which queries have been identified as slow or excessive slow queries.
- Monitor users and hosts running slow queries.
- Determine which SQL commands are slower than others.
- Examine slow query trends to determine if there are periodic performance bottlenecks in your database clusters.
Performance and Resource Metrics​
The MySQL - Performance and Resource Metrics dashboard allows you to monitor the performance and resource usage of your database clusters.
Use this dashboard to:
- Understand the behavior and performance of your database clusters.
- Monitor key operational metrics around threads running, innodb waits, and locks.
- Monitor query execution trends to ensure they match up with expectations.
- Dashboard filters allow you to narrow a search for a specific database cluster.
Create monitors for MySQL app​
From your App Catalog:
- From the Sumo Logic navigation, select App Catalog.
- In the Search Apps field, search for and then select your app.
- Make sure the app is installed.
- Navigate to What's Included tab and scroll down to the Monitors section.
- Click Create next to the pre-configured monitors. In the create monitors window, adjust the trigger conditions and notifications settings based on your requirements.
- Scroll down to Monitor Details.
- Under Location click on New Folder.
note
By default, monitor will be saved in the root folder. So to make the maintenance easier, create a new folder in the location of your choice.
- Enter Folder Name. Folder Description is optional.
tip
Using app version in the folder name will be helpful to determine the versioning for future updates.
- Click Create. Once the folder is created, click on Save.
MySQL alerts​
Alert Name | Alert Description and conditions | Alert Condition | Recover Condition |
---|---|---|---|
MySQL - Excessive Slow Query Detected Alert | This alert gets triggered when we detect the average time to execute a query is more than 15 seconds over a 24 hour time-period. | Count > = 100 | Count < 100 |
MySQL - Follower replication lag detected Alert | This alert gets triggered when we detect that the average replication lag is greater than or equal to 900 seconds within a 5 minute time interval. | Count > = 900 | Count < 900 |
MySQL - High average query run time Alert | This alert gets triggered when the average run time of SQL queries for a given schema is greater than or equal to one second within a time interval of 5 minutes. | Count > = 10 | Count < 10 |
MySQL - High Innodb buffer pool utilization Alert | This alert gets triggered when we detect that the InnoDB buffer pool utilization is high (>=90%) within a 5 minute time interval. | Count > = 90 | Count < 90 |
MySQL - Instance down Alert | This alert gets triggered when we detect that a MySQL instance is down within last 5 minutes interval. | Count > = 1 | Count < 1 |
MySQL - Large number of statement errors Alert | This alert gets triggered when we detect that there are 5 or more statement errors within a 5 minute time interval. | Count > = 5 | Count < 5 |
MySQL - Large number of statement warnings Alert | This alert gets triggered when we detect that there are 20 or more statement warnings within a 5 minute time interval. | Count > = 20 | Count < 20 |
MySQL - No index used in the SQL statements Alert | This alert gets triggered when we detect that there are 5 or more statements not using an index in the SQL query within a 5 minute time interval. | Count > = 5 | Count < 5 |