- Scaling the MySQL Enterprise Monitor with Many Monitored Agents
When monitoring many servers or components using the MySQL Enterprise Monitor (formerly known as the MySQL Network Monitoring and Advisory Service), the default settings for the Service Manager components often need some adjusting to allow throughput to scale. This article discusses the various steps to take to properly tune the Service Manager.
Overview of the Tuning Aspects
There are a number of working parts within the Service Manager component of the MySQL Enterprise Monitor, such as a MySQL Server, Apache Tomcat application server, and in versions before 1.2.0, there is also an Apache/PHP installation—this was removed in 1.2.0 when the Dashboard was ported to jsp pages served by Tomcat.
MySQL Service Agents ideally are installed onto the server for each instance that you monitor. MySQL Service Agents connect to the Service Manager in parallel to pull their scheduled jobs, and to push any data they have collected for the instance which they monitor back to the Service Manager. This means that the higher the number of agents that are connected, the higher number of threads that will be hitting the Apache Tomcat server. This in turns puts higher load on the MySQL Server as data is stored and read (via the Enterprise Dashboard UI). Each agent collects a significant amount of data for each instance that is monitored. Therefore, when monitoring many agents, the overall database size can increase dramatically, depending on how many rules are enabled for each of them. This increases access time for certain features when monitoring over long periods of time. When trying to scale to hundreds of monitored instances, it is recommended that you have a dedicated system on which to install the Service Manager component. Such a dedicated system should have at least 4GB of memory, good CPU power (i.e., 64bit, multi-core), and plenty of disk space available.
When tuning Tomcat, make sure there are enough threads available within the application server to service all of the agent connections. For version 1.1.1 and before, this is set within the /opt/mysql/network/monitoring/apache-tomcat/conf/server.xml file on Unix like systems. On Windows systems, set the C:\Program Files\MySQL\Network\Monitoring\apache-tomcat\conf\server.xml file. For version 1.2.0 and above, set the /opt/mysql/enterprise/monitor/apache-tomcat/conf/server.xml file on Unix like systems. On Windows systems, set the C:\Program Files\MySQL\Enterprise\monitor\apache-tomcat\conf\server.xml file. The related section for which to look in the appropriate XML configuration is as follows:
Connector port="18080" maxHttpHeaderSize="8192" maxThreads="150" minSpareThreads="25" maxSpareThreads="75" enableLookups="false" redirectPort="18443" acceptCount="100" ... Connector port="18443" maxThreads="150" minSpareThreads="25" maxSpareThreads="75" enableLookups="false" disableUploadTimeout="true" acceptCount="100" scheme="https" secure="true"For each of the maxThreads variables in the above sections, double the value up to 300.
Tomcat is also a Java based application server, so we have a setting within the application that sets the max Java heap space it can take. For version 1.1.1 and before, this is set on Unix like systems in the /opt/mysql/network/monitoring/apache-tomcat/bin/catalina.sh file. On Windows systems, set the C:\Program Files\MySQL\Network\Monitoring\apache-tomcat\bin\catalina.bat file. For version 1.2.0 or higher, set /opt/mysql/enterprise/monitor/apache-tomcat/bin/catalina.sh file on Unix like systems. On Windows systems, set the C:\Program Files\MySQL\Enterprise\monitor\apache-tomcat\bin\catalina.bat file. The related section for which to look in the appropriate XML configuration is as follows:
# Set custom JAVA_OPTS JAVA_OPTS='-Xmx512m'
This should likely be doubled so the application server can use 1Gb of RAM, like this:
# Set custom JAVA_OPTS JAVA_OPTS='-Xmx1024m'
The default settings for the MySQL server supplied with the Service Manager are quite conservative with regards to memory settings and such. For version 1.1.1 and before, the my.cnf options file for the MySQL server is within the /opt/mysql/network/monitoring/mysql/ directory on Unix like systems. On Windows systems, it's located in the C:\Program Files\MySQL\Network\Monitoring\mysql directory. For version 1.2.0 or higher, you'll find the configuration file in the /opt/mysql/enterprise/monitor/mysql directory on Unix like systems. On Windows systems, look in the C:\Program Files\MySQL\Enterprise\monitor\mysql directory.
We recommend setting the following options to at least these settings:
innodb_buffer_pool_size = 512M innodb_flush_log_at_trx_commit = 2 query_cache_size = 32M thread_cache_size = 32 table_cache = 256 tmp_table_size = 32M max_heap_table_size = 32M
For Linux systems, also set this option in the options file:
innodb_flush_method = O_DIRECT
Depending on the amount of memory available to the Service Manager machine, and how well the InnoDB buffer pool memory is used (which can be monitored with an agent connecting to the Service Manager's MySQL instance as well), the innodb_buffer_pool_size variable may need to be increased even more. When increasing this variable, the innodb_log_file_size variable should be increased as well. For instance, with a 512M innodb_buffer_pool_size, set innodb_log_file_size to 128M so that the total across both log files created is half of the size of the buffer pool. To change the innodb_log_file_size you need to totally shut down the Service Manager, remove the current ib_logfile* files within the data directory (from the above paths), change the value within the my.cnf file, and restart the Service Manager.
As mentioned earlier, when monitoring a high number of instances, enabling all rule for each instance can trigger a great deal of data to be collected and sent back to the Service Manager. So be prudent when enabling rules for servers. Pick those which are of most interest to the instances in question; do not just enable all rules for All Servers. You should also set up and use the data purging facility within the Service Manager, available within the Global Settings screen, under the Data Purge Behavior heading. When storing information on a large number of agents, it's recommended to keep only about 2-3 weeks of information on the agents before purging it.