我们正在尝试使用MariaDB群集调试问题.
我们在Amazon EC2中的c4.large实例上运行Maria 10.0.19;操作系统是Ubuntu 14.04(Trusty).
有三台机器聚集在一起,复制得很好(我们可以运行create database foo;在一台机器上看到另一台机器等).但是:当我们尝试从转储中恢复数据库时,所有三台机器都在运行并同步,有一个错误:
$du -sh *.sql 2.7G sqldump.sql $cat sqldump.sql | sudo MysqL ERROR 1047 (08S01) at line 4361: WSREP has not yet prepared node for application use
看起来这个错误与导入需要多长时间有关.如果我们在集群中的三个节点中的两个节点上运行服务MysqL stop并对剩余节点运行sql命令,则它可以正常工作.然后,我们可以逐个启动集群中的每台机器,通过SST复制数据,因此看起来这是Galera配置的问题.
这不仅发生在运行大型MysqL导入时:它发生在小事务的例行使用期间.但是,大量导入是我们最可靠的复制此问题的方法.
导入期间的系统内存使用率不是特别高,cpu使用率也不是很高.网络流量远低于机器链路的能力,在我们的测试中除了SSH连接之外没有其他流量.
有人可以帮助了解可能导致此问题的原因吗?
以下是有关群集中的计算机和MariaDB配置的更多详细信息:
Ubuntu的:
$lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 14.04.2 LTS Release: 14.04 Codename: trusty
核心:
$uname -a Linux servername.domain 3.13.0-53-generic #89-Ubuntu SMP Wed May 20 10:34:39 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
MysqL配置(故意混淆wsrep_cluster_address中的IP地址,域等):
$find /etc/MysqL/ -name "*.cnf" -exec cat {} \; | egrep -v "^#" | grep v "^$" [MysqLd] server-id = 965424531 bind-address = * max_connections = 500 max_connect_errors = 1000000 innodb_buffer_pool_size = 2635M log_bin = /var/lib/MysqL/MysqL/MysqL-bin expire_logs_days = 7 sync_binlog = 1 binlog_format = MIXED log-slave-updates = 1 slow_query_log = 1 slow_query_log_file = /var/log/MysqL/MysqL-slow.log [MysqLd] innodb_use_native_aio = 0 innodb_flush_method = O_DSYNC [client] [MysqLd] [MysqLd_safe] syslog [mariadb] [client] port = 3306 socket = /var/run/MysqLd/MysqLd.sock [MysqLd_safe] socket = /var/run/MysqLd/MysqLd.sock nice = 0 [MysqLd] user = MysqL pid-file = /var/run/MysqLd/MysqLd.pid socket = /var/run/MysqLd/MysqLd.sock port = 3306 basedir = /usr datadir = /var/lib/MysqL tmpdir = /tmp lc-messages-dir = /usr/share/MysqL skip-external-locking bind-address = 127.0.0.1 key_buffer = 16M max_allowed_packet = 16M thread_stack = 192K thread_cache_size = 8 myisam-recover = BACKUP query_cache_limit = 1M query_cache_size = 16M log_error = /var/log/MysqL/error.log expire_logs_days = 10 max_binlog_size = 100M [MysqLdump] quick quote-names max_allowed_packet = 16M [isamchk] key_buffer = 16M !includedir /etc/MysqL/conf.d/ [MysqLd] wsrep_provider=/usr/lib/galera/libgalera_smm.so wsrep_debug=ON wsrep_cluster_name="clustername" wsrep_cluster_address="gcomm://10.0.X.X,10.0.X.X" wsrep_sst_method=xtrabackup-v2 wsrep_sst_auth=sstuser:sstpassword wsrep_node_address="10.0.1.10" wsrep_node_name="servername.domain" binlog_format=ROW wsrep_on=ON default_storage_engine=InnoDB innodb_autoinc_lock_mode=2 innodb_doublewrite=1 query_cache_size=0 innodb_log_file_size = 256M
群集状态:
$sudo MysqL Welcome to the MariaDB monitor. Commands end with ; or \g. Your MariaDB connection id is 288 Server version: 10.0.19-MariaDB-1~trusty-wsrep-log mariadb.org binary distribution,wsrep_25.10.r4144 Copyright (c) 2000,2015,Oracle,MariaDB Corporation Ab and others. Type 'help;' or '\h' for help. Type '\c' to clear the current input statement. MariaDB [(none)]> show status like '%wsrep%'; +------------------------------+---------------------------------------------------+ | Variable_name | Value | +------------------------------+---------------------------------------------------+ | wsrep_local_state_uuid | e856afdc-18af-11e5-a3a6-efccde439ba4 | | wsrep_protocol_version | 7 | | wsrep_last_committed | 45764 | | wsrep_replicated | 2031 | | wsrep_replicated_bytes | 1527494811 | | wsrep_repl_keys | 9973524 | | wsrep_repl_keys_bytes | 79839767 | | wsrep_repl_data_bytes | 1447525060 | | wsrep_repl_other_bytes | 0 | | wsrep_received | 1478 | | wsrep_received_bytes | 13040 | | wsrep_local_commits | 1750 | | wsrep_local_cert_failures | 0 | | wsrep_local_replays | 0 | | wsrep_local_send_queue | 0 | | wsrep_local_send_queue_max | 2 | | wsrep_local_send_queue_min | 0 | | wsrep_local_send_queue_avg | 0.001140 | | wsrep_local_recv_queue | 0 | | wsrep_local_recv_queue_max | 7 | | wsrep_local_recv_queue_min | 0 | | wsrep_local_recv_queue_avg | 0.043302 | | wsrep_local_cached_downto | 45564 | | wsrep_flow_control_paused_ns | 3956186469 | | wsrep_flow_control_paused | 0.005006 | | wsrep_flow_control_sent | 0 | | wsrep_flow_control_recv | 41 | | wsrep_cert_deps_distance | 4.487445 | | wsrep_apply_oooe | 0.000000 | | wsrep_apply_oool | 0.000000 | | wsrep_apply_window | 1.000000 | | wsrep_commit_oooe | 0.000000 | | wsrep_commit_oool | 0.000000 | | wsrep_commit_window | 1.000000 | | wsrep_local_state | 4 | | wsrep_local_state_comment | Synced | | wsrep_cert_index_size | 11438 | | wsrep_causal_reads | 0 | | wsrep_cert_interval | 0.000000 | | wsrep_incoming_addresses |,| | wsrep_evs_delayed | | | wsrep_evs_evict_list | | | wsrep_evs_repl_latency | 0.00059098/0.000958534/0.00469729/0.000375612/732 | | wsrep_evs_state | OPERATIONAL | | wsrep_gcomm_uuid | 8bcfefe4-25f7-11e5-be32-062acc002ed5 | | wsrep_cluster_conf_id | 88 | | wsrep_cluster_size | 3 | | wsrep_cluster_state_uuid | e856afdc-18af-11e5-a3a6-efccde439ba4 | | wsrep_cluster_status | Primary | | wsrep_connected | ON | | wsrep_local_bf_aborts | 0 | | wsrep_local_index | 2 | | wsrep_provider_name | Galera | | wsrep_provider_vendor | Codership Oy <info@codership.com> | | wsrep_provider_version | 3.9(rXXXX) | | wsrep_ready | ON | | wsrep_thread_count | 2 | +------------------------------+---------------------------------------------------+ 57 rows in set (0.00 sec)
首先,巨大的交易和LOAD DATA INFILEon galera集群仍然是一个已知的限制,如果你不得不建议将这些交易拆分5k-10k trx或更小的YMMV.
尝试增加wsrep-max-ws-size.
在所有节点上设置innodb_flush_log_at_trx_commit = 0.