博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
rbr mysql_Lossless RBR for MySQL 8.0?
阅读量:7000 次
发布时间:2019-06-27

本文共 7054 字,大约阅读时间需要 23 分钟。

Lossless RBR

TL/DR: There’s been talk of moving the next release of MySQL to minimal RBR: I’d like to suggest an alternative: lossless RBR

For MySQL 5.8 there was talk / suggestions about moving to minimal

RBR as the default configuration ( http://mysqlserverteam.com/planning-the-defaults-for-mysql-5-8/

).

I’m not comfortable with this

because it means that by default you do not have the full data in

the binlog stream of the changes to a table.

The use of minimal RBR is an optimisation, it is done deliberately

in busy environments where the

size of written data is large and it is not convenient/possible to

keep all data.

Additionally the performance of full RBR can in

some cases be an issue especially for “fat/wide” tables etc. It is true that minimal

RBR helps considerably here. There are several issues it resolves:

reduces network bandwidth between master and slaves

reduces disk i/o reading / writing the binlog files

reduecs disk storage occupied by said binlogs

There was also a comment about enabling IDEMPOTENT

mode by default on a slave.

This is a mode which basically ignores most errors. That does not

seem wise

. As a DBA by default you want the server to not lose or munge data. There are times when you may decide to forgo that requirement, but the DBA should decide and the default behaviour should be safe.

Thus the idea of lossless RBR

came to mind. What would this involve compared to the current modes of FULL or MINIMAL RBR?

INSERT

s are unchanged (as now): you get the full row

DELETE

s are as per minimal RBR: The primary key is sent and the matching row is removed. IFF on a slave the pks differed and more than one row would be deleted this should be treated as an error.

UPDATE

s: Send the pk + full new image, thus ensuring that all new data is sent. This reduces the event size by ~ 1/2 so would be especially good for fat tables and tables where large updates go through. If the PK columns do not change then it should be sufficient to send the new row image and pk column names etc

Related to this behaviour it would be most convenient to implement an existing FR ( bug#69223

) to require

that table definitions via CREATE/ALTER TABLE MUST HAVE

a PK.

I’ve seen

several issues where a developer has not thought a primary key was important

(they often forget replication) and this would trigger problems. Inserts would work fine but any updates that happened afterwards would trigger a problem, not on the master but on all slaves.

I think that by default this

behaviour should be enabled. There may be situations where it needs to be disabled but they are likely to be rather limited.

This new mode LOSSLESS RBR is clearly a mix between full and minimal and it

ensures that data pushed into a slave will always be complete.

I

think that is a better target to aim for with MySQL 8.0 than the suggested MINIMAL RBR.

You may know that I do not like IDEMPOTENT mode much. I have created several

FRs to add counters to “lost/ignored events” so we can see the

impact of using this mode (usually it is used after an outage to

keep replication going even if this may mean some data is not being

updated correctly. Usually this is better than having a slave with

100% stale data.)

I would really also like to see you adding a “safe recovery mode”

where statements which won’t damage the slave more are accepted.

The examples are in bug#54250 but basically include:

INSERT

s with duplicate key: convert to UPDATE

DELETE

s with row not found: ignore as the data has gone anyway

UPDATE

s with non-matching PK: convert to INSERT

UPDATE

s with non-matching columns: update what you can. (This is likely to happen with full RBR as minimal RBR should never generate this type of error.)

[ For each of these 4 states: add counters to indicate how many

times this has happened, so we can see if we’re “correcting” or “fixing” errors or not. ]

You’ll notice that lossless RBR would work perfectly with this even

after a crash as you’ll have all the data you need, so you’ll never

make the state of the database any worse than it was before.

I would like to see the FRs I’ve made regarding improving RBR

being implemented as whether lossless RBR becomes a new replication

mode or not they would help DBAs both diagnose and fix problems

more easily than now.

It is probably also worth noting that FULL RBR is actually useful for a variety of scenarios, for example for exporting changes to other non-MySQL systems. We miss for this the definition of tables, and current systems need to extract that out of band which is a major nuisance. Exporting to external systems may not have happened that frequently in the past, but as larger companies use MySQL this becomes more and more important. For this type of system FULL RBR is probably needed even though it may not be used on the upstream master. I would expect that in most cases LOSSLESS RBR would also serve this purpose pretty well and reduce the replication footprint

. The only environment that may need traditional FULL RBR is where auditing

of ALL changes in a table is needed and thus both the before and after images are required.

Is it worth adding yet another replication mode to MySQL? That is a good question and it may not be worth the effort. However the differences between FULL and LOSSLESS RBR should be minimal: the only difference is the amount of data that’s pushed into the binlog so the scope of changes etc should be more limited. Improving replication performance seems to be a good goal: we all need that, but over-optimising should be considered more carefully. I think we are still missing the monitoring metrics which help us diagnose and be better aware of issues in RBR and the “tools” or improvements which would make recovery easier. Unless you live in the real world of systems which break it is hard to understand why these “obscure” edge cases matter that much. The responses like: “just restart mysqld” may make sense in some environments, but really are not realistic in systems that run 24x7x365. With replication it is similar: stopped replication is worse than replication that is working, but where data may not be complete. Depending on the situation you may tolerate that “incomplete data” (temporarily) while gaining the changes which your apps need to see. However, it is vitally important to be able to measure

the “damage” and that is why counters like the ones indicated above are so vital. It allows you to distinguish 1 broken row, or 1,000,000 and decide on how to prioritise and deal with that as appropriate.

While I guess the MySQL replication developers are busy I would certainly be interested in hearing their thoughts on this possible new replication mode and would definitely prefer it over the suggested minimal RBR as a default for 8.0. Both FULL and MINIMAL RBR have their place, but perhaps LOSSLESS would be a better default? What do you think?

注意:本文来自网友投稿。本站无法对本文内容的真实性、完整性、及时性、原创性提供任何保证,请您自行验证核实并承担相关的风险与后果!

CoLaBug.com遵循[CC BY-SA 4.0]分享并保持客观立场,本站不承担此类作品侵权行为的直接责任及连带责任。您有版权、意见、投诉等问题,请通过[eMail]联系我们处理,如需商业授权请联系原作者/原网站。

转载地址:http://ubevl.baihongyu.com/

你可能感兴趣的文章
C#设计模式之三抽象工厂模式(AbstractFactory)【创建型】
查看>>
Linux I2C驱动分析(三)----i2c_dev驱动和应用层分析 【转】
查看>>
三分钟彻底禁用、隐藏Android设备底部虚拟按钮(亲测有效)
查看>>
零复制(zero copy)技术
查看>>
java语言编程使用正则表达式来实现提取(美团 3-5年经验 15-30k 北京 hadoop高级工程)中的3-5和15-30...
查看>>
[js高手之路]匀速运动与实例实战(侧边栏,淡入淡出)
查看>>
adb protocol failure【转】
查看>>
25.Linux-Nor Flash驱动(详解)
查看>>
Appium 解决中文输入问题
查看>>
ping正常但是ssh到linux服务器很卡的解决方法
查看>>
php sqlserver及xdebug扩展配置
查看>>
【转】curl命令总结,Http Post_Get 常用
查看>>
jmeter压测前清理内存
查看>>
Windows2012R2版本区别
查看>>
PAT Basic 1073. 多选题常见计分法
查看>>
在 ReactNative 的 App 中,集成 Bugly 你会遇到的一些坑
查看>>
MII_GMII_RGMII_RMII_SMII_SSMII_TBI_RTBI比较
查看>>
inotify+rsync实现实时同步
查看>>
解决微信OAuth2.0网页授权回调域名只能设置一个的问题
查看>>
设计模式:原型模式
查看>>