Sharding The basic idea is to put a database <http://lib.csdn.net/base/mysql>
Cut into multiple parts and put them into different databases(server) upper, To alleviate the performance problems of a single database. Not strictly, For databases with massive data, If there are more tables and more data, In this case, vertical segmentation is suitable, I.e. close relationship( For example, the same module.) The tables are cut out and put in oneserver upper. If there are not many tables, But each table has a lot of data, This time is suitable for horizontal segmentation, That is to say, the data of the table should follow some rules( For example, pressID hash) Split to multiple databases(server) upper. Of course, In reality, these two situations are more mixed, At this time, we need to make a choice according to the actual situation, Vertical and horizontal segmentation may also be used in combination, Thus, the original database can be divided into databases that can be infinitely expanded like a matrix(server) array.


In particular: When vertical and horizontal segmentation are performed at the same time, There are subtle changes in segmentation strategies. such as: When only vertical segmentation is considered, Any association relationship can be maintained between tables divided together, So you can press“ functional module” Partition table, But once horizontal segmentation is introduced, The relationship between tables will be greatly restricted, Usually only one primary table is allowed( With this tableID Hash table) Keep association relationship with its multiple secondary tables, In other words: When vertical and horizontal segmentation are performed at the same time, The cut in the vertical direction will no longer be“ functional module” Divide, It requires more fine-grained vertical segmentation, And this granularity and Domain Driven Design“ polymerization” The concept coincides, It's even exactly the same, eachshard The main table of is exactly the aggregation root in an aggregation! So if you slice it, you will find that the database is too fragmented(shard There will be more, howevershard There are not many watches in it), To avoid managing too many data sources, Make full use of the resources of each database server, Consider making the business similar, And has similar data growth rate( Data quantity of main table is in the same order of magnitude) Two or more ofshard Put it in the same data source, eachshard Still independent, They have their own master tables, And use their own master tablesID Hashing, The only difference is their hash modulus( I.e. number of nodes) Must be consistent.

Common middleware of database and table

Easy to use components:

* Dangdangsharding-jdbc <https://github.com/dangdangdotcom/sharding-jdbc>
* Mogujie.comTSharding <https://github.com/baihui212/tsharding>
Powerful and heavyweight Middleware:

* sharding <https://github.com/go-pg/sharding>
* TDDL Smart Client Way( TaoBao) <https://github.com/alibaba/tb_tddl>
* Atlas(Qihoo 360) <https://github.com/Qihoo360/Atlas>
* alibaba.cobar( It's Alibaba(B2B) Department development) <https://github.com/alibaba/cobar>
* MyCAT( Open source based on AlibabaCobar Product development) <http://www.mycat.org.cn/>
* Oceanus(58 City database middleware) <https://github.com/58code/Oceanus>
* OneProxy( Development of Alipay's chief architect, Fang Xin)
<http://www.cnblogs.com/youge-OneSQL/articles/4208583.html>
* vitess( Database middleware developed by Google) <https://github.com/youtube/vitess>
Problems to be solved in sub database and sub table

1, Transaction problem

There are two feasible solutions to the transaction problem: Distributed transaction and transaction control by application and database.

* Scheme 1: Using distributed transactions
* Advantage: Database management, Simple and effective
* shortcoming: High performance cost, Especiallyshard More and more
* Option two: Jointly controlled by application and database
* principle: Split a distributed transaction across multiple databases into multiple locations Small transactions on a single database, And through the application to control Small business.
* Advantage: Performance advantages
* shortcoming: Need flexible design of application program on transaction control. If used 了spring <http://lib.csdn.net/base/javaee>
Transaction management of, It will be difficult to change.
2, Cross nodeJoin Problem


As long as it's segmentation, Cross nodeJoin The problem is inevitable. But good design and segmentation can reduce this kind of situation. The common way to solve this problem is to implement two queries. Find out the associated data in the result set of the first queryid, According to theseid Initiate the second request to get the associated data.

3, Cross nodecount,order by,group by And aggregate function problems


These are questions, Because they all need to be calculated based on the entire data set. Most agents do not automatically process the merge. Solution: And solve cross nodejoin Similar problems, After getting the results on each node, merge them on the application side. andjoin The difference is that the query of each node can be executed in parallel, So many times it's much faster than a single large watch. But if the result set is large, Consumption of application memory is a problem.

4, data migration, Capacity planning, Expansion and other issues

From Taobao integrated business platform team, It takes advantage of2 Multiple redundancy of is forward compatible( If right4 Surplus earned1 The number of pairs2 The surplus is also.1) To allocate data, Avoid row level data migration, But it still needs to be done
Table level migration, At the same time, there are restrictions on the scale of expansion and the number of sub tables. Generally speaking, None of these plans is ideal, There are some disadvantages, more or less, This also reflects from one sideSharding Difficulty of expansion.

5, affair

Distributed transaction
Reference resources: [ About distributed transactions, Two phase submission, Phase I submission,Best Efforts
1PC Research on the model and transaction compensation mechanism](http://blog.csdn.net/bluishglc/article/details/7612811)
* Advantage
* Based on two-phase commit, To maximize the cross database operation“ Atomicity”, It is the most strict transaction implementation method in distributed system.
* Implementation is simple, Small workload. Because most application servers and some independent distributed transaction coordinators do a lot of encapsulation work, The difficulty and workload of introducing distributed transaction into the project can be ignored.
* shortcoming
*
system“ level” Retractable enemy. Distributed transaction based on two-phase commit needs to coordinate among multiple nodes when committing transactions, Maximum delay in transaction commit time, Objectively extend the execution time of the transaction, This will increase the probability of conflicts and deadlocks when transactions access shared resources, With the increase of database nodes, This trend will become more and more serious, So that the system can scale horizontally at the database level" chains",
This is a lot.Sharding The main reason why the system does not use distributed transaction.
Be based onBest Efforts 1PC Transactions of pattern

Reference resourcesspring-data-neo4j
<https://github.com/SpringSource/spring-data-graph/blob/master/spring-data-neo4j/src/main/java/org/springframework/data/neo4j/transaction/ChainedTransactionManager.java>
Implementation. WhereasBest Efforts 1PC Performance advantages of patterns, And relatively simple implementation, It's beensharding Framework and project adoption

Transaction compensation( Power equivalence)


For those with high performance requirements, Systems with low requirements for consistency, The real-time consistency of the system is often not critical, As long as the final consistency is achieved within an allowed time period, This makes the transaction compensation mechanism a feasible scheme. Transaction compensation mechanism was originally proposed in“ Long transaction” In processing, But it also has a good reference for distributed system to ensure consistency. Generally speaking, Different from rolling back the transaction immediately after an error occurs during execution, Transaction compensation is a measure to check and remedy afterwards, It only expects to get the final consistent result in an allowable time period. The implementation of transaction compensation is closely related to system business, There's no standard way to deal with it. Some common implementations are: Reconciliation check of data; Log based comparison; Synchronize with standard data sources on a regular basis, Wait.

6,ID problem


Once the database is segmented to multiple physical nodes, We can no longer rely on the primary key generation mechanism of the database itself. One side, A partition database generated by itselfID There's no guarantee that it's unique globally; On the other hand, Applications need to get data before inserting itID, In order to carry outSQL Route.
Some common primary key generation strategies

UUID


UseUUID Primary key is the simplest solution, But the disadvantages are also very obvious. BecauseUUID Very long, In addition to taking up a lot of storage space, The main problem is index, There are performance problems in indexing and index based queries.

Maintenance of aSequence surface

The idea of this scheme is also very simple, Create aSequence surface, The structure of the table is similar to:
CREATE TABLE `SEQUENCE` ( `table_name` varchar(18) NOT NULL, `nextid` bigint(
20)NOT NULL, PRIMARY KEY (`table_name`) ) ENGINE=InnoDB

Whenever you need to generate a new record for a tableID From time to timeSequence Take out thenextid, And willnextid Value added1 Update to database for next use. This scheme is also simple, But the disadvantages are also obvious: Because all inserts require access to the table, This table can easily become the bottleneck of system performance, At the same time, it also has a single point of problem, Once the table database fails, The entire application will not work. Someone proposed to useMaster-Slave Perform master-slave synchronization, But it can only solve a single problem, It doesn't solve the problem that the read-write ratio is1:1 Access pressure issues for.

Twitter Distributed auto increment ofID algorithmSnowflake
<http://blog.sina.com.cn/s/blog_6b7c2e660102vbi2.html>


In distributed system, Global generation requiredUID There are many occasions,twitter Ofsnowflake It solves this need, The implementation is also very simple, Remove configuration information, The core code is millisecond time41 position
machineID 10 position Sequence in milliseconds12 position.
* 10---0000000000 0000000000 0000000000 0000000000 0 --- 00000 ---00000
---000000000000

In the string above, The first is not used( In fact, it can also be used aslong Symbol bit), Next41 Bit time in milliseconds, Then?5 positiondatacenter Identification bit,5 Bit machineID( Not an identifier, It is actually a thread identification), Then?12 Bit count of the current millisecond in that millisecond, It just adds up64 position, For oneLong type.


The advantage is, In general, it is sorted according to the increasing time, And the whole distributed system will not generateID collision( fromdatacenter And machineID Distinguish), And high efficiency, Tested,snowflake Can generate per second26 ten thousandID About, Fully meet the needs.

7, Sorting and paging across slices


Generally speaking, Paging needs to be sorted according to the specified fields. When the sorting field is a fragment field, We can easily locate the specified partition through the partition rule, When sorting fields is not fragmented, It's going to get more complicated. For the accuracy of the final result, We need to sort and return data in different sharding nodes, And summarize and reorder the result sets returned by different segments, Finally, return to the user. As shown in the figure below:

 

This is the simplest case( Take the data on the first page), Seems to have little impact on Performance. however, If you want to remove the10 Page data, It's going to be a lot more complicated, As shown in the figure below:

 


Some readers may not understand, Why can't it be as simple as getting the first page of data( Before sorting out10 Strip merging, sort). It's not hard to understand, Because the data in each partition node may be random, For the accuracy of sorting, All the nodes must be divided intoN Merge all page data after sorting, Finally, the overall sorting is carried out. Obviously, This kind of operation consumes resources, The user page back, System performance will be worse.
How to solve the paging problem in the case of sub database? There are several ways:

Provide paging in the foreground application, The user can only see the frontn page, This restriction is also reasonable in business, Generally speaking, it doesn't make much sense to look at the following pages( If you have to see it, The user can be asked to narrow down and re query).

In case of background batch task, batch data acquisition is required, Can be increasedpage size, For example, every acquisition5000 Bar record, Effectively reduce the number of pages( Of course, offline access to the general standby database, Avoid impact on main warehouse).

When designing the sub base, Generally, there are supporting big data platforms to summarize the records of all sub databases, Some paging queries can consider big data platform.

8, Sub library strategy

After the sub base dimension is determined, How to divide the records into different libraries?
There are generally two ways:

* According to the range of values, Such as usersId by1-9999 The records of are divided into the first library,10000-20000 Points到第二个库,以此类推.
* 根据数值取模,比如用户Id mod n,余数为0的记录放到第一个库,余数为1的放到第二个库,以此类推.
优劣比较:
评价指标按照范围分库按照Mod分库
库数量前期数目比较小,可以随用户/业务按需增长前期即根据mode因子确定库数量,数目一般比较大
访问性能前期库数量小,全库查询消耗资源少,单库查询性能略差前期库数量大,全库查询消耗资源多,单库查询性能略好
调整库数量比较容易,一般只需为新用户增加库,老库拆分也只影响单个库困难,改变mod因子导致数据在所有库之间迁移
数据热点新旧用户购物频率有差异,有数据热点问题新旧用户均匀到分布到各个库,无热点

实践中,为了处理简单,选择mod分库的比较多.同时二次分库时,为了数据迁移方便,一般是按倍数增加,比如初始4个库,二次分裂为8个,再16个.这样对于某个库的数据,一半数据移到新库,剩余不动,对比每次只增加一个库,所有数据都要大规模变动.

补充下,mod分库一般每个库记录数比较均匀,但也有些数据库,存在超级Id,这些Id的记录远远超过其他Id,比如在广告场景下,某个大广告主的广告数可能占总体很大比例.如果按照广告主Id取模分库,某些库的记录数会特别多,对于这些超级Id,需要提供单独库来存储记录.

9,分库数量

分库数量首先和单库能处理的记录数有关,一般来说,Mysql
单库超过5000万条记录,Oracle单库超过1亿条记录,DB压力就很大(当然处理能力和字段数量/访问模式/记录长度有进一步关系).


在满足上述前提下,如果分库数量少,达不到分散存储和减轻DB性能压力的目的;如果分库的数量多,好处是每个库记录少,单库访问性能好,但对于跨多个库的访问,应用程序需要访问多个库,如果是并发模式,要消耗宝贵的线程资源;如果是串行模式,执行时间会急剧增加.

最后分库数量还直接影响硬件的投入,一般每个分库跑在单独物理机上,多一个库意味多一台设备.所以具体分多少个库,要综合评估,一般初次分库建议分4-8个库.

10,路由透明

分库从某种意义上来说,意味着DB
schema改变了,必然影响应用,但这种改变和业务无关,所以要尽量保证分库对应用代码透明,分库逻辑尽量在数据访问层处理.当然完全做到这一点很困难,具体哪些应该由DAL负责,哪些由应用负责,这里有一些建议:

对于单库访问,比如查询条件指定用户Id,则该SQL只需访问特定库.此时应该由DAL层自动路由到特定库,当库二次分裂时,也只要修改mod
因子,应用代码不受影响.

对于简单的多库查询,DAL负责汇总各个数据库返回的记录,此时仍对上层应用透明.

11,使用框架还是自主研发

目前市面上的分库分表中间件相对较多,其中基于代理方式的有MySQL Proxy和Amoeba,基于Hibernate框架的是Hibernate
Shards,基于jdbc的有当当sharding-jdbc <https://github.com/dangdangdotcom/sharding-jdbc>
,基于mybatis的类似maven插件式的有蘑菇街的蘑菇街TSharding <https://github.com/baihui212/tsharding>
,通过重写spring的ibatis template类是Cobar
Client,这些框架各有各的优势与短板,架构师可以在深入调研之后结合项目的实际情况进行选择,但是总的来说,我个人对于框架的选择是持谨慎态度的.一方面多数框架缺乏成功案例的验证,其成熟性与稳定性值得怀疑.另一方面,一些从成功商业产品开源出框架(如阿里和淘宝的一些开源项目)是否适合你的项目是需要架构师深入调研分析的.当然,最终的选择一定是基于项目特点,团队状况,技术门槛和学习成本等综合因素考量确定的.


作者:jackcooper
链接:http://www.jianshu.com/p/32b3e91aa22c
來源:简书
著作权归作者所有.商业转载请联系作者获得授权,非商业转载请注明出处.