One , Data governance architecture

       
It is divided into three levels , They are strategy and governance guarantee , Big data management and big data application and service , Among them, strategy and governance guarantee include , Data strategic planning and evaluation , Data governance organization and responsibility , Data system and management process ; Big data management includes data standard management , Architecture and model management , Quality Assurance , Life cycle management and safety management ; Big data applications and services include , Data analysis , Open sharing and data services . The architecture is designed according to the maturity model of national standard data management capability .

Two , The discovery and rectification process of data quality problems

        The process is divided into three stages : Source business system data analysis , Develop quality inspection rules and analyze source system data , Analyze the impact of quality problems and formulate corrective measures .

       Source business system data analysis phase of the analysis material : Source system operation manual , Requirement analysis description , Database design , Source system data ;

                                               
  Analysis process : Analyze business processes , Logic and relationship , Determine the relationship between database tables and fields , At the same time, the source service system is analyzed                                    
                                  Business association and data association among other systems with association relationship ;

                                               
  Output results : database structure ( Primary foreign key , constraint , Relationship between tables , Field length and type, etc ), Business description ( Tables and fields                                    
                                  Business implications of , Business rules ).

      Input of data quality inspection stage : Output of analysis phase and Business Report ;

                                     
process : Preparation of quality inspection rules , And take it as the core , Design quality inspection procedures or scripts , Execution automation , Batch inspection . Focus in the process                                  
                    The data table quoted by business report is analyzed emphatically ;

                                      output : Quality problem list and problem location .

    Input of data quality analysis stage : Output of analysis phase and inspection phase ;

                                    process : Analyze the impact of quality problems on the report , Impact analysis ; Cause analysis ; Develop solutions ;

                                      output : Analysis report and rectification plan .

    The data quality analysis report will be an important basis for the subsequent big data governance platform .

Three , Data standard construction process

        Building the framework of basic data standards and index data standards ; Determine the scope of standardization , For important indicators ( attribute ) Standardization .

        Construction process : The process requires business personnel and technical personnel to participate in the combing and preparation , Supplement and improve the standardization scheme .


Content of basic data standard framework :  Refer to the data standards and specifications of the people's Bank of China , Divide , Include business attributes , Technical attribute and management attribute , Business attributes include standard topics , Standard category , Standard subclass , Standard subclass , Standard Chinese name , Standard English name , Business definition , Business rules , Fusion rules , My uncle is convinced , Relationship with relevant standards , Source and basis of standards ; Technical attributes include data types , data format , Code coding rules , Value range ; Management attributes include standard definers , Standard Manager , Standard user , Description of feedback results , Standard application area and application system ;


Index standard construction : Screening important business indicators ()=> Develop the framework of indicators and standards ( Determine the index classification system , Attribute forms a standardized definition template )=> Formulate index standards ( Index definition , caliber , rule , data sources , Technical department defines standardization attributes , Business departments revise and confirm indicator standards )

Index data standard framework : Business attributes , Technical and management attributes .

Four , Building data platform

        Hierarchical relationship : attribute ( field )=> entity ( object , surface )=> special ( Entity collection , Business topics )=> theme    ( Thematic collection , Business domain )   

       
Model evolution process : Basic model => logical model => physical model . Formation of basic model , It needs to be based on national or industrial standards , According to the business situation of the specific implementation unit, it is tailored , By retention , Add and merge business matching process , Form the basic model , Dividing business topic model framework ; In the process of logical model design , Need to copy ( Fields with different business meanings and field names between systems ), integration ( Fields with the same meaning and different names ) And split ( Fields with different meanings and the same name ); After that, the attributes are matched , retain , increase , merge , Split and map the source system into topics in the underlying model , Entities and attributes .

          After model design , Data mapping and ETL operation , Source table data , Reference mapping file , Development specification and loading strategy , conduct ETL Development and operation , Populates the data into the target table ;

          Through the above operation , Completion of integration layer construction , The significance of integration layer to data application : Unified business view , Detailed data , Comprehensive data , Stable data model , Complete historical data .

          Summary layer , That is to meet the common data access requirements , Extract public indicators , Form a dimension model composed of dimensions and indicators , The data that meet the requirements are processed in advance .