STATUS  REPORT  ON

                       DATABASE  VERIFICATION  PROJECT

              SPONSORED  BY NORTEL TO B E COLLEGE (D UNIVERSITY )

 
 

I.  OBJECTIVE OF THE PROJECT :

       1. To understand the verification problems associated with
   the Nortel's legacy database
              2. To findout the verification requirements understanding
    the table structures and constraints.
                      Also to collect data values for existing  tables
    record by record.
              3. How to avoid transaction anomalies on existing tables?
              4. To understand the nature & impact of new or changed functions
        on the existing tables of the legacy database .
              5. Suggesting  design guidelines for new database
    tables to be  created .

II.  DELIVERABLES :

              AT THE END OF FIRST YEAR OF THE PROJECT (APRIL, 1999)  BEC HAS DELIVERED
TO NORTEL AN INTERACTIVE PACKAGE REALIZING THE OBJECTIVES SPECIFIED IN FIRST THREE POINTS
NOTED UNDER I. THE COMPLETE PACKAGE IS APPENDED TO THIS REPORT ALONG WITH "README FILE"
AND ILLUSTRATIONS.

              DETAILS OF IMPLEMENTATION ISSUES ARE NOTED BELOW.
 

III.  PHASE-I  TASKS:
         1). BEC  has attempted to achieve  first three  objectives
                                  in  Phase-I.
                        by developing different interactive software modules described below:
 

        *Collecting/Displaying/Modifying  Table Structures existing in
             the legacy Database from field/design office.
                      *Collecting Data Records against Tables maintaining value-range
                                   and type constraints from actual users.
 
                      * The  problem arises in  legacy data bases
     due to incorrect design of table structures,composite attributes,
                          etc. as invented by Codd in 70s. In relational theory Codd
                          also pointed out functional anomalies caused due to the following:
 
       1. Inproper choice of key/primary key   .
       2. Not use of normalized table structures.
       3. Random use of composite and vector
      attribute(s)/field(s)  in tables.
                            4. Improper definition of attribute
      constraints, functinal dependencies, null values,etc.
    Codd developed Normalization method to overcome above difficulties.
 

              2). BEC has attempted to tackle the Table Structure Verification  problem
                using two separate  approaches .
 
   i) Using codd's relational theory for verifying table structures in
                        1NF,2NF,3NF normal forms.Therefore
 
                     * We have developed table Normalization /Verification
         algorithm (upto 3NF)  assuming  the following:
 

    1. Any composite attribute can  be splitted into
       atomic data-items to ensure 1NF constraint .
                         2. Any vector attribute ( delimitted by '$')will
       not perticipate in the primary key determination.
                         3. Functinal Dependencies between primary key and non-key
       attributes can be collected  through users
       interactions .
                         4. Existing table structures can be decomposed into
       multiple 3NF tables,of course, ensuring lossless join(based on
       ABU's algorithm).
 

       ii)  Since for legacy database tables  functional dependencies  and
     primary keys may  not be  known we have to apply some reverse engineering
     method to determine  primary keys and functional dependencies.
     The machine learning method applied here is the'Combined Entropy
     approach with uniqueness testing' .To apply this method data-records
            for each will be required.The interactive package has taken care
            of that.
 

    ACHIEVEMENTS :
 
 

  * BEC has been able to develop an interactive software package which is
    capable of perfomming the following activities .

    1. Collection/Modification/Deletion of table structures
       used in legacy database  of  NORTEL .
                         2. Collection of data-records based on those structures for
       applying machine  learning approach .
                         3. For any given table structure and data set
              it is possible by the package  to determine
       the primary keys  using 'Combined Entropy & Uniqueness
       Testing Approach ' .
                          4. Once the primary key of a table is determined ,
       verification of table structure for 1NF,2Nf,3NF can
       be done automatically with the help of the  package
       in an interactive manner .
 
     NOTE :
           We  could test the s/w package with only 'trkmem'
  table structure because dataset of  384 records
  were supplied for that table only . Although the structures
  of  other two tables, namely 'sbsinv' & 'cdmaconf',were
  known  tests could not be performed for want of sufficient
                 data-records.
 
 

  PROPOSED APPROACH FOR  SECOND PHASE :
 
 
 

        1. With the interactive Software Package we will be able to collect
           a good number of table structures and  data records from users.
           These collections will help us in understanding the system requirements.
           Identification of problems areas will also be easy.
 
 2. For automatic verification it is possible to develope a tool based on
   decission-tree/decision-list  methods of machine learning approach.

  The usefulness  of proposed  method has been already tested by "trkmem" table.
  The beauty of the  proposed m/c learning approach is that for any existing
  table structure ,it will create a  decission trees/lists .Once such
  decission trees become known for any legacy database tables,they will act
        as verification templates against any further changes proposed in the
  table structures/operations on dataset . The process of verification
  can be made automatic. The proposed algorithm will be able to fulfil
         real-time requirements also.