MOMIS Todo guidelines

Sat Jan 13 18:07:42 MET 2001

This document resumes all aspects of the MOMIS prototype do be developed or to be studied.

index

Future projects Future projects
Theoretical aspects Techniques and studies about MOMIS
Product re-engineering Things todo on the existing software
Implementation aspects Technologies and libraries for the MOMIS prototype


Keywords

Future projects

pDL Digital Library Integration End of 2002 (?)
pM MOMIS Query manager implementation End of 2003 (?)
pA MOMIS Automatic Integration after 2002 (?)

Theoretical aspects

  1. (pDL) Combining XML and Domain Ontologies for Digital Libraries access
    This will require studies about
    • Ontologies compatibility (minimum set of data structure to represent information in any Ontology).
      We need to study and define an interface (like a wrapper) to access to any ontology
    • Multi-lingual Ontologies
    • Ontology versioning issues (what if each month the reference ontology changes ?)
    • Ontology extension techniques (how allow Ontology to grow with correctness assurances)
  2. (pA) Extend odl_i3 (language and/or classes) to support annotation
  3. (pA) Allow the user to define new meanings as combination of Wordnet Meanings.
    This will solve common problmems such as
    • Phone number
    • person code vs product code
    Moreover, since such new meaning will be defined on standard meaning, they will be recognized with not a big effort.
    I suggest to introduce an Annotation hierarchy , where an annotation can be a AnnotationWordnetMeaning or a AnnotationComposed as composition of ohter Annotation. In this way we will have extensible structure and we will have to develop technique to compute matching between Annotations.
  4. (pA) Move the annotation capabilities on Wrappers
    Todo:
    • Create a wrapper initialization tool with the SLIM capabilities (Guerra already developed an initialization module for the XML wrapper) at least for the JDBC wrapper
    • Study how to express the annotation information.
      The alternatives are to use XML in addition to the odl_i3 OR to extend odl_i3 to support annotations
    • Study how to identify the wordnet synsets.
      Different wordnet versions could be used.
      Moreover wordnet is not the only ontology usable.
      We need to study a format that simplifies as much as possible the ontology management.
      (may be interesting to study something like wrappers for ontologies such as they shares the same interface.)
  5. (pM) Choose the way to represent union il odl_i3.
    Nowaday the Union can be defined between interfaceBody.
    Is this the correct solution ? (There are problems in type resolution defining foreign keys and complex objects.)
    Sould be possible to define union direclty on the single attribute ?
  6. (pM) Extend OLCD (ODB-Tools) to support the union and/or or operator.
  7. (pA) Using MOMIS and Mobile Agents to easily discover, integrate and query interesting data sources
    Claudio Sartori (Professor), Alberto Corni (Ph.D. Student), Gianluca Potenza (undergraduate Student)
  8. (pA) Dynamic integration
    Study of techniques to dynamically integrate sources choosing between
    • maintain as much as possible the same global schema interface
      This means that it will be possible to build applications on an integrated schema even if the data sources changes dynamically.
    • dynamically ever compute the best global schema that represent the currently integrated sources
  9. (pM) Problems related to complex attribute mapping:
    1. Is it possible to map local attribute defined on complex Structs?
      For example, consider two sources
        s1 {
          struct pallet {
            string     name;
            boolean[5] freeSlots;
          }
          Interface pallet1() {
            attribute Struct pallet st;
          }
        }
      
        s2 {
          Interface pallet2() {
            attribute string name;
          }
        }
      
      The the global schema for these two sources should contain a single global class Pallet composed by the local classes s1.pallet1, s2.pallet2.
      The correct mapping table should be something like:
        Pallet {
          globalAttribute name {
            s2.pallet2.name,
            s1.pallet1.st.name
          }
          globalAttribute freeSlot {
            s1.pallet1.st.freeSlot
          }
        }
      
      Has this problem ever considered ?
    2. Il problema del complex mapping nel contesto di Momis e' che non e' chiaro cosa comporti il fatto che un attributo globale sia di tipo complesso e che un mapping element sia di tipo complexMapping. La struttura finora creata sembra ipotizzare un mapping di questo tipo: un generico global attribute e' di tipo complesso se il suo dominio e' rappresentato da un'altra global class. In particolare un global attribute mappa elementi il cui dominio non e' di un tipo base ma e' di un tipo rappresentato da un'altra global class. Attualmente in Momis se un mapping element appartenente a un global attribute e' di tipo complexMapping significa pertanto che ha un dominio rappresentato da un'altra global class e l'istanza di complexMapping dovrebbe contenere tutte le informazioni relative alla classe referenziata.
      Questo tipo di mapping genera pero' alcuni problemi: le informazioni relative alla classe globale referenziata sono contenute a livello di mapping element (e cioe' ogni mapping element deve indicare a quale global class fa riferimento) e non di global attribute. E' necessario portare queste informazioni a livello di global attribute in modo da rendere visibile la referenziazione adottata. Il problema successivo e' quello di riuscire a mantenere il corretto allineamento fra le istanze delle due global class: cioe' tra istanze che contengono elementi un cui attributo e' di tipo complex Mapping e quelle contenenti il corrispondente attributo referenziato.

Product re-engineering

  1. In the shared classes of MOMIS there are duplicated data structures, eg, the data structures that describes Classes and Attributes are defined both in odl and in the globalschema packages.
    We need to remove such redundancy
  2. (after having defined a stable odi3 architecture)
    Re-engineering of the Query Manager to be fully integrated whith the CORBA architecture.
    This work have to take in account performances issues.
    • limits the allocation of resources nowadays the quey manager make use of DB2 to temporary store data. This uses a lot of resoureces.
    • (scalability) eventulally porting on a cluster of machines with load balancing issues.
    • Portability: The engine should be accessible through different query languages like OQLi3 or XML-QL
    The goal of this study is to design an advanced query engine with optimizer for distributed soureces that uses the information produced by the integration module. THIS IS A BIG PROBLEM that requires skills in DBMS, distributed DDMS and will produce a lot of modular software.
  3. Attualmente, la lista degli attributi di ogni singola Interface, oggi viene gestita come Vector di oggetti Attribute in quanto ogni singolo Vector raggruppa gli attributi omonimi generati dalle union in diverse interfaceBody di una stessa interface.
    Questo non aiuta la visione Object delle classi ODL_i3. Per cui occorre definire un apposito oggetto, ad esempio AttributeList, che svolga questo ruolo.
  4. Management of the Join Rules.
    To take in account Join Rules during clustering the better solution is to manage in the same way all kind of relationships between schema classes.
    The first step should be to build a hierarchy of relationships distinguish between intensional and extensional. We already know how to manage the intensional relationships. Extensional relationships will be divide then in extensional axioms and join maps.
  5. Mapping table management.
    It could be better to make more flexible the mapping between elements. For instance, the and mapping should be performed between two generic mapping element that could be and mapping or simple mapping, or mapping and so on.
    We should define an hierarchy of mapping elements and let the designer combine them as he prefers.
  6. TUNIM, management of the inherited local attributes.
    To give as much as flexibility possible to the mapping, it is necessary to instance all inherithed attributes.

Implementation aspects

technologies and libraries for the MOMIS prototype
  1. 13 jan 2001 Serialization and deserialization using XML
    Add in the specification that ALL SERIALIZED CLASSES MUST IMPLEMENT THE EMPTY CONSTRUCTOR (that is a constructor with no arguments).
  2. Tuning of the ODB-Tools CORBA Server interface (there are some performance problems)
  3. Wordnet server: add the "where is the wordnet database" configuration parameter.
  4. Do an animated demonstration of the SI-Designer using a "video grabber" software (Daniela Nasi suggestede HyperCam)
  5. Export MOMIS data in XML
    there are two possibilities:
    1. To produce a global schema DTD (in XML or better in XML-SChema) to export an integrated schema and then produce a QueryManager derived tool to query such schema using XML-QL.
      The integrated data source will be exported as XML.
    2. Export (and import) all information about schema integration (local source structure, thesaurus, mapping table, join map, etc.) in XML. This will allow any tool that includes an XML parser to easily access the integrated sources or to implement a wrapper without using ODL_i3.
      This will lead to the definition of a XML-DTD for integration phases description able to carry at least the ODLi3 information.
  6. ODB-Tools: make ODB-Tools portable.
    Study of the kernel algorithms of ODB-Tools, cost/benefits analisys about re-writing ODB-Tools kernel (-only the kernel-) in java
    If the kernel was integrated in SI-Designer...
    • It will be possible to track inconsitency and will be possible to develope a GUI that graphically shows it
  7. Foreign key type consistency check problem.
  8. New technologies
    • see C++ if there are new features like garbage collection. Sometime java is really too slow, some C++ (not C) kernel object should improve performances...
    • explore the Java Enteprise Objects.
    • explore the Iona ORB, freeware soupport for corba and MICO that should be an ORB by GNU.

The MOMIS Home Page