This document resumes all aspects of the MOMIS prototype do be
developed or to be studied.
index
Keywords
- Ontologies
- XML
- Dynamic Intelligent integration
Future projects
pDL |
Digital Library Integration |
End of 2002 (?) |
pM |
MOMIS Query manager implementation |
End of 2003 (?) |
pA |
MOMIS Automatic Integration |
after 2002 (?) |
Theoretical aspects
-
(pDL) Combining
XML and Domain Ontologies for Digital Libraries
access
This will require studies about
- Ontologies compatibility (minimum set of data structure
to represent information in any Ontology).
We need to study and define an interface (like a wrapper)
to access to any ontology
- Multi-lingual Ontologies
- Ontology versioning issues (what if each month the
reference ontology changes ?)
- Ontology extension techniques (how allow Ontology to
grow with correctness assurances)
- (pA) Extend
odl_i3 (language and/or classes) to
support annotation
-
(pA) Allow the
user to define new meanings as combination of
Wordnet Meanings.
This will solve common problmems such as
- Phone number
- person code vs product code
Moreover, since such new meaning will be defined on
standard meaning, they will be recognized with not a big
effort.
I suggest to introduce an Annotation hierarchy , where
an annotation can be a AnnotationWordnetMeaning or a
AnnotationComposed as composition of ohter Annotation.
In this way we will have extensible structure and we will have
to develop technique to compute
matching between Annotations.
-
(pA) Move the
annotation capabilities on Wrappers
Todo:
- Create a wrapper initialization tool with the SLIM
capabilities (Guerra already developed an initialization
module for the XML wrapper) at least for the JDBC
wrapper
- Study how to express the annotation information.
The alternatives are to use XML in addition to the odl_i3
OR to extend odl_i3 to support annotations
- Study how to identify the wordnet synsets.
Different wordnet versions could be used.
Moreover wordnet is not the only ontology usable.
We need to study a format that simplifies as much as
possible the ontology management.
(may be interesting to study something like wrappers for
ontologies such as they shares the same interface.)
- (pM) Choose the way to represent union il
odl_i3.
Nowaday the Union can be defined between
interfaceBody.
Is this the correct solution ? (There are problems in type
resolution defining foreign keys and complex objects.)
Sould be possible to define union direclty on the single
attribute ?
- (pM) Extend OLCD (ODB-Tools) to
support the union and/or
or operator.
- (pA) Using MOMIS and Mobile Agents to easily discover, integrate and
query interesting data sources
Claudio Sartori (Professor), Alberto Corni
(Ph.D. Student), Gianluca Potenza (undergraduate
Student)
-
(pA) Dynamic integration
Study of techniques to dynamically integrate sources choosing
between
- maintain as much as possible the same global schema
interface
This means that it will be possible to build applications
on an integrated schema even if the data sources changes
dynamically.
- dynamically ever compute the best global schema that
represent the currently integrated sources
-
(pM) Problems related to complex
attribute mapping:
-
Is it possible to map local attribute
defined on complex Structs?
For example, consider two sources
s1 {
struct pallet {
string name;
boolean[5] freeSlots;
}
Interface pallet1() {
attribute Struct pallet st;
}
}
s2 {
Interface pallet2() {
attribute string name;
}
}
The the global schema for these two sources should contain
a single global class Pallet composed by the local
classes s1.pallet1, s2.pallet2.
The correct mapping table should be something like:
Pallet {
globalAttribute name {
s2.pallet2.name,
s1.pallet1.st.name
}
globalAttribute freeSlot {
s1.pallet1.st.freeSlot
}
}
Has this problem ever considered ?
- Il problema del complex mapping nel contesto di Momis e'
che non e' chiaro cosa comporti il fatto che un attributo
globale sia di tipo complesso e che un mapping element sia
di tipo complexMapping. La struttura finora creata sembra
ipotizzare un mapping di questo tipo: un generico global
attribute e' di tipo complesso se il suo dominio e'
rappresentato da un'altra global class. In particolare un
global attribute mappa elementi il cui dominio non e' di un
tipo base ma e' di un tipo rappresentato da un'altra global
class. Attualmente in Momis se un mapping element
appartenente a un global attribute e' di tipo complexMapping
significa pertanto che ha un dominio rappresentato da
un'altra global class e l'istanza di complexMapping dovrebbe
contenere tutte le informazioni relative alla classe
referenziata.
Questo tipo di mapping genera pero' alcuni problemi: le
informazioni relative alla classe globale referenziata sono
contenute a livello di mapping element (e cioe' ogni mapping
element deve indicare a quale global class fa riferimento) e
non di global attribute. E' necessario portare queste
informazioni a livello di global attribute in modo da
rendere visibile la referenziazione adottata. Il problema
successivo e' quello di riuscire a mantenere il corretto
allineamento fra le istanze delle due global class: cioe'
tra istanze che contengono elementi un cui attributo e' di
tipo complex Mapping e quelle contenenti il corrispondente
attributo referenziato.
Product re-engineering
- In the shared classes of MOMIS there are duplicated data structures, eg, the data
structures that describes Classes and Attributes are defined
both in odl and in the globalschema
packages.
We need to remove such redundancy
-
(after having defined a stable odi3
architecture)
Re-engineering of the Query
Manager to be fully integrated whith the CORBA
architecture.
This work have to take in account performances issues.
- limits the allocation of resources nowadays the quey
manager make use of DB2 to temporary store data. This uses a
lot of resoureces.
- (scalability) eventulally porting on a cluster of
machines with load balancing issues.
- Portability: The engine should be accessible
through different query languages like OQLi3 or XML-QL
The goal of this study is to design an advanced query engine
with optimizer for distributed soureces that uses the
information produced by the integration module. THIS IS A BIG
PROBLEM that requires skills in DBMS, distributed DDMS and
will produce a lot of modular software.
- Attualmente, la lista degli attributi di
ogni singola Interface, oggi viene gestita come
Vector di oggetti Attribute in quanto ogni
singolo Vector raggruppa gli attributi omonimi generati dalle
union in diverse interfaceBody di una stessa interface.
Questo non aiuta la visione Object delle classi ODL_i3. Per cui
occorre definire un apposito oggetto, ad esempio
AttributeList, che svolga questo ruolo.
- Management of the Join Rules.
To take in account Join Rules during clustering the better
solution is to manage in the same way all kind of relationships
between schema classes.
The first step should be to build a hierarchy of relationships
distinguish between intensional and extensional. We already know
how to manage the intensional relationships. Extensional
relationships will be divide then in extensional
axioms and join maps.
- Mapping table management.
It could be better to make more flexible the mapping between
elements. For instance, the and mapping should be
performed between two generic mapping element that could
be and mapping or simple mapping, or
mapping and so on.
We should define an hierarchy of mapping elements and let the
designer combine them as he prefers.
- TUNIM, management of the inherited local
attributes.
To give as much as flexibility possible to the mapping, it is
necessary to instance all inherithed attributes.
Implementation aspects
technologies and libraries for the MOMIS prototype
- 13 jan 2001
Serialization and deserialization using XML
Add in the specification that ALL SERIALIZED
CLASSES MUST IMPLEMENT THE EMPTY CONSTRUCTOR (that is a
constructor with no arguments).
- Tuning of the ODB-Tools CORBA Server interface (there are
some performance problems)
- Wordnet server: add the "where is the wordnet
database" configuration parameter.
- Do an animated demonstration of the SI-Designer using a
"video grabber" software (Daniela Nasi suggestede
HyperCam)
-
Export MOMIS data in XML
there are two possibilities:
- To produce a global schema DTD (in XML or better in
XML-SChema) to export an integrated schema and then
produce a QueryManager derived tool to query such schema
using XML-QL.
The integrated data source will be exported as XML.
- Export (and import) all information about schema
integration (local source structure, thesaurus, mapping
table, join map, etc.) in XML. This will allow any tool that
includes an XML parser to easily access the integrated
sources or to implement a wrapper without using ODL_i3.
This will lead to the definition of a XML-DTD for
integration phases description able to carry at least the
ODLi3 information.
-
ODB-Tools: make ODB-Tools portable.
Study of the kernel algorithms of ODB-Tools, cost/benefits
analisys about re-writing ODB-Tools kernel (-only the kernel-)
in java
If the kernel was integrated in SI-Designer...
- It will be possible to track inconsitency and will be
possible to develope a GUI that graphically shows it
- Foreign key type consistency
check problem.
-
New technologies
- see C++ if there are new features like garbage
collection. Sometime java is really too slow, some C++ (not
C) kernel object should improve performances...
- explore the Java Enteprise Objects.
- explore the Iona ORB, freeware soupport for corba and
MICO that should be an ORB by GNU.