In my DRP, I explained how data integration can be implemented using a logic-based theorem prover or reasoner. The framework is based on ontology translation at a high level plus various syntax wrappers to store and retrieve data from underlying sources at the low level. This work focused primarily on the scenario of one source and one target. But how about multiple sources? This problem becomes interesting when we consider that each model transformation reasons independently over objects in the source domain, but it is highly possible that objects are shared across domains. For example, the person with account name "firstname.lastname@example.org" at Amazon.com is very likely the same person identified as "email@example.com" at Google.com, but how do we know for sure? Therefore, when we retrieve data from multiple sources, we desire a way to reconcile references to objects so that we can correctly remove redundant (add additional) information from (to) the answer set. In the future, we would like this mechanism to further our goal of detecting and reconciling inconsistencies between data sources such as a possibly incorrect birthdate for "firstname.lastname@example.org".
In this discussion, I will review the theoretical framework for ontology-based data integration. Then I will show how we might extend this framework to accomodate reference reconciliation and inconsistency checking. The intuition so far is to further develop our novel idea that combines both backward and forward chaining together for a single reasoning task. This work is under development, so your feedback and suggestions are very welcome.