Skip to main navigation Skip to search Skip to main content

Source selection in large scale data contexts: An optimization approach

  • Alexandra Pomares
  • , Claudia Roncancio
  • , Van Dat Cung
  • , José Abásolo
  • , María Del Pilar Villamil
  • Institut polytechnique de Grenoble
  • Universidad de los Andes Colombia

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations

Abstract

This paper presents OptiSource, a novel approach of source selection that reduces the number of data sources accessed during query evaluation in large scale distributed data contexts. These contexts are typical of large scale Virtual Organizations (VO) where autonomous organizations share data about a group of domain concepts (e.g. patient, gene). The instances of such concepts are constructed from non-disjointed fragments provided by several local data sources. Such sources overlap in a non mastered way making data location uncertain. This fact, in addition to the absence of reliable statistics on source contents and the large number of sources, make current proposals unsuitable in terms of response quality and/or response time. OptiSource optimizes source selection by taking advantage of organizational aspects of VOs to predict the benefit of using a source. It uses an optimization model to distinguish the sets of sources that maximize benefits and minimize the number of sources to contact to while satisfying resource constraints. The precision and recall of source selection is highly improved as demonstrated by the tests performed with the OptiSource prototype.

Original languageEnglish
Title of host publicationDatabase and Expert Systems Applications - 21st International Conference, DEXA 2010, Proceedings
Pages46-61
Number of pages16
EditionPART 1
DOIs
StatePublished - 2010
Event21st International Conference on Database and Expert Systems Applications, DEXA 2010 - Bilbao, Spain
Duration: 30 Aug 201003 Sep 2010

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
NumberPART 1
Volume6261 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference21st International Conference on Database and Expert Systems Applications, DEXA 2010
Country/TerritorySpain
CityBilbao
Period30/08/1003/09/10

Keywords

  • Combinatorial Optimization
  • Large Scale Data Mediation
  • Source Selection

Fingerprint

Dive into the research topics of 'Source selection in large scale data contexts: An optimization approach'. Together they form a unique fingerprint.

Cite this