A spatio-temporal access method based on snapshots and events

of 10
All materials on our website are shared by users. If you have any questions about copyright issues, please report us to resolve them. We are always happy to assist you.
Information Report
Category:

Career

Published:

Views: 12 | Pages: 10

Extension: PDF | Download: 0

Share
Description
A spatio-temporal access method based on snapshots and events
Tags
Transcript
  A Spatio-temporal Access Method based on Snapshotsand Events  ∗ GilbertoGuti´errezR.UniversidaddelB´ıoB´ıo/UniversidaddeChileBlancoEncalada2120,Santiago/Chileggutierr@dcc.uchile.clGonzaloNavarroCenterforWebResearchDepartmentofComputerScienceUniversidaddeChileBlancoEncalada2120,Santiago/Chilegnavarro@dcc.uchile.clAndreaRodr´ıguezT.UniversidaddeConcepci´onCenterforWebResearchEdmundoLarenas215,Concepci´on/Chileandrea@udec.clAlejandroGonz´alezO.UniversidaddelB´ıoB´ıoAvenidaLaCastillaS/NChill´an/Chilealejandro.gonzalez@dmrconsulting.intJos´eOrellanaV.UniversidaddelB´ıoB´ıoAvenidaLaCastillaS/NChill´an/Chile jose.orellana@gmail.com ABSTRACT This paper describes a new spatio-temporal access method(SEST-Index) that combines two approaches for modelingspatio-temporal information: snapshots and events. Thismethod makes it possible to not only process  time slice   and interval   queries, but also queries about events. The SEST-Index implementation uses an R-tree structure for storingsnapshots and a log data structure for storing events thatoccur between consecutive snapshots. Experimental resultsthat compare SEST-Index and HR-tree show that, for achange frequency between 1% and 13%, SEST-Index re-quires less storage space than HR-tree, and for a changefrequency between 1% and 7%, SEST-Index outperformsHR-tree for  interval   queries. In addition, as SEST-Indexis an event-oriented structure, event queries are efficientlyanswered. In order to decrease the storage space for fre-quencies of change above 20%, this work explores alterna-tives that optimize the space of the log structure withoutaffecting the efficiency of query answers. ∗ Gilberto Guti´errez is partially funded by Direcci´on de In-vestigaci´on, Universidad del B´ıo-B´ıo, Grant 043318 3/R.Gonzalo Navarro and Andrea Rodr´ıguez are funded by Nu-cleus Millenium Center for Web Research, Grant P04-067-F,Mideplan, Chile Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee. GIS’05,  November 4, 2005, Bremen, Germany.Copyright 2005 ACM 1-59593-146-5/05/0011 ... $ 5.00. Categories and Subject Descriptors H.2 [ DATABASE MANAGEMENT ]: Database appli-cations  - Spatio-temporal databases  General Terms Algorithms, Performance Keywords Spatio-temporal access methods, R-trees, temporal events. 1. INTRODUCTION Space and time are two inherent attributes of any objectof the real world. Thus, an object is characterized by itsposition and extent at any instant in time [4]. These ob- jects make up the type of spatio-temporal data that need tobe managed in some computer applications. For example,an application of a fleet of taxis needs to store informationabout where and when each of its cars has been. This allowsus to answer queries such as,  Which were the cars located at the shopping center at 6:00 pm ?   or  Which are the closest cars to the car number BB-3545 (which needs assistance)?  .Other applications relate to transportation, environment,social (e.g., demographic and health) and multimedia sys-tems. Spatio-temporal applications have been classified intothree categories depending on the type of data they manage[8].1. Applications that deal with continuos changes, such asthe movement of a car on a highway.2. Applications that include objects located in space andthat can change their position by means of a modi-fication of their geometric shape. For example, the  changes in the administrative boundaries of a city overtime. In this type of applications the changes in thegeometric shape of the objects occur in a discrete man-ner.3. Applications that integrate both previous behaviors.This type of applications appears in the environmentalarea where it is necessary to model the movements of the objects and their geometric shapes in time.This work proposes a new access method (SEST-Index)that is adequate for applications that belong to the secondcategory, thus it supports discrete changes to the locationand shape of the objects. This access method is based onproducing snapshots after a certain number of changes thatoccur over objects and on storing the events that producethese changes in a data structure called  log . Consequently,it allows the representation of (1) temporal snapshots   and (2) events on objects  , described in [16]. This approach has beendiscussed in others studies [2, 3], but it has been discardeda priori by arguing that it is not easy to determine howmany events determine a new snapshot and that extra timeis required for query processing. The number of snapshotsrepresents a tradeoff between space and answer time, sincea larger number of snapshots decreases the answer time of aquery while increasing the storage space. Inversely, a smallernumber of snapshots decreases the space while increasingthe answer time. This work explores and experimentallyevaluates the combination of snapshots and events for thefollowing reasons:1. Both snapshots and events are considered complemen-tary and relevant information for spatio-temporal ap-plications. Interesting queries exist for objects’ statesand for events over objects. For example,  when did an object enter a region?   and  how many objects move out of a region within a given time interval?  .2. The frequency of snapshots can be adjusted dependingon the type of applications and the change frequency of objects. For example, there may be applications whereit is not of interest to query about objects’ states oversome period of time.3. The data structure for snapshots and changes or eventsare independent, and so are the improvements that canbe obtained in either structure. Moreover, integrationof existing spatial access methods for handling snap-shots into this approach can be easily achieved.There exist various spatio-temporal access methods forthis same category of applications. Some are RT-tree [17],HR-tree (Historical R-tree) [6, 7], 3D R-tree [14], HR + -tree [10], MV3R-tree [11] and OLQ (Overlapping LinearQuadtree)[15], among others that are designed to only an-swer  time slice   and  time interval   queries about the history of the special attributes of objects. Unlike these previous stud-ies, this work aims to define a new access method that canefficiently answer time slice, time interval and event-basedqueriers.The organization of this article is as follows. In Sec-tion 2 the advantages and disadvantages of the main spatio-temporal access methods are discussed. Section 3 describesthe proposed access method, considering data structures andalgorithms for query processing and updates. In Section 4an experimental evaluation is presented. Section 5 presentssome variants of SEST-Index. Finally, Section 6 presentsconclusions and future work. 2. SPATIO-TEMPORALACCESSMETHODS This section describes the main spatio-temporal accessmethods available for applications of category 2 (Section 1).Figure 1 shows an example of the evolution of a set of spatio-temporal objects in different instants of time. Forsimplicity, an example in a two-dimensional space is con-sidered. In Figure 1, the axes  x  and  y  represent the two-dimensional space while  t  corresponds to the temporal di-mension. In instant of time  t 1 , objects  O 1  and  O 2  are in-serted. In instant  t 2 , object  O 3  is inserted while object  O 1 moves and  O 2  changes its shape. In instant  t 5 , object  O 1 moves again and object  O 2  changes its form until it com-pletely disappears. A  time slice   query is shown in Figure1. This query is expressed in the following way:  find objects that appear in the rectangle   q   at instant   t 3 . Figure 1: An example of the evolution of spatio-temporal objects According to [12] and [5], it is possible to classify thespatio-temporal access methods into the following three cat-egories: •  Methods that treat time as another dimension. •  Methods that incorporate the temporal informationwithin the node structure without considering time asanother dimension. •  Methods based on overlapping and multiversion of thestructure. In this case, the temporal dimension is sep-arated from the spatial dimension.The 3D R-tree [14] considers time as another axis alongwith the spatial coordinates. Using this approach, an ob- ject that initially remains at ( x i ,y i ) during time interval[ t i ,t j ) and then at ( x j ,y j ) during time interval [ t j ,t k ) can bemodeled by two line segments in a three-dimensional space[( x i ,y i ,t i ) , ( x i ,y i ,t j )) and [( x j ,y j ,t j ) , ( x j ,y j ,t k )), which canbe indexed by a 3D R-tree. This idea works well if all thefinal limits of the time intervals are known in advance. Adisadvantage of this approach is the inefficiency of process-ing time slice queries. Advantages of this approach are theefficiency in the use of space and the efficiency in processingtime interval queries.  RT-tree [17] is a structure belonging to the second cate-gory. In this structure, the temporal information is kept inthe nodes of the R-tree. This is an extension of the infor-mation content that an R-tree normally has. The tempo-ral information plays a secondary role because the search isguided by the spatial information. In this way, queries withtemporal conditions cannot be efficiently processed [6].HR-tree [7, 6] and MR-tree [17] belong to the third cat-egory. Both are based on the concept of overlapping. Thebasic idea is that, given two trees, the most recent tree cor-responds to an evolution of the older tree. HR-tree is one of the most studied methods for which evaluations have beenmade and for which variants exist. The major advantage of the HR-tree is its efficiency in processing  time slice   queries.The major disadvantage is the excessive space that it re-quires to store the structure. For example, if only an objectof each leaf node moves in instant  t i , the tree is completelyduplicated at time instant  t i +1 . MV3R-tree [11] also be-longs to the third category, using a multiversion approach.MV3R-tree uses two structures: a MVR-tree (Multi-versionR-tree) [11] for processing timeslice queries (where HR-treehas advantage) and an auxiliary 3D R-tree for processinglong interval queries (where 3D R-tree has advantage). 3. PROPOSED METHOD: SEST-INDEX The idea behind our method consists in maintaining thesnapshots of the database for certain instants of time and alog to store the events occurred between consecutive snap-shots. When an object undergoes a change in its spatial at-tribute at a given time instant, it generates a change event.The log is store in time-order and allows us to reconstructwhatever the state of the database was between two consec-utive snapshots (see Figure 2). The proposed access methodconsiders that, for each snapshot, the spatial components of the live objects in the database are stored in an R-tree datastructure. As it was explained previously, changes are main-tained in the log. For example, in Figure 2 the state of theobjects in the snapshot  t 0  are stored in  R 0 , and the eventsthat modify the geometry of objects in the temporal interval( t 0 ,t i ), are stored in log  L 0 . Thus, to recover the state of the database at an instant  t  with  t 0  < t < t i , we start fromthe R-tree in instant  t 0  and update objects’ attributes (i.e.,location) with the information of log  L 0 Figure 2: General outline of SEST-Index 3.1 Structure description The structure of the R-tree is the same as the proposal in[1], and the data structure of the log is a linked list of blocks.The entries in the blocks are tuples with the following struc-ture:  < t,Geometry,Oid,Op > , where  t  corresponds to thetime in which the modification (for example the insertionof a new object) happened, and  Oid  is the object identifier. Geometry  corresponds to the values in the spatial compo-nent of the object, which depends on the type of spatial ob- jects that need to be stored. For example, if the objects arepoints in a two-dimensional space, then  Geometry  will be apair of coordinates ( x,y ). If the objects are polygons, then Geometry  will be a set of points that defines the polygon orits approximation, for example, the polygon’s MBR (Mini-mum Bounding Rectangle). Finally,  Op  indicates the typeof operation (i.e., type of event or type of modification). Forthis work,we consider just two types of operations:  move in (an object moves to a new position) and  move out  (an ob- ject leaves its current position). Thus an object creation ismodeled as a  move in , an object deletion as a  move out , andan object changing its position or its shape as a  move out followed by a  move in . The entries in the log are orderedby the attribute  t .The whole index is a data structure  A , which is a se-quence of snapshots  A i  such that  A i .t  is the correspondingtime instant of the  i -th snapshot,  A i .R  is the R-tree forsuch snapshot, and  A i .L  is the log of events between timeinstants  A i .t  and  A i +1 .t . A parameter  d  defines the maxi-mum allowed distance (in disk blocks or changes) betweenconsecutive snapshots. 3.2 Operations 3.2.1 Time slice queries To process a time slice query, the first step is to find thecorresponding snapshot according to the time instant  t  thatis specified in the query. This snapshot corresponds to thelatest time instant within [0 ,t ] when a snapshot has beenstored. Using this snapshot, a spatial search is done usingthe method defined in [1] for an R-tree. The set of objectsobtained from this query is updated with the entries in thecorresponding log (Figure 3).1:  TimeSliceQuery(Rectangle  q , Time  t ) 2:  { R  will be a set that stores the objects of the answer } 3:  Find the last entry  i  in  A  such that  A i .t  ≤  t 4:  R  = SearchRtree( A i .R,q )  { All the objects in  A i .R  that in-tersect  q  at instant  A i .t  are retrieved } 5:  for  each entry  e  ∈  A i .L  so that  e.t  ≤  t  do 6:  if   e.Geometry Intersect ( q )  then 7:  if   e.Op  =  Move in  then 8:  R  =  R ∪{ e.Oid } 9:  else 10:  R  =  R −{ e.Oid } 11:  end if  12:  end if  13:  end for 14:  return  R Figure 3: Algorithm to process a  time slice   query 3.2.2 Time interval query Similar to the process for time slice queries, objects fromthe R-tree are retrieved by starting at a previous referencepoint that is closest to the starting instant  t 1  of the timeinterval of the query. Then, the process continues with all  entries in the log whose time instant is less than or equal tothe upper bound  t 2  of the time interval of the query (Figure4).1:  IntervalQuery(Rectangle  q , Time  t 1 , Time  t 2 ) 2:  G  =  ∅ { G  is a set that stores objects of the answer } 3:  Search last entry  i  in  A  such that  A i .t  ≤  t 1 4:  R  = SearchRtree( A i .R,q )  { All objects of   A i .R  that intersect q  in the instant  A i .t  are retrieved } 5:  L  =  A i .L 6:  k  =  i 7:  Update  R  with the changes in the log  L  which are found inthe interval [ t,t 1 ] (just like a TimeSliceQuery) 8:  G  =  R 9:  if   all the entries in log  L  were processed  then 10:  k  =  k  + 1 11:  L  =  A k .L 12:  end if  13:  Let  t s  be the next instant after  t 1  stored in Log  L 14:  while  t s  ≤  t 2 ∧  entries remain to be processed in  L  do 15:  Update  R  with the changes in Log  L  ocurred in  t s 16:  G  =  G ∪ R 17:  if   all the entries in log  L  were processed  then 18:  k  =  k  + 1 19:  L  =  A k .L 20:  end if  21:  Let  t s  be the next instant stored in Log  L 22:  end while 23:  return  G Figure 4: Algorithm to process an  Interval   typequery 3.2.3 Queries about events With SEST-Index it is possible to process queries aboutevents. For example, given an area  q   and a time  t , find howmany objects  move in  the area  q   and how many objects move out  the area  q   at time instant  t . A simple implemen-tation of this type of queries consists in using an array  B to locate the log block  Bid  containing the events occurredat time  t . Each entry of the array  B  occupies a small spaceand, therefore, it is possible to keep various entries in a diskblock. For example, for blocks of size 1024 bytes, it is possi-ble to store approximately 128 entries. Notice that this im-plementation doesn’t need to access an R-tree for processinga query about events. Figure 5 describes the algorithm. 3.2.4 Updating the structure This method aims to update the index with the changesover objects occurred in a particular time instant. Assumewe have a list with all changes that have occurred at a timeinstant and with all objects that can be found ”live” in thedatabase just before this time instant. What the algorithmdoes is to update the ”live” objects at the new time instantthat is being inserted in the database. It then verifies if theparameter  d  satisfies the threshold condition, that is, if thesize (in blocks) defined for the logs is reached by the newchanges. In such case, a new snapshot is created, whichimplies creating a new R-tree and a new entry in  A . If   d  hasnot been reached, the changes are simply stored in the log.Figure 6 describes this algorithm. 4. EXPERIMENTAL EVALUATION With the aim of evaluating the performance of SEST-Index, it is compared with HR-tree in terms of disk usage1:  EventQuery(Rectangle  q , Time  t , Array  B ) 2:  Search the entry  i  in array  B  such that  B i .t  =  t 3:  L  =  B i .Bid 4:  terminated  =  false 5:  oi  = 0  { quantity of objects that entered  q  at time  t  } 6:  os  = 0  { quantity of objects that left  q  at time  t  } 7:  while  not terminated  do 8:  for  each entry  e  ∈  L  so that  e.t  ≤  t  do 9:  if   e.t  =  t  ∧  e.Geometry Intersect ( q )  then 10:  if   e.Op  =  Move in  then 11:  oi  =  oi  + 1 12:  else 13:  os  =  os  + 1 14:  end if  15:  end if  16:  end for 17:  if   e.t > t  then 18:  terminated  =  true 19:  else 20:  L  = next log block 21:  end if  22:  end while 23:  return  ( oi,os ) Figure 5: Algorithm to process queries about  events 1:  InsertChanges(Time  t , Changes  c , SnapShot  F  )  { t  isthe instant of time in which the changes happen,  c  is a listwith the changes occurred at  t  and  F   corresponds to the ”live”elements in the instant immediately preceding  t } 2:  Let  i  be the last entry of   A . 3:  Update  F   with the changes of   c 4:  Insert the elements of   c  at the end of log  A i .L 5:  if   the total changes in  c  plus the stored changes in the log A i .L  is greater than  d  then 6:  { Create a new snapshot in SEST-Index }  A i +1 .t  =  tA i +1 .R  = R-tree with the live objects in  F A i +1 .L  =  ∅ 7:  end if  Figure 6: Algorithm to update the structure  Figure 7: Space usage with 3000 objects and number of blocks read for the different types of queriesdescribed in the previous section (time slice, time intervaland event-based queries). In the absence of real data, the ex-periment uses data obtained with the spatio-temporal datagenerator GSTD[13]. Sets with 1000, 3000 and 5000 points(objects), respectively, were created. These points are dis-tributed uniformly within a region in time instant 0.0. Sub-sequently, they are moved randomly during the next 50 timeinstants until reaching time instant 1.0. Thus, in this eval-uation, we consider change events related to the movementof objects. Four percentages of object mobility (change fre-quency) were considered: 1, 3, 5 and 7 percent per instant of time. Three values for the parameter  d  were also studied: 4,8, and 12 disk blocks with a block size of 1024 bytes. Disk us-age was measures by the number of blocks used by the datastructures after inserting the objects and their changes. Ac-cess time was defined as the average number of blocks readfor performing 100 random queries.The tests were performed on a Pentium 4 computer with1.6 Ghz, 1 Gb of RAM and Linux Operating System. 4.1 Space usage Figure 7 shows that SEST-Index uses less disk space thanHR-tree, and this behavior is more pronounced as the num-ber of objects or the percentage of changes decreases. An ex-treme case in wich SEST-Index utilizes more disk space thanHR-tree occurs when the selected  d  is so low that SEST-Index is only able to store the changes occurred in a singletime instant in each log. In such a case, a new snapshot foreach time instant is created and the structure degeneratesinto many individual R-trees.A simple idea for reducing even more the space used bySEST-Index is to consider the basic HR-tree strategy, thatis, at the instant of creating a new R-tree in a snapshot,reuse part of the R-tree of the previous snapshot. This ideawas evaluated, but little space was saved (around 5%). 4.2 Time slice queries 100 queries were performed with rectangles (query win-dows) formed with 5% and 10% of the dimension in eachaxis. The results shown for 3000 objects are similar to theresults obtained for 1000 and 5000 objects.We can observe in Figures 8 and 9 that SEST-Index readsmore blocks than HR-tree in this type of query. For mostcases, SEST-Index needs, on average, twice of the number of disk accesses required by HR-tree. SEST-Index always takeslonger to answer to a time slice query than HR-tree, since itdoes not only read an R-tree but also the corresponding log.Nevertheless, as the parameter  d  gets smaller, the disadvan-tages of SEST-Index decrease because small logs imply lessblocks read in order to answer a query. Figure 8: Blocks read by a time slice query witha query window formed by 5% of the dimension ineach axisFigure 9: Blocks read by a time slice query with aquery window formed by 10% of the dimension ineach axis 4.3 Time interval queries As for time slice queries, the evaluation of time intervalqueries uses 100 random query windows formed with 5% and10% of the dimension in each axis. Only results for 3000objects are shown, since similar conclusions were reachedfor 1000 and 5000 objects.We can observe in Figures 10, 11, 12 and 13 that SEST-Index reads less blocks than HR-tree when the change fre-quency is low (less than 5%). As the query interval grows,the advantage of SEST-Index increases. This advantage isdue to the fact that SEST-Index only reads one R-tree andsequentially a log. HR-tree, in contrast, must read an R-treefor each time instant within the query interval. 4.4 Queries about events In the same way of previous evaluations, 100 queries wereperformed with rectangles (query windows) formed with 5%and 10% of the dimension in each axis. The results shownfor 3000 objects are similar than those obtained for 1000 and5000 objects. The queries about events were processed usingthe algorithm in Figure 5. In this algorithm, the number of blocks that were read depends only on the percentage of change frequency and not on the parameter  d  or the size of the query window. This can be clearly seen in Figure 14.
Recommended
View more...
We Need Your Support
Thank you for visiting our website and your interest in our free products and services. We are nonprofit website to share and download documents. To the running of this website, we need your help to support us.

Thanks to everyone for your continued support.

No, Thanks
SAVE OUR EARTH

We need your sign to support Project to invent "SMART AND CONTROLLABLE REFLECTIVE BALLOONS" to cover the Sun and Save Our Earth.

More details...

Sign Now!

We are very appreciated for your Prompt Action!

x