Towards Querying and Visualization of Large Spatio-Temporal Databases

Sharma, Sugam
Journal Title
Journal ISSN
Volume Title
Research Projects
Organizational Units
Computer Science
Organizational Unit
Journal Issue

In any database model, data analysis can be eased by extracting a smaller set of the data of interest, called subset, from the mammoth original dataset. Thus, a subset helps enhance the performance of a system by avoiding the iteration through the huge parental data in further analysis. A subset, its specification, or the formal process for its extraction can be complex. In the database community, subsets are extracted through SQL-like queries and through visualization in the Geographic Information System (GIS) community. Both are iterative processes. An SQL query can be a composition of subqueries. Each subquery can be seen as an iterative step toward the extraction of the desired subset. For this to work, subqueries should result into relations that have the same structure as the relations in a given data model. Although it may not be immediately obvious, the visualization can be iterative too. Each community works in its own compartment. Either one uses subprocesses that are only subqueries or only visual interactions. Mixing these two subprocesses would yield a more powerful expressibility in the hands of users.

Parametric Data Model is well-known for handling multidimensional parametric data, such as spatial, temporal, or spatio-temporal. In the parametric approach, the object is modeled as a single tuple, creating one-to-one correspondence between an object in the real world and a tuple in the database. The parametric approach relies on its own SQL-like, but richer, query language called ParaSQL which mimics the classical SQL. However, it is simpler and avoids self-join operations; hence, enhances performance. In the parametric approach, the attribute values are defined as a function, allowing large values, also. The execution of a query in the existing prototype of the Parametric Data Model results in data out, as stream in a raw text format that cannot be queried further. This is unlike classical databases, where a subset provides additional strength to a system and the prototype lacks this potential functionality. The real power of ParaSQL lies in the where clause, and previous versions of the prototype had a very simple implementation. It is expanded further in this research work to harness its hidden potential. To perform the preliminary investigation, exploratory visual analysis is an important aspect in any spatio-temporal database system. Previous versions of the prototype of Parametric Data Model completely lacked the visualization functionality.

This work ensures the output of a ParaSQL (possibly a subset) will be a relation having the same format as relations in the model rather than plain text. It also attempts to expand the power of the where clause, ensuring a clean logic and more generic nature. Some important basic steps are taken to bring a visual in a way that is conducive to the structures in Parametric Data Model. The richness of GIS visualization serves as the foundation for the visual functionality of the Parametric Data Model. The query is executed on the parametric side, while the results are visualized on GIS side. This integration equips the Parametric Data Model with visualization functionality. GIS visualization also offers a click-based selection of a subset and its persistence, which later can be consumed by Parametric Data Model also. This research work establishes a two-way communication between the two communities-Parametric Data Model and GIS- where the output of one can serve as the input for the other and is an attempt to bring them together.

Data, GIS, Model, Parametric, Query, Visualization