Members of the OGSA-DAI team collaborated closely with the members of VOTES project team at NeSC, University of Glasgow.
Two members of the OGSA-DAI team travelled to Glasgow to meet with members of several e-Science projects. Discussions with the VOTES team identified functionality that was missing from OGSA-DAI that would be of use to the VOTES project and more generically to many other users of OGSA-DAI. The OGSA-DAI team assigned staff to work on this functionality and liaise with the VOTES team to ensure it met their requirements. The collaboration started on 17 September 2007 and lasted for 6 weeks.
The VOTES project had a scenario which required access to information stored in relational databases and required that the information could be coordinated and integrated across multiple databases, in essence a cross database join.
The goal of the interaction was the development of OGSA-DAI 3.0 activities for performing the following tasks:
The developed activities were:
The TupleMergeJoin activity is used to join two tuple streams. The TupleMergeJoin activity can be use to join two independently produced tuple streams, these streams could be from relational resources or from another tuple producing activity.
The SQLNestedInClauseQuery and SQLNestedInClauseJoin activities are designed to use the nested in clause feature of SQL to produce a more efficient operation. These activities can be used when the tuples extracted from one data resource can be used to identify those tuples for extraction from the other data resource.
If the desired output is simply a set of tuples from the second data resource then the SQLNestedInClauseQuery activity can be used in conjunction with the SQLQuery activity to achieve the desired result.
If the desired output is a set of tuples obtained by joining tuples from the first data resource with tuples obtained from the second data resource then the SQLNestedInClauseJoin activity can be used in conjunction with the SQLQuery activity to achieve the desired result.
The scenario and functions required were well-defined initially and gave a good foundation for the activities. From the scenario, the activities were specified and then implementation was carried out by the OGSA-DAI developers.
The activities although well-defined proved to have multiple implementation options and this proved to be one of the biggest time consuming tasks, looking at different methods of implementing the activities. This process highlighted some issues with the current sub-workflow API.
The activities were tested on both Axis/OGSA-DAI and GT/OGSA-DAI against both indexed and non-indexed databases. The activities were tested for functionality and correctness. Subsequent tests were concerned with performance issues, including how database indexing affected the performance. These performance tests are to be revisited in the future.
The testing found a potential deadlock situation in workflows using the SQLNestedInClauseQuery activity. Further information can be found in the documentation for the activities.
For distribution, the activities have been packaged into the OGSA-DAI 3.0 Extension Pack 1 release.
The process of developing the documentation and packaging the extension pack was split between the OGSA-DAI team. From this process, a method for releasing further extension packs has been developed.
Another possible extension which has been raised by this project and subsequent projects is the possibility of a SQLQuery like activity for accessing web services.
Alistair Grant, OGSA-DAI development team, EPCC:
Anthony Stell, VOTES developer, University of Glasgow: