Chapter 5. Usage statistics

5.1. Usage statistics
5.1.1. Why are we doing this?
5.1.2. Opt-out
5.1.3. How to opt-out
5.1.4. What is sent?
5.1.5. How is the data sent
5.1.6. When is the data sent?
5.1.7. What will the data be used for?
5.1.8. Data protection policy
5.1.9. Feedback
5.1.10. Updates to policy
5.1.11. Accreditation

5.1. Usage statistics

OGSA-DAI contains statistics collection functionality. This sends statistics from your host to an OGSA-DAI server at test.ogsadai.org.uk. This functionality is enabled by default. This page outlines the information we collect, why we collect it and how to opt-out of the collection process.

5.1.1. Why are we doing this?

The OGSA-DAI project receives government funding and must demonstrate that the e-Science community is taking up and using OGSA-DAI and that OGSA-DAI is making a valuable contribution to projects world-wide.

To this end, we have utilised support provided by the Globus Toolkit that will allow OGSA-DAI installations to send us generic usage statistics. This data is as generic as possible (see Section 5.1.4, “What is sent?”). By participating in this, you help our funders to justify continuing their support for the software on which you rely.

5.1.2. Opt-out

We are using opt-out rather than opt-in. The reason is that we need this data - it is a requirement for funding. We are sure our fellow users would be willing to help show that Grid Computing works and is in use. Realistically, however, we know that if it requires any additional effort to set up usage statistic reporting, it would drastically reduce the number of users that would actually report the data. To be effective, we need to require zero additional effort. By not opting out, and allowing these statistics to be reported back, you are explicitly supporting the further development of OGSA-DAI.

5.1.3. How to opt-out

If you must opt out of usage reporting, perform the steps described below.

5.1.3.1. How to opt-out - OGSA-DAI GT

  1. Edit the OGSA-DAI GT JNDI configuration file.
    • If you have not yet deployed OGSA-DAI then edit the file in:
      deploy/jndi-config.xml
      
    • If you have deployed OGSA-DAI GT onto Tomcat edit the file:
      TOMCAT/webapps/wsrf/WEB-INF/etc/dai/jndi-config.xml
      
    • If you have deployed OGSA-DAI GT onto Globus Toolkit standalone container edit the file:
      GT/etc/dai/jndi-config.xml
      
    • This assumes that the OGSA-DAI administrator used the default OGSA-DAI GAR name of dai.
  2. Comment out, or remove, the environment element of form:
    <environment name="ogsadai/misc/uk.org.ogsadai.USAGE_MONITOR_URL" 
                 value="test.ogsadai.org.uk:4810"
                 type="java.lang.String"/>
    
    Do not worry if the URL is not the one specified above - as long as it is the entry with name set equal to ogsadai/misc/uk.org.ogsadai.USAGE_MONITOR_URL.

5.1.4. What is sent?

The following information is sent:

  • Component identifier.
  • Usage data format identifier.
  • Time stamp.
  • Source IP address.
  • Source hostname (to differentiate between hosts with identical private IP addresses).
  • A list of the names of activities in OGSA-DAI workflows submitted to an OGSA-DAI server. Only the activity names are sent, not their content or arguments i.e. activity arguments such as database queries are not sent.

5.1.5. How is the data sent

The messages are sent as a single unencrypted UDP packet. While this may cause us to lose some data, it drastically reduces the possibility that the usage statistics reporting can adversely affect the operation of the software.

5.1.6. When is the data sent?

Data is sent when:

  • An OGSA-DAI server is started.
  • An OGSA-DAI server is contacted.
  • A workflow is submitted to an OGSA-DAI server.

5.1.7. What will the data be used for?

The OGSA-DAI project can gain an idea as to the number of downloads of the OGSA-DAI distribution from our project WWW site. This does not however tell us who is actually using OGSA-DAI. The usage statistics tell us just that - how many users have deployed and are actively using OGSA-DAI.

By collecting activity names this gives us an idea as to what functionality of OGSA-DAI our users use. The number of tasks that is requested that the OGSA-DAI team undertake far outweighs the effort available and so information on the functionality used by users allows us to determine where best to focus our development effort. It also allows us to focus our testing and tutorial development.

Our intent is that the data that we get is generic enough that we do not compromise privacy. We record the IP only for counting purposes to know how many sites there are, but we will not produce site-specific statistics, i.e. The data will not be used to answer questions such as "IP 123.456.789.012 submitted 42 workflows last month. The raw data collected will not be made available to anyone outwith the OGSA-DAI project.

Generalised information and statistics about the use of OGSA-DAI software based on the data collected may be made available periodically to our funders and partners, e.g. the number of OGSA-DAI services active in the last month. Individual users or machines will not be identified. This information may also be placed on our public web pages.

5.1.8. Data protection policy

Under the United Kingdom Data Protection Act 1998, we have a legal duty to protect any information we collect from you. We comply with the University of Edinburgh Data Protection Policy[1] with regards to the handling and storage of personal data. We do not share or disclose any personal information collected by the project with any third party without permission, and we do not sell or rent it. However, we may disclose information when legally compelled to do so - in other words, when we, in good faith, believe that the law requires it or for the protection of our legal rights. As a UK Higher Education Institution, we are committed to publishing certain information under the Freedom of Information Act (Scotland) 2002 (please see the University's publication scheme for more details). However, we consider the information collected as part of the usage statistics as exempt from this publication scheme under the Data Protection Act 1998.

5.1.9. Feedback

Feedback from our user communities will be useful in determining our path forward with this in the future. We do ask that if you have concerns or objections, to please be specific in your feedback. For example: "Our site has a policy against sending such data" is good information for us to know in the future. A link to such a policy would be even better.

If you have any feedback about the usage statistics collection, or have questions about the gathering or storage of your data, please send email to info@ogsadai.org.uk or write to us:


c/o OGSA-DAI Project,
EPCC,
The University of Edinburgh,
James Clerk Maxwell Building,
Mayfield Road,
Edinburgh,
EH9 3JZ,
UNITED KINGDOM.

5.1.10. Updates to policy

If this information about usage statistics collection changes in any way, we will place an updated version on our website.

This policy was last updated on 10th August 2007.

5.1.11. Accreditation

This page is an OGSA-DAI-specific customisation on the Globus Toolkit's page on usage statistics at: http://www.globus.org/toolkit/docs/4.0/Usage_Stats.html