Data Retention Policy

Hi all,

I would like to start a discussion about a data retention policy for the Flukso website. Looking at the energy consumption data it is possible to infere quite a lot about the life of someone. I blogged about it here:

http://gonium.net/md/2009/11/10/power-metering-with-flukso/

Although I don't think that Bart will sell our energy consumption data, I think it is important to have a policy in place for this. In a private eMail discussion with Bart I came up with the following claims:

(1) I would like to be able to delete my data at any time.
(2) The data must not be shared without my expressed consent. In the case of the community, you could simply say that the graphs can be shared by default. But access to the logged raw values should be restricted. In the same way, I would like to be able to switch the community sharing off (opt-out).
(3) I would like to be noticed if the policy changes.

Don't get me wrong - I think it is useful to share the energy consumption data and see how other people are doing. But I want to be able to opt-out. In addition, it would be nice to have an API to get the data out of the Flukso website - I know that Bart has some thoughts about it.

What do you think?
-Mathias

geertvanbommel's picture

You are right in the sense that there is no policy at the moment. It would be great if the community had a say in this in order to be comfortable with the direction of this project.

Personally the policy doesn't bother me so much. I would have more privacy concerns if my energy provider would install such a meter. At the end, I'm still able to pull the plug (literally) with this installation.

Perhaps the details of the graphs should be limited to trusted members somehow (eg limit access to month/year only), but that would be more because of security reasons.

Geert

paci's picture

I agree that such policy should be implemented.

I was surprised when I connected for the first time that I did not
find an options to control if my data has to be public or not.
[I'd go with public, but that's not the point]

At least some basic option to control data retention policy
would be appreciated for sure.

On a side note, I would love also to be able to download my
data in a CSV format (or similar) to do my own data mining.

Thanks,
/paci - a flukso fan

icarus75's picture

I've been giving the data retention and privacy topic some thought based on your input. Here's my view:

(1) Users retain ownership of data submitted to the Flukso platform via their Fluksometer. They have the right to delete this data at all times via a simple request.
Note: This could either be a data flushing operation or the deletion of their Flukso account.

(2) Users determine the 'circle of trust' they allow their data to be shared with in graphical form. These are:
[a] Me: Nobody else can see my data stream.
[b] Flukso: People within the Flukso community can add my data stream to their chart.
[c] Web: My data stream can be shared with the web.
Note: I've implemented [a] and [b] in Rev 71/72 and activated this on the Flukso site today. Head over to 'My account' and click on the 'privacy' tab. In case of [a] your data stream will not be drawn on other people's graph and all statistics will be marked to them as 'prv'. Conversely, nobody else's data stream will be drawn on your chart. The latter was done to discourage lurking. Case [b] is set as the default when joining Flukso.
I have received requests for option [c]. These users would like to share their graphs with people who don't have a Flukso account. This option is not available yet. It would entail changes in the way the URL's are constructed. Each user would need to view their charts via a dedicated URL e.g. www.flukso.net/dash/[username]/day for option [c] to work.

(3) Only the owner of the data will be given access to their data in raw format.
Note: This will most likely be implemented through an HTTP REST style API with a choice of fetching the data in JSON or CSV format. Maybe a binary rrd format would also be interesting. Let me know your thoughts on this. Some restrictions will be set on the frequency at which data can be fetched through the API.

(4) Future Flukso features might require statistical processing on a set of data streams. Flukso will take proper care to anonymize any data by applying these statistical techniques on a data set of sufficient size.

(5) Users will be notified of any changes to this data retention policy.

How does this sound to you?

Cheers,
Bart.

geertvanbommel's picture

sounds good to me.
Looking forward to the REST API :-)

Happy metering!
Geert

gonium's picture

Hi Bart,

thanks for considering this. You reflect all points of my initial comment, and more -- so I'm happy :-)

As for the API: An REST API would be nice. If you want to have a look at some code implementing a REST API in Ruby's Rack, drop me a note. Although I think you will prefer Erlang... As for the export format: JSON would be fine with me, but I don't see the value in exporting binary rrd. I am interested in the raw, uncompressed data -- I will implement my own storage suiting my needs. This *could* be an RRD database, but I don't think the API should reflect this. If it is straightforward for you it doesn't hurt, but its not a requirement from my point of view.

Cheers,
-Mathias