Flukso site down again

by fafot on Mon, 07/07/2014 - 02:29

It looks like every month or so there is a problem with the Flukso site. Last time it was a site certificate that has to be renewed, and I am wondering what is the problem now?

As Bazzle wrote last time, we are relying on the data and the whole idea of the Flukso meter is to have an instrument that is not computer dependent. I know that some people have computer backup for those kind of problems but IMO this is defeating the purpose of the Flukso meter for me and many others.

The problem today started at 9:00 am Australian Eastern time, that is 23:00 GST 6 July 14. Now we have a Flukso team here: http://pvoutput.org/liveteam.jsp?o=gss&d=desc&tid=860 so it much easier to identify that the problem is a Flukso site problem and not a particular user problem.

I am wondering if Flukso can do something in the future to prevent this kind of problems?

no more power bills | Mon, 07/07/2014 - 03:33

I have noticed flukso is down.
it is great to check that is common problem through Pvoutput.

How long will it be down for?

mozreactor | Mon, 07/07/2014 - 04:47

Thanks for the team heads-up. Joined! Stopped reporting for me also at 0900 GMT+10.

bazzle | Mon, 07/07/2014 - 05:09

When I asked last *** jumped down my throat.
But yes seems to be down again. I was looking in at work to check solar and saw the same.
Unfortunate :(

cwjames | Mon, 07/07/2014 - 05:14

Likewise thanks for the heads-up. Also joined Flukso team on PVOUTPUT (Starlight) and yes also stopped reporting at 07:00 am WST GMT +8. Makes it quicker to work out whats going on, thanks...

roadstar | Mon, 07/07/2014 - 07:26

I guess I will also just have to be patient until it comes up again.

gebhardm | Mon, 07/07/2014 - 08:29

You guys down under, could you imagine that the problem is somewhere else? You are half a planet away with A LOT of servers routing FLM data in-between... Couldn't it be there is an issue on PVOutput side retrieving data via RFC only once? As the FLM dash does not show any loss of data, it is JUST a matter of gaining data "again" if a certain interval has not been transmitted - that is asynchonrous communication...

cwjames | Mon, 07/07/2014 - 09:01

All is working now, however when it had failed I was UNABLE to read ANY data direct from the Flukso site. I could log onto the Flukso site but not read any data. Hence I was able to leave the post above. Where the problem was I don't know ? as you so, lots of network bits between us and Flukso, however the was comms of sorts that enabled us to log on to the Flukso site as mentioned ???

no more power bills | Mon, 07/07/2014 - 09:17

Mine is back but I am missing the data on Pvoutput.
How do I get that back on today's graph?

cwjames | Mon, 07/07/2014 - 09:29

You will need to download the data from Flukso then upload it to Pvoutput. This can be done manualy or with scripts. Good luck, I normaly do it manualy as I have not succceded in using curl or any other method to date to download from Flukso.

gebhardm | Mon, 07/07/2014 - 09:40

...raise an issue at PVOutput to get the option for "defined retrieval via the Flukso API"... (and learn that the internet is not an always-on infrastructure) - therefore, start to take care for your own data... (but we had that discussion already)
See also https://www.flukso.net/content/internet-interruption-missing-data-pvoutp...

icarus75 | Mon, 07/07/2014 - 10:11

New code was pushed to the live server last night (CET). All was looking fine after the upgrade. But at 1AM there was a hiccup in the log rotation. This caused the server to start buffering all the API logs in RAM, resulting in a crash of the VM at around 2AM. API logs have been disabled for the time being.

Please note that Flukso was designed with network/server glitches in mind. Every 5min the FLM tries to push its buffered readings to the Flukso server. If this fails, it will keep these readings in its RAM buffer and try to push them again 5min later. Once the network or server issue has been solved, all buffered readings will be synced with the server within 5min max. The platform has got built-in resilience.

A new method of storing readings locally on the FLM's flash, called tmpo, is in the works as well. With tmpo, the FLM will be able to persist all its readings at the highest resolution for months (perhaps years, to be tested), even with power cuts in between. A first version of the tmpo daemon is already present in the r241 alpha release. The uplink to flukso.net is still in the works. To further develop the uplink, code on flukso.net has to be upgraded in different places. If you wonder how that is working out, goto line 1.

fafot | Mon, 07/07/2014 - 12:43

Icarus - Tx for the explanation.

Gebhardm -
"You guys down under, could you imagine that the problem is somewhere else? You are half a planet away with A LOT of servers routing FLM data in-between... Couldn't it be there is an issue on PVOutput side retrieving data via RFC only once?"

Well it is very hard to imagine this after looking first at the Flukso site that gave no readings. So I am wondering what the "half a planet away", "A LOT of servers" or "issue on PVOutput side" has to do with it originally????
It would be smart of you to check the facts first before giving useless advice!

gebhardm | Mon, 07/07/2014 - 12:59

...be my guest...

gebhardm | Mon, 07/07/2014 - 13:34

fafot - maybe for you a quote from the PVOutput documentation

"Flukso

The Flukso auto uploader reads energy production or consumption data from your Flukso API account and automates the data upload to PVOutput every 5 minutes. Energy data is calculated based on the power data at 5 minute intervals. "

This indicates a pull mode that does not retry missed requests; certainly, when the flukso-API is not available, then data is "lost", but not by Flukso (as icarus75 explained) but by PVOutput; whose fault is it now? PVoutput may change this behaviour, but you surely would have to pay for it...

So, I strongly object against demands for a reliability that is justified by not one FLM community member through paying for a Fluksometer.... (you may ask what you have to pay for an AWS account offering an SLA including 99.9% reliability - which is still ca. 9 hours downtime per year)

https://www.flukso.net/content/how-mqtt-nicely-designed-gauges shows an alternative to rely on own demand - or better, contribute yourself!

icarus75 | Mon, 07/07/2014 - 13:55

Guys, let's keep our spirits high, blood pressure low and communication polite.

Rik already contacted PVOutput to ask whether they could implement a retry policy in the event of a failed API call to api.flukso.net. I think that should be a short-term solution to future network or server hiccups.

frumper | Mon, 07/07/2014 - 14:49

bb has made magic happened, if there is a flukso outage PVOuput will now grab the last 24hrs data.

refresh and be happy.

bazzle | Mon, 07/07/2014 - 15:01

@ Gerbahardm. If you cant say nothing nice or constructive :(

@iCarus .. Is he your spokesperson? Ive sent a lot of business your way. Its also fair for them to ask as above and as I have about the service continuity.

These chaps PM me on other forums about what to get, how to set it up and what to do when there when there are issues. What do I tell them.. quote Gebhardm?

Quote.
So, I strongly object against demands for a reliability that is justified by not one FLM community member through paying for a Fluksometer.... (you may ask what you have to pay for an AWS account offering an SLA including 99.9% reliability - which is still ca. 9 hours downtime per year)

GB0099 | Mon, 07/07/2014 - 15:07

Thanks BB also for such quick resurrection of today's data, I saw remotely via the PV app that there had been a problem but now it's all back and normal with the 5 min intervals

Great work !!

icarus75 | Mon, 07/07/2014 - 15:41

@frumper I wouldn't call that magic. Just healthy engineering practice. Design for failure. Because according to Murphy, it will happen. Sooner, rather than later.

@bazzle If you read my first comment carefully, then you should agree that the Flukso platform as a whole (FLM + flukso.net) is designed with continuity in mind from the very start. And with every release of the FLM or flukso.net we try to improve on it even more. The very reason PVOutput can now call the Flukso API to get the missing data indicates that the Flukso platform as a whole recovers gracefully.

As for @gebhardm, I can vouch that he his a fine lad, trying to help people in any way he can. Which I assume all of you are!

I have seen forum and mailing list discussions get all out of control for no obvious reason. It has not happened on this forum yet. And I sincerely hope it will never happen. Fluksonians live in different time zones, have different cultures and mother tongues. What seems like a normal reply to one person, might offend another one. So I would like to ask you all to let bygones be bygones. And have a beer when the Belgians play their semi-final. Oh shoot, we got kicked out by Argentina... :)

fafot | Mon, 07/07/2014 - 23:11

@ Gerbahardm
To finish this discussion on my side, what annoyed me was your statement "As the FLM dash does not show any loss of data, it is JUST a matter of gaining data "again" ".

When I looked at the Dash at 23:00 GST there was no data was displayed at all. That was 6 hours before your first message. I would not post my original message if I saw a data on the Dash. So when you posted your message saying that it may be PVoutput problem and the other stuff, it was like saying - there was never a problem here.
And yes, all the other things you suggested for data retrieval are fine IF there is data to retrieve.

BTW, I personally don't mind to pay for reliability, and that is why I I bought the FLM.

@Icarus: I am happy that you took care of us and ask PVoutput to implement the "retry" policy and that it worked. I also like the idea of the new method of storing readings locally on the FLM's flash so all the current outages will become a thing of the past.

frumper | Mon, 07/07/2014 - 23:28

@ICARUS75 the magic is i asked bb at 7pm local time and he implemented the fix within a few minutes, for free.

icarus75 | Tue, 08/07/2014 - 06:30

@fafot The current implementation already guards against server/network outages and auto-syncs readings with the server when communication between FLM and flukso.net is re-established. Tmpo will only improve on this by persisting data on the FLM to flash and keep this data at highest resolution for a much longer period. Network/server outages can and will happen sporadically. What matters is that Flukso as a whole auto-recovers from these hiccups. And from now on PVoutput will do so as well.