Sort of design/philosphy/way to do it sort of question about remotely gathering data and putting in a database

List overview All Threads
Download

newer

older

Linux (Debian) boot problem

2600 meeting at 18:00 on Friday...

Chris Green

29 Sep 2014 29 Sep '14

1:25 p.m.

I have a Beaglebone Black which gathers various bits of data on our boat in France. At present these comprise some temperatures and battery voltages and currents but there might be other things added.

I want to store these in a sqlite database on my desktop machine here at home. The mechanics of doing this are easy enough, I store the data in a file named for the date and time on the Beaglebone Black and at intervals my desktop machine mounts the directory where these files are saved (using sshfs), reads the data, inserts it into the database and then deletes the file. This means that if the connection fails for a while then nothing is lost, it's probably not by any means perfect but it's good enough, my life doesn't depend on the data! :-)

At the moment the receiving end just assumes there are ten values sent and that they are in a specific order. E.g. there's a file there at the moment whose name is "2014-09-29_15:05.od" and whose contents are:-

'2014-09-29 15:05', 22.125, 22.75, 13.6285714286, 13.8876923077, 34.36875, 0.253125, 14.85, 8.803125, 8

I'm not really happy with this because it means that any changes I do have to be 'synchronised' at the two ends to keep their knowledge of which value is which in step.

Is there a standard way of labelling values in this sort of situation? Should I send a file something like:-

DateTime, '2014-09-29 15:05' InsideTemperature, 22.125 OutsideTemperature, 22.75 LeisureVolts, 13.6285714286 StarterVolts, 13.8876923077 LeisureAmps1, 34.36875 etc.

Then the first column of this data is the database field name and the second column is the data. If a field name doesn't exist then the receiving program/script will create a new column. This rather depends on there always being a DateTime field so maybe the receiving script could reject files with no DateTime value.

Does this seem a reasonable way to do it or are there better ways, or should I just stay with the existing set-up where the order is significant?

-- Chris Green

Show replies by date

Ewan Slater

29 Sep 29 Sep

1:44 p.m.

Hi Chris,

Would the data have any value to you if it wasn't timestamped?

Is there any way of getting the file created with a self describing format like JSON or XML, so you know by looking at the file what each value represents?

I'd be tempted to look at a NoSQL database since these have a more flexible schema and are a lot less bothered about some records having different structure to others.

Cheers,

Ewan

On 29 September 2014 14:25, Chris Green cl@isbd.net wrote:

...

I have a Beaglebone Black which gathers various bits of data on our boat in France. At present these comprise some temperatures and battery voltages and currents but there might be other things added.

I want to store these in a sqlite database on my desktop machine here at home. The mechanics of doing this are easy enough, I store the data in a file named for the date and time on the Beaglebone Black and at intervals my desktop machine mounts the directory where these files are saved (using sshfs), reads the data, inserts it into the database and then deletes the file. This means that if the connection fails for a while then nothing is lost, it's probably not by any means perfect but it's good enough, my life doesn't depend on the data! :-)

At the moment the receiving end just assumes there are ten values sent and that they are in a specific order. E.g. there's a file there at the moment whose name is "2014-09-29_15:05.od" and whose contents are:-
'2014-09-29 15:05', 22.125, 22.75, 13.6285714286, 13.8876923077, 34.36875, 0.253125, 14.85, 8.803125, 8
I'm not really happy with this because it means that any changes I do have to be 'synchronised' at the two ends to keep their knowledge of which value is which in step.

Is there a standard way of labelling values in this sort of situation? Should I send a file something like:-
DateTime,           '2014-09-29 15:05'
InsideTemperature,  22.125
OutsideTemperature, 22.75
LeisureVolts,       13.6285714286
StarterVolts,       13.8876923077
LeisureAmps1,       34.36875
etc.
Then the first column of this data is the database field name and the second column is the data. If a field name doesn't exist then the receiving program/script will create a new column. This rather depends on there always being a DateTime field so maybe the receiving script could reject files with no DateTime value.

Does this seem a reasonable way to do it or are there better ways, or should I just stay with the existing set-up where the order is significant?

-- Chris Green

main@lists.alug.org.uk http://www.alug.org.uk/ http://lists.alug.org.uk/mailman/listinfo/main Unsubscribe? See message headers or the web site above!

Chris Green

2:06 p.m.

On Mon, Sep 29, 2014 at 02:44:58PM +0100, Ewan Slater wrote:

...

Hi Chris,

Would the data have any value to you if it wasn't timestamped?

No, not really, it's the trends in battery voltage etc. which are important.

...

Is there any way of getting the file created with a self describing format like JSON or XML, so you know by looking at the file what each value represents?

I can create it in whatever format I want as it's a Python (just my preferred language) script that creates it. I prefer something as human readable as possible though so I'm not all *that* keen on XML, it does seem to be flavour of the moment though.

I've just had a harder look at JSON (I've never really investigated it before) and that actually looks quite sensible, I can understand it without having to think too hard so writing my data in JSON format seems like a good idea to me. Thanks for that suggestion.

...

I'd be tempted to look at a NoSQL database since these have a more flexible schema and are a lot less bothered about some records having different structure to others.

There's not much in the way of lightweight (or any!) NoSQL databases in the Ubuntu repositories, I'm not keen on going outside if I can avoid it.

I think in reality just using JSON to transfer the data will give me the structure I need.

-- Chris Green

Ewan Slater

2:27 p.m.

There are certainly packages for Mongo, Redis and BDB in the 14.04 repositories (haven't checked earlier). And something called WhiteDB which claims to be a lightweight NoSQL library.

YAML might also work for you as a self describing data format

On 29 September 2014 15:06, Chris Green cl@isbd.net wrote:

...

On Mon, Sep 29, 2014 at 02:44:58PM +0100, Ewan Slater wrote:

...
Hi Chris,

Would the data have any value to you if it wasn't timestamped?

No, not really, it's the trends in battery voltage etc. which are important.

...
Is there any way of getting the file created with a self describing format like JSON or XML, so you know by looking at the file what each value represents?

I can create it in whatever format I want as it's a Python (just my preferred language) script that creates it. I prefer something as human readable as possible though so I'm not all *that* keen on XML, it does seem to be flavour of the moment though.

I've just had a harder look at JSON (I've never really investigated it before) and that actually looks quite sensible, I can understand it without having to think too hard so writing my data in JSON format seems like a good idea to me. Thanks for that suggestion.

...
I'd be tempted to look at a NoSQL database since these have a more flexible schema and are a lot less bothered about some records having different structure to others.

There's not much in the way of lightweight (or any!) NoSQL databases in the Ubuntu repositories, I'm not keen on going outside if I can avoid it.

I think in reality just using JSON to transfer the data will give me the structure I need.

-- Chris Green

main@lists.alug.org.uk http://www.alug.org.uk/ http://lists.alug.org.uk/mailman/listinfo/main Unsubscribe? See message headers or the web site above!

Chris Green

4:06 p.m.

On Mon, Sep 29, 2014 at 03:27:52PM +0100, Ewan Slater wrote:

...

There are certainly packages for Mongo, Redis and BDB in the 14.04 repositories (haven't checked earlier). And something called WhiteDB which claims to be a lightweight NoSQL library.

I just searched for noSQL in Synaptic and it didn't show me Mongo or Redis. I can't find BDP. I saw WhiteDB but as it's just C libraries it isn't a lot of use for a simple Python program.

Redis might be interesting.

...

YAML might also work for you as a self describing data format

Yes, possibly, my look at JSON took me to YAML as well.

-- Chris Green

Ewan Slater

7:21 p.m.

Odd that you couldn't see them. I was going off this list http://packages.ubuntu.com/trusty/database/

BDB = Berkeley DB = Berkeley

On 29 September 2014 17:06, Chris Green cl@isbd.net wrote:

...

On Mon, Sep 29, 2014 at 03:27:52PM +0100, Ewan Slater wrote:

...
There are certainly packages for Mongo, Redis and BDB in the 14.04 repositories (haven't checked earlier). And something called WhiteDB which claims to be a lightweight NoSQL library.

I just searched for noSQL in Synaptic and it didn't show me Mongo or Redis. I can't find BDP. I saw WhiteDB but as it's just C libraries it isn't a lot of use for a simple Python program.

Redis might be interesting.

...
YAML might also work for you as a self describing data format

Yes, possibly, my look at JSON took me to YAML as well.

-- Chris Green

main@lists.alug.org.uk http://www.alug.org.uk/ http://lists.alug.org.uk/mailman/listinfo/main Unsubscribe? See message headers or the web site above!

Mark Rogers

3:54 p.m.

On 29 September 2014 14:25, Chris Green cl@isbd.net wrote:

...

'2014-09-29 15:05', 22.125, 22.75, 13.6285714286, 13.8876923077, 34.36875, 0.253125, 14.85, 8.803125, 8

Silly question, but since this is just CSV data, why not stick with it but put a header line above it listing the fields?

-- Mark Rogers // More Solutions Ltd (Peterborough Office) // 0844 251 1450 Registered in England (0456 0902) @ 13 Clarke Rd, Milton Keynes, MK1 1LG

Chris Green

4:22 p.m.

On Mon, Sep 29, 2014 at 04:54:48PM +0100, Mark Rogers wrote:

...

On 29 September 2014 14:25, Chris Green cl@isbd.net wrote:

...
'2014-09-29 15:05', 22.125, 22.75, 13.6285714286, 13.8876923077, 34.36875,

0.253125, 14.85, 8.803125, 8

Silly question, but since this is just CSV data, why not stick with it but put a header line above it listing the fields?

Not a silly question at all, I could very easily do as you say and it seems at first glance to be a good solution.

Is there any easy *programmatic* way to do an SQL insert from a CSV with headings?

-- Chris Green

Chris Green

30 Sep 30 Sep

10:55 a.m.

On Mon, Sep 29, 2014 at 05:22:46PM +0100, Chris Green wrote:

...

On Mon, Sep 29, 2014 at 04:54:48PM +0100, Mark Rogers wrote:

...
On 29 September 2014 14:25, Chris Green cl@isbd.net wrote:

...
'2014-09-29 15:05', 22.125, 22.75, 13.6285714286, 13.8876923077, 34.36875,

0.253125, 14.85, 8.803125, 8

Silly question, but since this is just CSV data, why not stick with it but put a header line above it listing the fields?

Not a silly question at all, I could very easily do as you say and it seems at first glance to be a good solution.

Is there any easy *programmatic* way to do an SQL insert from a CSV with headings?

I've come up with an even simpler and more obvious solution.

The data I'm collecting isn't huge, there's about 700kBytes at the moment which is from the last couple of years (with gaps). I can simply do everything at the remote end on the Beaglebone Black and rsync the database across at intervals. The BBB is a newer one with a 4Gb emmc, currently it's 44% full so there's a couple of Gb to spare still, quite a few years' data!

So my programs which gather the data on the BBB will simply write it directly to a sqlite3 database, all the configuration (like names for ADC channels) can live in the database.

It will probably be a good idea to have some sort of historical backup when rsyncing the database across to home so that if it gets scrambled on the BBB I don't lose all my old data.

-- Chris Green

3938

Age (days ago)

3939

Last active (days ago)

main@lists.alug.org.uk

8 comments

3 participants

tags (0)

participants (3)

Chris Green
Ewan Slater
Mark Rogers