Everything I am doing here is on Linux but as it has the potential to
be cross platform feel free to tell me its off topic and I'll go
elsewhere with my tail between my legs.
Until last week I had zero understanding of NoSQL. I spent some time
going through tutorials (using OrientDB - I have no idea how much of
this is generic and how much is DB specific) and now have a broad
understanding of a graph database, and the concepts of vertices and
edges.
I'm hoping for some advice on a sensible schema to get started on a
specific application.
All I want to do is log data coming from sensors. I haven't decided
exactly what yet but assume environmental stuff like temperatures, and
other stuff like counters (eg energy usage), speeds, etc. They'll be
polled periodically from various devices and stored in the database.
In most cases there will be a logical grouping to a set of values (eg
they all relate to a single room) but others might be more generic.
Once I have the data, there are two things I want to do with it.
First: display current data on a web page which is constantly updating
and always being displayed. Second, I want to be able to run reports
over the historical data.
With MySQL I would probably have a script which polled all the sensors
and stored everything into one record indexed by time, because it's
trivial to dump even large amounts of data into a table that way.
Whether or not that was the right way in MySQL I'm sure it would be
wrong in Orientdb.
My first thought was to create a generic Sample class (type,
timestamp, value), and extend that to create new classes for specific
types of data (Temperature, etc). Create a new vertex for each sample
and (having already created a vertex for each room) link them with
edges. However I have no idea how efficient that would be when it
comes to extracting all the latest data, or historic data ordered by
time. So any guidance would be very gratefully received,
Also, I have no idea how well it would perform. A handful of samples
every minute would be fine I am sure, but what it if were more like
50-100? And what it they were stored every second? That might not be
relevant here but this is a learning exercise and I want to understand
this stuff. (50-100 samples per second into a MySQL DB would be no
problem even on a Pi, after all, assuming they were stored in a single
record as described above.)
--
Mark Rogers // More Solutions Ltd (Peterborough Office) // 0844 251 1450
Registered in England (0456 0902) @ 13 Clarke Rd, Milton Keynes, MK1 1LG