GNU tar does have direct support for incremental backups seeĀ 

https://www.gnu.org/software/tar/manual/html_node/Incremental-Dumps.html

But it's probably *much* easier and more reliable to use an existing system like resticĀ 

https://restic.net

which uses better archives than tar, handles deltas very fast and supports encryption for offsite backups.

Systems like restic and rsync are likely to be much faster than a homebrew system (although these are fun to write).

S

On Sat, 6 Jul 2019 at 20:24, Mark Rogers <mark@more-solutions.co.uk> wrote:
On Sat, 6 Jul 2019 at 12:32, Steve Mynott <steve.mynott@gmail.com> wrote:
> I did a simple "strace dpkg -S /usr/bin/411toppm|grep open" and looked at some of the paths and came up with
>
> basename `grep -rl "^/usr/bin/411toppm$" /var/lib/dpkg/info` .list

Interesting, thanks. If I understand correctly this would only work on
the live system (or at least a similar system with the same packages
installed) but I could, at a push, look at the files in the
var/lib/dpkg/info directory in the tarball. Whilst that would be slow
if done for each file separately, I could just extract and parse all
the .list files in that directory (of the tarball) up front which
would be pretty workable.

> I've tended to use rsync -n to compare directories of files and a scripting language (or go) is probably a better tool than shell.

I'm working with a python script that can directly parse the
compressed tarball (I found a sample script which could calculate
checksums of all files in a tarball pretty quickly which I'm using as
a base, because "tar -jtv" doesn't give enough information to detect
if a file has changed), so my intention is to build on that. Parsing
the .list files on the fly should be an easy add.

And in fact, I can compare just the list of .list files in that
directory to get a diff of installed packages, so that's a good start.
(But I do need to then exclude the files from each package from
further comparisons, so I'd still need to parse them.)

[Actually on a quick dig, the .md5sums files are probably more useful
than the .list files for my purposes.]

Thanks for that starting point.
--
Mark Rogers // More Solutions Ltd (Peterborough Office) // 0844 251 1450
Registered in England (0456 0902) 21 Drakes Mews, Milton Keynes, MK8 0ER


--
Steve Mynott <steve.mynott@gmail.com>
cv25519/ECF8B611205B447E091246AF959E3D6197190DD5