Re: [ALUG] Combining log files

26 Jan 2018


      On 26 January 2018 at 12:16, Mark Rogers mark@more-solutions.co.uk wrote:
...
Any comments? For example, would it make more sense to open the .gz
files directly in python rather than piping them in? (I assume that
zcat is efficient and so are pipes, and I'm unlikely to achieve
anything better myself.)
To answer my own question, I had a thought about this.
gzip files can be concatenated to create a valid gzip file. Therefore,
if multiple files are being combined I simply need to concatenate
them, unless they span a time (eg 1hr) boundary, in which case those
will need to be processed line by line.
The source log files have filenames which tell me when they start, so
I can tell fairly easily if a file needs to be processed (if there is
a later file starting in the same period then it doesn't, otherwise it
does).
I haven't scripted it yet but this should get me pretty close to the
raw performance of the disks, and is probably something that bash will
handle fine using a combination of zgrep for the edge cases and cat
for the rest (at the sacrifice of easy cross-platform implementation).
-- 
Mark Rogers // More Solutions Ltd (Peterborough Office) // 0844 251 1450
Registered in England (0456 0902) 21 Drakes Mews, Milton Keynes, MK8 0ER

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

Re: [ALUG] Combining log files