Re: [ALUG] Manipulating files from bash

23 Sep 2017

      Typically the way I deal with something like that is hook up simple grep/seds into a few lines,
inspecting/massaging the data as we go. For example, to create your minimal format:
---
# write the data file
cat > data <<EOM
label: dos
label-id: 0xe3f4f21a
device: /dev/sdd
unit: sectors
/dev/sdd1 : start=        8192, size=       85622, type=c
/dev/sdd2 : start=       94208, size=     5521408, type=83
/dev/sdd3 : start=     5615616, size=    25499648, type=83
EOM
# we only want lines starting with /dev, and get rid of everything up to including size=,
# remove the default type=83, and remove the type=
# I built up the command-line, and the -e sections one by one.
egrep '^/dev/'< data | sed -e 's/.*, size= */,/' -e 's/, type=83//' -e 's/ type=//' > data2
# remove the last line first number, and add a semicolon at the end
# btw, what's that semicolon? Isn't that meant to be a comma?
(head -n -1 data2; tail -n 1 data2 | sed -E -e 's/^,[0-9]+/,/' -e 's/$/;/') > data3
---
and then put it in a script (with 'set -euo pipefail' at the top) and be done.
For quick one-off things that's fine, especially if you're going to look at the data before blindly feeding it back to fdisk.
It's easy once you know the basics of grep/sed and regular expressions, and oddly satisfying, I find.
In awk, you can can combine all of these into a single program (a bit ugly because the "last line is different" requirement):
---
/^/dev// {
  SIZE=$6
  TYPE=gensub(/type=83/, "", "g", $7)
  TYPE2=gensub(/type=/, "", "g", TYPE)
  if (length(PREV)) {
    print PREV
  }
  PREV=sprintf(",%s%s", SIZE, TYPE2)
}
END { print gensub(/^,[0-9]+,/, ",;", "g", PREV) }
---
In both approaches, there is lots of scope for errors. It could be be that the grep picked up more than it should (eg if you forgot the ^),
or one of the sed changes didn't actually change anything. And what if the output of sfdisk changes in a future
versions (mine shows "Id" instead of "type")? Or what if on some systems it produces extra output you don't copy with.
What if you didn't pass the device to sfdisk, and you end up with conflicting data from partitions for multiple disks?
sfdisk prints some fixed-width right-aligned colums; what happens which those numbers get really large, do the colums run into eachother?
So if you want to make something more robust, you'd have add a bunch of extra checking. You can do that
by splitting up the commands more, and add verification steps; and/or do checking of the produced output to ensure
it has the correct format and desired number of lines.
It helps if you can produce better input in the first place. In your case, have a look at e.g.
 `sudo partx --noheadings --bytes --show --output SIZE,TYPE /dev/sda`
which prints just the info your are looking for, in 2 colums, making it easier to parse.
If you find you need lots of extra checking, or lots of complex logic, or be portable (MacOS and BSD have subtle changes
between grep/sed/awk), it may be easier to do it in a more general programming language (python, perl, or whatever) where you can
fully parse the text file (with error checking) into an internal representation, and then generate the desired output form that
representation; that way you can enforce correctly formatted output. Often that ends up more readable/maintainable.
And with some luck, there may be existing libraries (https://www.linuxvoice.com/issues/005/pyparted.pdf)
to help you do what you need to do. But, it's more skills to learn, more effort, and comes with its own set of issues
(version differences, having to setup environments for libraries etc).
— Martijn
...
On 22 Sep 2017, at 17:05, Mark Rogers mark@more-solutions.co.uk wrote:
There are lots of ways to process text files (eg sed, awk, etc); I'm
after suggestions on the best/easiest way to achieve something fairly
trivial to describe.
Below is example output from sfdisk -d:
label: dos
  label-id: 0xe3f4f21a
  device: /dev/sdd
  unit: sectors
/dev/sdd1 : start=        8192, size=       85622, type=c
  /dev/sdd2 : start=       94208, size=     5521408, type=83
  /dev/sdd3 : start=     5615616, size=    25499648, type=83
I want to convert this into a new script to go back into sfdisk but
with some changes; I want the start values skipped (so sfdisk will put
partition boundaries wherever it thinks is best), I want the last
sector not have a size (ie "rest of disk"), and I don't really need
any of the settings at the top.
In fact, a suitable (minimal) script to go into sfdisk would be simply:
 ,85622,c
 ,5521408
 ,;
I don't really mind at this point whether I end up with something more
like the first format or minified like the latter. In fact to be
honest I'm more interested just in learning the best ways to play with
text files like this.
-- 
Mark Rogers // More Solutions Ltd (Peterborough Office) // 0844 251 1450
Registered in England (0456 0902) 21 Drakes Mews, Milton Keynes, MK8 0ER

main@lists.alug.org.uk
http://www.alug.org.uk/
https://lists.alug.org.uk/mailman/listinfo/main
Unsubscribe?  See message headers or the web site above!

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

2001

2000

1999

1998

1997

Re: [ALUG] Manipulating files from bash