Over the past few months I've noticed that when I run grep it sometimes appears to hang, I had put it down to doing "grep -r" down a rather large file hierarchy but now it's happening with a perfectly simple grep:-
www-data$ ls *.txt sidebar.txt start.txt www-data$ grep xxx *.txt www-data$ grep fred *.txt www-data$ grep 'CLEARFLOAT' *.txt
... and there it sits, still (like 5 minutes later). The two .txt files are only a few tens of bytes long. It's not using any processor time, it's just stuck.
I have just done it again in another terminal window, quotes removed but otherwise the same:-
www-data$ grep fred *.txt www-data$ grep CLEARFLOAT *.txt ^C www-data$
... and with a different user (in the same directory):-
chris$ cd /home/www-data/www/wiki/data/pages chris$ grep fred *.txt chris$ grep CLEARFLOAT *.txt ^C chris$
So, what am I missing?
On 20-Jun-2012 16:36:43 Chris Green wrote:
Over the past few months I've noticed that when I run grep it sometimes appears to hang, I had put it down to doing "grep -r" down a rather large file hierarchy but now it's happening with a perfectly simple grep:-
www-data$ ls *.txt sidebar.txt start.txt www-data$ grep xxx *.txt www-data$ grep fred *.txt www-data$ grep 'CLEARFLOAT'_*.txt
... and there it sits, still (like 5 minutes later). The two .txt files are only a few tens of bytes long. It's not using any processor time, it's just stuck.
I have just done it again in another terminal window, quotes removed but otherwise the same:-
www-data$ grep fred *.txt www-data$ grep CLEARFLOAT_*.txt ^C www-data$
... and with a different user (in the same directory):-
chris$ cd /home/www-data/www/wiki/data/pages chris$ grep fred *.txt chris$ grep CLEARFLOAT_*.txt ^C chris$
So, what am I missing?
-- Chris Green
You're not giving it anything to grep in! If you give a grep command in the form
grep <regexp> <filenames>
then it will grep, in all the files matching <filenames>, for lines matching <regexp>.
However, you are only giving it
grep <regexp>
and now it is waiting for lines to arrive via stdin, so it is hanging because it is waiting for input. However, where is stdin coming from?
To take a slight modification of your example, if I do
grep CLEARFLOAT*
it will simply hang (until I press Ctrl-C).
But if I now feed it with input lines from the keyboard (using "<< EOT") I shall see:
cat << EOT | grep CLEARFLOAT*
line 1 line 2 line containing CLEARFLOAT_BOATING.txt line 4 line 5 EOT
line containing CLEARFLOAT_BOATING.txt
Does this help?
Ted.
------------------------------------------------- E-Mail: (Ted Harding) Ted.Harding@wlandres.net Date: 20-Jun-2012 Time: 19:26:23 This message was sent by XFMail -------------------------------------------------
On 20 June 2012 19:26, Ted Harding Ted.Harding@wlandres.net wrote:
On 20-Jun-2012 16:36:43 Chris Green wrote:
www-data$ grep xxx *.txt www-data$ grep fred *.txt www-data$ grep 'CLEARFLOAT'_*.txt
... and there it sits, still (like 5 minutes later). The two .txt files are only a few tens of bytes long. It's not using any processor time, it's just stuck.
You're not giving it anything to grep in! If you give a grep command in the form
However, you are only giving it
grep <regexp>
and now it is waiting for lines to arrive via stdin, so it is
Ted, If you look at the original email, there are spaces after the search term and the input filenames. I cannot see anything wrong with what Chris is doing.
Chris, You could try strace to see if there's some weirdness going on, but I suspect it won't really be of any help. Could you try it in a different shell? (ie. zsh, tcsh, etc...) Also maybe try in different terminal emulator, like xterm, rxvt etc.
Sorry, don't really know what else to suggest.
Regards, Srdjan
On Wed, 20 Jun 2012, Srdjan Todorovic wrote:
If you look at the original email, there are spaces after the search term and the input filenames.
Actually, the character between the search term and the input filename appears to be something weird - neither a normal space nor a normal underscore.
On 20 June 2012 20:12, Dan vi5u0-alug@yahoo.co.uk wrote:
On Wed, 20 Jun 2012, Srdjan Todorovic wrote:
If you look at the original email, there are spaces after the search term and the input filenames.
Actually, the character between the search term and the input filename appears to be something weird - neither a normal space nor a normal underscore.
That may be true, but it still works in xterm / bash.
$ echo CLEARFLOAT > cgTest.txt $ grep 'CLEARFLOAT' *.txt
finds it in that file.
I copy & pasted the grep line from the original email.
Regards, Srdjan
On Wed, 20 Jun 2012 20:12:07 +0100 (BST) Dan vi5u0-alug@yahoo.co.uk allegedly wrote:
On Wed, 20 Jun 2012, Srdjan Todorovic wrote:
If you look at the original email, there are spaces after the search term and the input filenames.
Actually, the character between the search term and the input filename appears to be something weird - neither a normal space nor a normal underscore.
Yep - not a space character (hex 20) but two very odd characters (hex C2 and A0).
Mick --------------------------------------------------------------------- blog: baldric.net fingerprint: E8D2 8882 F7AE DEB7 B2AA 9407 B9EA 82CC 1092 7423 ---------------------------------------------------------------------
On 20 June 2012 20:51, mick mbm@rlogin.net wrote:
On Wed, 20 Jun 2012 20:12:07 +0100 (BST) Dan vi5u0-alug@yahoo.co.uk allegedly wrote:
On Wed, 20 Jun 2012, Srdjan Todorovic wrote:
If you look at the original email, there are spaces after the search term and the input filenames.
Actually, the character between the search term and the input filename appears to be something weird - neither a normal space nor a normal underscore.
Yep - not a space character (hex 20) but two very odd characters (hex C2 and A0).
Probably doesn't really matter - copy it from your webmail in the web browser and paste to the terminal. And it works. Probably some unicode stuff. The terminal emulator / shell should surely still know what to do with such a character if it was configured to support unicode?
Regards, Srdjan
On 20-Jun-2012 20:07:58 Srdjan Todorovic wrote:
On 20 June 2012 20:51, mick mbm@rlogin.net wrote:
On Wed, 20 Jun 2012 20:12:07 +0100 (BST) Dan vi5u0-alug@yahoo.co.uk allegedly wrote:
On Wed, 20 Jun 2012, Srdjan Todorovic wrote:
If you look at the original email, there are spaces after the search term and the input filenames.
Actually, the character between the search term and the input filename appears to be something weird - neither a normal space nor a normal underscore.
Yep - not a space character (hex 20) but two very odd characters (hex C2 and A0).
Probably doesn't really matter - copy it from your webmail in the web browser and paste to the terminal. And it works. Probably some unicode stuff. The terminal emulator / shell should surely still know what to do with such a character if it was configured to support unicode?
Regards, Srdjan
I have to say that, when I saw Chris's original email, I saw an underscore separating "CLEARFLOAT" and "*.txt"; and when I pasted Chris's grep command into my xterm it indeed hung.
If interpreted as underscore, it will have th effect of making CLEARFLOAT_*.txt into a single word, so the <filename> part is missing and it is bound to hang.
For what it's worth, when I paste out of the email into the command
echo "grep 'CLEARFLOAT'_*.txt" | od -t x1
I get
0000000 67 72 65 70 20 27 43 4c 45 41 52 46 4c 4f 41 54 0000020 27 5f 2a 2e 74 78 74 0a
and the byte corresponding to the underscore is the 7th back from the end (the last, "0a", is the newline), so has hex code 5F which is ASCII for plain underscore.
Anyway, it looks as though, for whatever reason, something has converted what should have been a space, with ASCII hex code 20, into something which was not, thereby depriving grep of input (see below).
But why, I wonder, does the other space (the one after "grep") not also come out weird?
In the raw file of Chris's email, I see at that point:
grep 'CLEARFLOAT'=A0*.txt
and, in the headers:
Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
meaning that the character at that position has hex encoding A0 in iso-8859-1, aka Latin1 (which contains the Western European accented letters). In this encoding, it seems that code A0 is "NBSP", i.e. non-breaking space. See Wikipedia:
http://en.wikipedia.org/wiki/ISO-8859-1
So, somehow, the ordinary space that should have been there was in fact a non-breaking space. Maybe it was copied from a web-page?
Best wishes, Ted.
------------------------------------------------- E-Mail: (Ted Harding) Ted.Harding@wlandres.net Date: 20-Jun-2012 Time: 22:07:26 This message was sent by XFMail -------------------------------------------------
On Wed, Jun 20, 2012 at 08:51:44PM +0100, mick wrote:
On Wed, 20 Jun 2012 20:12:07 +0100 (BST) Dan vi5u0-alug@yahoo.co.uk allegedly wrote:
On Wed, 20 Jun 2012, Srdjan Todorovic wrote:
If you look at the original email, there are spaces after the search term and the input filenames.
Actually, the character between the search term and the input filename appears to be something weird - neither a normal space nor a normal underscore.
Yep - not a space character (hex 20) but two very odd characters (hex C2 and A0).
It's a nonbreakspace and I think you have solved my problem! :-)
I have set up my keyboard to send nonbreakspace when I do SHIFT-space and, of course, I haven't released the shift key when typing that space after CLEAFLOAT.
In 'old fashioned' ISO-8859 nonbreakspace is 0xA0 but in UTF-8 it's a two byte sequence 0xC2 0xA0.
On Wed, Jun 20, 2012 at 07:33:46PM +0100, Srdjan Todorovic wrote:
On 20 June 2012 19:26, Ted Harding Ted.Harding@wlandres.net wrote:
On 20-Jun-2012 16:36:43 Chris Green wrote:
www-data$ grep xxx *.txt www-data$ grep fred *.txt www-data$ grep 'CLEARFLOAT'_*.txt
... and there it sits, still (like 5 minutes later). The two .txt files are only a few tens of bytes long. It's not using any processor time, it's just stuck.
You're not giving it anything to grep in! If you give a grep command in the form
However, you are only giving it
grep <regexp>
and now it is waiting for lines to arrive via stdin, so it is
Ted, If you look at the original email, there are spaces after the search term and the input filenames. I cannot see anything wrong with what Chris is doing.
Ah, good, I'm not going completely crazy then! :-)
Chris, You could try strace to see if there's some weirdness going on, but I suspect it won't really be of any help. Could you try it in a different shell? (ie. zsh, tcsh, etc...) Also maybe try in different terminal emulator, like xterm, rxvt etc.
Sorry, don't really know what else to suggest.
I've tried in an xterm and I have even tried on a remote system running a different Linux distribution - same thing happens. I get the feeling that it's something to do with the string CLEARFLOAT.
On Wed, Jun 20, 2012 at 07:26:25PM +0100, Ted Harding wrote:
On 20-Jun-2012 16:36:43 Chris Green wrote:
Over the past few months I've noticed that when I run grep it sometimes appears to hang, I had put it down to doing "grep -r" down a rather large file hierarchy but now it's happening with a perfectly simple grep:-
www-data$ ls *.txt sidebar.txt start.txt www-data$ grep xxx *.txt www-data$ grep fred *.txt www-data$ grep 'CLEARFLOAT'_*.txt
... and there it sits, still (like 5 minutes later). The two .txt files are only a few tens of bytes long. It's not using any processor time, it's just stuck.
I have just done it again in another terminal window, quotes removed but otherwise the same:-
www-data$ grep fred *.txt www-data$ grep CLEARFLOAT_*.txt ^C www-data$
... and with a different user (in the same directory):-
chris$ cd /home/www-data/www/wiki/data/pages chris$ grep fred *.txt chris$ grep CLEARFLOAT_*.txt ^C chris$
So, what am I missing?
-- Chris Green
You're not giving it anything to grep in! If you give a grep command in the form
grep <regexp> <filenames>
then it will grep, in all the files matching <filenames>, for lines matching <regexp>.
However, you are only giving it
grep <regexp>
and now it is waiting for lines to arrive via stdin, so it is hanging because it is waiting for input. However, where is stdin coming from?
To take a slight modification of your example, if I do
grep CLEARFLOAT*
it will simply hang (until I press Ctrl-C).
But if I now feed it with input lines from the keyboard (using "<< EOT") I shall see:
cat << EOT | grep CLEARFLOAT*
line 1 line 2 line containing CLEARFLOAT_BOATING.txt line 4 line 5 EOT
line containing CLEARFLOAT_BOATING.txt
Does this help?
Aarrgh, where did that undescore come from! That's what's causing the problem.
It should have read:-
grep CLEARFLOAT *.txt
.... and it *does* say that:-
chris$ grep CLEARFLOAT *.txt
Still hangs. I don't know where those underscores in your reply came from but I don't think they're in my commands.