I want a regular expression to select text between /x and /, e.g. I'm wanting to look at something like:-
/x This is a load of multi-line text and some more and more /
... and get the three lines of text (with/without the end bits).
However I want it to be non-greedy so that:-
/x Here is some text / and some more /
... will return only the "Here is some text" bit.
So, a simple regular expression like '^/x[^/]*/' gets me what I want.
However, I'm trying to make this a bit cleverer so that embedded / characters don't break it. So the next try is '^/x.*/\n', this works to an extent but it's 'greedy' so giving it the second example means that it returns both "Here is some text" and "and some more".
Is there a way of saying "not /\n", or any other way of saying 'non-greedy'?
This is using PHP regular expression searches.
On 07 Feb 18:40, Chris Green wrote:
I want a regular expression to select text between /x and /, e.g. I'm wanting to look at something like:-
/x This is a load of multi-line text and some more and more /
<snippity class="lots of info on what you've tried" />
This is using PHP regular expression searches.
And PHP regular expressions are odd at the best of times!
Anyways...
Hows about: brettp@erwin:~/temp/cg$ cat test.txt /x Line one Line two / Line three / Not a line that should be in the output / Another line that should not be in the output / brettp@erwin:~/temp/cg$ cat test.php <?php $filename = "test.txt"; $fh = fopen($filename, "r"); $data = fread($fh, filesize($filename)); $hasmatch = preg_match("/(^|[\r\n]+)\/x[\r\n]+(.*?)[\r\n]+\/[\r\n]+/s", $data, $matches); print $data; print "====\n"; if ($hasmatch > 0) { print $matches[2]; print "\n"; } ?> brettp@erwin:~/temp/cg$ php5 test.php /x Line one Line two / Line three / Not a line that should be in the output / Another line that should not be in the output / ==== Line one Line two / Line three brettp@erwin:~/temp/cg$
Hope that gives you the pointer you need.
Cheers,
On Wed, Feb 08, 2012 at 10:06:50AM +0000, Brett Parker wrote:
On 07 Feb 18:40, Chris Green wrote:
[big snip]
brettp@erwin:~/temp/cg$ cat test.php <?php $filename = "test.txt"; $fh = fopen($filename, "r"); $data = fread($fh, filesize($filename)); $hasmatch = preg_match("/(^|[\r\n]+)/x[\r\n]+(.*?)[\r\n]+/[\r\n]+/s", $data, $matches);
Now that's just silly! :-)
print $data; print "====\n"; if ($hasmatch > 0) { print $matches[2]; print "\n"; }
?> brettp@erwin:~/temp/cg$ php5 test.php /x Line one Line two / Line three / Not a line that should be in the output / Another line that should not be in the output / ==== Line one Line two / Line three brettp@erwin:~/temp/cg$
Hope that gives you the pointer you need.
It turns out that one can turn on 'ungreedy' very simply *within* the RE. My code is actually in a wiki plugin, several layers away from the actual PHP preg_match(), the delimiters are added between my call and the preg_match().
After digging around in the PHP PCRE documentation I found that what I needed was:-
'(?U)^/x.*/\n'
which works perfectly so far, allowing embedded / characters but stopping at the first end-of-line /. It's the (?U) that tells the RE to be ungreedy. I can't do the standard U after the final delimiter because the delimiters get added later.