regex related query

ANewLearner

New Member
I am having a text file having some text to be remove enclosed within a defined tag (234 in the example below). This to-be removed text is not fixed and can occur multiple times in a single line OR can be spanned across multiple lines.

Example -
aaaaaaaaaaa234textA234aaaaa
234text234
blank line
bbbbbb234textB234bbbbbb
ccccc234te
xtC234
dddd234TextD234dddd

With this expression- "(\d)([\s\S]*?)(\d)", I am able to find out and replace the text with "" as expected but the problem occurs wherein the line ends with the tag (234) and the same line either starts with the tag (234) or in the continuation to the previous line like lines 2 and 6 in the above example. The problem is, line 2 and 6 are leaving a blank line behind and so the output contains 7 lines while I am expecting 5 lines (line 2 and 6 should be removed) only.
Please suggest.
 

tgundhus

Member
From your example above, it I guess you would like it to be
Code:
(\d+)([\s\S]*?)(\d+)
adding "+" to mark that it is multiple digits.
However, in regards of your issue, from how I understand your request, you could solve it by adding an optional line break at the beginning?
Code:
([\r\n]?\d+)([\s\S]*?)(\d+)
This will generate the output;
Code:
aaaaaaaaaaaaaaaa
blank line
bbbbbbbbbbbb
ccccc
dddddddd
 
Top