Post by Stefan on Sept 21, 2022 11:56:39 GMT -5
George, Robert, et al,
I think the common search routine used by FIND (and other commands) may not be working correctly when used with a regular expression.
Perhaps I'm missing something elementary, but all my research on the Net suggests this is most likely an error in the way SPFLite positions itself after a search with RegEx.
Take a line of data like:
....+....1....+....2....+....3
ABCDEABCEDAFG ABCDEAABB AABBAABBABB
(1) FIND "ABCDEA" => Finds data in col 1 to 6 (blue)
(2) RFIND => Finds data in col 15 to 20 (red)
Note that search (2) started from the byte AFTER the last found char from the search (1).
Now let's repeat the same exercise with a regular expression.
....+....1....+....2....+....3
ABCDEABCEDAFG ABCDEAABB AABBAABBABB
(1) FIND R"A[^A]*A" => Finds the underlined data in col 1 to 6, same as before
(2) RFIND => Finds the violet data in col 6 to 11
Note how the last 'A' found in (1) becomes the first 'A' in (2), even though it has already been 'accounted for' in the first string found.
The RegEx is the standard sequence deployed to capture quoted strings which have no imbedded quotes. I've merely used the latter 'A' instead of a quotation mark for readability.
The RegEx says:
A [^A] * A
Find char A followed by chars that are not A zero or more times followed by A.
As you can see, both the regular and the RegEx search return the exact same length of string as a result.
The regular search starts AFTER the length of the previously found string, ie in col 7.
The RegEx search starts AT the last found character, ie in col 6. So it either starts 1 char too early, or possibly even 1 char AFTER the FIRST char found by the search in step (1).
Either way, I think there is a bug here.
Can anyone convince me otherwise?