|
Post by Stefan on Jan 8, 2023 6:42:59 GMT -5
George,
This is a biggie and possibly/probably messy so feel free to say No.
Sorry, your response to my DS suggestion sparked a thought...
The general search routine allows a wide range of restrictions to where it searches or what constitutes a 'hit', eg only certain bounds, certain lines, certain text colour, only the LEFT-most occurence on each line, etc.
Could one also specify "certain syntax context"?
Use case: User wishes to change AAA to BBB on all lines but not within quoted strings.
How might it work? Maybe using a S'string' notation (similar to Picture or Format string), S is for SCOPE The 'string' part would specify the scope/context for the search. Possible values might be: Q means quoted string C means comments s1,s2 means anything between string s1 and string s2 s1 means anything from s1 to end of line ^ means NOT, valid as first char only The Q option represents the three SPF quotes (simplifies entry within the already quoted S-string) The C option could obtain values from an associated AUTO file. If not, the 's1,s2' format can be used to achieve a similar effect. What would it look like? CHANGE ALL AAA BBB S'^Q' would perfom the changes as usual but NOT touch any AAA within quoted strings
Limitation: If BOUNDS limits are in effect and a defined scope exceeds the limits (left or right) all bets are off.
|
|
|
Post by George on Jan 8, 2023 9:59:32 GMT -5
Stefan: Yes, quite extensive. Especially since all the optional fltering is done AFTER the basic string is located. Now it would require additional string searches (for the s1,s2 variation); comment processing (for the C type) and of course the Q type. All of course in either NOT ^ or non-NOT mode.
And then there's possible overlapping CHANGE ALL "AS" "ASOF" S"^STC, XPQ" where the line is ABC DEF ASTC GHI XPQ Is the AS not inside the scope? Or inside?
Bound to be other "hmmm, what to do here?" type items.
And S"s1, s2" -- breaks the parser as is. And what if we want S1 and S2 to be independently quoted?
An interesting item to ponder.
George
|
|
|
Post by Stefan on Jan 8, 2023 10:23:17 GMT -5
To my mind, the "AS" is outside the 'STC,XPQ' range because the "A" is outside said range. Imagine replacing the 'STC' and 'XPG' strings with a double quote each. The A is outwith the resultant quoted string of " GHI ". I don't know your parser well enough, but I figure the breakage arises from the comma. If so, feel free to use a different character.
Note: This is based on my assumption that you would 'ingest' the entire S'...' string as a text literal and then subsequently subdivide/parse it.
This might also provide a way to handle quoted s1 and s2 strings I've occasionally used this 'double-parse' approach when data became to "hairy" to parse in one go.
|
|
|
Post by George on Jan 8, 2023 10:40:10 GMT -5
Stefan: The AS problem is probably moot. I had to work at creating that example. Parsing isn't a major problem, probably use the | character like SPLIT/JOIN do already.
The whole scope thing would be done in one function, so it should be at least easy to think about and try out.
George
|
|
|
Post by George on Jan 11, 2023 15:34:19 GMT -5
Stefan: How about this, I think it's achievable. CHANGE ALL XXX YYY S(^Q)Which changes all XXX to YYY quich are NOT within a quoted string. Comments?
|
|
|
Post by Stefan on Jan 12, 2023 6:31:42 GMT -5
George, Your proposed approach looks ok to me, but I's ask you to consider these thoughts... I appreciate that you're concerned about compliance with the existing Parser and minimising its complexity, but I just wonder if...
Q syntax
- Given that the ' Comment' syntax relies on the presence of an AUTO file (understandable given up to 9 user-chosen variations), why not require the same for Quotes? Then you can just use the quotes (default or otherwise) indicated in the AUTO file and eliminate the need to parse for special quotes. As Robert pointed out, these things are 'language'-, ie. filetype-related.
Users may not have an AUTO file for TXT file yet, but they could easily add one with just a QUOTED line if they want to use SCOPE with that file type.
S syntax - I think e.g. S'S,abc|def' looks a bit complex. Would it not be simpler to drop the 'S,' altogether and specifiy a pair of delimiting strings simply as S'ABC|DEF'. The absence of a leading 'Q' or 'C' char tells you that you're dealing with the 'string' form of the operand. The NOT (^) character can still follow immediately after the S' string
- If you must keep the 'S,' format, you could go the same way as you proposed for 'special' quotes and use no separator between the 'S' and the following string at all (e.g. SABC|DEF).
It's no biggie, but I feel slightly uneasy about "comma" as a separator, possibly because in SPFLITE syntax, a comma usually separates two or more terms if a similar type, e.g. X,NX etc.
Maybe a 'space' between the 'S' and <str1> is preferable (and space can't occur in <str1> and <str2> anyway).
The format S<blank><str1>|<str2> or even S(<str1>|<str2>) would feel more 'natural' to users.
You already support the concept of 'escaped' characters (e.g. P'>\>' meaning "UCASE" folllowed by actual ">"), so it shouldn't be hard (famous last words) to enter a '<str1>' string which includes a '()'.
Robert,
We already have several techniques to 'select' lines for further processing. You know at least as well, probably better than I.
And while I think Regular Expressions are great, they are not the sort of thing that 90+% of users will spend any time to learn, write, practice and test(!) just to enter one fancy command.
They'll just step through using RFIND/RCHANGE, maybe in combination with a one or more EXCLUDE commands to get the job done.
SCOPE is intended to limit the effect of a command like FIND or CHNAGE not within rigid BOUNDS or to certain lines, but to surgically ignore certain strings on every line. The SCOPE applies WITHIN a line and provides the ability to change just a part of a line in a semi-smart way.
This suggestion is not seeking to boil the ocean with one command, just looking to address a wish to "CHANGE ALL AAA to BBBBB without mucking up my literals and comments."
|
|
|
Post by George on Jan 12, 2023 9:51:56 GMT -5
Robert: Stefan: Right off, if this goes forward, we have to think differently. Why? Because there's no way I'm going to perform any major surgery on the common Search routine.
This additional test, just like WORD, PREFIX, LEFT etc. is done AFTER the initial string search is performed. So proposing some method to restrict the initial string search via some method is, in my mind, a non-starter.
I agree with Stefan, hanging this on RegEx is problematic. I know myself I struggle to create one, always wondering if it is a complete and accurate specification of what I want. Sheesh! We even provide SPFTEST for users to experiment with because we recognize how unfamiliar most users are with RegEx. With RegEx this would simlpy languish unused.
e.g. whats the RegEx for a string - not in quotes - I have no idea even where to start to code this.
Syntax? I'm open to anything, the S"..." was chosen because it works with the parser. Stuff like WHERE STRING doesn't because the parser is full ISPF syntax, key words can be in any order. S"STRING" would work, but we're back to RegEx again.
Not sure where we go.
George
|
|
|
Post by George on Jan 12, 2023 11:38:21 GMT -5
Robert: How easily you toss off some complex changes with a simple "suppose we did A, then we could do B, and THEN we could ...
Like supporting Multi-line comments, or totally revising the color handling.
I don't try to be negative, but it's long stretch between lofty and ideal designs, and the trenches where the grungy coding gets done. I simply can't and won't stop looking at every idea with a view of "if this went ahead as it's described, could I do it?"
We've been through this conversation many times over the years, why the surprise at my reaction and comments?
All I want is something that's achievable, right now, with minimal disruption. I am simply not prepared to tackle any major revisions any more.
George
|
|
|
Post by George on Jan 12, 2023 15:10:57 GMT -5
Our problem has always been that I take your 'ideas' as specific proposals. I am not a "blue sky, let's just think about what we could do" guy. True, I don't take them as 'marching orders' and head off doing the code, but I do take them as a future road map, and I can't ignore the bumps in the road.
Your solution using colors is great. What about stuff in sngle quotes, Back quotes etc.? I'm sure RegEx can do it, but my bet is it would be a lot more complex statement.
Could I have done it - off the cuff? - No, I'd have to pop open Help on RegEx, maybe experiment using SPFTest to be sure, and then issue the commands.
And then, would I trust that I'm doing it correctly, enough that I'd actually trust a CHANGE command? Certainly not.
Should I become better at RegEx? Maybe. But I am sure that the majority of our users would be in the same boat as me - Nah! someday, maybe.
If we provide something here, I think it has to be an integral part of our search routine, not some bolt-on macro that is dependent on RegEx.
I agree with Stefan, we're just trying to enhance CHANGE so it doesn't mess up literals and comments. Let's not expand it into some generalized, swiss-army knife, capable-of-doing-everything new facility.
What do we need? - right now some simple syntax to specify this new requirement. Based on our simpler requirements, I'm sure I can handle the code part, but designing a simple syntax is a problem.
George
|
|
|
Post by Stefan on Jan 13, 2023 4:20:11 GMT -5
George,
My suggestions regarding your proposed syntax were just that - suggestions. I sought to make the S'....' argument more like existing operands and thus simpler to remember and use. But if they break the Parser code, then stick with what you had. I rather have a working but slight "eccentric" SCOPE feature than no SCOPE feature.
As for RegEx, there's examples of ways to exclude strings within delimiters (quotes or otherwise) on the web. Basically an alternation that searches for the delimiters, ignoring everything between them, and then searches for the target string. It does get complex when you have embedded delimiters or multiple different delimiters.
Macros aren't very helpful because the regex in FIND R'....' does not distinguish between 'captured ' an 'uncaptured' regex groups. It always returns ALL "hits" it finds, not just 'captured' hits, so you need extra code to do the hard work yourself. May as well just use normal FIND in the first place.
|
|
|
Post by mueh on Jan 13, 2023 7:51:07 GMT -5
Robert: Your CHANGE statement with -GRAY worked only with Spflite up to 8.5 For the new version it must be as shown in picture . Rc is not found in yellow txt . thanks for your suggestion which i personaly like .
|
|
|
Post by mueh on Jan 13, 2023 10:01:03 GMT -5
Robert: Here the picture with F Rc -GRAY all It should find the GRAY occurences also on your system . Since V10 of SPFLite doc is -color-names You may specify any one of the defined color names. When color names are entered with a prefix character of - it it means to locate text that does not not consist entirely of the named color. For example F "ABC" -RED would look for ABC only when ABC is not entirely RED.
|
|
|
Post by George on Jan 13, 2023 11:30:36 GMT -5
I think my latest color fix went a bit too far in it's resetting of the Attr line. A bit too big of a hammer. I'll have another go.
George
[UPDATE]
Looks much better.
[/UPDATE]
|
|
|
Post by Stefan on Jan 13, 2023 11:32:04 GMT -5
R,
This is just the sort of occasion I thought of when I said Regex isn't easy to make 'reliable', especially for the casual user.
At a glance, the FIND R'\".+\"' +GRAY should work fine - "Find anything between double quotes. " but it has limitations/pitfalls.
It also finds "literals" within comments and will stumble if there is more than one literal on a the same line, e.g. (REXX)
WHEN G._UsrCmd~CaselessPos("X") > 0 THEN G._UsrCmd = "X" /* 'EXIT' command */
To the FIND command there is only one quoted literal
WHEN G._UsrCmd~CaselessPos("X") > 0 THEN G._UsrCmd = "X" /* 'EXIT' command */
If I trusted this design and intended to change the name of variable _UsrCmd, my code would be trashed in some places.
You'd need R'\"[^"]+\"' to get this to work, but this would also stumble over embedded quotes, so yet more complexity required. And the readability of the RegEx gets horrible if you're dealing with a more symbol-focused language like C+ with lots escaped symbols.
If SPFLite Regex distinguished between 'captured' and 'non-captured' groups, you could user an alternation, e.g.
To find XXXX but only if not in single quotes or double quotes use something like \'[^']+\'|\"[^"]+\"|(XXXX)
But most users would never get that far and a FIND or CHANGE command in a macro can't do this because it would return the first of any of the three terms that it finds.
|
|
|
Post by George on Jan 13, 2023 13:51:06 GMT -5
Sounds like the idea of a SCOPE / WHERE addition to current commands is basically dead then, since consistently determining STRINGS and COMMENTS is too problematic.
A full proper rewrite of colorization to do it properly, via exact parsing, for all flavors of languages etc. is simply not going to happen.
George
|
|