|
Post by Stefan on Oct 24, 2023 10:17:53 GMT -5
George,
Observed with versions 2.7.23080, 3.0.23267 and 3.0.23295
I noticed that some files I save get bigger. Looks like MINLEN works as always, but PRESERVE OFF is not stripping trailing blanks before SAVE. This problem has existed for quite a while. Near as I can figure it, something went wrong between May 2021 (files are saved correctly) and September 2021 (file lines are all equal to MINLEN).
My profiles are LOCKED but most specify MINLEN 100 and PRESERVE OFF
1) Pick a file with short'ish lines and no trailing blanks 2) Load it into EDIT, using a profile specifying a lengthy MINLEN and PRESERVE OFF. 3) SAVE the file 4) Load file into Notepad 5) Use Notepad find/replace to change all single " " to "~" And there's the issue for all to see.
UPDATE George,
The only way I can avoid creating file with trailing blanks is to code MINLEN 0 in the profile before loading the file. Any value greater than 0 leaves trailing blanks when lines are shorter than the value, regardless of PRESERVE OFF. It's as if PRESERVE OFF treats blanks added by MINLEN differently to other blanks.
|
|
|
Post by Robert on Oct 24, 2023 10:38:11 GMT -5
Stefan, based on your description, it's working right. Why?
1. MINLEN means to enforce a minimum length. If you say MINLEN 100, you are asking for lines to be no shorter than 100.
2. PRESERVE OFF means to NOT retain any trailing blanks after the last non-blank, which are past the MINLEN. When MINLEN is omitted (when 0) means there is NO minimum, thus zero-length lines can be created. MINLEN 0 is the default because that is how text files are commonly created and maintained.
I believe the effect you have observed is actually SPFlite handling MINLEN and PRESERVE correctly.
R
|
|
|
Post by Stefan on Oct 24, 2023 11:07:19 GMT -5
R.
Hmmm, that's not my previous experience, but your post prompted a fresh look at the HELP doc.
It says...
Preserve ON/OFF Handling
If PRESERVE ON is specified, trailing blanks are retained as-is and written to the file. If PRESERVE OFF is specified, trailing blanks will be removed before writing the text to the file. It also says
The trimming action performed by PRESERVE OFF or PRESERVE C occurs when the file is saved or when END is processed, but during a SAVE operation when you continue editing the file, any existing trailing blanks are not removed from the then-current edit session. These trailing blanks will be removed when you close your file; this will be evident the next time you open your file again.
So it should strip trailing blanks on SAVE and END. However, I may have changed my modus operandi. I seem to have adopted a tendency to explicitly SAVE the file, rather than END'ing the session and letting AUTOSAVE ON handle the rest.
So I tried it both ways. If I have a file with trailing blanks, make a change and issue the END command, all the trailing blanks are saved also - all of them. And they definitely did not used to do so.
I reckon something has changed.
|
|
|
Post by Robert on Oct 24, 2023 11:17:22 GMT -5
I think the part about existing blanks NOT being removed at SAVE time, when they DO get removed at END time, is unwise; it a bug waiting to happen. Problem with removing blanks at SAVE time is (a) could be time-consuming for large files, and (b) might create surprising results.
In my way of thinking, the PRESERVE and MINLEN ought to be 'continuously' applied, so that the ending-blank state of any given line is always the same, regardless of whether SAVE or END is performed or not. Deferring the trimming action to SAVE or END creates potential inconsistencies, and is not well-defined.
This is a topic that deserves some keen analysis work.
R
|
|
|
Post by Stefan on Oct 24, 2023 13:58:10 GMT -5
I think the part about existing blanks NOT being removed at SAVE time, when they DO get removed at END time, is unwise; it a bug waiting to happen. Problem with removing blanks at SAVE time is (a) could be time-consuming for large files, and (b) might create surprising results. R.
I don't think it works that way. PRESERVE OFF used to remove all trailing blanks (whether created by a MINLEN value or just left over by editing) when writing the file to disc (SAVE or END).
The HELP description merely states that trailing blanks are removed 'when/during writing to disc', but they are not removed from the in-memory version of the edited file when the SAVE command is issued. That makes sense as the user is likely to continue editing after the SAVE, and would want to retain all bytes until the edit session is closed.
And I think it still works that way, except that I reckon NO trailing blanks are being removed regardless of the PRESERVE OFF setting. I also doubt that users would notice the blank removal overhead, even with large file. It is probably more than offset by having to write less data to the media.
|
|
|
Post by Robert on Oct 24, 2023 15:31:11 GMT -5
Keep in mind that with MINLEN, its only possible action, when applied, is to ADD trailing blanks, not remove them. This part I *think* should always be enforced during editing. However, this is a design choice, which was made a long time ago.
The removal of blanks via PRESERVE OFF does need to be done only at SAVE/END time, because during editing, there is no way to know if the user intends to add more nonblank data after some 'current' span of trailing blanks, so it would be crazy to trim them before the users was done with what they were doing.
But just to be clear, if I say MINLEN 100, and PRESERVE OFF, and the line has a length of 120, and there are blanks from column 90 to 120, then the line length should end up being 100 (that is the MINIMUM length, so it's not supposed to be shorter than that) and then the PRESERVE OFF will ensure that blanks in columns 101 to 120 will not be preserved.
Is that your understanding too? If not, there is some confusion here.
R
|
|
|
Post by George on Oct 24, 2023 15:51:01 GMT -5
I'll kick in here. I agree with Stefan.
MINLEN was intended to eliminate the problem of FIND strings with trailing blanks not finding things at end-of-line. So it affects the in-memory version only. And ONLY the in-memory copy.
PRESERVE was to handle what was written to the file. It is handled ONLY during the write process. If just a SAVE, the in memory copy was to remain untouched,
Without looking at code, this seems to be a bug, I'll check it out. The revision a while back that changed how Profiles are handled was pretty significant. I'm sure this is yet another fallout.
Stay tuned.
George
|
|
|
Post by Robert on Oct 24, 2023 17:38:10 GMT -5
George, I am very surprised by your comments above.
You: "MINLEN was intended to eliminate the problem of FIND strings with trailing blanks not finding things at end-of-line. So it affects the in-memory version only. And ONLY the in-memory copy."
According to the Help, it was intended to avoid zero-length lines, which are a problem for FIND strings. But I seem to recall when we put this thing in, there was also a question about insisting on RECFM F files having actual fixed sizes. For instance, someone might want actual card-image data to be RECFM F LRECL 80 MINLEN 80. Correct me if I'm wrong, but I seem to recall that you don't actively prevent RECFM F to vary from its LRECL size. Wasn't MINLEN an attempt to avoid that?
According to the Help (which I wrote), MINLEN is continuously maintained; any time any editing action changes the line length, Edit is supposed to jump in and ensure that the MINLEN is maintained, so lines never get shorter than MINLEN columns. And, it clearly says that no matter what editing actions you take, MINLEN will always override the result, so that the minimum length is always maintained. This means that, if the Help explanation is followed in the code, Preserve should never override MINLEN - it's the other way around.
Does everyone agree with that understanding?
R
|
|
|
Post by George on Oct 24, 2023 19:48:30 GMT -5
Robert: MINLEN only affects in memory contents. I'll not quibble about why it was created.
It does not prevent RECFM F from being longer - IN MEMORY. If length is too long, it is caught at write time.
PRESERVE is only done at write time,and never alters the working data.
What Stefan is describing is a bug,plain and simple.
George
|
|
|
Post by George on Oct 25, 2023 8:45:01 GMT -5
Lookng at the code, the MINLEN value is applied to the written lines AFTER the PRESERVE processing.
This is NOT what I thought, and I feel it is incorrect. I've re-read all the HELP and it doesn't really spell this out either.
So ... before I 'correct' this, how about some comments?
I feel, since MINLEN was created to avoid zero length lines, it should apply to the working memory data only, and PRESERVE should be the 'winner' when it comes to writing the file.
George
|
|
|
Post by Robert on Oct 25, 2023 10:35:30 GMT -5
George, believe it or not, I am happy to hear all this. I was about to apologize for being a senile fuddy-duddy.
I believe what you see as "incorrect" actually IS correct. Here's why.
Since you re-read all the Help, you know that MINLEN 'n' when 'n' is > 0 means that no lines will ever get shorter than 'n' columns. It makes sense that PRESERVE is done first, when it is set to OFF, because OFF means to trim trailing blanks.
So, PRESERVE OFF + MINLEN 100 means: (1) Trim trailing blanks (2) For any lines < 'n', pad them with blanks so that their length IS 'n'
It has to be done in this order. Reason: If PRESERVE OFF (that is, "trim") were done last, PRESERVE would have to know about MINLEN and its 'n' value, so that it wouldn't "over-trim". Because PRESERVE OFF (trim) is done first, it can blindly remove all trailing blanks without needing to know the 'n' value of MINLEN. Then, when MINLEN is applied, it doesn't need to know anything about PRESERVE.
Again, MINLEN was in fact added to avoid zero-length lines, but ALSO to force fixed-length line sizes, especially in cases where the file is being edited with RECFM U instead of RECFM F.
Finally, note in the Help that MINLEN is described as a continuous, on-going process, NOT applied JUST at SAVE/END time. When MINLEN > 0 is in effect, you should ACTIVELY see lines being padded with blanks. If you didn't, and could create lines shorter than 'n', then 'n' wouldn't actually be the MINIMUM length - the minimum would be anything you typed a line to be, and thus MINLEN would be meaningless.
R
|
|
|
Post by Robert on Oct 25, 2023 11:05:28 GMT -5
Hmm ... I see the problem. It's the timing of the two actions.
To be compliant with MINLEN Help, it has to be applied as a continuous process. But, for "PRESERVE to be done first, THEN MINLEN", it's a problem, because if MINLEN is always happening behind the scenes, PRESERVE *can't* be done first, because MINLEN is always in there doing its thing.
That means you'd have to do this:
1. Apply MINLEN on a continuous basis 2. During SAVE/END time, apply PRESERVE OFF if needed, THEN apply MINLEN again, one last time
Do you agree?
R
|
|
|
Post by Stefan on Oct 25, 2023 11:11:11 GMT -5
Guys, I've read Roberts post 3 times and still cannot fathom what he's trying to say (probably my fault). I don't understand Robert insistance of "PRESERVE first then MINLEN" because that is not how it worked for several years, before I raised this thread.
I am not questioning any effect the interplay of MINLEN and PRESERVE has on the 'in-memory' data of the file. But I do maintain that the effect MINLEN has on the records once the file has been written back to disc is definitively different to what it used to do.
Leaving aside why MINLEN and PRESERVE were originally introduced (and I was here at the time), the point remains that suddenly the effect is different. I do not recall any discussion about RECFM=U and fixed length records - which in any case doesn't make sense, What is an undefined record of fixed length, if not a fixed length record?
I agree with George when he said... MINLEN was intended to eliminate the problem of FIND strings with trailing blanks not finding things at end-of-line. So it affects the in-memory version only. And ONLY the in-memory copy. PRESERVE was to handle what was written to the file. It is handled ONLY during the write process
...as well as with Robert pointing out that it also avoids the issue with zero-length lines in files.
MINLEN 'n' pads records, presumably after they are loaded into memory to a minimum length (there's no implication of fixed length - some records may well exceed the MINLEN value). It is indeed a continuous process, keeping the in-memory lines at a minimum (but crucially not fixed!) length.
PRESERVE is not involved in the data loading, nor during the process of editing. It is only required before data is written back to external storage and then only if the profile option is PRESERVE C or PRESERVE OFF.
PRESERVE OFF should remove (a) MINLEN padding and (b) any other trailing blanks the user may have introduced, before the records are written back to disc. My testing seems to suggest that it does still do this for (b) but no longer for (a).
|
|
|
Post by George on Oct 25, 2023 12:16:22 GMT -5
OK, here goes. And the judges decision is final.
MINLEN is applied continuously, every change to a line is padded if needed. That has been true since MINLEN was created and has never been altered. This will not change.
MINLEN was designed to avoid the NULL line problems and others related to blanks at end of line. It was never intended to assist in creating fixed length records. Why use RECFM U LRECL 0 and MINLEN to create a file when RECFM F LRECL nn does that? Whether you want no EOL or CRLF is up to the EOL parameter.
PRESERVE was ALWAYS supposed to control the trimming of trailing blanks, it will be put back to that purpose.
So File Write will do PRESERVE processing LAST. MINLEN has no part in writing the file.
George
|
|
|
Post by Robert on Oct 25, 2023 14:22:08 GMT -5
To answer the "why" question, I *think* it might be a matter of the EOL setting. RECFM F ends a line at the LRECL 'n' size, while RECFM U LRECL 0 ends with an EOL of CRLF or whatever. Someone might want a fixed length record of size 'n' but still use an EOL other than NONE. Someone that wanted RECFM U but fixed size might have some special purpose for the data, perhaps in an embedded system or something. It's uncommon, but not unheard of, either.
It's the usual "who would ever want to do THAT?" question, and sure as s*** somebody DOES want to. We shouldn't arbitrarily restrict unusual files just because we personally might not use them.
R
===> I probably should note that actually, I don't have an axe to grind on this topic. I just want to make sure we are doing this correctly. Things like maintaining line lengths and trimming trailing blanks are fundamental issues. They are things that should have been settled a long time ago. I am a little concerned that at this late date a problem like this was found.
|
|