|
Post by George on Jun 26, 2021 10:11:59 GMT -5
OK, corrected, words ending in 'x or 'xx will not UC the ending contraction.
George
|
|
|
Post by George on Jun 26, 2021 10:32:16 GMT -5
Robert: Yeah, it's a bit of a kluge - look at the 'word', if it's < 3 characters and preceded by a ' ignore it.
George
|
|
|
Post by George on Jun 26, 2021 11:19:44 GMT -5
OK the full process is 1) LC the whole string 2) parse the string into words using all the punctuation characters as delimiters 3) Take each 'word' from the parse and UC it's 1st character in the string.
Your description of "scan the string looking for places to change LC to UC." seems pretty vague. You can't just skip the next character if preceded by a ' -- you'd miss fully quoted 'words'.
My 'fix' was to insert step 2A and ignore short words preceded by a ' before they get to step 3. I guess that could be enhanced by checking for a trailing space. But then you get into last-word-of-a-string (no space), or perhaps a different trailing delimiter such as --- they sure didn't, but I also wouldn't
We don't need that level of correctness.
George
|
|
|
Post by George on Jun 26, 2021 14:56:07 GMT -5
Robert: I'm not going to belabor this. I use the tools that PB provides, which it to quickly parse the whole string into 'words' using a delimiter string of all the common delimiters. That means "it's" becomes two words - 'it' and 's'.
I am not going to start scanning the string letter by letter looking for sequences like your LETTER APOSTROPHE LETTER sequence.
TitleCase is just not worth it, what we have is 99% effective, I will not be re-writing this.
George
|
|