Thanks so much, can’t wait to see the revision. Once you’re under the hood, please see my note above about the code generally.
What’s most interesting is that the results for same number type differ based on whatever else it finds in the column, which is something I didn’t expect and probably a clue on what’s happening to throw it off. You would think that each cell’s data is interpreted on its own but apparently no.
And just to finish off, here’s a screenshot of how it works fine when the numbers are plain strings, but only if they are by themselves in the column.
I found a regex site that specifically discusses formatting 10 digit phone numbers and suggests using this pattern. I then tested it on one of the online regex test sites and both our patterns match, whereas when I tested yours, they didn’t. Maybe that’s where the problem lies.
^(\+\d{1,2}\s?)?\(?\d{3}\)?[\s.-]?\d{3}[\s.-]?\d{4}$
See here:
@pavilion. I worked through the suggested regex pattern and most of what it matches does not apply in this case. Also, it lacks 3 required capture groups and does not enable multiline mode.
About the only thing it does differently is separators. The following is my regex pattern modified to use the same separators pattern. You can copy and paste this in my earlier script to see if it makes a difference. It does work in my testing, although my earlier regex pattern still works for me
set thePattern to "(?m)^(\\d{3})[\\s.-]?(\\d{3})[\\s.-]?(\\d{4})$"
FWIW, my understanding of the suggested regex pattern is:
^(\+\d{1,2}\s?)?\(?\d{3}\)?[\s.-]?\d{3}[\s.-]?\d{4}$
^ = start of line
( = begin capture group
\+ = match “+”
\d{1,2} = match 1 or 2 digits
\s? = match 0 or 1 whitespace characters
)? = end capture group - once or none
\(? = match 0 or 1 open parentheses
\d{3} = three digits
\)? = match 0 or 1 closing parentheses
[\s.-]? = march 0 or 1 whitespace or dot or dash
\d{3} = three digits
[\s.-]? = march 0 or 1 whitespace or dot or dash
\d{4} = 4 digits
$ = end of line
BTW, I did test my earlier regex pattern here and it worked as expected. I eliminated below the second backslashes for testing on this site:
(?m)^(\d{3})\.?(\d{3})\.?(\d{4})$
The pattern is crap, for several reasons (see @peavine’s dissection of it for more detail).
- This “match phone number” string is supposedly intended for international phone numbers. While a leading
+
sign is now often used to introduce a country prefix, there are still regions using00
for that. And there are quite a few countries having three digit prefixes. - Assuming only spaces, dots and dashes as group separators as in
[\s.-]
is overly narrow – what about slashes? - It uses no capturing groups, as already pointed out by @peavine. So, it does not help in reformatting, only in matching.
- And it, of course, only works for US phone numbers. Apparently everybody assumes those are the only relevant ones
@peavine is correct in their statement that the RE (?m)^(\d{3}).?(\d{3}).?(\d{4})$
works ok. However, they’re missing a backslash in front of the dots, perhaps because Discourse requires two of those outside code. As it stands now, this RE will match anything between \d{3}
, '\d{3}and
\d{4}`. For example, 123456789011 would also match and be rewritten as (123) 567-9011.
Also: I posted a modified JavaScript solution in the other thread. It handles both formats in any number of columns (and has two glitches):
chrillek. Thanks for making note of this. When I posted this yesterday the pattern did have backslashes before the dots but apparently the forum software deleted them. I’ve edited the post to display the pattern as a script, which retains the desired backslashes. I retested the pattern and it matches 1234567890 and 123.456.7890 but not 12345678901
@pavilion. I reviewed and retested my script in post 67, and it works as expected under every circumstance I could envision. The only exception is when one of the cells in a target column had been merged with another cell. Sorry I couldn’t help.
I’m kind of confused at this point as to what exact pattern I should be using. If you’ve updated the original script to this:
then it still doesn’t work for me, with the same issue on the 123.456.7890 format.
Everything I’ve tried still leaves me with the same results as my earlier screenshots, with the data being truncated after the space into the next column. Can’t understand how the two of us are getting different results from the same formula. And moreover, if the issue is just the regex syntax, at worst wouldn’t it just not recognize the number and skip it, and not recognize the text and skip it? I thought the regex is just to be able to identify what cells get processed and what cells get ignored. But here, both the text and the 123.456.7890 are getting identified and just being mis-processed, no?
OK, well here’s something to ponder and hopefully help me. I tried something different and now it works.
I copied 5 cells of each phone number column from the original sheet and pasted by themselves into a brand new sheet with no other data, and the script as you last posted it worked perfectly. But somehow when the script runs on the full spreadsheet it doesn’t.
Not sure why this is happening?
UPDATE: Not so fast. After I copied the entire columns over, the problem reappeared. I am going to see what there might be in that second column causing the issue and report back.
Appreciate your input tremendously. After testing this again my spreadsheet the issue is not the 123.456.7890 number format at all because when I isolate all the numbers from the column it works as expected. The problem is the text, and a simply test will show it to you. Please run the script on a column of two cells with text that contains more than one word, e.g. A1- Text Message, A2 = Data Message and then you’ll see what I’m experiencing on the full column. But what made this harder to understand is that once the column contains both types, the numbers and the multi-word text, then ALL the entries are formatted badly, not just the text.
OK, here’s the issue. the 123.456.7890 format in the second column gets processed perfectly by your script when there is not text in other cells (or more specifically, text with a space between words). Once there are those types of cells in the column, the whole thing gets formatted badly.
As a test, run your script on a column of cells with two words in them and you’ll see exactly what I mean.
UPDATE: That’s not exactly right either. I am uploading a very simple test spreadsheet, if you could kindly run the script on each of the columns and let me know if you can makes heads or tails of what’s going on!
Test Sheet.zip (265.8 KB)
You’ll see the first Column A of only numbers works fine, but the second two columns of only text and mixed text and numbers do not.
I’d been under the impression up till now that each column contained its own type of data and that column “B” only contained US-style telephone numbers.
If a cell has a space in it, the text is going to be split into separate cells when pasted back, in the same way the reformatted phone numbers were before the space in them was replaced with a non-break space. An extra line needs to be inserted into peavine’s script, just before the clipboard’s set, to replace every space in the entire text with a non-break space:
set theString to (theString's stringByReplacingOccurrencesOfString:space withString:(character id 160) options:1024 range:{0, theString's |length|()})
But I don’t know why normal spaces are confusing Numbers here.
Just a wild guess: Numbers is tearing pasted data as it is a text file that is told to read. Therefore, it tries to find a „separator“ like comma, semicolon or tab. In their absence, it considers a space to be the separator. Adding a comma to the end of the first line fixes this behavior. I think.
Hi @chrillek.
Yes. I think you’re right. I’ve just been testing with my multi-column script here and it works perfectly if I change the non-break spaces back to ordinary spaces because it uses tabs as the column separators.
One more thing: Appending a comma to the first line before setting the clipboard values seems to fix all the issues. And it’s bloody simple to fix this first line after pasting…
So, here’s my final take on this problem in JavaScript. It works just fine with the sample CSV posted by @pavilion here Format Telephone Number in Numbers - #91 by pavilion
The only grievance I have are the completely useless calls to display()
. That is, they’d be useless if Apple’s stuff were working sanely.
/* Anonymous, autoexecuting function to do the work */
(() => {
/* Set up some constants:
- the columns to search/replace in 'columnIndices' ('A' is 0, 'B' is 1 etc)
- the 'app' constant (like the tell block in AS)
- the table to work with
- the regular expression to find values that have to be modified
- the regular expression to match the phone numbers
- the replacement string to change the phone numbers
- and the 'currentApplication' which is needed (?) to work with the clipboard
*/
const columnIndices = [0,2]; // Select columns A and C
const app = Application("Numbers");
const table = app.documents[0].sheets[0].tables[0]; //in 1st doc: 1st table's 1st sheet
const testRE = /^[0-9.]{10,12}$/; //rows of 10 to 12 characters only containing digits and dots
const dateRE = /^(\d{3})\.?(\d{3})\.?(\d{4})$/; //phone numbers with and without dots
const replaceString = "($1) $2-$3"; // correctly formatted phone numbers
const ca = Application.currentApplication();
ca.includeStandardAdditions = true;
/* Loop over the columns in 'columnIndices */
columnIndices.forEach(c => {
const currentCol = table.columns[c];
/* Get all values for the current column. That is an 'Array' in JavaScript */
const vals = currentCol.cells.value();
/* Loop over the values and create a new 'Array' containing
- the unmodified originals for any undefined/empty values or those not matching the 'test' condition
- the corrected phone numbers
*/
const newVals = vals.map(v => {
if (!v || ! testRE.test(v)) return v;
const nv = (v+"").replace(dateRE, replaceString);
return nv;
});
/* Set the clipboard to a newline separated string. Append a comma to the
first line to stop Numbers from treating the phone numbers as strings
*/
ca.setTheClipboardTo(newVals.join('\n').replace('\n',',\n'));
app.activate();
/* Set the selection for the table */
table.selectionRange = currentCol;
delay(0.3);
/* Send the "paste without format" keystroke */
const se = Application('System Events');
se.keystroke('v',{using: ["shift down", "option down", "command down"]});
delay(0.3);
/* fix the first row */
currentCol.cells[0].value = newVals[0];
})
})()
I’m sure some kind soul can recreate this in AppleScript if they need it.
@pavilion. Thanks for the test spreadsheet, which allowed me to reproduce the issue.
I tested Nigel’s suggestion, which seemed to fix the issue with my script. Also, I found a line of my script that could cause a problem (not related to the current issue), and I have fixed that also. So, my suggestion for testing is:
use framework "Foundation"
use scripting additions
-- formatColumn("A")
-- delay 5 -- test lower values
-- formatColumn("B")
-- delay 5 -- test different values
formatColumn("C")
on formatColumn(theColumn)
tell application "Numbers" to tell table 1 of sheet 1 of document 1
set format of column theColumn to text
set theValues to value of every cell in column theColumn
set selection range to range (theColumn & "1:" & theColumn & "1")
end tell
set theArray to current application's NSArray's arrayWithArray:theValues
set theString to theArray's componentsJoinedByString:linefeed
set thePattern to "(?m)^(\\d{3})\\.?(\\d{3})\\.?(\\d{4})$"
set theString to (theString's stringByReplacingOccurrencesOfString:thePattern withString:("($1)" & character id 160 & "$2-$3") options:1024 range:{0, theString's |length|()})
set thePattern to "(?m)^<null>$"
set theString to (theString's stringByReplacingOccurrencesOfString:thePattern withString:"" options:1024 range:{0, theString's |length|()})
set theString to (theString's stringByReplacingOccurrencesOfString:space withString:(character id 160) options:1024 range:{0, theString's |length|()})
set the clipboard to (theString as text)
delay 0.2 -- test different values
tell application "System Events" to tell process "Numbers"
set frontmost to true
click menu item "Paste" of menu "Edit" of menu bar 1
end tell
end formatColumn
Works like an absolute charm! So appreciative for all your help (as well as @chrillek) and hope others in need of the same solution will find this super useful.
p.s. I assume there’s nothing left to tweak in the script to make it any faster?