Keeping backups of your SMS messages is useful for having copies of important conversations, or for moving messages off your device to external storage. The app I chose to use was SyncTech's SMS Backup & Restore. This app backs up SMS, MMS, and call logs to local storage, Google Drive, Dropbox, or OneDrive. You can even view the XML backups in your browser with the .xsl file provided.
Note: Browsers may block the xml files because of CORS settings. You can allow this in Firefox by setting
false in about:config.
In the app settings, you can select to store emojis and special characters, but the option notes that it "stores invalid XML characters" and it won't be usable outside the app itself. At first I was fine with it, I didn't need to back up emojis and could leave it disabled. But it wasn't long before I realized that emojis added a small something that wasn't conveyed in the emoji-less backups, which I never thought I would need.
Trying to view a backup with emojis on Firefox led to this problem:
These numbers are supposed to be surrogate pairs for representing emoji characters, but these big numbers are outside of XML's valid character set.
Well, we can escape the ampersand of the invalid XML characters so they aren't parsed as invalid characters:
Because I wanted to make a tool to manipulate the backups, I decided to use an XML parser instead of doing a simple regex replacement. The problem with fixing parsing issues in XML is that you can't actually parse the XML to fix it. The Python libraries I tried either couldn't parse the invalid XML, or just stripped out what I was trying to fix, so, I had to escape the invalid characters before feeding it through.
Old sms.xml layout:
New sms.xml layout:
I wrote a python script called smstoxml that is able to fix the invalid XML characters. smstoxml can manipulate the SMS and the call backup file, in addition to converting emojis to valid XML, smstoxml can:
* - only for SMS file
+ - only for call file
Running smstoxml on the exported XML file will convert the emojis into valid XML, which can then be viewed on a browser:
Make sure that the modified
sms.xsl is in the same directory as the XML backup for viewing.
Convert the valid emoji characters back into invalid XML so the app can correctly restore the backup:
Sometimes, the backup will contain different styles of numbers for the messages.
+1-123-555-0123 may be found in the file in these formats for
To normalize them for easier parsing, viewing, filtering, etc.:
To delete messages from certain contacts, use:
smstoxml can be downloaded on GitHub.