How To Remove Weird Encoding From Txt File
Solution 1:
The encoding you are looking at is uuencode
. In Python, you would use the uu
module to decode this blob, or simply stringdata.decode('uu')
.
uuencode
is a legacy format which was originally used to embed binaries in email (which then only permitted 7-bit US-ASCII; the format also has some concessions for interoperability with big-iron systems of the day which used their own bewildering character encodings). These days, you would expect to see base64
in this role.
I posted an answer to the followup question which shows how to remove uuencode blobs while reading from a filehandle or iterating over a bunch of lines of text.
Solution 2:
The problem can efficiently be solved using the sed command as provided here : sed command - apply in all text (.txt) files of folder
Post a Comment for "How To Remove Weird Encoding From Txt File"