Something strange happened to one of my XML files in UTF-8 encoding. As a result, my Ubuntu 14.04 desktop thinks that it is a binary file and any editor displays it as all full of "strange" characters. Here is my case:
k6ps@laptop520:~/Allalaadimised/File_problem$ ll
kokku 308
drwxrwxr-x 2 k6ps k6ps 4096 dets 15 11:02 ./
drwxr-xr-x 5 k6ps k6ps 20480 dets 15 10:58 ../
-rw-r--r-- 1 k6ps k6ps 134587 dets 15 10:58 bad_file.xml
-rw-r--r-- 1 k6ps k6ps 131930 dets 15 10:58 good_file.xml
k6ps@laptop520:~/Allalaadimised/File_problem$ file -bi good_file.xml
application/xml; charset=utf-8
k6ps@laptop520:~/Allalaadimised/File_problem$ file -bi bad_file.xml
application/octet-stream; charset=binary
k6ps@laptop520:~/Allalaadimised/File_problem$ head -n 3 good_file.xml
<?xml version="1.0" encoding="UTF-8"?>
<logbook>
<threadset name="First">
k6ps@laptop520:~/Allalaadimised/File_problem$ head -n 3 bad_file.xml
|I��+ˮ���|+��l��]֊ՙ�z�")��l��]֊ՙ�z�")��l��]֊ՙ�z�")��l��]֊ՙ�z�")��l��]֊ՙ�z�" )��l��]֊ՙ�z�")��l��]֊ՙ�z�")��l��]֊ՙ�z�")��l��]֊ՙ�z�")��l��]֊ՙ�z�")��l��]֊ՙ�z�")��l��]֊ՙ�z�")��l��]֊ՙ�z�")��l��]֊ՙ�z�")��l��]֊ՙ�z�")��l��]֊ՙ�z�")��l��]֊ՙ�z�")��l��]֊ՙ�z�")��l��]֊ՙ�z�")��l��]֊ՙ�z�")��l��]֊ՙ�z�")��l��]֊ՙ�z�")��l��]֊ՙ�z�")��l��]֊ՙ�z�")��l��]֊ՙ�z�")��l��]֊ՙ�z�")��l��]֊ՙ�z�")��l��
.. and a lot more characters like these. When i open the file in vi editor or Scite, i get lots of chars like these:
|I^[ ß+Ë®ýþö|+ÆÜ^Pl<8c>ò]Ö<8a>Õ<99><98>z»"^\)<9d>Ãl<8c>ò]Ö<8a>Õ<99><98>z»"^
\)<9d>Ãl<8c>ò]Ö<8a>Õ<99><98>z»"^\)<9d>Ãl<8c>ò]Ö<8a>Õ<99><98>z»"^\)<9d>Ãl<8c>ò]Ö<8a>Õ<99>
<98>z»"^\)<9d>Ãl<8c>ò]Ö<8a>Õ<99><98>z»"^\)<9d>Ãl<8c>ò]Ö<8a>Õ<99><98>z»"^<98>z»"^\)<9d
... and at the bottom it says:
"bad_file.xml" [Incomplete last line][converted] 138 lines, 214920 characters
Hexdump output:
k6ps@laptop520:~/Allalaadimised/File_problem$ hexdump -C bad_file.xml | head -n 15
00000000 7c 49 1b a0 df 2b cb ae fd fe f6 7c 2b c6 dc 10 ||I...+.....|+...|
00000010 6c 8c f2 5d d6 8a d5 99 98 7a bb 22 1c 29 9d c3 |l..].....z.".)..|
*
00001000 5a ea 54 45 9b f8 9e ce 16 35 89 bd 8f 08 cb 82 |Z.TE.....5......|
00001010 6c 8c f2 5d d6 8a d5 99 98 7a bb 22 1c 29 9d c3 |l..].....z.".)..|
*
00002000 29 b8 f0 21 4a ea 00 19 28 46 53 c5 d1 73 f5 a9 |)..!J...(FS..s..|
00002010 6c 8c f2 5d d6 8a d5 99 98 7a bb 22 1c 29 9d c3 |l..].....z.".)..|
*
00003000 5c 56 80 41 f9 ef 98 3c e3 7e 7c ee 3a 20 94 82 |\V.A...<.~|.: ..|
00003010 6c 8c f2 5d d6 8a d5 99 98 7a bb 22 1c 29 9d c3 |l..].....z.".)..|
*
00004000 ad cc 1c 5f 40 22 8b f6 9b bb aa ea 45 de 21 ee |..._@"......E.!.|
00004010 6c 8c f2 5d d6 8a d5 99 98 7a bb 22 1c 29 9d c3 |l..].....z.".)..|
*
I've tried to open the file with various editors and change encoding, convert with iconv, but no luck so far. Unfortunately i'm very inexperienced at system-level issues, so could anybody please give some suggestions what could i try to recover text from that file?
k6ps
strings bad_file.xmlto dump bare Strings. If you don't see any, the file might be damaged in a way noone can recover it (at least without knowing what happened to it)./home/k6ps/.Private ecryptfs 472157480 407729008 40421204 91% /home/k6ps