RHEL5 treats lone surrogate in UTF-8 text as legal!
ON RHEL5 the behavior is wrong as there is no message for lone surrogate:
$ echo -e 'P\xed\xa0\x80Q' | iconv -t UTF-8 >/dev/null
$
The 3 byte sequence ed-a0-80 is U+D800 which should be treated as illegal. But we do not get any message! The locale is
LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=
On RHEL6 we get the expected message for same ( command & local as above):
iconv: illegal input sequence at position 1
Any one got clues and resolution for this case?