Tree
- Tree:
bebcf6d224d05a5d97c2371ec9d4d0e496b4d902
- Date:
- Message:
- got-notify-http: fix unicode handling JSON strings are made of UNICODE codepoints, of which only \, " and control characters have to be escaped, and the whole document MUST be encoded in UTF-8. The current code generates invalid strings for non-ASCII characters, so it has to be made UTF-8 aware. tedu' isu8cont() can't be used since it allows surrogate pairs and overlong sequences which will cause decoding errors on the receiving side. Similarly, mbtowc() depends on the current locale and could cause issues in -portable. Instead, bundle Björn Höhrmann's "Flexible and Economical UTF-8 Decoder" and use it to parse the text. Decoding errors results in the replacement character U+FFFD being emitted and the bytes considered so far to be discarded; the decoder is then restarted with the next byte. Git commit messages don't carry the notion of the encoding, but it's reasonable to expect UTF-8 (which is a superset of ASCII). For other more esotic encodings, the commit id can be used to manually extract the data. ok stsp@
Makefile.am | commits | blame |
tog.1 | commits | blame |
tog.c | commits | blame |