26 items tagged “unicode”
ftfy—fix unicode that’s broken in various ways (via) I shipped a small web UI wrapper around the excellent Python FTFY library, which can take broken unicode strings and suggest a sequence of operations that can be applied to get back sensible text. # 9th January 2018, 3:22 am
I’m concerned that this character will open the floodgates for an open-ended set of PILE OF POO emoji with emotions, such as CRYING PILE OF POO, PILE OF POO WITH LOOK OF TRIUMPH, PILE OF POO SCREAMING IN FEAR, etc. Is there really any need to add a range of emotions to PILE OF POO? I personally think that changing PILE OF POO to a de facto SMILING PILE OF POO was wrong, but adding F|FROWNING PILE OF POO as a counterpart is even worse. If this is accepted then there will be no neutral, expressionless PILE OF POO, so at least a PILE OF POO WITH NO FACE would be required to be encoded to restore some balance.
The idea that our 5 committees would sanction further cute graphic characters based on this should embarrass absolutely everyone who votes yes on such an excrescence. Will we have a CRYING PILE OF POO next? PILE OF POO WITH TONGUE STICKING OUT? PILE OF POO WITH QUESTION MARKS FOR EYES? PILE OF POO WITH KARAOKE MIC? Will we have to encode a neutral FACELESS PILE OF POO?
Check out “The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)” by Joel Spolsky: http://www.joelonsoftware.com/ar...[... 55 words]
Reexamining Python 3 Text I/O. Python 3.1’s IO performance is a huge improvement over 3.0, but still considerably slower than 2.6. It turns out it’s all to do with Python 3’s unicode support: When you read a file in to a string, you’re asking Python to decode the bytes in to UTF-8 (the new default encoding) at the same time. If you open the file in binary mode Python 3 will read raw bytes in to a bytestring instead, avoiding the conversion overhead and performing only 4% slower than the equivalent code in Python 2.6.4. # 28th January 2010, 1:28 pm
Understanding Bidirectional (BIDI) Text in Unicode. It turns out you need to sanitise user input to ensure there are no unicode characters that switch your site’s regular text to RTL. # 15th March 2009, 4:37 am
UnicodeDictWriter—write unicode strings out to Excel compatible CSV files using Python. Stuart Langridge and I spent quite a while this morning battling with Excel. The magic combination for storing unicode text in a CSV file such that Excel correctly reads it is UTF-16, a byte order mark and tab delimiters rather than commas. # 20th August 2008, 12:19 pm
PortingDjangoTo3k. Martin von Loewis has started assembling a patch. His write-up illustrates some key differences between Python 2.X and Python 3—it looks like Django’s unicode handling is going to require the most work. # 19th June 2008, 5:53 pm
Sam Ruby: Ruby 1.9 Strings—Updated. A follow up to yesterday’s post: Sam’s principle complaints about Ruby 1.9’s character encoding support were down to a bug which has now been fixed. # 29th December 2007, 7:34 pm
I definitely like Python 3K’s Unicode support better [...] In fact, I think I prefer Ruby 1.8’s non-support for Unicode over Ruby 1.9’s “support”. The problem is one that is all to familiar to Python programmers. You can have a fully unit tested library and have somebody pass you a bad string, and you will fall over.
Ruby 1.9—Right for You? Dave Thomas on the just-released Ruby 1.9. It’s a development release that breaks backwards compatibility in a few minor ways, but new features include the YARV virtual machine (hence significant speed improvements) and unicode support via associating encodings with bytestrings. # 26th December 2007, 12:09 pm
The larger question is why on earth, in 2007 and ten years after XML came out, we are still using text files that don’t label their encoding?
Sam Ruby: 2to3. Sam’s report on an attempt to port the Universal Feed Parser to Python 3.0. The 2to3 tool does most of the work, but it seems the unicode changes can be pretty tricky. # 3rd September 2007, 1:38 am
UnicodeBranch: Porting Applications. A checklist for porting Django applications to handle the new unicode changes. If your application only handles ASCII text at the moment you shouldn’t have to change a thing. # 4th July 2007, 2:41 pm
Translations of My hovercraft is full of eels in many languages (via) Great for unicode testing. # 27th April 2007, 11:14 am