Dealing with Unicode and ASCII using Python
Dealing with Character Encodings is (sometimes) hard. It's especially confusing for those who've never done it before. Converting text from unicode to ascii can be tricky.
A lot of times, I'll import some data from a text file, and I just want to convert everything to ASCII and ignore anything that's not ascii (like MS Word's smart quotes). Luckily, this is fairly easy:mystring = mystring.decode('ascii', 'ignore')
There's tons of great python resources (and code!) for all your character encoding needs. In no particular order, here are a few I've found:A Crash Course in Character Encoding Dive Into Python's Chapter on Unicode Beautiful Soup gives you Unicode, Dammit and there's the companion: ASCII, Dammit There's also unaccent.py , which seems to convert various unicode characters to their ascii equivalent.
There's probably more, but most of these have helped me get the job done.
本文开发（python）相关术语:python基础教程 python多线程 web开发工程师 软件开发工程师 软件开发流程