Only show posts tagged with: metablogging, english, sotramont, francais, linux, ubuntu, geeky, web, python, django, screwtheman, spam sucks, vélo, akoha, hiring, chicago, pycon, cloud, consulting, quacks
Older posts:
As of rev 5609, the unicode branch of Django has been merged into trunk. It's mostly transparent, they say, but i've had a few difficulties with it. Here's a couple of simple tips to help.
First, it seems to help if you convert your templates, and even python files, to UTF-8.
I of course use vim, and vim has
very good unicode support. If your LANG is set correctly, vim should use the proper encoding; in
my case, that's fr_CA.UTF-8. You can add set enc=utf-8 to your ~/.vimrc to make
double-sure.
UTF-8 is neat, because up until you write a non-ascii character, the file will look, walk and talk like an ascii file. The second you enter a char which is outside the basic ascii character, it'll write that character in UTF-8. Hence the -8 in UTF-8: it's an 8-bit file format, but which can encode UTF characters.
So, in the shell, make sure your file is actually UTF-8 by doing something like:
esj@titan:520@~$ file yo yo: UTF-8 Unicode textNote that if you haven't put any non-ascii characters in there, it'll still say ASCII.
Then, make sure the file is properly tagged inside as UTF-8. In python, add this line:
# -*- coding: utf-8 -*-As the first line of the file (or second, if your first is
#!/usr/bin/env python).
For templates, make sure there's a header line in the resulting html that says <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">(though django tells the browser in the response headers anyways, but i guess that if the file gets saved, there better be a header). If all this is done properly, you don't have to escape characters, like
é for a é, in either your python code or
your templates, or your data.
One last thing that bit me in the rear: when django objects get translated into strings, like for example an entry from this blog:
class Entry(models.Model):
...blabla...
def __str__(self):
return str(self.headline)
Or maybe you rely on headline's type (CharField) to translate itself into a str, and you don't
call str at all. Either way, you'll have UnicodeError exceptions with non-ascii data.
It seems the string gets translated into some form of ascii (perhaps 8859-1, i don't know) by str(),
and then django puts the string through the unicode encoder, which throws its arms in the air and proclaims that he shouldn't be seeing characters >127 in the data. Do this instead:
from django.utils.encoding import smart_str
(...)
return smart_str(self.headline)
and it works.
by wiswaud
on 7 August 2007
Tags:
django, english, geeky, python, web