Convert Zend Framework project to UTF-8

If you still got your Zend Framework project lying around with mixed charsets, it’s now time to clean this up! If you decide to switch to Unicode, change the character set everywhere throughout your project. There should no longer be any need of conversions as utf8_encode() or utf8_decode.
Here’s my quick step-by-step tutorial…

Convert all files to UTF-8

This is the first tricky part. There are tons of different solutions to convert the character set of your files. You could simply take your text editor as e.g. TextMate (on OS X) and save-replace the files, one by one:
TextMate save as...
But let’s do it the easy way, I mean, the platform-independent way (well, in case you find bash and iconv on your Windows box :) )
I wrote a small script that converts your *.php files in-place using iconv:

You can now simply run the script in your current directory and change all files recursively by confirming each, e.g.:

Convert Database tables

Change your CREATE TABLE statements for new tables:

That’s how you convert existing tables in a different character set:

Layout script

Add the following meta-tag to your layout script (usually located in /application/views/layouts/layout.phtml):

Bootstrap definitions

Set the correct charset for your database connection in the database adapter used by your ZF project:

Also set the encoding for your Zend_View object:

form-tags / Zend_Form

Change your form-tags and add accept-charset="utf-8":

If you’re using Zend_Form, specify this attribute as follows:


Zend_Mail also has to be aware of your charset:

Final Cleanup

As final cleanup, do some search-replace:

7 Responses

  1. ProTom
    Jun 25, 2009 - 08:55 PM

    Super instructions. Helped me very much to convert an existing project to UTF-8. Had some hard time with conversion until I found out that I have to use UTF-8(no BOM) for all text sources.

  2. Darryl
    Aug 11, 2009 - 09:56 PM

    Thanks! I always forget about the dbAdapter part!

  3. Joe Devon
    Nov 28, 2009 - 02:56 AM

    Nice tutorial. Let me add a few things.

    Default charset in apache:

    Default charset in php.ini:

    Default CSS charset:

    FYI on header charsets vs meta charsets:
    “The HTTP header is the preferred method, and it overrides the tag if present.”

    Re the form charset:
    Note: The accept-charset attribute does not work properly in Internet Explorer. If accept-charset=”ISO-8859-1″, IE will send data encoded as “Windows-1252″.
    (but the good side is this is NOT a security issue)

    Last but not least, instead of issuing SET NAMES query which will be a call to the dB whether you need to connect or not, as of #ZF 1.8 and up, this will work in your app.ini:
    db.params.charset = utf8 ;

    If you’re running an older version, go with this:
    db.params.driver_options.1002 = “SET NAMES utf8″


  4. Florian
    Mar 15, 2010 - 10:17 AM


  5. riedi
    Jul 04, 2010 - 09:11 PM

    Thank you, very helpful!
    Please note, that you have to set manually the charset if you use the method htmlentities!

    htmlentities ( $element->getLabel (), null, ‘UTF-8′ );

  6. Klaus
    Jul 17, 2011 - 11:52 AM

    This is fabulous!
    Saved my life and hours of work.
    Greetz Klaus

  7. Slava
    Feb 06, 2012 - 03:52 PM

    And if failed with Error Like this:
     line 1′ in /home/~/library/Zend/Controller/Response/Abstract.php:282 Stack trace: #0 /home/~/library/Zend/Controller/Response/Abstract.php(300): Zend_Controller_Response_Abstract->canSendHeaders(true) #1 /home/~/library/Zend/Controller/Response/Abstract.php(728): Zend_Controller_Response_Abstract->sendHeaders() #2 /home/~/library/Zend/Controller/Front.php(984): Zend_Controller_Response_Abstract->sendResponse() #3 /home/~/library/Zend/Application/Bootstrap/Bootstrap.php(77): Zend_Controller_Front->dispatch() #4 /home/~/library/Zend/Application.php(335): Zend_Application_Bootstrap_Bootstrap->run() #5 /home/~/public/index.php(31): Zend_Application->run() #6 {main} thrown in /home/~/library/Zend/Controller/Response/Abstract.php on line 282
    And if I return to Ascii or (UTF-8 with BOM) everything work

Leave a Comment