diff options
| author | Chris McDonough <chrism@agendaless.com> | 2009-04-11 18:35:40 +0000 |
|---|---|---|
| committer | Chris McDonough <chrism@agendaless.com> | 2009-04-11 18:35:40 +0000 |
| commit | 77a146c26fab9594b4a401fc44f1ee5b8373bbea (patch) | |
| tree | be1270ae2f1b291760070a799d215175fa1230e7 /docs | |
| parent | 012f0e34d6e2f3238b0e5d16d045f292579d3822 (diff) | |
| download | pyramid-77a146c26fab9594b4a401fc44f1ee5b8373bbea.tar.gz pyramid-77a146c26fab9594b4a401fc44f1ee5b8373bbea.tar.bz2 pyramid-77a146c26fab9594b4a401fc44f1ee5b8373bbea.zip | |
- The default request charset encoding is now ``utf-8``. As a result,
the request machinery will attempt to decode values from the utf-8
encoding to Unicode automatically when they are obtained via
``request.params``, ``request.GET``, and ``request.POST``. The
previous behavior of BFG was to return a bytestring when a value was
accessed in this manner. This change will break form handling code
in apps that rely on values from those APIs being considered
bytestrings. If you are manually decoding values from form
submissions in your application, you'll either need to change the
code that does that to expect Unicode values from
``request.params``, ``request.GET`` and ``request.POST``, or you'll
need to explicitly reenable the previous behavior. To reenable the
previous behavior, add the following to your application's
``configure.zcml``::
<subscriber for="repoze.bfg.interfaces.INewRequest"
handler="repoze.bfg.request.make_request_ascii"/>
See also the documentation in the "Views" chapter of the BFG docs
entitled "Using Views to Handle Form Submissions (Unicode and
Character Set Issues)".
Diffstat (limited to 'docs')
| -rw-r--r-- | docs/narr/views.rst | 125 |
1 files changed, 125 insertions, 0 deletions
diff --git a/docs/narr/views.rst b/docs/narr/views.rst index 3fbe8ef60..f52b0619b 100644 --- a/docs/narr/views.rst +++ b/docs/narr/views.rst @@ -576,5 +576,130 @@ these will be resolved by the static view as you would expect. <http://pythonpaste.org/modules/urlparser.html>`_ for more information about ``urlparser.StaticURLParser``. +Using Views to Handle Form Submissions (Unicode and Character Set Issues) +------------------------------------------------------------------------- + +Most web applications need to accept form submissions from web +browsers and various other clients. In :mod:`repoze.bfg`, form +submission handling logic is always part of a :term:`view`. For a +general overview of how to handle form submission data using the +:term:`WebOb` API, see `"Query and POST variables" within the WebOb +documentation +<http://pythonpaste.org/webob/reference.html#query-post-variables>`_. +:mod:`repoze.bfg` defers to WebOb for its request and response +implementations, and handling form submission data is a property of +the request implementation. Understanding WebOb's request API is the +key to understanding how to process form submission data. + +There are some defaults that you need to be aware of when trying to +handle form submission data in a :mod:`repoze.bfg` view. Because +having high-order (non-ASCII) characters in data contained within form +submissions is exceedingly common, and because the UTF-8 encoding is +the most common encoding used on the web for non-ASCII character data, +and because working and storing Unicode values is much saner than +working with an storing bytestrings, :mod:`repoze.bfg` configures the +:term:`WebOb` request machinery to attempt to decode form submission +values into Unicode automatically from the UTF-8 character set +implicitly. This implicit decoding happens when view code obtains +form field values via the :term:`WebOb` ``request.params``, +``request.GET``, or ``request.POST`` APIs. + +For example, let's assume that the following form page is served up to +a browser client, and its ``action`` points at some :mod:`repoze.bfg` +view code:: + +.. code-block: xml + + <html xmlns="http://www.w3.org/1999/xhtml"> + <head> + <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/> + </head> + <form method="POST" action="myview"> + <div> + <input type="text" name="firstname"/> + </div> + <div> + <input type="text" name="lastname"/> + </div> + <input type="submit" value="Submit"/> + </form> + </html> + +The ``myview`` view code in the :mod:`repoze.bfg` application *must* +expect that the values returned by ``request.params`` will be of type +``unicode``, as opposed to type ``str``. The following will work to +accept a form post from the above form: +.. code-block:: python + + def myview(context, request): + firstname = request.params['firstname'] + lastname = request.params['lastname'] + +But the following ``myview`` view code *may not* work, as it tries to +decode already-decoded (``unicode``) values obtained from +``request.params``: + +.. code-block:: python + + def myview(context, request): + # the .decode('utf-8') will break below if there are any high-order + # characters in the firstname or lastname + firstname = request.params['firstname'].decode('utf-8') + lastname = request.params['lastname'].decode('utf-8') + +For implicit decoding to work reliably, you must ensure that every +form you render that posts to a :mod:`repoze.bfg` view is rendered via +a response that has a ``;charset=UTF-8`` in its ``Content-Type`` +header; or, as in the form above, with a ``meta http-equiv`` tag that +implies that the charset is UTF-8 within the HTML ``head`` of the page +containing the form. This must be done explicitly because all known +browser clients assume that they should encode form data in the +character set implied by ``Content-Type`` value of the response +containing the form when subsequently submitting that form; there is +no other generally accepted way to tell browser clients which charset +to use to encode form data. If you do not specify an encoding +explicitly, the browser client will choose to encode form data in its +default character set before submitting it. The browser client may +have a non-UTF-8 default encoding. If such a request is handled by +your view code, when the form submission data is encoded in a non-UTF8 +charset, eventually the WebOb request code accessed within your view +will throw an error when it can't decode some high-order character +encoded in another character set within form data e.g. when +``request.params['somename']`` is accessed. + +If you are using the ``webob.Response`` class to generate a response, +or if you use the ``render_template``* templating APIs, the UTF-8 +charset is set automatically as the default via the ``Content-Type`` +header. If you return a ``Content-Type`` header without an explicit +charset, a WebOb request will add a ``;charset=utf-8`` trailer to the +``Content-Type`` header value for you for response content types that +are textual (e.g. ``text/html``, ``application/xml``, etc) as it is +rendered. If you are using your own response object, you will need to +ensure you do this yourself. + +To avoid implicit form submission value decoding, so that the values +returned from ``request.params``, ``request.GET`` and ``request.POST`` +are returned as bytestrings rather than Unicode, add the following to +your application's ``configure.zcml``:: + + <subscriber for="repoze.bfg.interfaces.INewRequest" + handler="repoze.bfg.request.make_request_ascii"/> + +You can then control form post data decoding "by hand" as necessary. +For example, when this subscriber is active, the second example above +will work unconditionally as long as you ensure that your forms are +rendered in a request that has a ``;charset=utf-8`` stanza on its +``Content-Type`` header. + +.. note:: The behavior that form values are decoded from UTF-8 to + Unicode implicitly was introduced in :mod:`repoze.bfg` 0.7.0. + Previous versions of :mod:`repoze.bfg` performed no implicit + decoding of form values (the default was to treat values as + bytestrings). + +.. note:: Only the *values* of request params obtained via + ``request.params``, ``request.GET`` or ``request.POST`` are decoded + to Unicode objects implicitly by :mod:`repoze.bfg`. The keys are + still strings. |
