diff options
| author | Chris McDonough <chrism@agendaless.com> | 2009-02-06 03:53:10 +0000 |
|---|---|---|
| committer | Chris McDonough <chrism@agendaless.com> | 2009-02-06 03:53:10 +0000 |
| commit | a97a702c5c3b1494028f32ffc837e35e3e8f606e (patch) | |
| tree | f9c02669438cfecf7a28dcd66eb6881978f66ba1 | |
| parent | 980e5261d9d5221f9d88b9e46918ec3a7735f9e0 (diff) | |
| download | pyramid-a97a702c5c3b1494028f32ffc837e35e3e8f606e.tar.gz pyramid-a97a702c5c3b1494028f32ffc837e35e3e8f606e.tar.bz2 pyramid-a97a702c5c3b1494028f32ffc837e35e3e8f606e.zip | |
Revert my decision to make ``model_path`` return a tuple; it
now still returns a string; albeit a quoted one. An additional API
(model_path_tuple) now also exists which can be used to get a model
path as a tuple.
- The ``repoze.bfg.traversal.model_path`` API now returns a *quoted*
string rather than a string represented by series of unquoted
elements joined via ``/`` characters. Previously it returned a
string or unicode object representing the model path, with each
segment name in the path joined together via ``/`` characters,
e.g. ``/foo /bar``. Now it returns a string, where each segment is
a UTF-8 encoded and URL-quoted element e.g. ``/foo%20/bar``. This
change was (as discussed briefly on the repoze-dev maillist)
necessary to accomodate model objects which themselves have
``__name__`` attributes that contain the ``/`` character.
For people that have no models that have high-order Unicode
``__name__`` attributes or ``__name__`` attributes with values that
require URL-quoting with in their model graphs, this won't cause any
issue. However, if you have code that currently expects
``model_path`` to return an unquoted string, or you have an existing
application with data generated via the old method, and you're too
lazy to change anything, you may wish replace the BFG-imported
``model_path`` in your code with this function (this is the code of
the "old" ``model_path`` implementation)::
from repoze.bfg.location import lineage
def i_am_too_lazy_to_move_to_the_new_model_path(model, *elements):
rpath = []
for location in lineage(model):
if location.__name__:
rpath.append(location.__name__)
path = '/' + '/'.join(reversed(rpath))
if elements:
suffix = '/'.join(elements)
path = '/'.join([path, suffix])
return path
- The ``repoze.bfg.traversal.find_model`` API no longer implicitly
converts unicode representations of a full path passed to it as a
Unicode object into a UTF-8 string. Callers should either use
prequoted path strings returned by
``repoze.bfg.traversal.model_path``, or tuple values returned by the
result of ``repoze.bfg.traversal.model_path_tuple`` or they should
use the guidelines about passing a string ``path`` argument
described in the ``find_model`` API documentation.
- Each argument contained in ``elements`` passed to
``repoze.bfg.traversal.model_path`` will now have any ``/``
characters contained within quoted to ``%2F`` in the returned
string. Previously, ``/`` characters in elements were left unquoted
(a bug).
- A ``repoze.bfg.traversal.model_path_tuple`` API was added. This API
is an alternative to ``model_path`` (which returns a string);
``model_path_tuple`` returns a model path as a tuple (much like
Zope's ``getPhysicalPath``).
- A ``repoze.bfg.traversal.quote_path_segment`` API was added. This
API will quote an individual path segment (string or unicode
object). See the ``repoze.bfg.traversal`` API documentation for
more information.
| -rw-r--r-- | CHANGES.txt | 84 | ||||
| -rw-r--r-- | docs/api/traversal.rst | 11 | ||||
| -rw-r--r-- | repoze/bfg/tests/test_traversal.py | 55 | ||||
| -rw-r--r-- | repoze/bfg/tests/test_url.py | 2 | ||||
| -rw-r--r-- | repoze/bfg/traversal.py | 213 | ||||
| -rw-r--r-- | repoze/bfg/url.py | 72 |
6 files changed, 310 insertions, 127 deletions
diff --git a/CHANGES.txt b/CHANGES.txt index c240b4736..67d0adffa 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,32 +1,78 @@ -Next release -============ +Next Release +================== Backwards Incompatibilities --------------------------- -- The ``repoze.bfg.traversal.model_path`` API now returns a tuple - instead of a string. Previously it returned a string representing - the model path, with each segment name in the path joined together - via ``/`` characters, e.g. ``/foo/bar``. Now it returns a tuple, - where each segment is an element in the tuple e.g. ``('', 'foo', - 'bar')`` (the leading empty element indicates that this path is - absolute). This change was (as discussed on the repoze-dev - maillist) necessary to accomodate model objects which themselves - have names that contain the ``/`` character. See the API - documentation for ``repoze.bfg.traversal.model_path`` for more - information. +- The ``repoze.bfg.traversal.model_path`` API now returns a *quoted* + string rather than a string represented by series of unquoted + elements joined via ``/`` characters. Previously it returned a + string or unicode object representing the model path, with each + segment name in the path joined together via ``/`` characters, + e.g. ``/foo /bar``. Now it returns a string, where each segment is + a UTF-8 encoded and URL-quoted element e.g. ``/foo%20/bar``. This + change was (as discussed briefly on the repoze-dev maillist) + necessary to accomodate model objects which themselves have + ``__name__`` attributes that contain the ``/`` character. + + For people that have no models that have high-order Unicode + ``__name__`` attributes or ``__name__`` attributes with values that + require URL-quoting with in their model graphs, this won't cause any + issue. However, if you have code that currently expects + ``model_path`` to return an unquoted string, or you have an existing + application with data generated via the old method, and you're too + lazy to change anything, you may wish replace the BFG-imported + ``model_path`` in your code with this function (this is the code of + the "old" ``model_path`` implementation):: + + from repoze.bfg.location import lineage + + def i_am_too_lazy_to_move_to_the_new_model_path(model, *elements): + rpath = [] + for location in lineage(model): + if location.__name__: + rpath.append(location.__name__) + path = '/' + '/'.join(reversed(rpath)) + if elements: + suffix = '/'.join(elements) + path = '/'.join([path, suffix]) + return path - The ``repoze.bfg.traversal.find_model`` API no longer implicitly - converts unicode path representations into a UTF-8 string. Callers - should either use path tuples or use the guidelines about passing a - string ``path`` argument described in its API documentation. + converts unicode representations of a full path passed to it as a + Unicode object into a UTF-8 string. Callers should either use + prequoted path strings returned by + ``repoze.bfg.traversal.model_path``, or tuple values returned by the + result of ``repoze.bfg.traversal.model_path_tuple`` or they should + use the guidelines about passing a string ``path`` argument + described in the ``find_model`` API documentation. + +Bugfixes +-------- + +- Each argument contained in ``elements`` passed to + ``repoze.bfg.traversal.model_path`` will now have any ``/`` + characters contained within quoted to ``%2F`` in the returned + string. Previously, ``/`` characters in elements were left unquoted + (a bug). Features -------- -- The ``find_model`` API now accepts "path tuples" (see the above note - regarding ``model_path``) as well as string path representations as - a ``path`` argument. +- A ``repoze.bfg.traversal.model_path_tuple`` API was added. This API + is an alternative to ``model_path`` (which returns a string); + ``model_path_tuple`` returns a model path as a tuple (much like + Zope's ``getPhysicalPath``). + +- A ``repoze.bfg.traversal.quote_path_segment`` API was added. This + API will quote an individual path segment (string or unicode + object). See the ``repoze.bfg.traversal`` API documentation for + more information. + +- The ``repoze.bfg.traversal.find_model`` API now accepts "path + tuples" (see the above note regarding ``model_path_tuple``) as well + as string path representations (from + ``repoze.bfg.traversal.model_path``) as a ``path`` argument. - Add ` `renderer`` argument (defaulting to None) to ``repoze.bfg.testing.registerDummyRenderer``. This makes it diff --git a/docs/api/traversal.rst b/docs/api/traversal.rst index 1fcc9a2c6..dfb4ac77f 100644 --- a/docs/api/traversal.rst +++ b/docs/api/traversal.rst @@ -13,10 +13,19 @@ .. autofunction:: model_path - .. autofunction:: traversal_path(path) + .. autofunction:: model_path_string + + .. autofunction:: quote_path_segment .. autofunction:: virtual_root .. note:: A function named ``model_url`` used to be present in this module. It was moved to :ref:`url_module` in version 0.6.1. +Secondary APIs +~~~~~~~~~~~~~~ + +.. automodule:: repoze.bfg.traversal + + .. autofunction:: traversal_path(path) + diff --git a/repoze/bfg/tests/test_traversal.py b/repoze/bfg/tests/test_traversal.py index 1486e5b91..ebac680d4 100644 --- a/repoze/bfg/tests/test_traversal.py +++ b/repoze/bfg/tests/test_traversal.py @@ -412,6 +412,46 @@ class ModelPathTests(unittest.TestCase): baz.__parent__ = bar baz.__name__ = 'baz' result = self._callFUT(baz, 'this/theotherthing', 'that') + self.assertEqual(result, '/foo%20/bar/baz/this%2Ftheotherthing/that') + + def test_root_default(self): + root = DummyContext() + root.__parent__ = None + root.__name__ = None + request = DummyRequest() + result = self._callFUT(root) + self.assertEqual(result, '/') + + def test_nonroot_default(self): + root = DummyContext() + root.__parent__ = None + root.__name__ = None + other = DummyContext() + other.__parent__ = root + other.__name__ = 'other' + request = DummyRequest() + result = self._callFUT(other) + self.assertEqual(result, '/other') + +class ModelPathTupleTests(unittest.TestCase): + def _callFUT(self, model, *elements): + from repoze.bfg.traversal import model_path_tuple + return model_path_tuple(model, *elements) + + def test_it(self): + baz = DummyContext() + bar = DummyContext(baz) + foo = DummyContext(bar) + root = DummyContext(foo) + root.__parent__ = None + root.__name__ = None + foo.__parent__ = root + foo.__name__ = 'foo ' + bar.__parent__ = foo + bar.__name__ = 'bar' + baz.__parent__ = bar + baz.__name__ = 'baz' + result = self._callFUT(baz, 'this/theotherthing', 'that') self.assertEqual(result, ('','foo ', 'bar', 'baz', 'this/theotherthing', 'that')) @@ -434,6 +474,21 @@ class ModelPathTests(unittest.TestCase): result = self._callFUT(other) self.assertEqual(result, ('', 'other')) +class QuotePathSegmentTests(unittest.TestCase): + def _callFUT(self, s): + from repoze.bfg.traversal import quote_path_segment + return quote_path_segment(s) + + def test_unicode(self): + la = unicode('/La Pe\xc3\xb1a', 'utf-8') + result = self._callFUT(la) + self.assertEqual(result, '%2FLa%20Pe%C3%B1a') + + def test_string(self): + s = '/ hello!' + result = self._callFUT(s) + self.assertEqual(result, '%2F%20hello%21') + class TraversalContextURLTests(unittest.TestCase): def _makeOne(self, context, url): return self._getTargetClass()(context, url) diff --git a/repoze/bfg/tests/test_url.py b/repoze/bfg/tests/test_url.py index dee86c05f..b9733f802 100644 --- a/repoze/bfg/tests/test_url.py +++ b/repoze/bfg/tests/test_url.py @@ -40,7 +40,7 @@ class ModelURLTests(unittest.TestCase): result = self._callFUT(context, request, 'this/theotherthing', 'that') self.assertEqual( result, - 'http://example.com/context/this/theotherthing/that') + 'http://example.com/context/this%2Ftheotherthing/that') def test_unicode_in_element_names(self): self._registerContextURL() diff --git a/repoze/bfg/traversal.py b/repoze/bfg/traversal.py index ee565f85e..1aba34918 100644 --- a/repoze/bfg/traversal.py +++ b/repoze/bfg/traversal.py @@ -1,3 +1,4 @@ +import re import urllib from zope.component import getMultiAdapter @@ -11,7 +12,6 @@ from repoze.bfg.location import LocationProxy from repoze.bfg.location import lineage from repoze.bfg.lru import lru_cache -from repoze.bfg.url import _urlsegment from repoze.bfg.interfaces import IContextURL from repoze.bfg.interfaces import ILocation @@ -54,35 +54,47 @@ def find_root(model): return model def find_model(model, path): - """ Given a model object and a tuple representing a path (such as - the return value of ``model_path``), return an context in this - application's model graph at the specified path. The model passed - in *must* be :term:`location`-aware. If the first element in the - path tuple is the empty string (for example ``('', 'a', 'b', - 'c')``, the path is considered absolute and the graph traversal - will start at the graph root object. If the first element in the - path tuple is not the empty string (for example ``('a', 'b', - 'c')``), the path is considered relative and graph traversal will - begin at the model object supplied to the function. No + """ Given a model object and a string or tuple representing a path + (such as the return value of ``model_path`` or + ``model_path_tuple``), return an context in this application's + model graph at the specified path. The model passed in *must* be + :term:`location`-aware. If the path cannot be resolved (if the + respective node in the graph does not exist), a KeyError will be + raised. + + This function is the logical inverse of ``model_path`` and + ``model_path_tuple``; it can resolve any path string or tuple + generated by ``model_path`` or ``model_path_tuple``. + + Rules for passing a *string* as the ``path`` argument: if the + first character in the path string is the with the ``/`` + character, the path will considered absolute and the graph + traversal will start at the root object. If the first character + of the path string is *not* the ``/`` character, the path is + considered relative and graph traversal will begin at the model + object supplied to the function as the ``model`` argument. If an + empty string is passed as ``path``, the ``model`` passed in will + be returned. Model path strings must be escaped in the following + manner: each Unicode path segment must be encoded as UTF-8 and as + each path segment must escaped via Python's ``urllib.quote``. For + example, ``/path/to%20the/La%20Pe%C3%B1a`` (absolute) or + ``to%20the/La%20Pe%C3%B1a`` (relative). The ``model_path`` + function generates strings which follow these rules (albeit only + absolute ones). + + Rules for passing a *tuple* as the ``path`` argument: if the first + element in the path tuple is the empty string (for example ``('', + 'a', 'b', 'c')``, the path is considered absolute and the graph + traversal will start at the graph root object. If the first + element in the path tuple is not the empty string (for example + ``('a', 'b', 'c')``), the path is considered relative and graph + traversal will begin at the model object supplied to the function + as the ``model`` argument. If an empty sequence is passed as + ``path``, the ``model`` passed in itself will be returned. No URL-quoting or UTF-8-encoding of individual path segments within the tuple is required (each segment may be any string or unicode - object representing a model name). If an empty sequence is passed - as ``path``, the ``model`` passed in itself will be returned. If - the path cannot be resolved, a KeyError will be raised. - - .. note:: It is also permissible to pass a string to this function - as the ``path``, as long as each Unicode path segment is encoded - as UTF-8 and as long as each path segment is escaped via Python's - ``urllib.quote``. For example, ``/path/to%20the/La%20Pe%C3%B1a`` - (absolute) or ``to%20the/La%20Pe%C3%B1a`` (relative). - ``find_model`` will consider a string path absolute if it starts - with the ``/`` character; it will consider the path relative to - the ``model`` passed in if it does not start with the ``/`` - character. If an empty string is passed as ``path``, the - ``model`` passed in will be returned. - - .. note:: This function is the logical inverse of ``model_path``; - it can resolve any path tuple generated by ``model_path``. + object representing a model name). Model path tuples generated by + ``model_path_tuple`` can always be resolved by ``find_model``. """ if hasattr(path, '__iter__'): # it's a tuple or some other iterable @@ -90,7 +102,7 @@ def find_model(model, path): # unicode and it expects path segments to be utf-8 and # urlencoded (it's the same traverser which accepts PATH_INFO # from user agents; user agents always send strings). - path = [_urlsegment(name) for name in path] + path = [quote_path_segment(name) for name in path] if path: path = '/'.join(path) or '/' else: @@ -116,19 +128,63 @@ def find_interface(model, interface): return location def model_path(model, *elements): + """ Return a string object representing the absolute physical path + of the model object based on its position in the model graph, e.g + ``'/foo/bar``. Any positional arguments passed in as ``elements`` + will be appended as path segments to the end of the model path. + For instance, if the model's path is ``/foo/bar`` and ``elements`` + equals ``('a', 'b')``, the returned tuple will be + ``/foo/bar/a/b``. The first character in the string will always + be the ``/`` character (a leading ``/`` character in a path string + represents that the path is absolute). + + Model path strings returned will be escaped in the following + manner: each unicode path segment will be encoded as UTF-8 and as + each path segment will be escaped via Python's ``urllib.quote``. + For example, ``/path/to%20the/La%20Pe%C3%B1a``. + + This function is a logical inverse of ``find_model``: it can be + used to generate path references that can later be resolved via + ``find_model``. + + The ``model`` passed in *must* be :term:`location`-aware. + + .. note:: Each segment in the path string returned will use the + ``__name__`` attribute of the model it represents within the + graph. Each of these segments *should* be a unicode or string + object (as per the contract of :term:`location-awareness`). + However, no conversion or safety checking of model names is + performed. For instance, if one of the models in your graph has a + ``__name__`` which (by error) is a dictionary, the ``model_path`` + function will attempt to append it to a string and it will cause a + TypeError. A single exception to this rule exists: the + :term:`root` model may have a ``__name__`` attribute of any value; + the value of this attribute will always be ignored (and + effectively replaced with a leading ``/``) when the path is + generated. + """ + path = [ quote_path_segment(name) for name in + model_path_tuple(model, *elements) ] + path = '/'.join(path) or '/' + return path + +def model_path_tuple(model, *elements): """ Return a tuple representing the absolute physical path of the model object based on its position in the model graph, e.g ``('', 'foo', 'bar')``. Any positional arguments passed in as ``elements`` will be appended as elements in the tuple representing the the model path. For instance, if the model's path is ``('', 'foo', 'bar')`` and elements equals ``('a', 'b')``, - the returned tuple will be ``('foo', 'bar', 'a', b')``. The - ``model`` passed in *must* be :term:`location`-aware. The first - element of this tuple will always be the empty string (a leading - empty string element in a path tuple represents that the path is - absolute). This function is the logical inverse of - ``find_model``: it can be used to generate path references that - can later be resolved via ``find_model``. + the returned tuple will be ``('', 'foo', 'bar', 'a', b')``. The + first element of this tuple will always be the empty string (a + leading empty string element in a path tuple represents that the + path is absolute). + + This function is a logical inverse of ``find_model``: it can be + used to generate path references that can later be resolved via + ``find_model``. + + The ``model`` passed in *must* be :term:`location`-aware. .. note:: Each segment in the path tuple returned will equal the ``__name__`` attribute of the model it represents within the @@ -155,6 +211,7 @@ def model_path(model, *elements): path.extend(elements) return tuple(path) + def virtual_root(model, request): """ Provided any model and a request object, return the model object @@ -185,7 +242,7 @@ def virtual_root(model, request): @lru_cache(500) def traversal_path(path): - """ Given a PATH_INFO string (slash-separated path elements), + """ Given a ``PATH_INFO`` string (slash-separated path segments), return a tuple representing that path which can be used to traverse a graph. The PATH_INFO is split on slashes, creating a list of segments. Each segment is URL-unquoted, and decoded into @@ -227,6 +284,14 @@ def traversal_path(path): (u'archives', u'<unprintable unicode>') + .. note:: This function does not generate the same type of tuples + that ``model_path_tuple`` does. In particular, the leading empty + string is not present in the tuple it returns, unilke + ``model_path_tuple``. As a result, tuples generated by + ``traversal_path`` are not resolveable by the ``find_model`` API. + ``traversal_path`` is a function mostly used by the internals of + :mod:`repoze.bfg` and by people writing their own traversal + machinery, as opposed to users writing applications in BFG. """ path = path.rstrip('/') path = path.lstrip('/') @@ -246,6 +311,34 @@ def traversal_path(path): clean.append(segment) return tuple(clean) +_segment_cache = {} + +def quote_path_segment(segment): + """ Return a quoted representation of a 'path segment' (such as + the string ``__name__`` attribute of a model) as a string. If the + ``segment`` passed in is a unicode object, it is converted to a + UTF-8 string, then it is URL-quoted using Python's + ``urllib.quote``. If the ``segment`` passed in is a string, it is + URL-quoted using Python's ``urllib.quote``. If the segment passed + in is not a string or unicode object, an error will be raised. + The return value of ``quote_path_segment`` is always a string, + never Unicode.""" + # The bit of this code that deals with ``_segment_cache`` is an + # optimization: we cache all the computation of URL path segments + # in this module-scope dictionary with the original string (or + # unicode value) as the key, so we can look it up later without + # needing to reencode or re-url-quote it + result = _segment_cache.get(segment) + if result is None: + if segment.__class__ is unicode: # isinstance slighly slower (~15%) + result = _url_quote(segment.encode('utf-8')) + else: + result = _url_quote(segment) + # we don't need a lock to mutate _segment_cache, as the below + # will generate exactly one Python bytecode (STORE_SUBSCR) + _segment_cache[segment] = result + return result + _marker = object() class ModelGraphTraverser(object): @@ -328,7 +421,7 @@ class TraversalContextURL(object): for location in lineage(self.context): name = location.__name__ if name: - rpath.append(_urlsegment(name)) + rpath.append(quote_path_segment(name)) if rpath: path = '/' + '/'.join(reversed(rpath)) + '/' else: @@ -343,4 +436,48 @@ class TraversalContextURL(object): app_url = request.application_url # never ends in a slash return app_url + path +always_safe = ('ABCDEFGHIJKLMNOPQRSTUVWXYZ' + 'abcdefghijklmnopqrstuvwxyz' + '0123456789' '_.-') +_safemaps = {} +_must_quote = {} + +def _url_quote(s, safe = ''): + """quote('abc def') -> 'abc%20def' + + Faster version of Python stdlib urllib.quote which also quotes + the '/' character. + Each part of a URL, e.g. the path info, the query, etc., has a + different set of reserved characters that must be quoted. + + RFC 2396 Uniform Resource Identifiers (URI): Generic Syntax lists + the following reserved characters. + + reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" | + "$" | "," + + Each of these characters is reserved in some component of a URL, + but not necessarily in all of them. + + Unlike the default version of this function in the Python stdlib, + by default, the quote function is intended for quoting individual + path segments instead of an already composed path that might have + '/' characters in it. Thus, it *will* encode any '/' character it + finds in a string. + """ + cachekey = (safe, always_safe) + try: + safe_map = _safemaps[cachekey] + if not _must_quote[cachekey].search(s): + return s + except KeyError: + safe += always_safe + _must_quote[cachekey] = re.compile(r'[^%s]' % safe) + safe_map = {} + for i in range(256): + c = chr(i) + safe_map[c] = (c in safe) and c or ('%%%02X' % i) + _safemaps[cachekey] = safe_map + res = map(safe_map.__getitem__, s) + return ''.join(res) diff --git a/repoze/bfg/url.py b/repoze/bfg/url.py index 99b4cdabf..0c6031c26 100644 --- a/repoze/bfg/url.py +++ b/repoze/bfg/url.py @@ -1,11 +1,13 @@ """ Utility functions for dealing with URLs in repoze.bfg """ -import re import urllib from zope.component import queryMultiAdapter from repoze.bfg.interfaces import IContextURL +from repoze.bfg.traversal import TraversalContextURL +from repoze.bfg.traversal import quote_path_segment + def model_url(model, request, *elements, **kw): """ Generate a string representing the absolute URL of the model (or @@ -58,7 +60,6 @@ def model_url(model, request, *elements, **kw): context_url = queryMultiAdapter((model, request), IContextURL) if context_url is None: # b/w compat for unit tests - from repoze.bfg.traversal import TraversalContextURL context_url = TraversalContextURL(model, request) model_url = context_url() @@ -68,7 +69,7 @@ def model_url(model, request, *elements, **kw): qs = '' if elements: - suffix = '/'.join([_urlsegment(s) for s in elements]) + suffix = '/'.join([quote_path_segment(s) for s in elements]) else: suffix = '' @@ -112,69 +113,4 @@ def urlencode(query, doseq=False): return urllib.urlencode(newquery, doseq=doseq) -_segment_cache = {} - -def _urlsegment(s): - """ The bit of this code that deals with ``_segment_cache`` is an - optimization: we cache all the computation of URL path segments in - this module-scope dictionary with the original string (or unicode - value) as the key, so we can look it up later without needing to - reencode or re-url-quote it """ - result = _segment_cache.get(s) - if result is None: - if s.__class__ is unicode: # isinstance slighly slower (~15%) - result = _url_quote(s.encode('utf-8')) - else: - result = _url_quote(s) - # we don't need a lock to mutate _segment_cache, as the below - # will generate exactly one Python bytecode (STORE_SUBSCR) - _segment_cache[s] = result - return result - - -always_safe = ('ABCDEFGHIJKLMNOPQRSTUVWXYZ' - 'abcdefghijklmnopqrstuvwxyz' - '0123456789' '_.-') -_safemaps = {} -_must_quote = {} - -def _url_quote(s, safe = '/'): - """quote('abc def') -> 'abc%20def' - - Faster version of Python stdlib urllib.quote. See - http://bugs.python.org/issue1285086 for more information. - - Each part of a URL, e.g. the path info, the query, etc., has a - different set of reserved characters that must be quoted. - - RFC 2396 Uniform Resource Identifiers (URI): Generic Syntax lists - the following reserved characters. - - reserved = ";" | "/" | "?" | ":" | "@" | "&" | "=" | "+" | - "$" | "," - - Each of these characters is reserved in some component of a URL, - but not necessarily in all of them. - - By default, the quote function is intended for quoting the path - section of a URL. Thus, it will not encode '/'. This character - is reserved, but in typical usage the quote function is being - called on a path where the existing slash characters are used as - reserved characters. - """ - cachekey = (safe, always_safe) - try: - safe_map = _safemaps[cachekey] - if not _must_quote[cachekey].search(s): - return s - except KeyError: - safe += always_safe - _must_quote[cachekey] = re.compile(r'[^%s]' % safe) - safe_map = {} - for i in range(256): - c = chr(i) - safe_map[c] = (c in safe) and c or ('%%%02X' % i) - _safemaps[cachekey] = safe_map - res = map(safe_map.__getitem__, s) - return ''.join(res) |
