1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
|
.. _traversal_chapter:
Traversal
=========
:term:`traversal` is a :term:`context finding` mechanism that is used
by :mod:`repoze.bfg`. :term:`traversal` is the act of finding a
:term:`context` and a :term:`view name` by walking over an *object
graph*, starting from a :term:`root` object, using a :term:`request`
object as a source of path information.
In this chapter, we'll provide a high-level overview of traversal,
explain the concept of an *object graph*, and show how traversal might
be used within an application.
.. index::
pair: traversal; high-level overview
A High-Level Overview of Traversal Mechanics
--------------------------------------------
:term:`Traversal` is dependent on information in a :term:`request`
object. The :term:`request` object contains URL path information in
the ``PATH_INFO`` portion of the :term:`WSGI` environment. The
``PATH_INFO`` portion of the WSGI environment is the URL data in a
request following the hostname and port number, but before any query
string elements or fragments, for example the ``/a/b/c`` portion of
the URL ``http://example.com/a/b/c?foo=1``.
Traversal treats the ``PATH_INFO`` segment of a URL as a sequence of
path segments. For example, the ``PATH_INFO`` string ``/a/b/c`` is
treated as the sequence ``['a', 'b', 'c']``. Traversal pops the first
element (``a``) from the path segment sequence and attempts to use it
as a lookup key into an object graph supplied by an application. If
that succeeeds, the :term:`context` temporarily becomes the object
found via that lookup. Then the next segment (``b``) is popped from
the sequence, and the object graph is queried for that segment; if
that lookup succeeds, the :term:`context` becomes that object. This
process continues until the path segment sequence is exhausted or any
lookup for a name in the sequence fails. In either case, a
:term:`context` is found.
The results of a :term:`traversal` also include a :term:`view name`.
The :term:`view name` is the *first* URL path segment in the set of
``PATH_INFO`` segments "left over" in the path segment list popped by
the traversal process.
The combination of the :term:`context` object and the :term:`view
name` found via traversal is used later in the same request by a
separate :mod:`repoze.bfg` subsystem -- the "view lookup" subsystem --
to find a :term:`view callable` later within the same request. How
:mod:`repoze.bfg` performs view lookup is explained within the
:ref:`views_chapter` chapter.
.. index::
single: object graph
single: traversal graph
single: model graph
The Object Graph
----------------
When your application uses :term:`traversal` to resolve URLs to code,
your application must supply an *object graph* to :mod:`repoze.bfg`.
At system startup time, the :mod:`repoze.bfg` :term:`Router` is
configured with a callback known as a :term:`root factory`, supplied
by the application developer as the ``root_factory`` argument to a
:term:`Configurator`.
Here's an example of a simple root factory:
.. code-block:: python
:linenos:
class Root(dict):
def __init__(self, request):
pass
Here's an example of using this root factory within startup
configuration, by passing it to an instance of a :term:`Configurator`
named ``config``:
config = Configurator(root_factory=Root)
Making a declaration like this at startup means that your
:mod:`repoze.bfg` application will call the root factory (in this
case, the class ``Root``) to generate a root object whenever a request
enters the application. Usually a root factory for a traversal-based
application will be more complicated than the above ``Root`` object;
in particular it may be associated with a database connection or
another persistence mechanism.
A root factory is passed a :term:`request` object and it is expected
to return an object which represents the root of the object graph.
All :term:`traversal` will begin at this root object. The root object
is usually a *mapping* object (such as a Python dictionary, or at
least a class which has many of the same methods as a dictionary).
.. warning:: In :mod:`repoze.bfg` 1.0 and prior versions, the root
factory was passed a term WSGI *environment* object (a dictionary)
while in :mod:`repoze.bfg` 1.1+ it is passed a :term:`request`
object. For backwards compatibility purposes, the request object
passed to the root factory has a dictionary-like interface that
emulates the WSGI environment, so code expecting the argument to be
a dictionary will continue to work.
If a :term:`root factory` is passed to the :mod:`repoze.bfg`
:term:`Configurator` constructor as the value ``None``, a *default*
root factory is used.
.. sidebar:: Emulating the Default Root Factory
For purposes of understanding the default root factory better,
we'll note that you can emulate the default root factory by using
this code as an explicit root factory in your application setup:
.. code-block:: python
:linenos:
class Root(object):
def __init__(self, request):
pass
config = Configurator(root_factory=Root)
The default root factory is just a really stupid object that has no
behavior or state. Using :term:`traversal` against an application
that uses the object graph supplied by the default root object is
not very interesting, because the default root object has no
children.
Items contained within the object graph are sometimes analogous to the
concept of :term:`model` objects used by many other frameworks (and
:mod:`repoze.bfg` APIs often refers to them as "models", as well).
They are typically instances of Python classes.
.. index::
single: traversal behavior
.. _traversal_behavior:
:mod:`repoze.bfg` Traversal Behavior
-------------------------------------
We need to use an analogy to clarify how :mod:`repoze.bfg` traversal
works against an arbitrary object graph.
Let's imagine an inexperienced UNIX computer user, wishing only to use
the command line to find a file and to invoke the ``cat`` command
against that file. Because he is inexperienced, the only commands he
knows how to use are ``cd``, which changes the current directory and
``cat``, which prints the contents of a file. And because he is
inexperienced, he doesn't understand that ``cat`` can take an absolute
path specification as an argument, so he doesn't know that you can
issue a single command command ``cat /an/absolute/path`` to get the
desired result. Instead, this user believes he must issue the ``cd``
command, starting from the root, for each intermediate path segment,
*even the path segment that represents the file itself*. Once he gets
an error (because you cannot successfully ``cd`` into a file), he knows
he has reached the file he wants, and he will be able to execute
``cat`` against the resulting path segment.
This inexperienced user's attempt to execute ``cat`` against the file
named ``/fiz/buz/myfile`` might be to issue the following set of UNIX
commands:
.. code-block:: text
cd /
cd fiz
cd buz
cd myfile
The user now know he has found a *file*, because the ``cd`` command
issues an error when he executed ``cd myfile``. Now he knows that he
can run the ``cat`` command:
.. code-block:: text
cat myfile
The contents of ``myfile`` are now printed on the user's behalf.
:mod:`repoze.bfg` is very much like this inexperienced UNIX user as it
uses :term:`traversal` against an object graph. In this analogy, we
can map the ``cat`` program to the :mod:`repoze.bfg` concept of a
:term:`view callable`: it is a program that can be run against some
:term:`context`. The file being operated on in this analogy is the
:term:`context` object; the context is the "last node found" in a
traversal. The directory structure is the object graph being
traversed. The act of progressively changing directories to find the
file as well as the handling of a ``cd`` error as a stop condition is
analogous to :term:`traversal`.
The object graph is traversed, beginning at a root object, represented
by the root URL (``/``); if there are further path segments in the
path info of the request being processed, the root object's
``__getitem__`` is called with the next path segment, and it is
expected to return another graph object. The resulting object's
``__getitem__`` is called with the very next path segment, and it is
expected to return another graph object. This happens *ad infinitum*
until all path segments are exhausted. If at any point during
traversal any node in the graph doesn't *have* a ``__getitem__``
method, or if the ``__getitem__`` of a node raises a :exc:`KeyError`,
traversal ends immediately, and the node becomes the :term:`context`.
The object graph consists of *container* nodes and *leaf* nodes.
There is only one difference between a *container* node and a *leaf*
node: *container* nodes possess a ``__getitem__`` method while *leaf*
nodes do not. The ``__getitem__`` method was chosen as the signifying
difference between the two types of nodes because the presence of this
method is how Python itself typically determines whether an object is
"containerish" or not.
Each container node is presumed to be willing to return a child node
or raise a ``KeyError`` based on a name passed to its ``__getitem__``.
No leaf-level instance is required to have a ``__getitem__``. If
leaf-level instances happen to have a ``__getitem__`` (through some
historical inequity), you should subclass these node types and cause
their ``__getitem__`` methods to simply raise a ``KeyError``. Or just
disuse them and think up another strategy.
Usually, the traversal root is a *container* node, and as such it
contains other nodes. However, it doesn't *need* to be a container.
Your object graph can be as shallow or as deep as you require.
Traversal "stops" when :mod:`repoze.bfg` either reaches a leaf level
model instance in your object graph or when the path segments implied
by the URL "run out". The object that traversal "stops on" becomes
the :term:`context`.
.. _how_bfg_traverses:
How :mod:`repoze.bfg` Processes a Request Using Traversal
---------------------------------------------------------
When a user requests a page from your :mod:`repoze.bfg` -powered
application, the system uses this algorithm to determine which Python
code to execute:
#. The request for the page is presented to the :mod:`repoze.bfg`
:term:`router` in terms of a standard :term:`WSGI` request, which
is represented by a WSGI environment and a ``start_response``
callable.
#. The router creates a :term:`request` object based on the WSGI
environment.
#. The :term:`root factory` is called with the :term:`request`. It
returns a :term:`root` object.
#. The router uses the WSGI environment's ``PATH_INFO`` information
to determine the path segments to traverse. The leading slash is
stripped off ``PATH_INFO``, and the remaining path segments are
split on the slash character to form a traversal sequence, so a
request with a ``PATH_INFO`` variable of ``/a/b/c`` maps to the
traversal sequence ``[u'a', u'b', u'c']``. Note that each of the
path segments in the sequence is converted to Unicode using the
UTF-8 decoding (if the decoding fails, a :exc:`TypeError` is
raised).
#. :term:`Traversal` begins at the root object returned by the root
factory. For the traversal sequence ``[u'a', u'b', u'c']``, the
root object's ``__getitem__`` is called with the name ``a``.
Traversal continues through the sequence. In our example, if the
root object's ``__getitem__`` called with the name ``a`` returns
an object (aka "object ``a``"), that object's ``__getitem__`` is
called with the name ``b``. If object A returns an object when
asked for ``b``, "object ``b``"'s ``__getitem__`` is then asked
for the name ``c``, and may return "object ``c``".
#. Traversal ends when a) the entire path is exhausted or b) when any
graph element raises a :exc:`KeyError` from its ``__getitem__`` or
c) when any non-final path element traversal does not have a
``__getitem__`` method (resulting in a :exc:`NameError`) or d)
when any path element is prefixed with the set of characters
``@@`` (indicating that the characters following the ``@@`` token
should be treated as a :term:`view name`).
#. When traversal ends for any of the reasons in the previous step,
the the last object found during traversal is deemed to be the
:term:`context`. If the path has been exhausted when traversal
ends, the :term:`view name` is deemed to be the empty string
(``''``). However, if the path was *not* exhausted before
traversal terminated, the first remaining path element is treated
as the view name.
Any subsequent path elements after the view name are deemed the
:term:`subpath`. The subpath is always a sequence of path
segments that come from ``PATH_INFO`` that are "left over" after
traversal has completed. For instance, if ``PATH_INFO`` was
``/a/b`` and the root returned an "object ``a``", and "object
``a``" subsequently returned an "object ``b``", the router deems
that the context is "object ``b``", the view name is the empty
string, and the subpath is the empty sequence. On the other hand,
if ``PATH_INFO`` was ``/a/b/c`` and "object ``a``" was found but
raised a ``KeyError`` for the name ``b``, the router deems that
the context is "object ``a``", the view name is ``b`` and the
subpath is ``('c',)``.
#. If a :term:`authorization policy` is configured, the router
performs a permission lookup. If a permission declaration is
found for the view name and context implied by the current
request, the :term:`authorization policy` is consulted to see if
the "current user" (as determined by the the active
:term:`authentication policy`) can perform the action. If he can,
processing continues. If he cannot, the :term:`forbidden view` is
called (see also :ref:`changing_the_forbidden_view`).
#. Armed with the context, the view name, and the subpath, the router
performs a view lookup. It attempts to look up a view from the
:mod:`repoze.bfg` :term:`application registry` using the
:term:`view name`, the :term:`context`, and the :term:`request`.
If a view function is found, it is called with the context and the
request. It returns a response, which is fed back upstream. If a
view is not found, the :term:`not found view` is called (see
:ref:`changing_the_notfound_view`).
In either case, the result is returned upstream via the :term:`WSGI`
protocol.
.. image:: modelgraphtraverser.png
.. index::
pair: traversal; example
A Traversal Example
-------------------
Let's pretend the user asks for
``http://example.com/foo/bar/baz/biz/buz.txt``. Let's pretend that the
request's ``PATH_INFO`` in that case is ``/foo/bar/baz/biz/buz.txt``.
Let's further pretend that when this request comes in that we're
traversing the following graph::
/--
|
|-- foo
|
----bar
Here's what happens:
- :mod:`repoze.bfg` traverses the root, and attempts to find "foo",
which it finds.
- :mod:`repoze.bfg` traverses "foo", and attempts to find "bar", which
it finds.
- :mod:`repoze.bfg` traverses bar, and attempts to find "baz", which
it does not find ("bar" raises a :exc:`KeyError` when asked for
"baz").
The fact that it does not find "baz" at this point does not signify an
error condition. It signifies that:
- the :term:`context` is "bar" (the context is the last item found
during traversal).
- the :term:`view name` is ``baz``
- the :term:`subpath` is ``('biz', 'buz.txt')``
Because it's the "context", :mod:`repoze.bfg` examines "bar" to find
out what "type" it is. Let's say it finds that the context is an
``Bar`` type (because "bar" happens to be an instance of the class
``Bar``).
Using the :term:`view name` (``baz``) and the type, it asks the
:term:`application registry` this question:
- Please find me a :term:`view callable` registered using a
:term:`view configuration` with the name "baz" that can be used for
the class ``Bar``.
Let's say it finds no matching view type. It then returns the result
of the :term:`not found view`. The request ends.
However, for this graph::
/--
|
|-- foo
|
----bar
|
----baz
|
biz
The user asks for ``http://example.com/foo/bar/baz/biz/buz.txt``
- :mod:`repoze.bfg` traverses "foo", and attempts to find "bar", which
it finds.
- :mod:`repoze.bfg` traverses "bar", and attempts to find "baz", which
it finds.
- :mod:`repoze.bfg` traverses "baz", and attempts to find "biz", which
it finds.
- :mod:`repoze.bfg` traverses "biz", and attempts to find "buz.txt"
which it does not find.
The fact that it does not find "buz.txt" at this point does not
signify an error condition. It signifies that:
- the :term:`context` is "biz" (the context is the last item found
during traversal).
- the :term:`view name` is "buz.txt"
- the :term:`subpath` is an empty sequence ( ``()`` ).
Because it's the "context", :mod:`repoze.bfg` examines "biz" to find
out what "type" it is. Let's say it finds that the context is a
``Biz`` type (because "biz" is an instance of the Python class
``Biz``).
Using the :term:`view name` (``buz.txt``) and the type, it asks the
:term:`application registry` this question:
- Please find me a :term:`view callable` registered with a :term:`view
configuration` with the name ``buz.txt`` that can be used for class
``Biz``.
Let's say that question is answered "here you go, here's a bit of code
that is willing to deal with that case", and returns a :term:`view
callable`. The view callable is passed the "biz" object as the
"context" and the current :term:`WebOb` :term:`request` as the
"request". It returns a :term:`response`.
There are two special cases:
- During traversal you will often end up with a :term:`view name` that
is the empty string. This indicates that :mod:`repoze.bfg` should
look up the :term:`default view`. The default view is a view that is
registered with no name or a view which is registered with a name
that equals the empty string.
- If any path segment element begins with the special characters
``@@`` (think of them as goggles), the value of that segment minus
the goggle characters is considered the :term:`view name`
immediately and traversal stops there. This allows you to address
views that may have the same names as model instance names in the
graph unambiguously.
.. index::
pair: debugging; not found errors
.. index::
pair: traversal; unicode
Traversal and Unicode
---------------------
The traversal machinery by default attempts to first URL-unquote and
then Unicode-decode each path element in ``PATH_INFO`` from its
natural byte string (``str`` type) representation. URL unquoting is
performed using the Python standard library ``urllib.unquote``
function. Conversion from a URL-decoded string into Unicode is
attempted using the UTF-8 encoding. If any URL-unquoted path segment
in ``PATH_INFO`` is not decodeable using the UTF-8 decoding, a
TypeError is raised. A segment will be fully URL-unquoted and
UTF8-decoded before it is passed it to the ``__getitem__`` of any
model object during traversal.
References
----------
A tutorial showing how :term:`traversal` can be used to create a
:mod:`repoze.bfg` application exists in :ref:`bfg_wiki_tutorial`.
|