summaryrefslogtreecommitdiff
path: root/docs/narr/traversal.rst
diff options
context:
space:
mode:
authorChris McDonough <chrism@agendaless.com>2008-07-20 19:13:42 +0000
committerChris McDonough <chrism@agendaless.com>2008-07-20 19:13:42 +0000
commit0bc787d4999460d4219c621f83c619ca7c4552c2 (patch)
tree1f6c47221ffede2b68ef6c288c61ae6027a1cfa1 /docs/narr/traversal.rst
parentb54cdb6d0951a28b7d7bf4f585a4059cc5e6b18a (diff)
downloadpyramid-0bc787d4999460d4219c621f83c619ca7c4552c2.tar.gz
pyramid-0bc787d4999460d4219c621f83c619ca7c4552c2.tar.bz2
pyramid-0bc787d4999460d4219c621f83c619ca7c4552c2.zip
More docs; fix autogen app model root creation.
Diffstat (limited to 'docs/narr/traversal.rst')
-rw-r--r--docs/narr/traversal.rst269
1 files changed, 269 insertions, 0 deletions
diff --git a/docs/narr/traversal.rst b/docs/narr/traversal.rst
new file mode 100644
index 000000000..3e7c6125b
--- /dev/null
+++ b/docs/narr/traversal.rst
@@ -0,0 +1,269 @@
+Traversal
+=========
+
+In many popular web frameworks, a "URL dispatcher" is used to
+associate a particular URL with a bit of code (known somewhat
+ambiguously as a "controller" or "view" depending upon the particular
+vocabulary religion to which you subscribe). These systems allow the
+developer to create "urlconfs" or "routes" to controller/view Python
+code using pattern matching against URL components. Examples:
+`Django's URL dispatcher
+<http://www.djangoproject.com/documentation/url_dispatch/>`_ and the
+`Routes URL mapping system <http://routes.groovie.org/>`_ .
+
+It is however possible to map URLs to code differently, using object
+graph traversal. The venerable Zope and CherryPy web frameworks offer
+graph-traversal-based URL dispatch. ``repoze.bfg`` also provides
+graph-traversal-based dispatch of URLs to code. Graph-traversal based
+dispatching is useful if you like the URL to be representative of an
+arbitrary hierarchy of potentially heterogeneous items.
+
+Non-graph traversal based URL dispatch can easily handle URLs such as
+``http://example.com/members/Chris``, where it's assumed that each
+item "below" ``members`` in the URL represents a member in the system.
+You just match everything "below" ``members`` to a particular view.
+They are not very good, however, at inferring the difference between
+sets of URLs such as ``http://example.com/members/Chris/document`` vs.
+``http://example.com/members/Chris/stuff/page`` wherein you'd like the
+``document`` in the first URL to represent, e.g. a PDF document, and
+``/stuff/page`` in the second to represent, e.g. an OpenOffice
+document in a "stuff" folder. It takes more pattern matching
+assertions to be able to make URLs like these work in URL-dispatch
+based systems, and some assertions just aren't possible. For example,
+URL-dispatch based systems don't deal very well with URLs that
+represent arbitrary-depth hierarchies.
+
+Graph traversal works well if you need to divine meaning out of these
+types of "ambiguous" URLs and URLs that represent arbitrary-depth
+hierarchies. Each URL segment represents a single traversal through
+an edge of the graph. So a URL like ``http://example.com/a/b/c`` can
+be thought of as a graph traversal on the example.com site through the
+edges "a", "b", and "c".
+
+Finally, if you're willing to treat your application models as a graph
+that can be traversed, it also becomes trivial to provide "row-level
+security" (in common relational parlance): you just attach a security
+declaration to each instance in the graph. This is not as easy in
+frameworks that use URL-based dispatch.
+
+Graph traversal is materially more complex than URL-based dispatch,
+however, if only because it requires the construction and maintenance
+of a graph, and it requires the developer to think about mapping URLs
+to code in terms of traversing the graph. (How's *that* for
+self-referential! ;-) That said, for developers comfortable with Zope,
+in particular, and comfortable with hierarchical data stores like
+ZODB, mapping a URL to a graph traversal it's a natural way to think
+about creating a web application.
+
+In essence, the choice to use graph traversal vs. URL dispatch is
+largely religious in some sense. Graph traversal dispatch probably
+just doesn't make any sense when you possess completely "square" data
+stored in a relational database. However, when you have a
+hierarchical data store, it can provide advantages over using
+URL-based dispatch.
+
+The Model Graph
+---------------
+
+Users interact with your ``repoze.bfg``-based application via a
+"router", which is itself a WSGI application. At system startup time,
+the router is configured with a root object from which all traversal
+will begin. The root object is a mapping object, such as a Python
+dictionary. In fact, all items contained in the graph are either leaf
+nodes (these have no ``__getitem__``) or container nodes (these do
+have a ``__getitem__``).
+
+Items contained within the graph are analogous to the concept of
+``model`` objects used by many other frameworks (and ``repoze.bfg``
+refers to them as models, as well). They are typically instances of
+classes. Each containerish instance is willing to return a child or
+raise a KeyError based on a name passed to its ``__getitem__``. No
+leaf-level instance is required to have a ``__getitem__``.
+
+``repoze.bfg`` traverses the model graph in order to find a *context*.
+It then attempts to find a *view* based on the type of the context.
+
+How ``repoze.bfg`` Processes a Request Using Traversal
+------------------------------------------------------
+
+When a user requests a page from your ``repoze.bfg`` -powered
+application, the system uses this algorithm to determine which Python
+code to execute:
+
+ 1. The request for the page is presented to ``repoze.bfg``'s
+ "router" in terms of a standard WSGI request, which is
+ represented by a WSGI environment and a start_response callable.
+
+ 2. The router creates a `WebOb <http://pythonpaste.org/webob/>`_
+ request object based on the WSGI environment.
+
+ 3. The router uses the WSGI environment's ``PATH_INFO`` variable to
+ determine the path segments to traverse. The leading slash is
+ stripped off ``PATH_INFO``, and the remaining path segments are
+ split on the slash character to form a traversal sequence, so a
+ request with a ``PATH_INFO`` variable of ``/a/b/c`` maps to the
+ traversal sequence ``['a', 'b', 'c']``.
+
+ 4. Traversal begins at the root object. For the traversal sequence
+ ``['a', 'b', 'c']``, the root object's ``__getitem__`` is called
+ with the name ``a``. Traversal continues through the sequence.
+ In our example, if the root object's ``__getitem__`` called with
+ the name ``a`` returns an object (aka "object A"), that object's
+ ``__getitem__`` is called with the name ``b``. If object A
+ returns an object when asked for ``b``, object B's
+ ``__getitem__`` is then asked for the name ``c``, and may return
+ object C.
+
+ 5. Traversal ends when a) the entire path is exhausted or b) when
+ any graph element raises a KeyError from its ``__getitem__`` or
+ c) when any non-final path element traversal does not have a
+ ``__getitem__`` method (resulting in a NameError) or d) when any
+ path element is prefixed with the set of characters ``@@``
+ (indicating that the characters following the ``@@`` token should
+ be treated as a "view name").
+
+ 6. When traversal ends for any of the reasons in the previous step,
+ the the last object found during traversal is deemed to be the
+ "context". If the path has been exhausted when traversal ends,
+ the "view name" is deemed to be the empty string (``''``).
+ However, if the path was not exhausted before traversal
+ terminated, the first remaining path element is treated as the
+ view name. Any subseqent path elements after the view name are
+ deemed the "subpath". For instance, if ``PATH_INFO`` was
+ ``/a/b`` and the root returned an "A" object, and the "A" object
+ returned a "B" object, the router deems that the context is
+ "object B", the view name is the empty string, and the subpath is
+ the empty sequence. On the other hand, if ``PATH_INFO`` was
+ ``/a/b/c`` and "object A" was found but raised a KeyError for the
+ name ``b``, the router deems that the context is object A, the
+ view name is ``b`` and the subpath is ``['c']``.
+
+ 7. If a security policy is configured, the router performs a
+ permission lookup. If a permission declaration is found for the
+ view name and context implied by the current request, the
+ security policy is consulted to see if the "current user" (also
+ determined by the security policy) can perform the action. If he
+ can, processing continues. If he cannot, an HTTPUnauthorized
+ error is raised.
+
+ 8. Armed with the context, the view name, and the subpath, the
+ router performs a view lookup. It attemtps to look up a view
+ from the ``repoze.bfg`` application registry using the view name
+ and the context. If a view factory is found, it is called with
+ the context and the request. It returns a response, which is fed
+ back upstream. If a view is not found, a generic WSGI
+ ``NotFound`` application is constructed.
+
+In either case, the result is returned upstream via the WSGI protocol.
+
+A Traversal Example
+-------------------
+
+Let's pretend the user asks for
+``http://example.com/foo/bar/baz/biz/buz.txt``. Let's pretend that the
+request's ``PATH_INFO`` in that case is ``/foo/bar/baz/biz/buz.txt``.
+Let's further pretend that when this request comes in that we're
+traversing the follwing graph::
+
+ /--
+ |
+ |-- foo
+ |
+ ----bar
+
+Here's what happens:
+
+ - bfg traverses the root, and attempts to find foo, which it finds.
+
+ - bfg traverses foo, and attempts to find bar, which it finds.
+
+ - bfg traverses bar, and attempts to find baz, which it does not
+ find ('bar' raises a ``KeyError`` when asked for baz).
+
+The fact that it does not find "baz" at this point does not signify an
+error condition. It signifies that:
+
+ - the "context" is bar (the context is the last item found during
+ traversal).
+
+ - the "view name" is ``baz``
+
+ - the "subpath" is ``['biz', 'buz.txt']``
+
+Because it's the "context", bfg examimes "baz" to find out what "type"
+it is. Let's say it finds that the context an ``IBar`` type (because
+"bar" happens to have an attribute attached to it that indicates it's
+an ``IBar``).
+
+Using the "view name" ("baz") and the type, it asks the "application
+registry" (configured separately, via "configure.zcml") this question:
+
+ - Please find me a "view" (controller in some religions) with the
+ name "baz" that can be used for the type ``IBar``.
+
+Let's say it finds no matching view type. It then returns a NotFound.
+The request ends. Everyone is sad.
+
+But! For this graph::
+
+ /--
+ |
+ |-- foo
+ |
+ ----bar
+ |
+ ----baz
+ |
+ biz
+
+The user asks for ``http://example.com/foo/bar/baz/biz/buz.txt``
+
+ - bfg traverses foo, and attempts to find bar, which it finds.
+
+ - bfg traverses bar, and attempts to find baz, which it finds.
+
+ - bfg traverses baz, and attempts to find biz, which it finds.
+
+ - bfg traverses biz, and attemtps to find "buz.txt" which it does
+ not find.
+
+The fact that it does not find "biz.txt" at this point does not
+signify an error condition. It signifies that:
+
+ - the "context" is biz (the context is the last item found during traversal).
+
+ - the "view name" is "buz.txt"
+
+ - the "subpath" is the empty list []
+
+Because it's the "context", bfg examimes "biz" to find out what "type"
+it is. Let's say it finds that the context an ``IBiz`` type (because
+"biz" happens to have an attribute attached to it that happens
+indicates it's an ``IBiz``).
+
+Using the "view name" ("buz.txt") and the type, it asks the
+"application registry" (configured separately, in "configure.zcml")
+this question:
+
+ - Please find me a "view" (controller in some religions) with the name
+ "buz.txt" that can be used for type ``IBiz``.
+
+Let's say that question is answered "here you go, here'a a bit of code
+that is willing to deal with that case", and returns a view. It is
+passed the "biz" object as the "context" and the current WebOb request
+as the "request". It returns a response.
+
+There are two special cases:
+
+- During traversal you will often end up with a "view name" that is
+ the empty string. This indicates that ``repoze.bfg`` should look up
+ the *default view*. The default view is a view that is registered
+ with no name or a view which is registered with a name that equals
+ the empty string.
+
+- If any path segment element begins with the special characters
+ ``@@`` (think of them as goggles), that segment is considered the
+ "view name" immediately and traversal stops there. This allows you
+ to address views that may have the same names as model instance
+ names in the graph umambiguously.
+