diff options
| author | Casey Duncan <casey.duncan@gmail.com> | 2010-12-31 13:48:07 -0700 |
|---|---|---|
| committer | Casey Duncan <casey.duncan@gmail.com> | 2010-12-31 13:48:07 -0700 |
| commit | f4c5f1a60612749ef36aae01d9a3a559b6acdfff (patch) | |
| tree | c7dd9ddc3fcf752b16fd59bad29571bf44d4d246 /docs/narr | |
| parent | 6f7b6c0221f8c120a22b630eea26905744d2b6b4 (diff) | |
| download | pyramid-f4c5f1a60612749ef36aae01d9a3a559b6acdfff.tar.gz pyramid-f4c5f1a60612749ef36aae01d9a3a559b6acdfff.tar.bz2 pyramid-f4c5f1a60612749ef36aae01d9a3a559b6acdfff.zip | |
add Much ado about traversal chapter from Rob Miller, with light adaptations. Also remove some now redundant overview content in the Traversal chapter, which is now only details.
Diffstat (limited to 'docs/narr')
| -rw-r--r-- | docs/narr/muchadoabouttraversal.rst | 293 | ||||
| -rw-r--r-- | docs/narr/traversal.rst | 50 |
2 files changed, 312 insertions, 31 deletions
diff --git a/docs/narr/muchadoabouttraversal.rst b/docs/narr/muchadoabouttraversal.rst new file mode 100644 index 000000000..52b6dd3a7 --- /dev/null +++ b/docs/narr/muchadoabouttraversal.rst @@ -0,0 +1,293 @@ +.. _much_ado_about_traversal_chapter: + +======================== +Much Ado About Traversal +======================== + +Introduction +------------ + +A lot of folks who have been using Pylons (and, therefore, Routes-based +URL matching) are being exposed for the first time, via :app:`Pyramid`, +to new ideas such as ":term:`traversal`" and ":term:`view lookup`" as a +way to route incoming HTTP requests to callable code. This has caused a +bit of consternation in some circles. Many think that traversal is hard +to understand. Others question its usefulness; URL matching has worked +for them so far, why should they even consider dealing with another +approach, one which doesn't fit their brain and which doesn't provide +any immediately obvious value? + +This chapter is an attempt to counter these opinions. Traversal and +view lookup *are* useful. There are some straightforward, real-world +use cases that are much more easily served by a traversal-based approach +than by a pattern-matching mechanism. Even if you haven't yet hit one +of these use cases yourself, understanding these new ideas is worth the +effort for any web developer so you know when you might want to use +them. Especially because (WARNING: Bold Assertion Ahead) these ideas +are *not* particularly hard to understand. In fact, :term:`traversal` +is a straightforward metaphor easily comprehended by anyone who's ever +used a run-of-the-mill file system with folders and files. + +.. note:: + + Those of you who are already familiar with traversal and view lookup + conceptually, may want to skip directly to the + :ref:`traversal_chapter` chapter, which discusses the technical + details. + +URL Matching +------------ + +Let's take a step back. The problem we're trying to solve is +simple. We have an HTTP request for a particular path that +has been routed to our web application. The requested path will +possibly invoke a specific callable function defined somewhere in our +app, or it may point to nothing in which case a 404 response should be +generated. What we're trying to do is figure out is which callable +function, if any, should be invoked for a given requested path. + +URL matching (or :term:`URL dispatch` in :app:`Pyramid` parlance) +approaches this problem by parsing the URL path and comparing the +results to a set of registered "patterns", defined by a set of regular +expressions, or some other URL path templating syntax. Each pattern is +mapped to a callable function somewhere; if the request path matches a +specific pattern, the associated function is called. If the request +path matches more than one pattern, some conflict resolution scheme is +used, usually a simple order precedence so that the first match will +take priority over any subsequent matches. If a request path doesn't +match any of the defined patterns, we've got a 404. + +Just in case it's not crystal clear, we'll give an example. Using +:app:`Pyramid`'s syntax, we might have a match pattern such as +``/{userid}/photos/{photoid}``, mapped to a ``photo_view()`` function +defined somewhere in our code. Then a request for a path such as +``/joeschmoe/photos/photo1`` would be a match, and the ``photo_view()`` +function would be invoked to handle the request. Similarly, +``/{userid}/blog/{year}/{month}/{postid}`` might map to a +``blog_post_view()`` function, so +``/joeschmoe/blog/2010/12/urlmatching`` would trigger the function, +which presumably would know how to find and render the ``urlmatching`` +blog post. + +Historical Refresher +-------------------- + +Okay, we've got :term:`URL dispatch` out of the way, soon we'll dig in +to the supposedly "harder to understand" idea of traversal. Before we +do, though, let's take a trip down memory lane. If you've been doing +web work for a while, you may remember a time when we didn't have these +fancy web frameworks. Instead, we had general purpose HTTP servers that +primarily served files off of a file system. The "root" of a given site +mapped to a particular folder somewhere on the file system. Each +segment of the request path represented a subdirectory. The final path +segment would be either a directory or a file, and once the server found +the right file it would package it up in an HTTP response and send it +back to the client. So serving up a request for +``/joeschmoe/photos/photo1`` literally meant that there was a +``joeschmoe`` folder somewhere, which contained a ``photos`` folder, +which in turn contained a ``photo1`` file. If at any point along the +way we find that there is not a folder or file matching the requested +path, we return a 404 response. + +As the web grew more dynamic, however, a little bit of extra +complexity was added. Technologies such as CGI and HTTP server +modules were developed. Files were still looked up on the file +system, but if the file ended with (for example) ``.cgi`` or ``.php``, +or if it lived in a special folder, instead of simply sending the file +to the client the server would read the file, execute it using an +interpreter of some sort, and then send the output from this process +to the client as the final result. The server configuration specified +which files would trigger some dynamic code, with the default case +being to just serve the static file. + +Traversal (aka Resource Location) +--------------------------------- + +You with me so far? Good. Because if you understand how serving +files from a file system works, then you pretty much understand +traversal. And if you understand that a server might do something +different based on what type of file a given request specifies, then +you pretty much understand view lookup. + +Wait... what!?! + +.. index:: + single: traversal overview + +The only difference between file system lookup and traversal is that a +file system lookup is stepping through nested directories and files in +a file system tree, while traversal is stepping through nested +dictionary-type objects in an object tree. Let's take a detailed look +at one of our example paths, so we can see what I mean: + +With ``/joeschmoe/photos/photo1``, we've got 4 segments: ``/``, +``joeschmoe/``, ``photos/`` and ``photo1``. With file system +lookup we have a root folder (``/``) containing a nested folder +(``joeschmoe``), which contains ANOTHER nested folder (``photos``), +which finally contains a JPG file ("photo1"). With traversal, we +have a dictionary-like root object. Asking for the ``joeschmoe`` key +gives us another dictionary-like object. Asking this in turn for the +``photos`` key gives us yet another mapping object, which finally +(hopefully) contains the resource that we're looking for within its +values, referenced by the ``photo1`` key. + +In pure Python terms, then, the traversal or "resource location" +portion of satisfying the ``/joeschmoe/photos/photo1`` request +will look like this:: + + get_root()['joeschmoe']['photos']['photo1'] + +Where ``get_root()`` is some function that returns our root traversal +resource. If all of the specified keys exist, then the returned object +will be the resource that is being requested, analogous to the JPG file +that was retrieved in the file system example. If a :exc:`KeyError` is +generated anywhere along the way, we get a 404. (Well, this isn't +precisely true, as you'll see when we learn about view lookup below, but +the basic idea holds.) + +What is a "resource"? +--------------------- + +Okay, okay... files on a file system I understand, you might say. But +what are these nested dictionary things? Where do these objects, these +"resources", live? What *are* they? + +Well, since :app:`Pyramid` is not a highly opinionated framework, there +is no restriction on how a resource is implemented; the developer can do +whatever he wants. One common pattern is to persist all of the +resources, including the root, in a database. The root object stores +the ids of all of its subresources, and provides a ``__getitem__`` +implementation that fetches them. So ``get_root()`` fetches the unique +root object, while ``get_root()['joeschmoe']`` returns a different +object, also stored in the database, which in turn has its own +subresources and ``__getitem__`` implementation, etc. These resources +could be persisted in a relational database, one of the many "NoSQL" +solutions that are becoming popular these days, or anywhere else, it +doesn't matter. As long as the returned objects provide the +dictionary-like API (i.e. as long as they have an appropriately +implemented ``__getitem__`` method) then traversal will work. + +In fact, you don't need a "database" at all. You could trivially +implement a set of objects with ``__getitem__`` methods that search +for files in specific directories, and thus precisely recreate the +older mechanism of having the URL path mapped directly to a folder +structure on the file system. Traversal is in fact a superset of file +system lookup. + +View Lookup +----------- + +At this point we're nearly there. We've covered traversal, which is +the process by which a specific resource is retrieved according to a +specific URL path. But what is this "view lookup" business? + +View lookup comes from a simple realization, namely, that there is more +than one possible action that you might want to take for a single +resource. With our photo example, for instance, you might want to view +the photo in a page, but you might also want to provide a way for the +user to edit the photo and any associated metadata. We'll call the +former the ``view`` view, and the latter will be the ``edit`` view +(Original, I know.) :app:`Pyramid` has a centralized view registry +where named views can be associated with specific resource types. So in +our example, we'll assume that we've registered ``view`` and ``edit`` +views for photo objects, and that we've specified the ``view`` view as +the default, so that ``/joeschmoe/photos/photo1/view`` and +``/joeschmoe/photos/photo1`` are equivalent. The edit view would +sensibly be provided by a request for ``/joeschmoe/photos/photo1/edit``. + +Hopefully it's clear that the first portion of the edit view's URL path +is going to resolve to the same resource as the non-edit version, +specifically the resource returned by +``get_root()['joeschmoe']['photos']['photo1']``. But traveral ends +there; the ``photo1`` resource doesn't have an ``edit`` key. In fact, +it might not even be a dictionary-like object, in which case +``photo1['edit']`` would be meaningless. When :app:`Pyramid`'s resource +location has resolved to a *leaf* resource but the entire request path +has not yet been expended, the next path segment is treated as a view +name. The registry is then checked to see if a view of the given name +has been specified for a resource of the given type. If so, the view +callable is invoked, with the resource passed in as the ``context`` +object; if not, we 404. + +This is a slight simplification, but to summarize you can think of a +request for ``/joeschmoe/photos/photo1/edit`` as ultimately converted +into the following piece of Python:: + + context = get_root()['joeschmoe']['photos']['photo1'] + view_callable = registry.get_view(context, 'edit') + view_callable(context, request) + +That's not too hard to conceptualize, is it? + +Use Cases +--------- + +Let's come back around to look at why we even care. Yes, maybe +traversal and view lookup isn't mind-bending rocket science. But URL +matching is easier to explain, and it's good enough, right? + +In some cases, yes, but certainly not in all cases. So far we've had +very structured URLs, where our paths have had a specific, small +number of pieces, like this:: + + /{userid}/{typename}/{objectid}[/{view_name}] + +In all of the examples thus far, we've hard coded the typename value, +assuming that we'd know at development time what names were going to +be used ("photos", "blog", etc.). But what if we don't know what +these names will be? Or, worse yet, what if we don't know *anything* +about the structure of the URLs inside a user's folder? We could be +writing a CMS where we want the end user to be able to arbitrarily add +content and other folders inside his folder. He might decide to nest +folders dozens of layers deep. How would you construct matching +patterns that could account for every possible combination of paths +that might develop? + +It may be possible, but it's tricky at best. And your matching +patterns are going to become quite complex very quickly as you try +to handle all of the edge cases. + +With traversal, however, it's straightforward. You want 20 layers of +nesting? No problem, :app:`Pyramid` will happily call ``__getitem__`` +as long as it needs to, until it runs out of path segments or until it +gets a :exc:`KeyError`. Each resource only needs to know how to fetch +its immediate children, the traversal algorithm takes care of the rest. + +The key advantage of traversal here is that the structure of the +resource tree can live in the database, and not in the code. It's +simple to let users modify the tree at runtime to set up their own +personalized directory structures. + +Another use case in which traversal shines is when there is a need to +support a context-dependent security policy. One example might be a +document management infrastructure for a large corporation, where +members of different departments have varying access levels to the +various other departments' files. Reasonably, even specific files +might need to be made available to specific individuals. Traversal +does well here because the idea of a resource context is baked right +into the code resolution and calling process. Resource objects can +store ACLs, which can be inherited and/or overridden by the +subresources. + +If each resource can thus generate a context-based ACL, then whenever +view code is attempting to perform a sensitive action, it can check +against that ACL to see whether the current user should be allowed to +perform the action. In this way you achieve so called "instance based" +or "row level" security which is considerably harder to model using a +traditional tabular approach. :app:`Pyramid` actively supports such a +scheme, and in fact if you register your views with guard permissions +and use an authorization policy, :app:`Pyramid` can check against a +resource's ACL when deciding whether or not the view itself is available +to the current user. + +In summary, there are entire classes of problems that are more easily +served by traversal and view lookup than by :term:`URL dispatch`. If +your problems aren't of this nature, great, stick with :term:`URL +dispatch`. But if you're using :app:`Pyramid` and you ever find that +you *do* need to support one of these use cases, you'll be glad you have +traversal in your toolkit. + +.. note:: + It is even possible to mix and match :term:`traversal` with + :term:`URL dispatch` in the same :app:`Pyramid` application. See the + :ref:`hybrid_chapter` chapter for details. diff --git a/docs/narr/traversal.rst b/docs/narr/traversal.rst index 2d7878265..e8949880c 100644 --- a/docs/narr/traversal.rst +++ b/docs/narr/traversal.rst @@ -3,34 +3,22 @@ Traversal ========= -:term:`Traversal` provides an alternative to using :term:`URL dispatch` to -map a URL to a :term:`view callable`. It is the act of locating a -:term:`context` resource by walking over a :term:`resource tree`, starting -from a :term:`root` resource, using a :term:`request` object as a source of -path information. Once a context resource is found, a view callable is -looked up and invoked. - -Using :term:`Traversal` to map a URL to code is optional. It is often less -easy to understand than URL dispatch, so if you're a rank beginner, it -probably makes sense to use URL dispatch to map URLs to code instead of -traversal. In that case, you can skip this chapter. - -.. index:: - single: traversal overview - -A High-Level Overview of Traversal ----------------------------------- - A :term:`traversal` uses the URL (Universal Resource Locator) to find a -:term:`resource`. This is done by mapping each segment of the path portion -of the URL into a set of nested dictionary-like objects called the -:term:`resource tree`. You might think of this as looking up files and -directories in a file system. Traversal walks down the path until it finds a -published "directory" or "file". The resource we find as the result of a -traversal becomes the :term:`context`. A separate :term:`view lookup` -subsystem is used to then find some view code willing "publish" the context +:term:`resource` located in a :term:`resource tree`, which is a set of +nested dictionary-like objects. Traversal is done by using each segment +of the path portion of the URL to navigate through the :term:`resource +tree`. You might think of this as looking up files and directories in a +file system. Traversal walks down the path until it finds a published +"directory" or "file". The resource we find as the result of a +traversal becomes the :term:`context`. Then, the :term:`view lookup` +subsystem is used to find some view code willing "publish" this resource. +Using :term:`Traversal` to map a URL to code is optional. It is often +less easy to understand than :term:`URL dispatch`, so if you're a rank +beginner, it probably makes sense to use URL dispatch to map URLs to +code instead of traversal. In that case, you can skip this chapter.` + .. index:: single: traversal details @@ -76,7 +64,7 @@ element cannot be resolved to a resource. In either case, a :term:`context` resource is chosen. Traversal "stops" when it either reaches a leaf level resource in your -resource tree or when the path segments implied by the URL "run out". The +resource tree or when the path segments from the URL "run out". The resource that traversal "stops on" becomes the :term:`context`. If at any point during traversal any resource in the tree doesn't have a ``__getitem__`` method, or if the ``__getitem__`` method of a resource raises @@ -88,11 +76,11 @@ The results of a :term:`traversal` also include a :term:`view name`. The segments "left over" in the path segment list popped by the traversal process *after* traversal finds a context resource. -The combination of the context resource and the :term:`view name` found via -traversal is used later in the same request by a separate :app:`Pyramid` -subsystem -- the :term:`view lookup` subsystem -- to find a :term:`view -callable` later within the same request. How :app:`Pyramid` performs view -lookup is explained within the :ref:`views_chapter` chapter. +The combination of the context resource and the :term:`view name` found +via traversal is used later in the same request by the :term:`view +lookup` subsystem to find a :term:`view callable`. How :app:`Pyramid` +performs view lookup is explained within the :ref:`views_chapter` +chapter. .. index:: single: object tree |
