lucy24 - 5:31 pm on Feb 4, 2013 (gmt 0)
The only reference is the past couple years of posts in this forum :) but I can pick it apart.
the "path" part of the request consists of zero or more iterations of aaa followed by a single optional bbb. The aaa and bbb pieces have to be in parentheses so the * and ? can apply to the whole group. But they do not need to be captured separately, so the ?: saves the server a nano-minim of resources.
In the first element
each piece is a directory: one or more non-slash non-periods followed by a single slash
The second element
is a filename-- one or more non-slash non-periods followed by a period plus extension-- where "\.html" would be replaced by whatever extension you really use.
If you use extensionless URLs, all pages can be expressed even more simply as
without the [^./]+/ component. Anchors are crucial. Inside grouping brackets, . is a literal period.
Using this formulation, the server will only have to backtrack once, if at all: when it captures a [^./]+ group and then runs into a . dot instead of a / slash.