Lars Hjemli 939d32fda7 Redesign the caching layer
The original caching layer in cgit has no upper bound on the number of
concurrent cache entries, so when cgit is traversed by a spider (like the
googlebot), the cache might end up filling your disk. Also, if any error
occurs in the cache layer, no content is returned to the client.

This patch redesigns the caching layer to avoid these flaws by
* giving the cache a bound number of slots
* disabling the cache for the current request when errors occur

The cache size limit is implemented by hashing the querystring (the cache
lookup key) and generating a cache filename based on this hash modulo the
cache size. In order to detect hash collisions, the full lookup key (i.e.
the querystring) is stored in the cache file (separated from its associated
content by ascii 0).

The cache filename is the reversed 8-digit hexadecimal representation of

  hash(key) % cache_size

which should make the filesystem lookup pretty fast (if directory content
is indexed/sorted); reversing the representation avoids the problem where
all keys have equal prefix.

There is a new config option, cache-size, which sets the upper bound for
the cache. Default value for this option is 0, which has the same effect
as setting nocache=1 (hence nocache is now deprecated).

Included in this patch is also a new testfile which verifies that the
new option works as intended.

Signed-off-by: Lars Hjemli <hjemli@gmail.com>
2008-04-28 11:32:42 +02:00
2008-04-09 18:06:26 +02:00
2008-04-28 11:32:42 +02:00
2008-04-28 11:32:42 +02:00
2008-04-28 11:32:42 +02:00
2008-04-28 11:32:42 +02:00
2008-04-14 22:10:33 +02:00
2008-04-28 11:32:42 +02:00
2008-04-12 20:00:27 +02:00
2008-03-24 01:43:48 +01:00
2006-12-10 22:41:14 +01:00
2008-04-08 21:29:21 +02:00
2008-04-09 18:06:26 +02:00
2007-09-04 11:53:54 +02:00
2008-04-08 21:35:00 +02:00
2008-04-17 18:33:33 +02:00
2008-03-24 16:50:57 +01:00
2008-04-08 21:29:21 +02:00
2008-04-15 00:00:11 +02:00
2008-03-24 16:50:57 +01:00
2008-03-24 16:50:57 +01:00
2008-04-08 21:35:00 +02:00

                       cgit - cgi for git


This is an attempt to create a fast web interface for the git scm, using a
builtin cache to decrease server io-pressure.


Installation

Building cgit involves building a proper version of git. How to do this
depends on how you obtained the cgit sources:

a) If you're working in a cloned cgit repository, you first need to
initialize and update the git submodule:

  $ git submodule init     # register the git submodule in .git/config
  $ $EDITOR .git/config    # if you want to specify a different url for git
  $ git submodule update   # clone/fetch and checkout correct git version

b) If you're building from a cgit tarball, you can download a proper git
version like this:

  $ make get-git


When either a) or b) has been performed, you can build and install cgit like
this:

  $ make
  $ sudo make install

This will install cgit.cgi and cgit.css into "/var/www/htdocs/cgit". You can
configure this location (and a few other things) by providing a "cgit.conf"
file (see the Makefile for details).


Dependencies:
  -git 1.5.3
  -zip lib
  -crypto lib
  -openssl lib


Apache configuration

A new Directory-section must probably be added for cgit, possibly something
like this:

  <Directory "/var/www/htdocs/cgit/">
      AllowOverride None
      Options ExecCGI
      Order allow,deny
      Allow from all
  </Directory>


Runtime configuration

The file /etc/cgitrc is read by cgit before handling a request. In addition
to runtime parameters, this file also contains a list of the repositories
displayed by cgit.

A template cgitrc is shipped with the sources, and all parameters and default
values are documented in this file.


The cache

When cgit is invoked it looks for a cachefile matching the request and
returns it to the client. If no such cachefile exist (or if it has expired),
the content for the request is written into the proper cachefile before the
file is returned.

If the cachefile has expired but cgit is unable to obtain a lock for it, the
stale cachefile is returned to the client. This is done to favour page
throughput over page freshness.

The generated content contains the complete response to the client, including
the http-headers "Modified" and "Expires".


The missing features

* Submodule links in the directory listing page have a fixed format per
  repository. This should probably be extended to a generic map between
  submodule path and url.

* Branch- and tag-lists in the summary page can get very long, they should
  probably only show something like the ten "latest modified" branches and
  a similar number of "most recent" tags.

* There should be a new page for browsing refs/heads and refs/tags, with links
  from the summary page whenever the branch/tag lists overflow.

* The log-page should have more/better search options (author, committer,
  pickaxe, paths) and possibly support arbitrary revision specifiers.

* A set of test-scripts is required before cgit-1.0 can be released.

Patches/bugreports/suggestions/comments are always welcome, please feel free
to contact the author: hjemli@gmail.com
Description
cgit with patches for sandboxing using qssb
Readme 2.8 MiB
Languages
C 74.2%
Shell 8%
Lua 7.9%
CSS 4%
Python 3.3%
Other 2.6%