The original caching layer in cgit has no upper bound on the number of
concurrent cache entries, so when cgit is traversed by a spider (like the
googlebot), the cache might end up filling your disk. Also, if any error
occurs in the cache layer, no content is returned to the client.
This patch redesigns the caching layer to avoid these flaws by
* giving the cache a bound number of slots
* disabling the cache for the current request when errors occur
The cache size limit is implemented by hashing the querystring (the cache
lookup key) and generating a cache filename based on this hash modulo the
cache size. In order to detect hash collisions, the full lookup key (i.e.
the querystring) is stored in the cache file (separated from its associated
content by ascii 0).
The cache filename is the reversed 8-digit hexadecimal representation of
hash(key) % cache_size
which should make the filesystem lookup pretty fast (if directory content
is indexed/sorted); reversing the representation avoids the problem where
all keys have equal prefix.
There is a new config option, cache-size, which sets the upper bound for
the cache. Default value for this option is 0, which has the same effect
as setting nocache=1 (hence nocache is now deprecated).
Included in this patch is also a new testfile which verifies that the
new option works as intended.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
This global variable is used by the config parsing callback to keep track
of the currently configured repository. If it is not reset to NULL when
the config parser is finished, and neither `url` or `r` is specified on the
querystring, cgit will wrongly consider the last configured repo as
selected.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
The functions found in cache.c are only used by cgit.c, so there's no
point in rebuilding all object files when the cache interface is changed.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
With the matching Makefile change, this makes sure that only cgit.o and cgit
proper needs to be rebuildt when VERSION has been modified.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
This struct is used when generating http headers, and as such is another
small step towards the goal of the whole cleanup series; to invoke each
page/view function with a function pointer.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
All html-functions can be quite easily separated from the rest of cgit, so
lets do it; the only issue was html_filemode which uses some git-defined
macros so the function is moved into ui-shared.c::cgit_print_filemode().
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Using the functions offered by libgit feels like the right thing to do. Also,
make sure that config errors gets properly reported.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
This removes the global variable which is used to keep track of the
currently selected repository, and adds a new variable in the cgit_context
structure.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
This removes another big set of global variables, and introduces the
cgit_prepare_context() function which populates a context-variable with
compile-time default values.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
This struct will hold all the cgit runtime information currently found in
a multitude of global variables.
The first cleanup removes all querystring-related variables.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
When no branch is specified and the repository does not have a default branch,
use the first branch.
Also, print sensible errormessages when the repository does not contain any
branches and when invalid branchnames are specified.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
The new view mimics the output from `git format-patch`, making it possible
to cherry-pick directly from cgit with something like `curl $url | git am`.
Inspired by a patch to `git-apply` by Mike Hommey:
http://thread.gmane.org/gmane.comp.version-control.git/67611/focus=67610
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
This makes the log searching more explicit, using a dropdown box to specify
the commit field to match against.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
This enables the new urls $repo/refs, $repo/refs/heads and $repo/refs/tags,
which can be used to print _all_ branches and/or tags.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
* 'master' of git://git.klever.net/patchwork/cgit:
link raw blob from tree file view
fix: changed view link to blob in summary.
allow selective enabling of snapshots
shorten snapshot names to repo basename
introduce cgit_repobasename
added snapshot filename to the link
add plain uncompressed tar snapshort format
introduced .tar.bz2 snapshots
compress .tar.gz using gzip as a filter
added a chk_non_negative check
css: adjust vertical-align of commit info th cells
add support for snapshot tarballs
Conflicts:
ui-summary.c
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
This file implements the tag-command, i.e. printing of annotated tags.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
snapshot configuration parameter now can be a
space/slash/comma/colon/semicolon/pipe-separated list of snaphot suffixes as
listed in ui-snapshot.c
Signed-off-by: Michael Krelin <hacker@klever.net>
- reworked cgit_print_snapshot to use a list of supported archivers and pick
one for the suffix supplied
- moved printing of snaphot links into ui-snapshot and make it iterate through
the said list
* lh/menu:
Add ofs argument to cgit_log_link and use it in ui-log.c
Add trim_end() and use it to remove trailing slashes from repo paths
Do not include current path in the "tree" menu link
Add setting to enable/disable extra links on index page
Change S/L/T to summary/log/tree
Change "files" to "tree"
Include querystring as part of cached filename for repo summary page
Add more menuitems on repo pages
When adding support for the h parameter to the summary page (passing current
branch between pages), the builtin cache returned basically random results
for summary page since the cached filename didn't honour the querystring.
This fixes the issue for now, but someday it might be worthwhile to generate
'canonical' filenames in the cache for all pages, i.e. something a bit more
clever than just including the querystring.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
A new script, gen-version.sh, is now invoked from 'make version' to generate
the file VERSION. This file contains a version identifier generated by
git-describe and is included in the Makefile.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
This adds a new function used to generate links to the diff page and uses
it everywhere such links appear (expect for single files in the diffstat
displayed on the commit page: this is now a link to the tree page).
The updated diff-page now expects zero, one or two revision specifiers, in
parameters head, id and id2. Id defaults to head unless otherwise specified,
while head (as usual) defaults to repo.defbranch. If id2 isn't specified, it
defaults to the first parent of id1.
The most important change is of course that now all repo pages (summary, log,
tree, commit and diff) has support for passing on the current branch and
revision, i.e. the road is now open for a 'static' menu with links to all
of these pages.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
This teaches ui-log to prefer id=sha1 and fallback to h=rev if no id-
parameter is specified. With this change, summary, log, commit and tree
views now passes current branch using h parameter and current revision
using id parameter.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
This adds a function to generate links to the commit page and extends said
page to use id from querystring as primary revision specified (fallback to
h).
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
This teaches ui-tree to show both trees and blobs, thereby making ui-view
superfluous. At the same time, ui-tree is extended to honour the specified
path instead of requiering a tree/blob sha1.
This makes is possible to use repo-urls like '/pub/scm/git/git.git' and
even add path specifications, like '/pub/scm/git/git.git/log/documentation'.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
The commitdiff will be generated against the first parent, and the
diff page also gets the benefit of repo.defbranch.
Cleaned up some bad whitespace in cgit.h while at it.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Pages which expect head to be specified in the querystring can now be
given a default value, configurable per repository (via repo.defbranch,
which defaults to "master").
Currently, only the log page actually works without parameters, but the
defbranch is bound to be exploited.
This also removes some dead code from shared.c
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Also, let the makefile define the name of the installed cgi and
use that definition as a default value for cgit_script_name variable.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Pass CGIT_CONFIG from makefile during build, to enable stuff like
make CGIT_CONFIG=/var/cgit/cgit.conf
Noticed by Kristian Høgsberg <krh@bitplanet.net>
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
This enables path-filtering in log-view, and adds a link per entry in
tree-view to show the log for each file/directory.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
A link is added next to each parent of a commit, leading to the new
diff-functionality in ui-diff.c.
Also added support for a path-parameter to filelevel diffs accessed via the
diffstat.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
This enabled customizing number of commits shown per page in log view. It
also changes the default from 100 to 50, mainly due to the more cpu
intensive log pages (number of files/lines changed) but also since 100
log messages requires excessive scrolling.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Snapshots can now be enabled/disabled by default for all repositories in
cgitrc with param "snapshots". Additionally, any repo can override the
default setting with param "repo.snapshots".
By default, no snapshotting is enabled.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Make a link from the commit viewer to a snapshot of the corresponding tree.
Currently only zip-format is supported.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
This makes cgit read all repo-info from the configfile, instead of scanning for
possible git-dirs below a common root path. This is primarily done to get
better security (separate physical path from logical repo-name).
In /etc/cgitrc each repo is registered with the following keys:
repo.url
repo.name
repo.path
repo.desc
repo.owner
Note:
*Required keys are repo.url and repo.path, all others are optional
*Each occurrence of repo.url starts a new repository registration
*Default value for repo.name is taken from repo.url
*The value of repo.url cannot contain characters with special meaning for
urls (i.e. one of /?%&), while repo.name can contain anything.
Example:
repo.url=cgit-pub
repo.name=cgit/public
repo.path=/pub/git/cgit
repo.desc=My public cgit repo
repo.owner=Lars Hjemli
repo.url=cgit-priv
repo.name=cgit/private
repo.path=/home/larsh/src/cgit/.git
repo.desc=My private cgit repo
repo.owner=Lars Hjemli
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
This adds the ability to show a search box in any pageheader with correct href and
hidden form data, but does not enable the box on any pages.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Make sure we chdir(2) back to the original getcwd(2) when a page
has been generated. Also, if the cgit_cache_root do not exist,
try to create it.
This is a feature intended to ease testing/debugging.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
This adds support for the following options to cgit:
--root=<path>
--cache=<path>
--nocache
--query=<querystring>
--repo=<reponame>
--page=<pagename>
--head=<branchname>
--sha1=<sha1>
--ofs=<number>
On startup, /etc/cgitrc is parsed, followed by argument parsing and
finally querystring parsing.
If --nocache is specified (or set in /etc/gitrc), caching is disabled and
cgit instead generates pages to stdout.
The combined effect of these two changes makes testing/debugging a lot
less painfull.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
An embarrassing thinko in cgit_check_cache() would truncate valid cachefiles
in the following situation:
1) process A notices a missing/expired cachefile
2) process B gets scheduled, locks, fills and unlocks the cachefile
3) process A gets scheduled, locks the cachefile, notices that the cachefile
now exist/is not expired anymore, and continues to overwrite it with an
empty lockfile.
Thanks to Linus for noticing (again).
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
Add a global variable, cgit_max_lock_attemps, to avoid the possibility of
infinite loops when failing to acquire a lockfile. This could happen on
broken setups or under crazy server load.
Incidentally, this also fixes a lurking bug in cache_lock() where an
uninitialized returnvalue was used.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
This enables internal caching of page output.
Page requests are split into four groups:
1) repo listing (front page)
2) repo summary
3) repo pages w/symbolic references in query string
4) repo pages w/constant sha1's in query string
Each group has a TTL specified in minutes. When a page is requested, a cached
filename is stat(2)'ed and st_mtime is compared to time(2). If TTL has expired
(or the file didn't exist), the cached file is regenerated.
When generating a cached file, locking is used to avoid parallell processing
of the request. If multiple processes tries to aquire the same lock, the ones
who fail to get the lock serves the (expired) cached file. If the cached file
don't exist, the process instead calls sched_yield(2) before restarting the
request processing.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>
This enables basic cgit functionality, using libgit.a and xdiff/lib.a from
git + a custom "git.h" + openssl for sha1 routines.
Signed-off-by: Lars Hjemli <hjemli@gmail.com>