Pages like /commit?h=wip&id=8a335ce618ba77fbf05148d6f8be17bd48ba4340
were being marked as dynamic, because of h=wip, when it should be
static, because of id=.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
We've long supported negative ttls, for infinite cache, except the
documentation incorrectly showed one of our defaults as being 5 and not
-1. As well, with a negative ttl, we were actually making the HTTP
expired header go backwards. This changes it to go ahead ten years
instead.
Further, we add an cache-about-ttl option to set a different ttl for
about pages, which are now increasingly being filtered through markdown
or just sent statically anyway.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
My dmesg is filled with the oom killer bringing down processes while the
Bingbot downloads every snapshot for every commit of the Linux kernel in
tar.xz format. Sure, I should be running with memory limits, and now I'm
using cgroups, but a more general solution is to prevent crawlers from
wasting resources like that in the first place.
Suggested-by: Natanael Copa <ncopa@alpinelinux.org>
Suggested-by: Julius Plenz <plenz@cis.fu-berlin.de>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Features:
- update to git v1.8.3.
- expanded set of default filters to include markdown, restructuredtext, and
man pages.
- better sample configuration file in man page.
- "readme" may now be specified multiple times, and cgit will choose the first
one it finds.
- "readme" no longer needs a branch name. If prefixed with simply ":" it will
use the default branch.
- "branch-sort" allowing branches to be sorted either by "age" or "name", for
kernel.org.
- "enable-index-owner" allowing the owner column to be disabled in the index
page.
- print submodule revision next to submodule link.
- integrate more closely with git apis, such as strbuf.
- rely on git test harness and git makefiles.
- more robust test suite.
- more rebust makefile dependency accounting.
- pager navigation is now unordered list.
- span tag wraps commit directions.
Behavior changes:
- HOME is no longer passed as an environment variable to any filter api
scripts.
- "about-filter" now receives the filename being filtered as argv[1]. This may
disrupt existing scripts, so adjust accordingly.
- gitconfig and gitattributes are no longer loaded from any system directories
or home directories.
Security:
- CVE-2013-2117: disallow directory traversal when readme is set to filesystem
path.
Bug fixes:
- ssdiff now correctly manages tab expansion.
- support unannotated tags in http git clone.
- lots of cleanups of global variables and memory leaks.
- do not rely on gettext/libintl.
- better C standard compliance.
- make several functions and variables static.
- improved constification.
- remove unused functions.
- fix colspan values to correct width.
- fix out-of-bounds memory accesses with virtual_root="".
- cache repo config more precisely.
- die when write fails.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Now this is possible in cgitrc -
readme=:README.md
readme=:readme.md
readme=:README.mkd
readme=:readme.mkd
readme=:README.rst
readme=:readme.rst
readme=:README.html
readme=:readme.html
readme=:README.htm
readme=:readme.htm
readme=:README.txt
readme=:readme.txt
readme=:README
readme=:readme
readme=:INSTALL.txt
readme=:install.txt
readme=:INSTALL
readme=:install
Suggested-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Using the url= query string, it was possible request arbitrary files
from the filesystem if the readme for a given page was set to a
filesystem file. The following request would return my /etc/passwd file:
http://git.zx2c4.com/?url=/somerepo/about/../../../../etc/passwdhttp://data.zx2c4.com/cgit-directory-traversal.png
This fix uses realpath(3) to canonicalize all paths, and then compares
the base components.
This fix introduces a subtle timing attack, whereby a client can check
whether or not strstr is called using timing measurements in order
to determine if a given file exists on the filesystem.
This fix also does not account for filesystem race conditions (TOCTOU)
in resolving symlinks.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
The readme variable may now contain multiple space deliminated entries,
which per usual are either a filepath or a git ref filepath. If multiple
are specified, cgit will now select the first one in the list that
exists. This is to make it easier to specify multiple default readme
types in the main cgitrc file and have them automatically get applied to
each repo based on what exists.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
This gives the about-filter API the same semantics as source-filter,
where the filter receives the filename so it can decide what to do next
with it.
While we're at it, plug a memory leak.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
If the readme value begins with ":", and has no specified branch before
it, use the repository's default branch.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
The number of odd cases in which git will try to read config is far too
great to keep putting a bandaid over each one, so we'll just unset it.
If it turns out that scripts really liked to know about $HOME, we can
always reset it in the filter forks.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
We've now added quite a few config keys for repositories, but we've
forgotten to update the printing of it for cache files. Synchronize the
two.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
By using the standard library's printf, cache_ls does not redirect its
output to the cache when we change the process' stdout file descriptor
to point to the cache file. Fix this by using "htmlf" in the same way
that we do for writing HTTP headers.
Signed-off-by: John Keeping <john@keeping.me.uk>
This means that we can avoid hardcoding the number of headers we expect
CGit to generate in test cases and simply remove whatever headers happen
to by there when we are checking body content.
Signed-off-by: John Keeping <john@keeping.me.uk>
If we fail to write HTML output once, there's no point carrying on so
just write a failure message once and die. By using Git's die_errno
function we also let the user know in what way the write failed.
Signed-off-by: John Keeping <john@keeping.me.uk>
This helps projects that have a large number of tags to display them all
using custom CSS.
The default stylesheet has not been updated since what is useful for
projects with a lot of tags is not the same as what is useful for
projects with only a small number of decorations per commit.
Suggested-by: Konstantin Ryabitsev <mricon@kernel.org>
Signed-off-by: John Keeping <john@keeping.me.uk>
When building the "test" target we depend on both cgit and building the
Git tools. By doing this with two targets we end up running make in the
git/ directory twice, concurrently if using parallel make, which causes
us to build more than we need and potentially builds incorrectly if
multi-step build-then-move operations overlap.
Fix this by instead calling back into the makefile so that we alter the
"cgit" target to also build the Git tools.
Signed-off-by: John Keeping <john@keeping.me.uk>
Commit fb3655d (use struct strbuf instead of static buffers, 2013-04-06)
broke the logic in cache.c::cache_ls by failing to set slot->cache_name
before calling open_slot.
While fixing this, also free the strbufs added by that commit once we're
done with them.
Signed-off-by: John Keeping <john@keeping.me.uk>
We try to stick to POSIX shell in the tests but a "function" keyword has
found its way into t0109. Remove it.
This makes the tests work with dash again.
Signed-off-by: John Keeping <john@keeping.me.uk>
It's a bit tedious to have to do this here too. If we encounter other
issues with $HOME down the line, I'll look into adding some nice utility
functions to handle this, or perhaps giving up on the hope that we could
keep $HOME defined for scripts.
This commit additionally adds a test case, should the issue surface
again.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
When creating the GIT-VERSION-FILE that we use to test that the version
of Git in git/ is the same as in the CGit Makefile, Git applies the
transform "s/-/./g" to the version string. This doesn't affect released
versions but does change RC version numbers such as 1.8.3-rc0.
While CGit should only refer to a released Git version in general, it is
useful to developers who want to test upcoming Git releases if the tests
do work with RCs, so change t0001 to apply the same transform to our
Makefile version before comparing it to the contents of
GIT-VERSION-FILE.
Signed-off-by: John Keeping <john@keeping.me.uk>
Commit fb3655d (use struct strbuf instead of static buffers -
2013-04-06) introduced a regression in the "section-from-path" handling
when the configured value is negative. By changing the "rel" variable
so that it includes a trailing slash, counting slashes from the end of
the string no longer gives the same answer as it did before.
Fix this by ensuring that "rel" does not have a trailing slash.
Reported-by: Julius Plenz <plenz@cis.fu-berlin.de>
Signed-off-by: John Keeping <john@keeping.me.uk>
When testing modifications in Git that affect CGit, it is annoying to
have t0001 failing simply because the Git version has a ".dirty" suffix
when the version of Git there does indeed match that specified in the
CGit makefile. Stop this by stripping the ".dirty" suffix from the
GIT_VERSION variable.
Note that this brings the "Git version" behaviour in line with the
"submodule version" case which does not check if the working tree in
git/ is modified.
Signed-off-by: John Keeping <john@keeping.me.uk>
By default, Git's test suite puts the trash directories and test-results
directory into its own directory, not that containing the tests being
run. This is less convenient for inspecting test failures, so set the
output directory to CGit's tests/ directory instead.
Note that there is currently a bug in Git whereby it will create the
trash directories in our tests/ directory regardless of the value of
TEST_OUTPUT_DIRECTORY, and then fail to remove them once the tests are
done. This change does currently affect the location of the
test-results/ directory though.
Signed-off-by: John Keeping <john@keeping.me.uk>
In order to ensure that we don't access $HOME at some point after
initial startup when rendering a specific view, run the strace test on a
range of different pages.
This ensures that we don't end up reading a configuration later for some
specific view.
Signed-off-by: John Keeping <john@keeping.me.uk>
Several options must be specified prior to scan-path. This is consistant
source of user confusion. Document these facts.
Suggested-by: Lukas Fleischer <cgit@cryptocrack.de>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
In cgit_print_snapshot_links() we strip leading "v" and "V", while we
currently only prepend a lower case "v" when parsing a snapshot file
name. This results in broken snapshot links for tags that start with an
upper case "V". Avoid this by prepending a "V" as a fallback.
Signed-off-by: Lukas Fleischer <cgit@cryptocrack.de>
Note that we cannot use skip_all here since some tests have already been
executed when ZIP tests are reached. Use test prerequisites to skip
everything using unzip(1) if the binary is not available instead.
Signed-off-by: Lukas Fleischer <cgit@cryptocrack.de>
"-i" isn't part of the POSIX standard and doesn't work on several
platforms such as OpenBSD. Use a temporary file instead.
Signed-off-by: Lukas Fleischer <cgit@cryptocrack.de>
When set to "name", branches are sorted by name, which is the current
default. When set to "age", branches are sorted by the age of the
repository.
This feature was requested by Konstantin Ryabitsev for use on
kernel.org.
Proposed-by: Konstantin Ryabitsev <mricon@kernel.org>
Without '&&' between operations, we will not detect if strace or cgit
exit with an error status, which would cause a false positive test
status in this case.
Signed-off-by: John Keeping <john@keeping.me.uk>
getenv() returns a NULL pointer if the specified variable name cannot be
found in the environment. However, some setenv() implementations crash
if a NULL pointer is passed as second argument. Only restore variables
that are not NULL.
See commit d96d2c98eb for a related patch.
Signed-off-by: Lukas Fleischer <cgit@cryptocrack.de>
Some tar(1) versions do not support auto detection of the compression
type. Explicitly specify "-z" to decompress a ".tar.gz" archive.
Signed-off-by: Lukas Fleischer <cgit@cryptocrack.de>
With the latest changes to prevent git from accessing configuration
files that it should not, it's important to be sure that we won't
have further breakage in the future.
Use strace to implement a test to make sure cgit does not access()
anything built from $HOME.
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
This allows tests to run in parallel as well as letting us use "prove"
or another TAP harness to run the tests.
Git's test framework requires Git to be fully built before letting any
tests run, so add a new target to the top-level Makefile which builds
all of Git instead of just libgit.a and make the "test" target depend on
that.
Signed-off-by: John Keeping <john@keeping.me.uk>
While doing any kind of git loading, unset HOME variables and set
NOSYSTEM variables so that cgit does not load any settings that a user
may have set for his own /usr/bin/git usage.
This fixes a fatal error introduced with git 1.8, whereupon git would
fatally exit when failing to access particular files.
The result of this is that only repo-local configuration files are
accessed:
zx2c4@thinkpad ~/Projects/cgit $ HOME=/root QUERY_STRING="url=foo/log"
CGIT_CONFIG=tests/trash/cgitrc strace -e access ./cgit >/dev/null
access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
access("repos/foo/.git/objects", X_OK) = 0
access("repos/foo/.git/refs", X_OK) = 0
access("repos/foo/.git/config", R_OK) = 0
access("repos/foo/.git/config", R_OK) = 0
access("repos/foo/.git/objects/b3/bafdbf0183f4897ef8b1319cb8c490ed54717e", F_OK) = 0
access("repos/foo/.git/objects/b3/bafdbf0183f4897ef8b1319cb8c490ed54717e", F_OK) = 0
access("repos/foo/.git/objects/b3/bafdbf0183f4897ef8b1319cb8c490ed54717e", F_OK) = 0
access("repos/foo/.git/objects/b3/bafdbf0183f4897ef8b1319cb8c490ed54717e", F_OK) = 0
+++ exited with 0 +++
Reported-by: Ferry Huberts <ferry.huberts@pelagic.nl>
Tested-by: Jason A. Donenfeld <Jason@zx2c4.com>
Tested-by: Ferry Huberts <ferry.huberts@pelagic.nl>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
Use "struct strbuf" from Git to remove the limit on file path length.
Notes on scan-tree:
This is slightly involved since I decided to pass the strbuf into
add_repo() and modify if whenever a new file name is required, which
should avoid any extra allocations within that function. The pattern
there is to append the filename, use it and then reset the buffer to its
original length (retaining a trailing '/').
Notes on ui-snapshot:
Since write_archive modifies the argv array passed to it we
copy the argv_array values into a new array of char* and then free the
original argv_array structure and the new array without worrying about
what the values now look like.
Signed-off-by: John Keeping <john@keeping.me.uk>
After this change there is one remaining call 'fmt("%s", delim)' in
ui-shared.c but is needed as delim is stack allocated and so cannot be
returned from the function.
Signed-off-by: John Keeping <john@keeping.me.uk>