robots.txt: disallow access to snapshots

My dmesg is filled with the oom killer bringing down processes while the
Bingbot downloads every snapshot for every commit of the Linux kernel in
tar.xz format. Sure, I should be running with memory limits, and now I'm
using cgroups, but a more general solution is to prevent crawlers from
wasting resources like that in the first place.

Suggested-by: Natanael Copa <ncopa@alpinelinux.org>
Suggested-by: Julius Plenz <plenz@cis.fu-berlin.de>
Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
This commit is contained in:
Jason A. Donenfeld 2013-05-28 14:17:00 +02:00
부모 830eb6f6ff
커밋 23debef621
2개의 변경된 파일4개의 추가작업 그리고 0개의 파일을 삭제

파일 보기

@ -78,6 +78,7 @@ install: all
$(INSTALL) -m 0644 cgit.css $(DESTDIR)$(CGIT_DATA_PATH)/cgit.css
$(INSTALL) -m 0644 cgit.png $(DESTDIR)$(CGIT_DATA_PATH)/cgit.png
$(INSTALL) -m 0644 favicon.ico $(DESTDIR)$(CGIT_DATA_PATH)/favicon.ico
$(INSTALL) -m 0644 robots.txt $(DESTDIR)$(CGIT_DATA_PATH)/robots.txt
$(INSTALL) -m 0755 -d $(DESTDIR)$(filterdir)
$(COPYTREE) filters/* $(DESTDIR)$(filterdir)

3
robots.txt Normal file
파일 보기

@ -0,0 +1,3 @@
User-agent: *
Disallow: /*/snapshot/*
Allow: /