Added fetchers concept: seperate scripts to fetch the feeds

Fetchers claim to be a certain client. They try to send the same
headers as the original client. That's better than a simple curl request
with a fake user agent, because curl doesn't send the other headers like
the original client and therefore its traffic stands out.
此提交包含在:
2017-08-11 12:57:30 +02:00
父節點 8a80aa0d6d
當前提交 3a723b9440
共有 6 個檔案被更改,包括 40 行新增9 行删除

10
fetchers/chrome 可執行檔
查看文件

@ -0,0 +1,10 @@
#!/bin/sh
#Tries more or less to look like Chrome
if [ $# -ne 2 ] ; then
echo "usage: $0 url output" 1>&2
exit 1
fi
#better randomize
useragent=$(shuf -n 1 $RANDRSS_ROOT/fetchers/chrome_agents)
curl "$1" -H 'Accept-Encoding: gzip, deflate, br' -H 'Accept-Language: en-US,en;q=0.8' -H 'Upgrade-Insecure-Requests: 1' -H "User-Agent: $useragent" -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8' -H 'Connection: keep-alive' -H 'Cache-Control: max-age=0' --compressed > $2

1
fetchers/chrome_agents 一般檔案
查看文件

@ -0,0 +1 @@
User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.78 Safari/537.36

11
fetchers/firefox 可執行檔
查看文件

@ -0,0 +1,11 @@
#!/bin/sh
set -x
#Tries more or less to look like Firefox
if [ $# -ne 2 ] ; then
echo "usage: $0 url output" 1>&2
exit 1
fi
#better randomize
useragent=$(shuf -n 1 $RANDRSS_ROOT/fetchers/firefox_agents)
curl "$1" -H "User-Agent: $useragent" -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8' -H 'Accept-Language: en-US,en;q=0.5' -H 'Accept-Encoding: gzip, deflate, br' --compressed -H 'Connection: keep-alive' -H 'Upgrade-Insecure-Requests: 1' > $2

2
fetchers/firefox_agents 一般檔案
查看文件

@ -0,0 +1,2 @@
Mozilla/5.0 (X11; Linux x86_64; rv:55.0) Gecko/20100101 Firefox/55.0
Mozilla/5.0 (Windows NT 10.0; WOW64; rv:55.0) Gecko/20100101 Firefox/55.0