Commit c17892fd authored by Lysander Trischler's avatar Lysander Trischler

Always process Who Follows Resources

parent aac923b3
=========
useragent
=========
``useragent`` is a Twtxt ``User-Agent`` HTTP request header analyzer, which
helps discovering new people to follow or check whether certain people are able
to receive mentions or not.
It reads an Nginx access log file with from stdin and generates a simple
statistic on stdout. I use the following Nginx config for my *twtxt.txt* file::
log_format twtxt '$time_iso8601 "$request" $status "$http_user_agent"';
server {
location = /twtxt.txt {
access_log /somewhere/twtxt.log twtxt;
}
}
Although, ``useragent`` should work fine with any access log format where the
``User-Agent`` is logged in double quotes at the very end of a line. The
webserver also must somehow escape or encode potential double quotes inside of
the header value (Nginx hex-encodes doubles quotes to ``\x22``).
Supported ``User-Agent`` Formats
================================
* `Official single user client format
<https://twtxt.readthedocs.io/en/latest/user/discoverability.html>`_, e.g.
``twtxt/1.2.3 (+https://example.com/twtxt.txt; @somebody)``
* `Extended multi user client format
<https://dev.twtxt.net/doc/useragentextension.html>`_, e.g.
``twtxt/0.1.0@abcdefg (~https://example.com/whoFollows?token=randomtoken123; contact=https://example.com/support)``
* Old twtd multi user client format with 2-5 followers, e.g.
``twtxt/0.1.0@69ac73b (Pod: example.com Followers: hugo kate Support: https://example.com/support)``
* Old twtd multi user client format with 6 or more followers, e.g.
``twtxt/0.1.0@37fd365 (Pod: example.com Followers: eugen hugo kate lieschen richard and 3 more... https://example.com/whoFollows?uri=https://example.org/twtxt.txt&nick=steffi&token=OzcdPbe6Z Support: https://example.com/support)``
In case of both old twtd formats, all followers directly found in the
``User-Agent`` header are extracted and their *twtxt.txt* URLs constructed from
the support URL or Who Follows Resource URL.
The Who Follows Resources are not queried, but the latest URLs (assuming newest
log records are always appended to the access log) of each encountered hostname
are printed, so operators can manually query their followers.
Example Usage
=============
::
$ useragent < /somewhere/twtxt.log
Twtxt UAs: 16841 Non-Twtxt UAs: 1709
343 @kate → http://example.com/user/kate/twtxt.txt
4309 @eugen → http://example.com/user/eugen/twtxt.txt
34 @lieschen → http://example.com/user/lieschen/twtxt.txt
34 @hugo → http://example.com/user/hugo/twtxt.txt
9902 @richard → http://example.com/user/richard/twtxt.txt
983 @peter → https://example.org/peter.txt
32900 @somebody → https://example.com/twtxt.txt
34 http://example.com/whoFollows?followers=8&token=gLvOWbFYT
......@@ -32,7 +32,8 @@ func main() {
singleUsers[twtxtURL] = ua.TwtxtNicks[i]
urlCounter[twtxtURL]++
}
} else if ua.WhoFollowsURL != "" {
}
if ua.WhoFollowsURL != "" {
u, err := url.Parse(ua.WhoFollowsURL)
if err != nil {
fmt.Printf("ERROR: Invalid Who Follows Resource: %v\n", err)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment