-
Notifications
You must be signed in to change notification settings - Fork 9
/
README
445 lines (311 loc) · 15.2 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
This file contains outdated information and will be updated soon!
Tor Metrics
===========
Tor Metrics aggregates publicly available data about the Tor network and
visualizes that data on a website.
This software package, metrics-web, contains (1) the code to aggregate Tor
network data, (2) the code to generate graphs and .CSV output, and (3) the
code for a dynamic web application. metrics-web is based on Java, Ant,
PostgreSQL, R, Apache HTTP Server, and Apache Tomcat.
This README explains all necessary steps to install metrics-web including
any databases (Section 1), the graphing engine (Section 2), and the web
application (Section 3).
1. Installing the metrics database
==================================
The metrics database contains data about the Tor Network coming from
different sources, including the Tor directory authorities, Torperf
performance measurement installations, and others.
1.1. Preparing the operating system
===================================
This README describes the steps for installing metrics-web on a Debian
GNU/Linux Jessie server. Instructions for other operating systems may
vary.
In the following it is assumed that sudo (or root) privileges are available.
Start by adding a metrics user that will be used to execute all commands
that do not require root privileges.
$ sudo adduser metrics
The database importer and website sources will be installed in
/srv/metrics.torproject.org/ that is created as follows:
$ sudo mkdir /srv/metrics.torproject.org/
$ sudo chmod g+ws /srv/metrics.torproject.org/
$ sudo chown metrics:metrics /srv/metrics.torproject.org/
Clone the metrics-web Git repository:
$ cd /srv/metrics.torproject.org/
$ git clone git://git.torproject.org/metrics-web metrics
Install OpenJDK 7, Ant 1.9.4, and PostgreSQL 9.4 that are necessary for
setting up the metrics database.
$ sudo apt-get install openjdk-7-jdk ant postgresql-9.4
Setting up the graphing engine (cf. 2.) requires installing R 2.8 or
higher as well as the ggplot2 library.
$ sudo apt-get install r-base r-cran-rserve r-cran-ggplot2 r-cran-reshape \
r-cran-scales r-cran-java
Check the versions of the newly installed tools.
$ java -version
java version "1.7.0_101"
OpenJDK Runtime Environment (IcedTea 2.6.6) (7u101-2.6.6-2~deb8u1)
OpenJDK 64-Bit Server VM (build 24.95-b01, mixed mode)
$ ant -version
Apache Ant(TM) version 1.9.4 compiled on October 7 2014
$ psql --version
psql (PostgreSQL) 9.4.8
Now prepare the library folder for all ant projects.
$ cd /srv/metrics.torproject.org/metrics/
$ mkdir shared/lib
Download .jar files listed below. Metrics usually uses Debian stable
provided libraries, but you can also just download them elsewhere.
Copy or link the following jars, annotated with file names in Debian
stable packages, to /srv/metrics.torproject.org/metrics/shared/lib:
commons-codec-1.9.jar
[/usr/share/java/commons-codec-1.9.jar in libcommons-codec-java]
commons-compress-1.9.jar
[/usr/share/java/commons-compress-1.9.jar in libcommons-compress-java]
commons-lang-2.6.jar
[/usr/share/java/commons-lang-2.6.jar in libcommons-lang-java]
gson-2.2.4.jar
[/usr/share/java/gson.jar in libgoogle-gson-java]
jstl1.1-1.1.2.jar
[/usr/share/java/jstl1.1-1.1.2.jar in libjstl1.1-java]
junit4-4.11.jar
[/usr/share/java/junit4-4.11.jar in junit4]
postgresql-jdbc3-9.2.jar
[/usr/share/java/postgresql-jdbc3-9.2.jar in libpostgresql-jdbc-java]
REngine.jar
[/usr/lib/R/site-library/Rserve/java/REngine.jar in r-cran-rserve]
Rserve.jar
[/usr/lib/R/site-library/Rserve/java/Rserve.jar in r-cran-rserve]
servlet-api-3.0.jar
[/usr/share/java/servlet-api-3.0.jar in libservlet3.0-java]
standard-1.1.2.jar
[/usr/share/java/standard-1.1.2.jar in libjakarta-taglibs-standard-java]
xz-1.5.jar
[/usr/share/java/xz-1.5.jar in libxz-java]
DescripTor is provided by The Tor Project and can be found here:
https://dist.torproject.org/descriptor/
Download the tar.gz file with the version number listed in build.xml.
The README inside the tar.gz file has all the information about DescripTor
and explains how to verify the downloaded files.
Copy descriptor-<version>.jar to /srv/metrics.torproject.org/shared/lib
1.2. Configuring the database
=============================
The first step in setting up the metrics database is to configure the
PostgreSQL database and import a database schema.
Start by creating a new metrics database user. There is no need to give
the metrics user superuser privileges or allow it to create databases or
new roles. You will be prompted for the password.
$ sudo -u postgres createuser -P metrics
Create a new database tordir owned by user metrics.
$ sudo -u postgres createdb -O metrics tordir
Import the metrics database schema.
$ sudo -u metrics psql -f /srv/metrics.torproject.org/metrics/modules/legacy/tordir.sql tordir
Confirm that the database now contains tables to hold metrics data. In
the following, => will be used as the database prompt.
$ sudo -u metrics psql tordir
=> \dt+
=> \q
1.3. Importing relay descriptor tarballs
========================================
In most cases it makes sense to populate the metrics database with
archived relay descriptors from the official metrics website.
Download the relay descriptor tarballs from the CollecTor website at
https://collector.torproject.org/archive/relay-descriptors/
and extract them to /srv/metrics.torproject.org/archives/ . The database
importer can process v3 votes, v3 consensuses, server descriptors, and extra-infos.
Edit the config file ~/metrics-web/config (or create it if it's not there)
to contain the following five lines (be sure to remove the linebreak in
the line defining the JDBC string and insert the real password there):
ImportDirectoryArchives 1
DirectoryArchivesDirectory archives/
KeepDirectoryArchiveImportHistory 1
WriteRelayDescriptorDatabase 1
RelayDescriptorDatabaseJDBC
jdbc:postgresql://localhost/tordir?user=metrics&password=password
Compile and run the Java database importer.
$ cd /srv/metrics.torproject.org/metrics/
$ ./run-web.sh
The database import will take a while. Once it's complete, check that the
database tables now contain metrics data:
$ sudo -u metrics psql tordir
=> \dt+
=> \q
It's safe to delete the relay descriptor files in ~/metrics-web/archives/
once they are imported.
An alternative to importing relay descriptor tarballs directly into the
database is to convert them into a data format that psql's \copy command
can process. Look for the config option WriteRelayDescriptorsRawFiles in
/srv/metrics.torproject.org/config.template for more information on this
experimental feature.
In a future version of metrics-web it may also be possible to update local
relay descriptor tarballs from the official metrics server via rsync and
import only the changes into the metrics database. The idea is to simply
rsync the data/ directory from the CollecTor server and have all information
available. However, this feature is not implemented yet.
1.4. Importing relay descriptors from a local Tor data directory
================================================================
In order to keep the data in the metrics database up-to-date, the metrics
database importer can import the cached descriptors from a local Tor data
directory.
Configure a local Tor client to fetch all known descriptors as early as
possible by adding these config options to its torrc file:
DownloadExtraInfo 1
FetchUselessDescriptors 1
FetchDirInfoEarly 1
FetchDirInfoExtraEarly 1
Tell the metrics database importer where to find the cached descriptor
files. One way to achieve this is to add symbolic links to
/srv/metrics.torproject.org/archives/ like this. Tor's data directory is
assumed to be /srv/tor/ here.
$ cd /srv/metrics.torproject.org/archives/
$ ln -s /srv/tor/cached-* .
Add a crontab entry for the database importer to run once per hour:
15 * * * * cd /srv/metrics.torproject.org/metrics/ && ./run-web.sh
1.5. Pre-calculating relay statistics
=====================================
The relay graphs on the metrics website rely on pre-calculated statistics
in the metrics database. These statistics are not calculated after every
completed import, which would usually be once per hour. In general it's
sufficient to pre-calculate statistics 2 or 4 times a day.
Calculate statistics manually after large imports (this may take a while):
$ sudo -u metrics psql tordir -c 'SELECT * FROM refresh_all();'
If the metrics database gets updated automatically, write a script and add
a crontab entry for pre-calculating statistics every 6 or 12 hours.
1.6. Importing sanitized bridge descriptors
===========================================
The metrics database can store aggregate statistics about running bridges
and bridge usage. These statistics are added by parsing sanitized bridge
descriptors available on the official metrics website.
Download a sanitized bridge descriptor tarball from the metrics website at
https://collector.torproject.org/archive/bridge-descriptors/ and extract
it to, e.g., /srv/metrics.torproject.org/bridges/bridge-descriptors-2011-05
Edit /srv/metrics.torproject.org/config to contain the following options:
ImportSanitizedBridges 1
SanitizedBridgesDirectory bridges/
KeepSanitizedBridgesImportHistory 1
WriteBridgeStats 1
Note that the bridge usage statistics require parsing relay descriptors of
the same time period in order to filter bridges that have been running as
relays from the results. When parsing sanitized bridge descriptors for
the first time it may be necessary to delete the relay descriptor import
history in /srv/metrics.torproject.org/stats/archives-import-history and
import all relay descriptors once again.
Run the database import:
$ ./run-web.sh
1.7. Importing Torperf performance data
=======================================
Torperf measures the performance of the Tor network as users experience
it. Torperf's measurement data are available on the metrics website and
can be imported into the metrics database, too.
Download the Torperf measurement files from the metrics website at
https://collector.torproject.org/archive/torperf/ and put them in a
subdirectory, e.g., /srv/metrics.torproject.org/torperf/ .
Edit /srv/metrics.torproject.org/config to contain the following options:
ImportWriteTorperfStats 1
TorperfDirectory torperf/
Run the database import:
$ ./run-web.sh
2. Installing the graphing engine
=================================
The metrics graphing engine generates custom graphs of Tor network data
based on user-provided parameters. The graphing engine requires the
metrics database to be installed as described in the previous section.
The graphing engine uses R and Rserve to generate its graphs. Rserve is a
TCP/IP server that makes it easy for other tools to use R without spawning
their own R process. Rserve also pre-loads R code and R libraries which
saves time when processing user requests.
In this configuration, Rserve will run in the context of the metrics user.
Start Rserve, this time with the metrics-web-specific configuration that
includes pre-loading the graph code:
$ cd /srv/metrics.torproject.org/rserve/ && ./start.sh
Add a crontab entry to start Rserve on reboot:
@reboot cd /srv/metrics.torproject.org/metrics/website/rserve/ && ./start.sh
Rserve will pre-load the graph code at startup. If changes are made to
the graph code, Rserve must be restarted:
$ cd /srv/metrics.torproject.org/metrics/website/rserve/
$ killall Rserve; ./start.sh
3. Installing the metrics website
=================================
The metrics website lets web users search parts of the metrics database
and visualizes custom graphs.
Note that the description here has a few specific parts that only apply to
the official metrics website. These parts should be changed when setting
up a non-official metrics website.
3.1. Configuring Apache HTTP Server
===================================
The Apache HTTP Server is used as the front-end web server that serves
static resources itself and forwards requests for dynamic resources to
Apache Tomcat.
Start by installing Apache 2:
$ sudo apt-get install apache2
Disable Apache's default site.
$ sudo a2dissite 000-default
Enable mod_rewrite to tell Apache where to find static resources on disk.
Also enable mod_proxy to forward requests to Tomcat.
$ sudo a2enmod rewrite proxy_http
Create a new virtual host configuration and store it in a new file
/etc/apache2/sites-available/metrics.torproject.org with the following
content:
<VirtualHost *:80>
ServerName metrics.torproject.org
ServerAdmin [email protected]
ErrorLog /var/log/apache2/error.log
CustomLog /var/log/apache2/access.log combined
ServerSignature On
<IfModule mod_proxy.c>
<Proxy *>
Order deny,allow
Allow from all
</Proxy>
ProxyPass / http://127.0.0.1:8080/metrics/ retry=15
ProxyPassReverse / http://127.0.0.1:8080/metrics/
ProxyPreserveHost on
</IfModule>
</VirtualHost>
Enable the new virtual host.
$ sudo a2ensite metrics.torproject.org
Restart Apache just to be sure that all changes are effective.
$ sudo service apache2 restart
3.2. Configuring Apache Tomcat
==============================
Apache Tomcat will process requests for dynamic resources, including web
pages and graphs.
Install Tomcat 8:
$ sudo apt-get install tomcat8
Replace Tomcat's default configuration in /etc/tomcat8/server.xml with the
following configuration:
<Server port="8005" shutdown="SHUTDOWN">
<Service name="Catalina">
<Connector port="8080" maxHttpHeaderSize="8192"
maxThreads="150" minSpareThreads="25" maxSpareThreads="75"
enableLookups="false" redirectPort="8443" acceptCount="100"
connectionTimeout="20000" disableUploadTimeout="true"
compression="off" compressionMinSize="2048"
noCompressionUserAgents="gozilla, traviata"
compressableMimeType="text/html,text/xml,text/plain" />
<Engine name="Catalina" defaultHost="metrics.torproject.org">
<Host name="metrics.torproject.org" appBase="webapps"
unpackWARs="true" autoDeploy="true"
xmlValidation="false" xmlNamespaceAware="false">
<Alias>metrics.torproject.org</Alias>
<Valve className="org.apache.catalina.valves.AccessLogValve"
directory="logs" prefix="metrics_access_log." suffix=".txt"
pattern="%l %u %t %r %s %b" resolveHosts="false"/>
</Host>
</Engine>
</Service>
</Server>
Be sure to replace *.torproject.org with something else, unless this is
a re-installation of the official metrics website.
Update the paths starting with /srv/metrics.torproject.org/ in
/srv/metrics.torproject.org/metrics/website/etc/web.xml to the correct
paths in /srv/metrics.torproject.org/. The default paths in that file are
correct for the official metrics website setup which is slightly different
than the one described here.
Now generate the web application.
$ ant war
Create a symbolic link to the ernie.war file:
$ sudo ln -s /srv/metrics.torproject.org/metrics/website/metrics.war /var/lib/tomcat8/webapps/
Tomcat will now attempt to deploy the web application automatically.
Whenever the metrics website needs to be redeployed, generate a new .war
file and Tomcat will reload the web application automatically.
Restart Tomcat to make all configuration changes effective:
$ sudo service tomcat8 restart
The metrics website should now work.