Skip to content

Commit a026943

Browse files
committed
Add Minerva algorithm to Arlo
feat: add athena_sample_sizes option Add an athena_sample_sizes() shim, to call Athena using the same call signature as bravo_sample_sizes() Calculate Athena p-values Define get_athena_test_statistics() shim patterned after bravo.get_test_statistics() Still following earlier integration work based on ab525ad from 2020-05-05 Move shim to new audit_math location Based on Arlo changes in the meantime. Dirty patch to bravo.py for using athena Allow user to choose audit_math implementation at startup. The default is now to use Athena, but the $ARLO_ALGORITHM environmental variable can be used to select an algorithm at run time. E.g.: ARLO_ALGORITHM=bravo ./run-dev.sh N.B.: to avoid have to add too much other logic before we decide on the best way to integrate this, we simply conditionally replace bravo.bravo_sample_sizes with athena_sample_sizes: if ALGORITHM == "athena": bravo_sample_sizes = athena_sample_sizes Also catch up with API updates in athena repo. Switch to minerva, adapt to Athena api changes FIXME: bravo tests still broken Need to check the minerva test results. Temp README changes; logging setup; more tests Temp: run black, note issues Resolve lint, typing errors; fstring logging Turned off logging-fstring-interpolation in .pylintrc. I think the possible tiny performance penalty is offset by the readability gains, as noted at pylint-dev/pylint#2354 (comment) Clarify README and auth0.md; fix .pylintrc format Add athena from git to Pipfile Note changes in Pipfile.lock - not sure if you want the rest of the packages to be updated, or to have specific version numbers. Add logging during startup Fix typo, get node, cli buildpacks to build Specify Heroku version python-3.7.8 in runtime.txt Fix Pipfile.lock syntax Clean up some logging
1 parent 747238f commit a026943

File tree

9 files changed

+315
-9
lines changed

9 files changed

+315
-9
lines changed

.pylintrc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -163,6 +163,7 @@ disable=print-statement,
163163
duplicate-code,
164164
broad-except,
165165
no-else-raise,
166+
logging-fstring-interpolation,
166167

167168

168169

Pipfile

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ sqlalchemy = "*"
4040
typing-extensions = "*"
4141
pytest-testmon = "*"
4242
sentry-sdk = {extras = ["flask"], version = "*"}
43+
athena = {editable = true,git = "https://github.com/filipzz/athena.git"}
4344

4445
[requires]
4546
python_version = "3.8"

Pipfile.lock

Lines changed: 5 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

README.md

Lines changed: 32 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -109,11 +109,17 @@ Rather than manually config the environment, you can also run the setup script d
109109

110110
### Creating Organizations and Administrators
111111

112-
Organizations are, for example, the State of
112+
Arlo identifies and authenticates three classes of end-user:
113+
organization administrators, jurisdiction adminstrators,
114+
and audit boards.
115+
116+
Organizations identify administrators and jurisdictions for whom they
117+
administrate audits. Jurisdictions identify their own administrators,
118+
as well as audit boards. Audit boards enter ballot-by-ballot auditing data.
119+
120+
Thus, organizations are, for example, the State of
113121
Massachusetts. Administrators are individual users that administer
114-
audits for an organization. All authentication is done via auth0 with
115-
email addresses, so users in the Arlo database also need to be
116-
mirrored in the appropriate auth0 tenant user database.
122+
audits for an organization.
117123

118124
To create an organization in the database:
119125

@@ -127,6 +133,26 @@ Then, to create an administrator for the organization:
127133

128134
which returns the `user_id`.
129135

136+
This can be be automated via:
137+
138+
org=$(python -m scripts.create-org MyOrg)
139+
python -m scripts.create-admin $org [email protected]
140+
141+
The email addresses authorized to administer jurisdictions are identified
142+
via the `filesheet.csv` file uploaded by the organization administrator.
143+
144+
After the jurisdiction admin creates audit boards for each audit,
145+
they download the `Audit Board Credentials for Data Entry.pdf` files.
146+
Each one contains a URL for an audit board, with an embedded
147+
authentication token.
148+
149+
All authentication is done using OAuth 2.0
150+
(e.g. via [Auth0](https://auth0.com/), with
151+
email addresses, so users in the Arlo database also should typically be
152+
configured in the appropriate auth0 tenant user database.
153+
154+
For design details, see [Arlo's use of Auth0](docs/auth0.md).
155+
130156
### Resetting the Database When Upgrading Arlo
131157

132158
If you're upgrading Arlo, right now the only way is to destroy and
@@ -155,10 +181,11 @@ We recommend Ubuntu 18.0.4.
155181

156182
#### Automatic configuration and setup
157183

158-
If you would just like to run Arlo and do not wish to setup a custom configuration, you can run `pipenv run python -m scripts.setup-dev`, which provides interactive configuration. The script optionally installs VotingWorks' [nOAuth](https://github.com/votingworks/nOAuth) locally, runs it, and configures Arlo to use it. It creates the necessary audit administrator and jurisdiction administrator credentials discussed above, and launches a dev instance of Arlo. Once you have navigated to `localhost:3000` in your broswer, you should be able to log in as an audit admin using the credentials you configured earlier in the script.
184+
If you would just like to run Arlo and do not wish to setup a custom configuration, you can run `pipenv run python -m scripts.setup-dev`, which provides interactive configuration. The script optionally installs VotingWorks' [nOAuth](https://github.com/votingworks/nOAuth) locally, runs it, and configures Arlo to use it. It creates the necessary audit administrator and jurisdiction administrator credentials discussed above, and launches a dev instance of Arlo. Once you have navigated to `localhost:3000` in your broswer, you should be able to log in as an audit admin using the credentials you configured earlier in the script.
159185

160186
#### Troubleshooting
161187

188+
- Beware: if you run make format-server, it will run black in a way which changes all files under the current directory without providing a backup
162189
- Postgres is best installed by grabbing `postgresql-server-dev-10` and `postgresql-client-10`.
163190
- `psychopg2` has known issues depending on your install (see, e.g., [here](https://github.com/psycopg/psycopg2/issues/674)). If you run into issues, switch `psychopg2` to `psychopg2-binary` in the Pipfile
164191
- `pipenv install` can hang attempting to get [a lock on the packages it's installing](https://github.com/pypa/pipenv/issues/3827). To get around this, add the `--skip-lock` flag in the Makefile (the first line should be `pipenv install --skip-lock`).

docs/auth0.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -12,13 +12,13 @@ Auth0 is used for authentication. Key things to keep in mind:
1212
- we use two separate Auth0 tenants, one for audit administrators, one
1313
for jurisdiction administrators, each with its own single
1414
application, so we can use completely different login screens for
15-
both, specifically 2FA for audit administrators and passwordless for
16-
jurisdiction administrators.
15+
both, specifically 2FA for audit administrators, and both 2FA
16+
and passwordless for jurisdiction administrators / audit boards.
1717

1818
- setting up auth0 passwordless requires either creating users via the
1919
Management API, or letting anyone sign in and filtering on our
20-
end. We'll start with the latter, we may do the former at some
21-
point.
20+
end. We'll start with the latter, creating URLs for audit boards on
21+
the fly, but we may do the former at some point.
2222

2323
- right now we're using "Universal Login", where Auth0 controls the
2424
login page. It's not clear that's the right way forward for Arlo, as

runtime.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
python-3.7.8

server/audit_math/bravo.py

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -9,10 +9,14 @@
99
import math
1010
from decimal import Decimal, ROUND_CEILING
1111
from collections import defaultdict
12+
import logging
1213
from typing import Dict, Tuple, Optional
1314
from scipy import stats
1415

1516
from .sampler_contest import Contest
17+
from .shim import minerva_sample_sizes, get_minerva_test_statistics # type: ignore
18+
19+
from ..config import ALGORITHM
1620

1721

1822
def get_expected_sample_sizes(
@@ -122,6 +126,27 @@ def get_test_statistics(
122126
Decimal((1 - winners[winner]["swl"][cand]) / 0.5) ** votes
123127
)
124128

129+
logging.debug(f"bravo test_stats: T={T}")
130+
131+
if ALGORITHM == "minerva":
132+
for winner, winner_res in winners.items():
133+
for loser, loser_res in losers.items():
134+
res = get_minerva_test_statistics(
135+
0.1,
136+
winner_res["p_w"],
137+
loser_res["p_l"],
138+
sample_results[winner],
139+
sample_results[loser],
140+
)
141+
logging.debug(
142+
f"minerva test_stats {res=} for: {winner_res['p_w']=}, {loser_res['p_l']=}, {sample_results[winner]=}, {sample_results[loser]=})"
143+
)
144+
T[(winner, loser)] = 1.0 if res is None else 1.0 / res
145+
146+
logging.debug(f"minerva test_stats return: T={T}")
147+
return T
148+
149+
# else.....
125150
return T
126151

127152

@@ -486,4 +511,10 @@ def compute_risk(
486511

487512
if raw > alpha:
488513
finished = False
514+
logging.debug(f"samples {sample_results}, measurements {measurements}")
489515
return measurements, finished
516+
517+
518+
# Quick-and-dirty way to switch between auditing algorithms: override the function
519+
if ALGORITHM == "minerva":
520+
bravo_sample_sizes = minerva_sample_sizes

server/audit_math/shim.py

Lines changed: 195 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,195 @@
1+
"""shim.py: Shim code to interface between the calling conventions expected by the current
2+
bravo_sample_sizes() code with the API currently provided by the athena module.
3+
4+
Over time we expect both the Arlo and the Athena calling conventions to change, so
5+
this is a very temporary solution.
6+
7+
TODO: Is is worth finding a way to keep the Audit objects cached?
8+
Or is it better to make them up for each pairwise estimate as we go?
9+
"""
10+
11+
import logging
12+
import math
13+
from typing import Any
14+
from athena.audit import Audit # type: ignore
15+
16+
17+
def make_election(risk_limit, p_w: float, p_r: float) -> Any:
18+
"""
19+
Transform fractional shares to an athena Election object.
20+
21+
Inputs:
22+
risk_limit - the risk-limit for this audit
23+
p_w - the fraction of vote share for the winner
24+
p_r - the fraction of vote share for the loser / runner-up
25+
"""
26+
27+
# calculate the undiluted "two-way" share of votes for the winner
28+
p_wr = p_w + p_r
29+
p_w2 = p_w / p_wr
30+
31+
contest_ballots = 100000
32+
winner = int(contest_ballots * p_w2)
33+
loser = contest_ballots - winner
34+
35+
contest = {
36+
"contest_ballots": contest_ballots,
37+
"tally": {"A": winner, "LOSER": loser},
38+
"num_winners": 1,
39+
"reported_winners": ["A"],
40+
"contest_type": "PLURALITY",
41+
}
42+
43+
contest_name = "ArloContest"
44+
election = {
45+
"name": "ArloElection",
46+
"total_ballots": contest_ballots,
47+
"contests": {contest_name: contest},
48+
}
49+
50+
audit = Audit("minerva", risk_limit)
51+
audit.add_election(election)
52+
audit.load_contest(contest_name)
53+
54+
return audit
55+
56+
57+
def get_minerva_test_statistics(
58+
risk_limit: float, p_w: float, p_r: float, sample_w: int, sample_r: int,
59+
) -> Any:
60+
"""
61+
Return Minerva p-value
62+
TODO: refactor to pass in integer vote shares to allow more exact calculations, incorporate or
63+
track round schedule over time, and handle sampling without replacement.
64+
65+
Inputs:
66+
risk_limit - the risk-limit for this audit
67+
p_w - the fraction of vote share for the winner
68+
p_r - the fraction of vote share for the loser
69+
sample_w - the number of votes for the winner that have already
70+
been sampled
71+
sample_r - the number of votes for the runner-up that have
72+
already been sampled
73+
74+
Outputs:
75+
p_value - p-value for given circumstances
76+
77+
FIXME: need new Minerva-specific test cases - are these exactly right?
78+
Vs Athena Test cases from https://github.com/gwexploratoryaudits/brla_explore/pull/10/files/988f068e65fd955c8e5d1512865ef5e95a1d7b3c..94693c67aa33a1c642a98336ca5b7fcd32c1ce33#
79+
test26: pass
80+
>>> get_minerva_test_statistics(0.1, 0.224472184613, 0.12237580158, 50, 36)
81+
0.08762086910131112
82+
83+
test27: fail
84+
>>> get_minerva_test_statistics(0.1, 0.224472184613, 0.12237580158, 49, 37)
85+
0.12450655512929908
86+
87+
FIXME: Should this be 1.0? Or nothing, indicaating "None"?
88+
>>> get_minerva_test_statistics(0.1, 0.224472184613, 0.12237580158, 0, 0)
89+
>>> get_minerva_test_statistics(0.1, 0.75, 0.25, 7, 0)
90+
0.05852766346593508
91+
"""
92+
93+
# calculate the undiluted "two-way" share of votes for the winner
94+
p_wr = p_w + p_r
95+
p_w2 = p_w / p_wr
96+
97+
audit = make_election(risk_limit, p_w, p_r)
98+
99+
if sample_w or sample_r:
100+
round_sizes = [sample_w + sample_r]
101+
audit.add_round_schedule(round_sizes)
102+
audit.set_observations(round_sizes[0], round_sizes[0], [sample_w, sample_r])
103+
else:
104+
round_sizes = []
105+
106+
if round_sizes:
107+
status = audit.status[audit.active_contest]
108+
risk = status.risks[0]
109+
else:
110+
risk = None
111+
112+
logging.info(
113+
f"shim get_minerva_test_statistics: margin {(p_w2 - 0.5) * 2} (pw {p_w} pr {p_r}) (sw {sample_w} sr {sample_w}) risk {risk}"
114+
)
115+
116+
return risk
117+
118+
119+
def minerva_sample_sizes(
120+
risk_limit: float,
121+
p_w: float,
122+
p_r: float,
123+
sample_w: int,
124+
sample_r: int,
125+
p_completion: float,
126+
) -> int:
127+
"""
128+
Return Minerva round size based on completion probability, assuming the election outcome is correct.
129+
TODO: refactor to pass in integer vote shares to allow more exact calculations, incorporate or
130+
track round schedule over time, and handle sampling without replacement.
131+
132+
Inputs:
133+
risk_limit - the risk-limit for this audit
134+
p_w - the fraction of vote share for the winner
135+
p_r - the fraction of vote share for the loser
136+
sample_w - the number of votes for the winner that have already
137+
been sampled
138+
sample_r - the number of votes for the runner-up that have
139+
already been sampled
140+
p_completion - the desired chance of completion in one round,
141+
if the outcome is correct
142+
143+
Outputs:
144+
sample_size - the expected sample size for the given chance
145+
of completion in one round
146+
147+
>>> minerva_sample_sizes(0.1, 0.6, 0.4, 56, 56, 0.7)
148+
244
149+
150+
# FIXME: check this
151+
>>> minerva_sample_sizes(0.1, 0.6, 0.4, 0, 0, 0.7)
152+
111
153+
>>> minerva_sample_sizes(0.1, 0.6, 0.4, 0, 0, 0.9)
154+
179
155+
"""
156+
157+
# calculate the undiluted "two-way" share of votes for the winner
158+
p_wr = p_w + p_r
159+
p_w2 = p_w / p_wr
160+
161+
audit = make_election(risk_limit, p_w, p_r)
162+
163+
pstop_goal = [p_completion]
164+
165+
if sample_w or sample_r:
166+
round_sizes = [sample_w + sample_r]
167+
audit.add_round_schedule(round_sizes)
168+
audit.set_observations(round_sizes[0], round_sizes[0], [sample_w, sample_r])
169+
else:
170+
round_sizes = []
171+
172+
if round_sizes:
173+
status = audit.status[audit.active_contest]
174+
below_kmin = status.min_kmins[0] - sample_w
175+
else:
176+
below_kmin = 0
177+
178+
res = audit.find_next_round_size(pstop_goal)
179+
next_round_size_0 = res["future_round_sizes"][0]
180+
181+
next_round_size = next_round_size_0 + 2 * below_kmin
182+
183+
size_adj = math.ceil(next_round_size / p_wr)
184+
185+
logging.info(
186+
f"shim sample sizes: margin {(p_w2 - 0.5) * 2} (pw {p_w} pr {p_r}) (sw {sample_w} sr {sample_r}) pstop {p_completion} below_kmin {below_kmin} raw {next_round_size} scaled {size_adj}"
187+
)
188+
189+
return size_adj
190+
191+
192+
if __name__ == "__main__":
193+
import doctest
194+
195+
doctest.testmod()

0 commit comments

Comments
 (0)