Sunday, August 19, 2007
Rop Gonggrijp and I proposed a panel session we ran on the first afternoon at the recent e-voting Dagstuhl seminar on the relationship between science and policy, the press, and politics. The other people on the panel, besides Rop and I, were J. Alex Halderman, Dan Wallach, and a member of the German press that was attending this event, Richard Sietmann. The main point of the panel was to emphasize to the audience that scientific work in this area cannot be divorced from politics and we also offered many war stories and advice on interacting with government, vendors, and the media. My main recommendation was that every scientist that is doing research that has potential media interest should read "A Scientist's Guide to Talking with the Media" by Richard Hayes and Daniel Grossman. I have found this book invaluable. It has informed my interactions with not only the media, but also with the general public, and even other experts. Clearly communicating the core of complex ideas, like those inherent in the "hot" topics of e-voting or applied formal methods, is critical. It is an added bonus when you can combine the disciplines, as convincing computer scientists of the use and utility of modern applied formal methods is just as hard as educating government ministers of the dangers of currently available commercial e-voting computers, but for entirely different reasons. I am encouraged by young researcher/activists like Alex and Dan. The world needs more activist scientists. These Ph.D. students and young not-yet-tenured professors are very brave to grab the e-voting bull by the horns, and I'm proud of them for that. Perhaps the generational apathy of the past 20 years can, in some small way, be counteracted by the all-consuming passion of the modern hacker/scientist? I know I'll keep making a fuss.
Friday, August 17, 2007
2007 Electronic Voting Technology Workshop
Last week I was presenting a paper at EVT07. Here are my notes/observations. 2007 USENIX/ACCURATE Electronic Voting Technology Workshop (http://www.usenix.org/events/evt07/tech/) 6th August: EVT Session Analysis I: First talk of the day was a talk from the Dutch group "We don't trust voting computers". This a well presented talk with video clips showing how easy it was to change the EEPROM on a voting computer, in 60 seconds, and of reprogrammed voting machine which could play chess. The second talk was similar, but was about Diebold machines in the US. The lock can be picked in less than 10 seconds. The third talk was about issues with technologies for disabled voters, namely the DV eSlate (http://evote.cs.ucdavis.edu/). All three types of voting machines have radio emissions from raster displays, which can be detected and analyzed. EVT Session Design I: The first talk showed how to use hash chains to prevent/detect tampering with an audit log. This assumes that the information in the audit log can safely be made public (in encrypted form) and then shared between servers for redundancy. The second talk was about how to reduce the complexity of voting machine software by pre-rendering the user interface. My talk about verification of electronic vote counting for PRSTV was before last before lunch. This was the only paper at EVT to discuss formal methods, with mixed reaction. Several cryptographers suggested that a list of votes cast could be made public (in encrypted form) to allow each candidate/party to count the votes for themselves. I was asked to explain PRSTV in more detail. Also, some cryptographers suggested that is better to verify the result with independent counts rather than to verify the count process or the software. However, I don't know of any existing cryptographic schemes which work well with PRSTV from a vote counting perspective. At present in the paper based system, ballots are not made public, only the number of votes held by each candidate in each round and the proportion of transfers in each round. It was suggested that randomization, which is part of PSTRV, could be done in advance with a predefined table of random numbers. However, this could compromise the anonymity of the votes. So, I still think that the actual vote counting process needs to be formally verified. EVT Session Auditing and Transparency: The first talk of this session described the confidentially requirements of contracts between voting machine vendors and election agencies in the US, which prohibit almost all use/analysis of the voting machines/software, except where necessary for the purpose of casting a vote. Also, the contracts are confidential. In some cases the audit logs are confidential. Some vendors to reserve the right to update the election software at any time e.g. just before an election. The second talk of the session described how to calculate the optimal sample size for auditing. The next talk described how to automate the auditing of ballots (http://itpolicy.princeton.edu/voting). The last talk of the session described an experiment in which college students were unable to successfully complete an audit that required recounting of VVPAT (voter verified paper audit trail) receipts, meaning that VVPAT receipts need to be redesigned for usability. EVT Session Analysis II: The first talk of this session described a software tool called Pioneer, which is designed to run on voting machines, checking that none of the software runs slower than expected. The idea is that if the software is compromised or acting maliciously, then it will take longer, than it should, to perform certain calculations. The next paper described a ballot layout attack against optical scan vote terminals; candidate name indexes can be swapped in memory, but the VVPAT receipt will appear to be correct. The last paper of this session showed how voting certification standards discourage good database design by requiring additional documentation if the design is more secure. The data model for GEMS, which is the Election Management Software used by Diebold, does not even comply with first normal form. Different parts of the tally get stored in different parts of the database. It has been shown to give inconsistent reporting of election results i.e. count the same set of votes on two different days and get two different results! Negative numbers of votes were stored in some parts of the database. Microsoft JET was used as as the database engine, although it does not guarantee "absolute data integrity". Law professor Candice Hoke, makes the point that the legislation needs to specify some minimum set of technical/quality standards e.g. database schema must be at least as good as 2NF, or equivalent. How best to express these technical standards in legal form is an open question. EVT Session Design II: The first talk proposed a scheme whereby a voter using a direct recording (DRE) voting machine can choose either to cast the vote as normal or to challenge the machine to decrypt the vote correctly, just by adding one question to the voting process. The next paper described a more complex scheme with two half-receipts for each vote. The voter retains one half-receipt; enough to verify the vote, not enough to reveal the vote. The last talk described three different non-cryptographic schemes: Three Ballot (which would not work with PRSTV), VAV (Vote, Anti-Vote, Vote) and Twin. Although VAV might work with PRSTV, it has some usability issues i.e. the need to fill out the ballot paper three times. See also: http://benlog.com/articles/2007/08/06/electronic-voting-technology-2007/ ===== USENIX Security 07 Technical Sessions (http://www.usenix.org/events/sec07/tech/tech.html) 8-10th August: Web Security Session: First talk was about SIF (servlets with information flow) which is based on JIF. Next talk was about using tokens to weight the value of clicks on a syndication site. Then a talk about execution based analysis of untrusted websites using a kind of virtual machine. Privacy Session: The privacy session described how variable bit rate encoding, which leads to more efficient packet sizes can also leak data, so that encryption is not enough. The second talk of that session applied the same analysis to pervasive computing devices. The third talk described how to infer data from anonymised documents e.g. most people can be identified just by their gender, date of birth and ZIP code. Authentication Session: Very interesting talk about relay attacks on smartcard payment systems, including a video clip from BBC TV programme 'Watchdog' and a proposed solution using distance bounding. The next talk showed techniques that be used to discover graphical passwords by finding hot spots in images.. The third talk was a way of encoding a user-specified time delay into offline passwords so as to make dictionary attacks about 3.59 times harder. Threats Session: Two talks about spam and one about botnets. The first talk about using image shingling on screenhots to profile the 'scam' sites. The second talk about establishing IP reputation so that overloaded mail servers could bias towards legitimate mail senders. The third talk was about a BotHunter tool for detecting bots i.e. remote controlled malware. Analysis Session: The analysis session included a talk on integrity checking of cryptographic file systems, and a talk on reverse engineering of message protocols. The last talk in that section, awarded 'best' in conference was about detecting variations in different implementations of the same protocol e.g. two different HTTP servers. This was done by extracting symbolic logic from the binaries (assembly code) using an intermediate language and weakest precondition analysis and then finding a set of inputs for which the two implementations differ. This could be used either to find error conditions or to fingerprint the binary executables. Invited Talks on Electronic Voting: There were two invited talks on Computer Security and Voting. The first was a general introduction to the topic by David Dill and the second was a discussion by David Wagner and others of their review of electronic voting machines in California. After the first talk, someone asked a question about randomisation in PR-STV, is used in Cambridge, MA, for city elections. In the second talk, David Wagner described some of the more basic security flaws in Diebold, Sequouia and Hart machines, and then said that those were less harmful than the ones which could not be publicly disclosed! All three systems have very weak or no use of encryption and all three are vulnerable in different ways to malicious viral code. The review team focussed only on the security code and could not comment on other parts of the code e.g. vote counting. Obfuscation Session: Interesting talks on both software and hardware obfuscation. The obfuscation technique for software is too much like a hack; it is an example of what malware developers might do to avoid detection. The hardware obfuscation is a legitimate way to protect intellectual property using physical variation in fabricated circuits. Network Security and Privacy Session: The first talk showed that connection time is the bottleneck in cellular networks and thus vulnerable to denial of service attacks by using up the pool of connection IDs. The second talk discussed various possible attacks on dense urban w-ifi networks. The last talk was about data mining of anonymised network data by profiling data flows for different websites. Work In Progress Session: Some of the WIP talks discussed virtualisation. One talk described how to use virtual machines to isolate malware. Another talk showed how to use virtual machines to subvert software licenses, using a form of replay attack. Yet another talked about using VMs to detect kernel modifying rootkits.
Thursday, August 09, 2007
Dagstuhl Seminar "Frontiers of e-Voting"
A few weeks ago I participated in the Dagstuhl Seminar "Frontiers of e-Voting," organized by David Chaum, Miroslaw Kutylowski, Ron Rivest, and Peter Ryan.
I always enjoy my Dagstuhl visits, and this particular seminar was no exception. In fact, I found that the mix of people and personalities at this event quite refreshing, as it wasn't just a "castle" full of the same-old people who all know each other and agree. We witnessed cryptographers arguing with activist/computer scientists, political scientists drinking with theoreticians, and hacker/activists playing pool with PhD students of all ilks.
I have subscribed to several participant blogs as a result of these interactions, including those (co-)run by Michael Alvarez and Ian Brown. Already a couple of posts have caught my eye, and I'll comment upon them here in later posts.
I was asked to have a highly active role on the first day. In the morning I chaired the first session of the whole event. It focused on a summary of existing voting systems that have been constructed by attendees. The following systems were discussed: CIVS, Adder, Digishuff, Scantegrity/PunchScan, Pret a Voter, CyberVote, the Belgian Voting System, Hack-a-Vote, U.E. (used in Brazil), and KOA. Each person summarizing a system had to explain the principles of the system (i.e., why it was built), when it was created, if the project is still running, if/how the source/system are available, what license it is available under, what size and kind of team created the system and at what cost, how has the system been used in real elections, what balloting systems/style are supported (FPTP = first past the post, STV = single-transferable vote, etc.), what programming language(s) and operating system(s) were used, how scalable the system is thought to be, and what lessons were learned. I was struck by how many systems were "open source", but how few speakers actually knew if the source was really available, what license was used, and how the system was designed and built. Such does not bode well as hints of software system quality. I, of course, asked every person also how the system was specified and checked for "correctness". Where requirements gathered and analyzed? Was the system and architecture designed? Were assertions used in some way? How was the system tested? In general, the answer was always either "we make no claims that this system implements anything" or "the correctness of the software system is completely unimportant". The first of these claims is the honest one. The design and implementation of these experimental systems, without fail, have been haphazard, at best. Even one of my favorite systems, CIVS, which is implemented in a research programming language called JIF (Java + Information Flow), and uses very advanced distributed systems and cryptography constructs, contains little docs and no assertions according to one of its authors. The latter claim comes from the e-voting sub-community that focuses on end-to-end schemes. End-to-end voting systems are meant to provide very strong guarantees about a given elections integrity and voter privacy, typically by virtue of very smart algorithm design (and that's not just computational algorithms, but these systems involve people, organizations, physical artifacts like ballots, etc.) coupled with interesting cryptography. The mantra of this group is "verify the election, not the software". Unfortunately, nearly universally these system do use hardware and software to print and scan ballots, sometimes collect a voter's choices, transmit ballots, perform vote tallies, and report results. When something goes wrong, in essence, the crypto catches it, but where is the fault? At the moment, no one really knows how this auditability and culpability-management work. Furthermore, if the software and hardware is of poor quality, then the fault is likely to be found there in these relatively complex systems. So, while the mantra is "verify the election, not the software", what is actually meant is "verify the election and make sure the software is of high quality, don't just verify the software then trust the election"...a sentiment with which I completely agree.