Counting encrypted ballots

Now the voting is over and we have an encrypted blockchain. The task is to count the ballots and produce the voting results.

A brief digression. A prime number, starting with 2, is assigned to each voting choice: 

Essentially, when submitting their ballots, voters send these prime numbers, encrypted with the public key, to the voting system.

It is also important to note that the ElGamal system is multiplicatively homomorphic:

This means that encrypted message m1 multiplied by encrypted message m2 is equal to the result of encrypting the product of m1 and m2.

This enables us to multiply encrypted data without decrypting it. Consequently, the voting results can be saved in the following form:

where n is the number of ballots cast.

By performing multiplication on encrypted data we get the product of all votes cast, also in encrypted form.

Decrypting the results

As you remember, we used threshold encryption (Shamir’s Secret Sharing scheme). Any k participants, who know the coordinates of k different points of a polynomial, can reconstruct the polynomial and all of its constants, including the last one, which is the shared secret. Essentially, the task in hand is to solve a system of equations to reconstruct the original polynomial and its constant term. This type of problem is often solved using the Lagrange interpolating polynomial – an approach that we adopted, as well. The underlying idea is simple: this is a polynomial of the lowest degree which assumes the given values in the given set of points, i.e., which interpolates the function by known points. Below we provide some more details on this.

To decrypt the data, we start by identifying the set of trusted representatives, Pi, that have parts of the keys and will participate in recovering the original text.

Next, we introduce the set π ͼ {1,..n}, |π|=k.

To recover the original text, we calculate the modified key, ai. 

where ai is the modified key of representative Pi, li is a multiplier in the Lagrangian function, which depends on node numbers and the total number of nodes in π ͼ {1,..n}.

To decrypt the voting results, we need to apply keys ai of representatives Pi sequentially.

Counting the votes

The result is something that looks like this:

where k1, k2, k3 represent the number of people who have voted for each of the candidates and N is the number that we have decrypted. Now that we know the prime numbers used (3, 5, 7) and N, we need to factor the equation and obtain k1, k2, k3.

Let us consider a simple case in which we have the following equation:

How do we find x if we know α, β and M? The easiest method would be to multiply α by itself until we get β, since what we are dealing with here is a residue group modulo M and the order (number of elements) of the group is equal to M, which means that we will need to perform a maximum of M multiplications to find β using this kind of enumeration:

This sort of enumeration can be accelerated using Shanks’ algorithm, which is also known as the baby-step giant-step algorithm. The algorithm is based on the following idea:

First, for some k we calculate αk, α2k, α3k, etc. These are the so-called giant steps:

Next, at the second stage of the algorithm, we calculate the products β⋅α, β⋅α2 β⋅α3, etc. These are the baby steps:

The above diagrams demonstrate that at some point β⋅αt becomes equal to some ym, which was computed in the previous step. Thus,

from which it follows that

Careful readers will have noticed that this describes the algorithm for finding one x, while we have many unknowns, since there are many choices in an election. This is why the algorithm was slightly modified to turn it into a multidimensional Shanks’ algorithm. A value table (giant steps) was developed for products of α in the following form: 

And the baby step in Shanks’ algorithm was performed as follows: for all jh=1..m,h=1..k, check whether the number β*α1j1*α2j2*αkjk is found in the table. If the number is found, the result is (i1m-j1, i2m-j2,..ikm-jk). The principle is basically the same as above, except that we are working with the product of voter choices from the start.

Vote counting performance

Shanks’ algorithm accelerates computation, but the issue of performance in large-scale elections remains open. We carried out a number of one-core computation experiments (to get a feel for the magnitude of things) and the results were as follows:

It can be assumed that, with good optimization and using all processor cores, it would take a reasonable amount of time to process 1,000,000 votes using one computer with 5 choices or 100,000 votes with 6 choices.

It is also important to note that the vote and the subsequent count can be sharded, i.e., assembled into metablocks (in the form of separate blockchains), e.g., 10,000 votes to each metablock. After this, the factoring can be performed separately for each block and the results can then be added together. This will enable the system to be scaled to any size.

Did this answer your question?