Inclusion-Exclusion principle, which will be called from now also the principle, is a famous and very useful technique in combinatorics, probability and counting.

For the purpose of this article, at the beginning the most common application of the principle, which is counting the cardinality of sum of $$n$$ sets, will be considered. Later, more applications will be given.

**Cardinality of the sum of sets**

For a given set $$A$$ of $$n$$ sets $$A_1, A_2, \ldots, A_n$$, let $$S_n$$ be the sum of these sets, more formally, $$S_n = \bigg|\bigcup_{i=1}^n A_i\bigg|$$

The principle gives a direct formula for computing $$S_n$$. It is important to notice that since any two of given sets can have a non-empty intersection, the cardinality of their sum can be smaller than the sum of their cardinalities. This is the reason why inclusion-exclusion principle is so useful.

**The formula:**

$$$S_n = \sum_{i = 1}^n |A_i| - \sum_{1 \leq i < j \leq n} |A_i \cap A_j| + \sum_{1 \leq i < j < k \leq n} |A_i \cap A_j \cap A_k| - \ldots + (-1)^{n - 1} \cdot |A_1 \cap \ldots \cap A_n|$$$

If for a given set $$A$$ of $$n$$ sets $$A_1, A_2, \ldots, A_n$$, the cardinality of intersection of sets in any subset of $$A$$ can be computed efficiently, then the formula can be used in order to get the value of $$S_n$$.

A very important thing to notice is that the formula can be quite long. Actually, each subset of $$A$$ is considered exactly once in the formula, thus the formula for a set of $$n$$ sets has $$2^n$$ components. This is very important fact to keep in mind. Inclusion-exclusion principle has may different applications, but in some of them the complexity cost of using it is too big to be practical.

**Example:**

Let $$A$$ be a set of $$4$$ sets:

$$A_1 = {1,2,3}$$

$$A_2 = {2,3,4}$$

$$A_3 = {1,3,5}$$

$$A_4 = {2, 3}$$

The goal is to compute the cardinality of sum of all the above subsets. For the purpose of the example, let $$A_{i, j}, A_{i, j, k}, A_{i, j, k, l}$$ denote correspondently the intersection of sets $$A_i$$ and $$A_j$$, the intersection of sets $$A_i, A_j, A_k$$, and the intersection of sets $$A_i, A_j, A_k, A_l$$.

Then the formula for the above example looks like this:

$$$|A_1| + |A_2| + |A_3| + |A_4| - |A_{1,2}| - |A_{1,3}| - |A_{1, 4}| - |A_{2,3}| - |A_{2,4}| - |A_{3,4}| + |A_{1,2,3}| + |A_{1,2,4}| + |A_{1,3,4}| + |A_{2,3,4}| - |A_{1,2,3,4}|$$$

Since in the example the cardinality of each above intersection can be computed just by looking at the sets in the intersection, the formula is transformed to:

$$$3 + 3 + 3 + 2 - 2 - 2 - 2 - 1 - 2 - 1 + 1 + 2 + 1 + 1 - 1 = 5$$$

This example might seem trivial, because computing the cardinality of sum of all sets is as straightforward like computing the cardinality of their intersections. However, in many cases the complexity of computing these two values differ significantly. At the end of this article, there will be an example problem demonstrating such case. But first, a very handy method of implementing the formula will be given.

**Implementation:**

For the purpose of the article, let

```
int intersectionCardinality(vector<int> indices)
```

be a function returning the cardinality of intersection of sets $$A_i$$, where i is in the indices vector.

Then the value of the formula can be computed by generating all subsets of $$A$$, which is the set of sets $$A_1, \ldots, A_n$$, and for each such subset, by computing the cardinality of its intersection and adding this value with an appropriate sign to the final result.

```
int n; // the number of sets in the set A
int result = 0; //final result, the cardinality of sum of all subsets of A
for(int b = 0; b < (1 << n); ++b)
{
vector<int> indices;
for(int k = 0; k < n; ++k)
{
if(b & (1 << k))
{
indices.push_back(k);
}
}
int cardinality = intersectionCardinality(indices);
if(indices.size() % 2 == 1) result += cardinality;
else result -= cardinality;
}
cout << result << endl; //printing the final result
```

This method of iterating over all subsets of some set using indicator vector is a super handy method and it is important to remember and master it. The idea is pretty simple, since there are $$2^n$$ subsets of a set of n elements, each integer in a range $$[0, 2^n)$$ represents an indicator vector of one such subset when this integer is interpreted as a binary number.