In this story, we're going to discuss the importance of the union-find data structure (aka dis-joint sets) in our lives. We'll also discuss how we implement underlying operations and some additional tricks which can be used to further make our life easy.

Union-Find represents the array of problems that can be modeled as disjoint sets. If we consider two sets A and B such that `intersection(A, B) = empty`

, that denotes that A and B are disjoint. The same principle applies when we deal with n number of elements. There are many possible real-life examples of disjoint sets:

- Social network (Find all the users which are connected directly or indirectly)
- Semiconductor conductivity (Are two points electrically connected?)
- Percolation (How many sites should be open before it percolates?)
- Pixels in a digital photo (Are two pixels in a digital maze connected?)
- and many more ..

**Are two highlighted points connected in the maze below?**

Now, we're slightly familiar with the importance of UF. Let's look at how it can be implemented using actual code examples.

The basic API interface for a Disjoint set with Union-Find operations can be defined as follows,

```
class UF
__init__(N: int) # initialize a union-find data structure with N singleton objects (0 to N – 1)
def union(p: int, q: int) # add a connection between p and q
def find(p: int): int # component identifier for p (0 to N – 1)
def connected(p: int, q: int): bool # are p and q in the same component?
```

To implement UF using Quick-Find approach, consider the following premise:

- Integer array
`id[]`

of length`N`

. - Interpretation:
`id[i]`

is parent of`i`

. - Root of
`i`

is`id[id[id[...id[i]...]]]`

The Union Find queries can be translated to the above thought process, as

**Find**-> What is the root of p?**Connected**-> Do p and q have the same root?**Union**-> To merge components containing p and q, set the id of p's root to the id of q's root.

```
class UF:
def __init__(self, n):
self.id = [0]*n
for i in range(n):
self.id[i] = i
def find(self, i):
while(i != self.id[i]):
i = self.id[i]
return i
def union(self, p, q):
idp = self.find(p)
idq = self.find(q)
if idp == idq:
return
self.id[idp] = idq
```

**Isn't that very easy?** That's the beauty of union-find, being powerful yet simple.

To answer these questions, let's see how much is the time complexity of the operations in the above-mentioned approach,

- Union -
`O(N)`

- includes cost of finding roots - Find -
`O(N)`

- Connected -
`O(N)`

Hence, a series of N union-find operations on a set of N sites takes O(N^2) time. The main factors contributing to worst-case in the above approach are:

- Trees can get tall (imagine a long chain of
`id[id[id[...id[i]...]]]`

) - Find/connected too expensive (could be N array accesses).

Can we do better? To our surprise, yes we can use couple of interesting techniques for further optimizing union-find operations.

In order to tackle the problem of having skewed trees, we can consider weighting.

- Modify quick-union to avoid tall trees.
- Keep track of size of each tree (number of objects).
- Balance by linking root of smaller tree to root of larger tree.

We can modify the algorithm such that, subsequent queries are faster. Just after computing the root of p, set the `id[]`

of each examined node to point to that root.

The easier way to implement that would be to every other node in the path point to its grandparent.

```
id[i] = id[id[i]]; # only single line of code
```

Incorporating above mentioned two tricks for further optimizations, we get WQUPC (Weighted Quick Union with Path Compression).

```
class UF:
def __init__(self, n):
self.id = [0]*n
self.weights = [1]*n
for i in range(n):
self.id[i] = i
def find(self, i):
while(i != self.id[i]):
i = self.id[i]
self.id[i] = self.id[self.id[i]]
return i
def union(self, p, q):
idp = self.find(p)
idq = self.find(q)
if idp == idq:
return
if self.weights[idp] > self.weights[idq]:
self.id[idq] = idp
self.weights[idp] += self.weights[idq]
else:
self.id[idp] = idq
self.weights[idq] += self.weights[idp]
```

**Amortized analysis.** Starting from an empty data structure, any sequence of M union-find ops on N objects makes `≤ c ( N + M lg* N )`

array accesses. Where `lg* N`

is an interactive log function.

- Analysis can be improved to
`N + M α(M, N)`

- Simple algorithm with fascinating mathematics

Hence, In theory, **WQUPC is not quite linear, however, in practice, it is linear.** (iterated log function grows extremely slow, example - `lg*(2^65536) = 5)`

For a famous example of performing `10^9`

union-find operations on `10^9`

sites/elements. Weighted Quick Union Find with Path Compression reduced the processing time from `30 years`

to `6 seconds`

. This unblocked solutions to a wide array of real-world challenges. It's a good example to demonstrate the importance of putting more emphasis on defining good algorithms.

Credits: Photo by Eric Prouzet on Unsplash

In this article, we'll try to understand generic programming. We'll start with the underlying challenges which led to the rise of generic programming in the first place. Alongside, we'll see some examples of how it can be implemented using Java Generics.

So, let's say we're in the early days of computers and recently while developing a ticketing system we discovered how to implement a Queue for storing names of people waiting for the tickets. The interface looks as follows,

```
public class QueueOfStrings
QueueOfStrings() # create an empty queue
void push(String item) # insert a new string into queue
String pop() # remove and return the string least recently added
boolean isEmpty() # is the queue empty?
int size() # number of strings on the queue
```

Initially, we only required a queue to put people names in a list to issue them tickets in-order, hence we defined QueueOfStrings(). However, suddenly the demand for more Queues increased and now business needs Queues for phone numbers, **motor vans** (for parking slots), etc. So, how can provide all the queues? Should we just copy the implementation and change the names and types?

There can be many ways to accomplish that, lets' try to discuss a few, and evaluate them one by one below.

Attempt 1: Implement a separate queue class for each type. For example, if we need a queue of MotorVan objects, we can define our interface as below,

```
public class QueueOfMotorVans
QueueOfMotorVans() # create an empty queue
void push(String item) # insert a new string into queue
String pop() # remove and return the string least recently added
boolean isEmpty() # is the queue empty?
int size() # number of strings on the queue
```

And, we can copy the underlying implementation replacing Strings with MotorVans. However, below are two underlying issues with this method,

- Rewriting code is tedious and error-prone.
- Maintaining cut-and-pasted code is tedious and error-prone.

**Fun Fact** - This was the most reasonable approach until Java 1.5.

Attempt 2: Implement a queue with items of type Object. This can help to store all types of objects whether it's Oranges or MotorVans.

```
QueueOfObjects s = new QueueOfObjects();
MotorVan mv = new MotorVan();
Orange or = new Orange();
s.push(mv);
s.push(or);
a = (MotorVan) (s.pop()); # Errors at run-time
```

As mentioned in a comment, popping objects might be error-prone. Another issue is that we'll not be able to identify issues in my application at compile-time (type-casting happens at run time). Hence, not ensuring the correctness of solution.

Attempt 3: Java generics. We can use the following syntax to define the generic definition of the queue - `Class Name<T>`

. And, now we can create a queue for theoretically anything with just a single line of code as mentioned below.

```
Queue<MotorVan> s = new Queue<MotorVan>();
MotorVan mv = new MotorVan();
Orange or = new Orange();
s.push(mv);
s.push(or); # This throws compile time error
a = s.pop();
```

This implementation approach helps us by:

- Avoid casting in the client.
- Discover type mismatch errors at compile-time instead of run-time.

Generic Programming enables the solution to more complex problems, by simplifying the process of dealing with a different type of real-world objects. We can see the day to day examples of generics being used in the C++ STL and Java collections library. More powerful real-world examples can be seen in video games, animations, etc.

**Caveats**: In Java, arrays are covariant and generics are invariant. Hence, sometimes we might need to use a casting trick to enable solutions.