A blog about software development and other software related matters

Blog Archive

Wednesday, May 2, 2007

Dicsrete math & Java

In my Bachelor's degree i had the pleasure to study discrete math, discrete math handles group theory which address well hmm... groups :).
A group is a math entity that has a proper definition and a collection of operations which are operable on them such as: intersection, union, disjunction, subtraction.

At this point you must be thinking "well this all doesn't matter to us Java folks ..", well in fact we can think of Java collections and sets as groups (not exactly since we wont insist on a binary operation), thinking of them this way can result in some clean code implementation.
For example imagine that we need to find which changes were made to an arbitrary list of uniquely identifiable objects (list which are new, removed or old), the simplistic approach will look like:


public void printChanges(List older, List newer) {
for(Object value:older){
if(newer.contains(value)){
log.info(value.toString()+" is old");
} else {
log.info(value.toString()+" was removed");
}
}

for(Object value:newer){
if(!older.contains(value)){
log.info(value.toString()+" was added");
}
}
}

Its not the most elegant code since it contains loops and conditionals, in simple cases this might not be so bad but in more complex cases keeping trace on this kind of code is not easy, as for run time its about o(n).

Now lets see how the groups approach might work, our input consists of two object groups and we are seeking for three other groups that contain elements from these two, the most easy one to detect is the intersection of the two which match the old objects.
Finding the removed and the new is the same symmetric problem which is to find the objects that exists in one group but doesn't exist on the other, in groups lingo the operation that finds such objects is called subtraction.
Now lets take a look at the group oriented implementation:


//making use of org.apache.commons.collections
public void printChanges(Collection older, Collection newer) {
final Collection removed = CollectionUtils.subtract(older, newer);
final Collection added = CollectionUtils.subtract(newer, older);
final Collection old = CollectionUtils.intersection(older, newer);
log.info("removed: "+removed);
log.info("added: "+added);
log.info("old: "+old);
}

Its easy to see that this implementation is much more easy to follow since there are no loops or conditionals (code complexity is lower), as for runtime its also o(n).

9 comments:

Unknown said...

I can't really see where groups are involved here, as this all relates to operations upon sets -- or did I miss something?

BTW, you can easily avoid resorting to a custom CollectionUtils implementation, as standard methods removeAll(Collection) and retainAll(Collection) can do the job for you.

jc said...

nice post, the commons collections tools are a must in doing any large group manipulation.

fyi - your feed link on your page has a double slash which makes it return a 404 instead of your feed

ronen said...

jc & antoine first thank you for the comments.
First it's has to do a lot with groups since
that these operations are used in widely in many theorems which are proved on groups.
Still you are correct that these operations are a part of set theory: http://math.comsci.us/sets/
(A group is a private case of a set).
As for resorting to custom API thats the whole point! :)
You want to be able to use more sophisticated tools and not work with the low level building blocks (using r*All will look like more the first example that iv shown), besides i think that using commons is as standard as using Spring, Hibernate etc..

Oh and ill fix up that link

Anonymous said...

if(newer.contains(value)){
log.info(value.toString()+" is old");
} else if(!newer.contains(value)){
log.info(value.toString()+" was removed");

i think you dont need to check that it doesnt contain the value in the else if statement, when you have done "if(newer.contains(value))"

ronen said...

Your absolutely right :), didn't notice that redundant call.

Anonymous said...

I hope this doesn't come across as too pedantic, but from a mathematician's point of view, this really doesn't have anything to do with group theory. Groups have a very particular definition, whereas sets are a general tool used throughout mathematics.

If you know a little basic networking, here's an analogy...It's a little like pointing to TCP/IP and saying, "isn't HTTP wonderful?"

So, yes, group theory uses theorems about sets, but so do a lot of other areas of maths. Set theory is one of the lower layers in the maths "stack".

But you're quite correct in all but your terminology: thinking about programming problems in terms of set operations can be a powerful technique.

Anonymous said...

Why is this o(n)? You're looping through one list (size n) and then looping over another list (size n), shouldn't that be o(n^2)?

If you were using hashmaps maybe this would be reduced to o(n) as the lookup is more like o(1).

ronen said...

In retrospective i guess that i should have used sets and not groups its simply that the first thing that popped to my head when thinking about this was the use that iv made with such operations in groups theory.
As for the analogy i see things a bit more like inheritance and not like stacks, a group extends the properties of the set but still its a nice analogy.
As for the run time you might be right :), it depends on how much contains costs (in ArrayList it costs O(n)), so in some cases it costs o(n^2) and not o(n).

ronen said...

Just to clarify, in the first example there are two separated loops and not a nested one (i had a typo and left a bracket out by mistake).