Friday, August 14, 2009

Algorithmists vs. Mathematicians

This is something I learned very early in the competition, before I had interacted with any other competitors. It's really more of a personal discovery rather than something directly related to the competition, but I think it's a good place to start.

I am not a mathematician, statistician, or actuary. In fact, math was generally my worst subject when I was in grade school. Still, I did take quite a few math courses in college: 3 semesters of calculus, linear algebra, and differential equations. I would have avoided these courses like the plague, but they were required for one or more of the majors I blithely skipped between as an undergraduate. Once again, my grades in these courses were not stellar. But I found that I understood the theories we studied at least as well as most of my classmates. I even helped others figure out how to do homework problems. Why, then, was my own work so poor? My conclusion: I lacked the ability (or perhaps desire) to do dozens of contrived problems based on the single assumption that the method(s) used should be drawn from the chapter we were were studying at the time. After the first few problems I would get bored and start looking for shortcuts – some of which worked, but most of which failed. I once spent half of the allotted time for a calculus test attempting a derivation that would greatly simplify the bulk of the problems presented. I could see that such a formula must exist, and was much more interested in finding it than doing the problems. Unfortunately I didn't finish the derivation, and subsequently failed the test. We learned the formula a couple of months later. At the time I didn't understand the point of hiding that formula. To be honest, I still don't.

What does this have to do with the Netflix competition, you ask? Am I just using this as an excuse to rail against poor educational methods or cover up my character flaws? Good question.

One reason I stumbled across this particular life lesson was because I had been reading books by Wirth (particularly Algorithms + Data Structures = Programs) and Hunter (Metalogic: An Introduction to the Metatheory of Standard First Order Logic).

The academic study of algorithms can be a complex business. It often overlaps combinatorics in ways less obvious than combinatorial optimization; but that's a discussion for another day. In the study of algorithms there is much discussion of provability, efficiency (in terms of big O notation), and elegance. What it all comes down to, though, is finding a reliable, relatively efficient, sequence of steps by which a particular family of problems can be solved. An algorithmist (yes, I choose to use that term) rarely assumes that any algorithm is perfect, especially when seen in the harsh light of real world problems. Every act of publishing an algorithm contains an implicit invitation, or even challenge, to improve upon it. Where would we be if nobody had ever challenged the “bubble sort”, after all? Computer science students, in my day at least, were encouraged (even expected) to look for better algorithms than those presented in the textbooks. Discovery of novel algorithms, even if they only fit a small subset of problems, was noteworthy. This was all considered good practice for life-after-college. Why, then, did the related field of mathematics seemed to discourage rather than encourage the same sort of thinking?

Metalogic (which also overlaps combinatorics) revealed the intriguing thought that mathematics was just another logical system, and, as such, could be scrutinized from a perspective that was different than the one I had been so painfully taught. Unfortunately this was before the publication of Gödel, Escher, Bach, which would have saved me a lot of mental calisthenics. Still, the Hunter book lead to Gödel. I delighted in Gödel's incompleteness theorem, willfully misinterpreting his conclusions to support my contention that mathematics was deeply and inherently flawed.

So, I went on my merry way, believing that mathematics was a necessary evil: something to be suffered through.

It was clear from my earliest exposure to the Netflix data that I would need to learn or re-learn some fairly basic math. For example: I certainly knew about gradient methods for problem solving, but finding the proper gradient for some of the formulas I was starting to come up with would require more knowledge of multivariate calculus than I could muster. In the past I had relied on co-workers to handle that sort of thing. It was equally clear that the bits and pieces of linear algebra that I remembered were not going to be adequate – I could barely remember how to construct and solve simple linear regression problems. That's what SAS and SPSS are for, right? Except I wasn't allowed to use commercial software.

What happened next was surprising. I did sit down and grind through some old math texts; just enough to pick up what I needed. (Coincidentally I also read How to Solve It: Modern Heuristics at about the same time – if you've read it you'll know why that's relevant). Then some competitors started to publish Netflix Prize-related papers. These papers were often riddled with formulas (ugh, I thought, more math to slog through). My own notes rarely contained formulas, but had copious quantities of pseudo-code interspersed with the text. So, have you figured out my “brilliant” revelation yet? Here it is: these mathematicians were doing exactly the same thing that I was doing – only the syntax was different. All I had to do was translate the formulas into pseudo-code (or actual code) and what they were trying to say suddenly became very clear. I had a sudden urge to smack myself in the forehead with my palm. How stupid had I been all these years!? But this was applied math, right? Not one of the “pure” pursuits. So, out of morbid curiosity I started reading all sorts of math-related papers (gotta love Google, arxiv.org, and the web in general). Guess what I found? Mathematicians really do promote the same sort of thinking, and have the same sort of expectations, that algorithmists do. It's true that mathematicians have been around longer and explored more of their “world”, and in some cases this has lead to stagnation in how it is taught and performed, but, sans baggage, it's still all about finding solutions to problems. Who would have guessed.

5 comments:

  1. Bill, that's a great post. The whole idea of a duality between code and mathematical notation reminds me of a quote from Abelson & Sussman, which is: "First, we want to establish the idea that a computer language is not just a way of getting a computer to perform operations but rather that it is a novel formal medium for expressing ideas about methodology. Thus, programs must be written for people to read, and only incidentally for machines to execute."

    So with this point of view, code and mathematical notation are both ways of conveying the same idea...though in journals, you're bound to see more mathematical notation, and only an occasional snippet of pseudocode. But like someone who speaks two languages, sometimes there's an idiom in one domain that's hard to translate to the other, so both domains have their strengths and weaknesses, I suppose. Now that more mathematically-oriented tools like Mathematica, R are available, along with upcoming language(s) like Fortress, perhaps this will blur the line between code and mathematical notation, so that they're even more freely interchangable, and ideas can more freely flow between the two.

    ReplyDelete
  2. I'm like you. Encapsulation priciple: why we should be able to do manual calcs when we invented calculators to speed up the workflow and encapsulate the complexity? Math is for boring people, smart people use Logic!

    ReplyDelete

  3. thanks for this post, keep it up for updating us, i am waiting for ur new article.
    IPL 2015 Cricket live score
    mps computers
    Harjinder Singh
    thanks again

    ReplyDelete
  4. I like your stuff, classic right, Now... everything with digital, perdita di peso thanks for remind me for this memories

    women need need exercise specially more 40 age, because their hormone
    upper body exercises for woman for your body lower lower body exercises for women and abdominal too abdominal muscle exercises for woman this special therapy if you do not have a time to exercises testosterone replacement therapy

    ReplyDelete