Tuesday, September 23, 2008

Python Silliness

So I was implementing the K-Means and the Fuzzy C-Means algorithms in Python for a class assignment and something just wasn't working. I double-checked my logic several times and everything seemed right. And indeed, the logic was right, I just got into trouble using some Python shortcuts.

The * operator in Python is overloaded to allow you do cool things like

a = 'h'*5 # stores 'hhhhh'

So naturally when I needed to initialize a list of lists, I did something like

t = [[]]*3

and expected to get a list with three empty lists. Then here's where I made my error. If you do something like this

t[0].append('hi')

you might think you should get

t = [['hi'], [], []].

That would make sense, but what you actually get is

t = [['hi'], ['hi'], ['hi']]

and that can mess up your logic if you're not careful. What I really wanted was the more verbose

t = [[] for i in range(3)].

In retrospect, it makes sense that the * operator just creates more pointers to the object I'm multiplying, not completely new objects. Thus when I change the underlying object, I change all the copies. But still this wasn't entirely intuitive and is a nasty pitfall to watch out for.

--Arkajit

1 comment:

Anonymous said...

Whoa. Freaky.