One of the fun things to do with python is to use the language to bend the rules of programming. One neat way of using your resources wisely with objects is to create objects in the right places, but not generate or call costly data gathering operations until you absolutely need to.
So called ‘late data initialisation’ is useful in certain kinds of programs. I document here the best pattern I’ve found to turn a python list into a late data class:
class newList(list): @property def data(self): if self.populate: self.populate() return self def populate(self): print "Generating Now" for x in range(42): self.append( x ) self.populate = None def __getitem__(self, key): return super(newList, self.data).__getitem__(key) def __len__(self): return super(newList, self.data).__len__() def __iter__(self): return super(newList, self.data).__iter__()
Basically populate can be any piece of code which calls what ever it needs in order to gather the data. It’ll only be called once and to prevent having to use an __init__ variable we just clobber the method attribute. Sneaky!
When the list is used for an iteration (for loop), or if you delve into it directly for a specific item, or need to know it’s length, then we can get the data behind our object and have it contained within the list object we’re inheriting. No need for secondary list object variables dangling off of self. Ugly! Although this pattern does require that every use you want to put the object to (i.e. string, representation, slicing, dicing, mincing or conditionalising) you’ll have to make a new super method to wrap around to make sure that the data is generated if that’s the first way the list will be used.
What are your thoughts about this pattern? Do you know how to fix the init issue with a better pattern?