![[personal profile]](https://www.dreamwidth.org/img/silk/identity/user.png)
itertools, which gives you the building blocks for pretty much the entire STL <algorithm> library, complete with built-in exceptions.
Python's index builtin, which is a list method that takes an object x as its argument, returns the index of the first position in the list where x is found. This is all well and good if all you care about is verifying that a certain element is in the sequence -- in fact, it's overkill, as "if x in seq" is designed for this. But what do you do if you have a sequence of objects with various attributes, and you want those for which some predicate about an attribute is true, e.g. "every Child whose Age is greater than 12"?
In C++, this is why we have find_if, part of the <algorithm> library. Give find_if the beginning and the end of the sequence you want to search, and a predicate to test with, and find_if will return a pointer to the first item in the sequence for which the predicate returns true (i.e., the first child older than 12), or the last item in the sequence if the predicate never returns true. (This is why calls to find_if usually look like find_if(foo.begin(), foo.end(), ...), because .end() returns an iterator one past the last data item in the sequence.)
Unfortunately, writing predicates in C++ is a whole lot of pain in the ass. The <functional> library gives you a whole raft of template functions (and functors, which I'll get into in a moment) which will let you perform interesting operations on POD ("plain ol' data") types, e.g. ints and chars. So if you've got a list<age>, and age is just a typedef for unsigned int, you're fine. If not, then things get complicated.
See, that predicate I mentioned has to be a function object, not just a function. Now, the STL provides some nice adaptors to regular old functions that will turn them into function objects for you, such as ptr_fun. If all you need to do is, say, determine whether the absolute value of a value in your list is greater than 100, you can use ptr_fun, possibly compose1 and the bind family of functor adapters (bind1st and bind2nd) to compose your predicate, e.g.
Suppose that you have a list of pairs of integers, and that you want to identify the first element in the list whose .first member is greater than 0. The STL provides a Unary Functions (and their slightly more self-knowledgeable cousins, Adaptable Unary Functions) from which a function object can inherit. All you need to do (she said, with a strong sense of irony) is specialize the base class's template parameters for the appropriate argument and return values, then overload the operator() method. So, in this case, we'd want:
And now that we have this, we can use it where we used ptr_fun(abs) above. Voila. Or perhaps I should say "whew". That's a lot of work, and a lot of not-particularly-newbie-ish knowledge about the STL.
Compare this to the Python approach, where functions are data just like anything else -- and you can create functions on the fly using lambda. Now, consider the situation I was in this afternoon: I have an XML document which contains multiple elements with the tag <foo>, and each of these elements has a name attribute. I want to find the first <foo> whose name is "bar". The xml.dom library has a method, .getElementsByTagName, which ... does exactly what it says, and returns those elements as a list. But I don't want a list; I just want the first item in that list which has a name equal to "bar".
At long last, this is where itertools comes in. Specifically, itertools.ifilter.
I'll let you read the code, and then point out what I'm doing.
And that's it. Yes, really. ifilter(pred, seq) creates a generator which will return for me all elements in seq for which pred returns true. I create my predicate on the fly using a lambda expression ("for everything which gets passed to me, call the .getAttribute method on it, and if the result of that is equal to "bar", return True, otherwise return False"), pass in the sequence, and because all I care about is the first element for which my predicate is true, I call .next() to get the first thing the generator spits out. (Otherwise node would be a generator object, not an XML node.) It behaves just like find_if, with a whole lot less hassle.
Now, that said, find_if and ifilter both let you operate on arbitrary slices of a sequence (using slicing in Python, iterator arithmetic in C++), but I don't see a good way to operate on conditional slices of a sequence in Python (though the syntactic complexity of doing it in C++ leads me to question whether that's actually a good way). But I'll think on this, and if I come up with something good, I'll let you know.
(Thanks to
yoctohedron for the productive line of inquiry.)
Python's index builtin, which is a list method that takes an object x as its argument, returns the index of the first position in the list where x is found. This is all well and good if all you care about is verifying that a certain element is in the sequence -- in fact, it's overkill, as "if x in seq" is designed for this. But what do you do if you have a sequence of objects with various attributes, and you want those for which some predicate about an attribute is true, e.g. "every Child whose Age is greater than 12"?
In C++, this is why we have find_if, part of the <algorithm> library. Give find_if the beginning and the end of the sequence you want to search, and a predicate to test with, and find_if will return a pointer to the first item in the sequence for which the predicate returns true (i.e., the first child older than 12), or the last item in the sequence if the predicate never returns true. (This is why calls to find_if usually look like find_if(foo.begin(), foo.end(), ...), because .end() returns an iterator one past the last data item in the sequence.)
Unfortunately, writing predicates in C++ is a whole lot of pain in the ass. The <functional> library gives you a whole raft of template functions (and functors, which I'll get into in a moment) which will let you perform interesting operations on POD ("plain ol' data") types, e.g. ints and chars. So if you've got a list<age>, and age is just a typedef for unsigned int, you're fine. If not, then things get complicated.
See, that predicate I mentioned has to be a function object, not just a function. Now, the STL provides some nice adaptors to regular old functions that will turn them into function objects for you, such as ptr_fun. If all you need to do is, say, determine whether the absolute value of a value in your list is greater than 100, you can use ptr_fun, possibly compose1 and the bind family of functor adapters (bind1st and bind2nd) to compose your predicate, e.g.
compose1(bind2nd(greater<int>(), 100), ptr_fun(abs))But this is already kind of hard to read, and if you need to operate on sequences of anything more complicated than a primitive, you're going to have to write a helper function. Or, more properly, a helper functor.
Suppose that you have a list of pairs of integers, and that you want to identify the first element in the list whose .first member is greater than 0. The STL provides a Unary Functions (and their slightly more self-knowledgeable cousins, Adaptable Unary Functions) from which a function object can inherit. All you need to do (she said, with a strong sense of irony) is specialize the base class's template parameters for the appropriate argument and return values, then overload the operator() method. So, in this case, we'd want:
template<class T, class U> struct getFirst : unary_function<pair<T, U>, T> { T operator()(pair<T, U> x) { return x.first } };
And now that we have this, we can use it where we used ptr_fun(abs) above. Voila. Or perhaps I should say "whew". That's a lot of work, and a lot of not-particularly-newbie-ish knowledge about the STL.
Compare this to the Python approach, where functions are data just like anything else -- and you can create functions on the fly using lambda. Now, consider the situation I was in this afternoon: I have an XML document which contains multiple elements with the tag <foo>, and each of these elements has a name attribute. I want to find the first <foo> whose name is "bar". The xml.dom library has a method, .getElementsByTagName, which ... does exactly what it says, and returns those elements as a list. But I don't want a list; I just want the first item in that list which has a name equal to "bar".
At long last, this is where itertools comes in. Specifically, itertools.ifilter.
I'll let you read the code, and then point out what I'm doing.
from itertools import ifilter from xml.dom.minidom import parse doc = parse("foo.xml") node = ifilter(lambda x: "bar" == x.getAttribute("name"), doc.getElementsByTagName("foo")).next()
And that's it. Yes, really. ifilter(pred, seq) creates a generator which will return for me all elements in seq for which pred returns true. I create my predicate on the fly using a lambda expression ("for everything which gets passed to me, call the .getAttribute method on it, and if the result of that is equal to "bar", return True, otherwise return False"), pass in the sequence, and because all I care about is the first element for which my predicate is true, I call .next() to get the first thing the generator spits out. (Otherwise node would be a generator object, not an XML node.) It behaves just like find_if, with a whole lot less hassle.
Now, that said, find_if and ifilter both let you operate on arbitrary slices of a sequence (using slicing in Python, iterator arithmetic in C++), but I don't see a good way to operate on conditional slices of a sequence in Python (though the syntactic complexity of doing it in C++ leads me to question whether that's actually a good way). But I'll think on this, and if I come up with something good, I'll let you know.
(Thanks to
![[livejournal.com profile]](https://www.dreamwidth.org/img/external/lj-userinfo.gif)
(no subject)
Date: 2006-03-15 03:37 am (UTC)(no subject)
Date: 2006-03-15 03:39 am (UTC)x.erase(remove_if(x.begin(), x.end(), is_odd), x.end());
(no subject)
Date: 2006-03-15 06:25 am (UTC)(I mean, yes, I could have also written a function like int getFirst(pair<int, int>), which is indeed less complicated, but also type-bound. And when you need to use equal_to or greater or one of those conditions, you do need a functor, which means either ptr_fun or writing your own structs.)
(no subject)
Date: 2006-03-15 06:29 am (UTC)(no subject)
Date: 2006-03-15 06:55 am (UTC)Don't get me wrong here: I have a lot of admiration for the flexibility and power of the STL, and as I said above, there are places where the STL idiom trumps the Python idiom in terms of flexibility. But the tradeoff in terms of baroque syntax and ramp-up time can be a real pain in the ass.
(no subject)
Date: 2006-03-15 06:58 am (UTC)Well, there is a way to use class methods. That's the way.
(no subject)
Date: 2006-03-15 09:48 pm (UTC)And, all that aside, what I said about readability still holds.
Grf. My ideal language has lambdas, functions as first-order data, and templates. But it doesn't exist.
(no subject)
Date: 2006-03-15 11:50 pm (UTC)For what templates were originally intended to do, the C++ template facility works just fine and is admirably readable. However, the capabilities it was discovered to bring to the table... the syntax for those capabilities is accidental. That it exists, I'm grateful. That it's as legible as Perl code written after a one-night bender, I curse it.
Insert my standard rant here about C++ template metaprogramming. Your ideal language does exist, and is Scheme and/or Common LISP.
Lambdas? Yep. Functions as first-order data? Yep. Templates? Yep.
C++ template metaprogramming is just a bad-syntax way of doing what LISP and Scheme programmers have been doing for years with macros. :)
(no subject)
Date: 2006-03-16 01:22 am (UTC)(no subject)
Date: 2006-03-15 08:53 am (UTC)List.exists : ('a -> bool) -> 'a list -> bool
A friend of mine and I worked on a year ago where we used those functors you described quite heavily. It was truly sick, but a wonder to behold.
(no subject)
Date: 2006-03-15 08:54 am (UTC)