maradydd: (Default)
"You are Zaphod Beeblebrox?" it squeaked.

"Yeah," said Zaphod, "but don't shout it out or they'll all want one."

"The Zaphod Beeblebrox?"

"No, just a Zaphod Beeblebrox, didn't you hear I come in six-packs?"
--Douglas Adams, The Restaurant at the End of the Universe
maradydd: (Default)
One of the cooler things about living in Belgium is that it is basically impossible to live in a reasonably-sized town and not be within a couple of blocks of a bakery. (We're a block from one, and within four blocks of two more.) This has had a really positive impact on my life in terms of breakfast. Every morning, post-caffeine, I hike over to our nearest bakery, which is also a candy shop, and pick up a bunch of fresh pastries to start [livejournal.com profile] enochsmiles' and my day.

This morning, the baker -- a short, apple-cheeked woman who in thirty years will look like every cartoon Mrs. Claus you've ever seen -- was laying out a tray of Santa-shaped chocolates as I walked in. "Oh, Sinterklaas?" I asked. "Nee," said the baker. "Kerstman!" This threw me, since I knew that the English "Santa Claus" is a borrowing of the Dutch "Sinterklaas", which of course is a contraction of "Sint-Nicolaas" (St. Nicholas). Come to find out, after Anglophone culture borrowed Sinterklaas and morphed him into Santa Claus, Dutch (and, by extension, Belgian) culture borrowed him right back as Kerstman ("Christmas man"). So now we have two St. Nicholases (Nicholi?), one who brings presents on December 5th, one who brings presents on December 25th.
maradydd: (Default)
Seriously. I wish I'd had this as an example when I was teaching about prestige dialects in sociolinguistics; he knows the theory and the practice.

maradydd: (Default)
...but the kriek is so tasty.

Anyway, I was recently snarked at on IRC for uttering the following sentence:

I.e., you start with the functional spec, you hand it in to the professor, he grades it, then you do whatever the next bit of the process is?
The snarker in question took issue with my use of the word 'he' as the anaphor, or "pronoun that refers to a previously introduced noun", for "professor". I remarked that in my dialect, 'he' is the commonly accepted third-person singular gender-neutral pronoun. "Oh, complete with gender-neutral penis?" snarked the snarker. I offered to use 'it', and remarked that while 'they' has become more common when the referent of a pronoun is known to be gendered but that gender is unknown, using a plural pronoun plays hell with my sense of number.

Thinking further about the original sentence, it occurred to me that "it grades it" also plays merry hell with my sense of syntax -- in this case, anaphora resolution.

Terminology time! An anaphor, plural anaphora, is a pronoun which refers to something. (Why, yes, there can be pronouns which don't refer to anything -- the "weather 'it'" of "It's raining" is one example.) A referent is the noun to which an anaphor refers. Anaphora resolution is the process of matching up anaphora and their references in a sentence.

So. In our example sentence, we have three possible referents: you, the person being addressed; the professor, a third person animate entity of currently-unknown gender; and the functional spec, a third person inanimate entity, thus of no (or, neutral) gender (well, at least in English).

We also have a semantic structure which we need to encode: grades(the professor, the functional spec). (I've placed this in predicate logic format. Since both referents are finite, as opposed to "some professor" or "all the functional specs", we don't need to use any of predicate logic's nifty symbols. We also know that both anaphora will be singular, because both our referents are singular.)

I encoded this semantic structure with the phrase "he grades it", which is a complete sentence being used as a phrase. Syntactically, I would encode it as [IP [NPHe ] [I' [I [-Pst] ] [VP [Vgrades ] [NPit ] ] ] ]. (Sorry, not really sure how to do trees on LJ.)

So, let's look at the possible ways to resolve "he grades it", given our current scope, or "what referents do we have available?" Both 'he' and 'it' are third person pronouns, so that rules out you. The professor could be 'it', but that means that the functional spec would have to be 'he', which isn't possible, because as we already said, the functional spec is inanimate, and 'he' applies to animate referents. Thus, the functional spec resolves to 'it', and as we already ruled out you, the professor must be 'he'. There is only one syntactically legal resolution for all the anaphora in the sentence.

But some people object to calling the professor 'he' when we don't know whether the professor is male or female, because they argue that the speaker is assuming that only men are smart enough to be professors. WTFever, I'm a chick and you're listening to me school you, so you already know that I know better than that; STFU and keep reading. I'm going to explain why 'he' is a more reasonable anaphor for that position than any alternative that was put forward.

As you'll recall, two options were discussed: 'it' and 'they'. We'll take 'it' first, because it's the general case.

All we really know, when the phrasing is "it grades it", is that whatever 'it' is, it is not grading itself -- otherwise 'itself' would be the second anaphor. We also know that you can't be the referent either, so we have two possible assignments: grades(the professor, the functional spec) or grades(the functional spec, the professor). Since 'it' can be either animate or inanimate ("Who put the dog in the trash compactor?" "I put it there."), it grades is an acceptable phrasing (grades needs an animate ACTOR), so this syntactic coding is acceptable. Thus, the syntax parts of the brain pass a validated parse tree to the semantics parts of the brain to perform anaphora resolution. It is more likely that a professor will grade a functional spec than vice versa; in fact, the latter idea is kind of silly, so that reading is "marked". (In optimality theory terms, we might say that it falls hors du combat.) Having to determine which reading is more likely is an extra step that the 'he' case does not require.

Now to consider 'they'. Remember, we had three possible referents, all singular. 'They' is a plural pronoun. Syntactically speaking, 'they' does not fit anywhere in the tree, because there is no plural referent for it to refer to. I'll be honest, I'm not quite sure what happens next, because I know very little about how the brain processes language, but my best guess is that 'they' gets downgraded to 'it' (number being the most common difference between 'they' and the possible referents) and and the same process as before occurs. (Of course, now that 'they' is becoming more common as an anaphor for 'singular gendered animate of currently unknown gender', people may be rewriting their own syntax rules.)

Anyway, in the end, this gets me thinking about computational linguistics and how to write language generators that generate correct and non-confusing syntax. In the 'he grades it' case, we created an encoding using anaphora which had only one valid reading upon decoding. In the 'it grades it' case, the encoding has two possible readings and must be further decoded by a different piece of the language mechanism. In the 'they grade it' case, there's actually no strictly valid reading at first (due to number disagreement), and other encodings have to be tried. It is thus important for a language generator to consider what the most computationally inexpensive-to-decode encoding will be, before it transmits a sentence to a listener.

Either that, or English needs a pronoun which signifies 'singular gendered animate of currently-unknown gender', and I'll let getting that into the language be your problem. Until then, the OED and I will say 'he grades it' until you tell me that your professor is a woman.

ETA: ... and of course this is interesting to me as a computer scientist, because it hints at a potentially NP-complete problem embedded in our neurological language framework: "most effective assignment of anaphora". Of course, n is not particularly large in most cases, but we are talking about encodings that have to be decoded in realtime, and as the number of referents and anaphora in a sentence increase, the number of possible encodings rises as a permutation, which gets very large very fast...
maradydd: (Default)
It should not take two three and a half fucking hours to convince a computer to print out a PDF generated from a PostScript file generated from LaTeX. Even if that computer is a crippled Windows box in a business centre of an apartment complex.

I mean, really, Ghostscript doesn't exactly make it clear how to configure GSview to look for fonts in weird places (*grumblegrumbledirectorypermissionsyouasshats*), nor is it particularly obvious where one goes about finding the PostScript font binaries so that the .ps will load up at all, and it certainly doesn't help that the folks maintaining the Ghostscript mirrors don't seem to have checked the md5sums of their own archive recently (note to archive maintainers: when I download the same file several times and I keep getting the same incorrect hash, it is probably your fault, kthxbye), but Jesus Cluny Fuck, it takes talent to write LaTeX that incredibly awful.

Sigh. Off to go do some more editing. [livejournal.com profile] deannahoak, sorry for disappearing for so long.
Disproven hypothesis, below the cut )
And with that little bit of investigation done, back to work.
maradydd: (Default)
BoingBoing reports on an unintentionally (?) hilarious lead sentence for an article about monks who wanted to watch the Pope's funeral on TV:
Catholic monks living on an island off the coast of Wales have flown in a satellite dish to watch the Pope's funeral.
They refer to the sentence as a "crime against the English language," an assessment which I find, well, unfortunate. It's merely an example of my personal favourite kind of structural ambiguity, particle verb/prepositional phrase homonymy. (A similar example appears in the anecdotal World War II ambiguous headline, "British push bottles up German rear", though there you've also got homonymy on the third-person singular verb/zero-derivation noun "push" at work. See also several entries in the tombstone game, particularly "The Sheep Look Up John Brunner".)

If this were Language Log or Semantic Compositions, I'd have something pithy and meaningful to say about ambiguity and what a fascinating and important problem it is (just ask [livejournal.com profile] oralelk), but (1) this isn't, (2) I'm tired, and (3) the humour value entertains me more than the research about it, at least while I'm working in a different field. But it is a neat problem, and one which I find especially confounding because I don't see a visible pattern in what makes one reading normal and one funny; clearly no one else does either, otherwise it wouldn't be an open problem. (On the other hand, given that my current avenue of research deals with extracting patterns from human intuition without requiring humans to specify why their intuition says what it does, maybe it's related after all.)
maradydd: (Default)
Weber: (6:36pm) Actually, here's a lesson on presuppositions. The phrase "her BDSM porn" carries the presupposition that she *has* BDSM porn. You can't interpret the phrase if the presupposition isn't true. It's distinct from normal assertions in that it survives if you put a negation or question around it; e.g. "I don't like her BDSM porn", or "Do you think her BDSM porn is any good?"
----
SETTING: My office, after the Language and Society final. LEAH and MEREDITH are doing a Mad Lib to unwind.

LEAH: Give me a noun.
MEREDITH: Consonant.
LEAH: A celebrity.
MEREDITH: Chomsky.
LEAH: Okay, a food.
MEREDITH: Beans!*
LEAH: A liquid?
MEREDITH: L.**

* "Beans, I like" is a commonly used example of topicalisation. At least it is here.
** /l/ and /r/ are classed as "liquid" consonants.

Profile

maradydd: (Default)
maradydd

September 2010

S M T W T F S
   1234
567891011
12131415 161718
19202122232425
26 27282930  

Syndicate

RSS Atom

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags