Small world

Jan. 2nd, 2008 07:52 pm
maradydd: (Default)
[personal profile] maradydd
There's a post up on BoingBoing today (ok, yesterday for me) about open vs. closed search algorithms, suggesting that the search algorithms used by Google, Yahoo et al are bad because of their lack of transparency. It invokes a comparison to an important concept in computer security: "security through obscurity" is dangerous because an effective encryption scheme should be equally hard to break whether you know the internals of the algorithm that generated the ciphertext or whether you don't.

I think comparing this to search is a bad (or at best misleading) idea, and expounded on this in the comments. But I'm far more entertained by the fact that the two best comments on the post so far come from two sources with whom I am tangentially familiar, albeit from totally different directions: [livejournal.com profile] jrtom and [livejournal.com profile] radtea. Small damn world!

(no subject)

Date: 2008-01-04 06:05 pm (UTC)
From: [identity profile] enochsmiles.livejournal.com
Where does your entropy for "the random version of Algorithm A" come from?

Hand-waving?

(no subject)

Date: 2008-01-04 06:22 pm (UTC)
From: [identity profile] jrtom.livejournal.com
I'm not a crypto algorithms expert, so let me make sure that I understand what you mean by 'entropy' in this case. I'm guessing that you mean "what means does A employ to generate its random sequence?"; if not, please clarify.

*shrug* In any case, I'm not sure it really matters that much, but I had in mind something like a simple LCG, whose outputs are in the range [1,n] where n is the number of web pages you've got.

As I said, though, there are additional sources of entropy in this system, those being the other users' queries (which cause invocations of the LCG that you don't see) and the addition of pages to the system (which causes the LCG to get reset with a different value for n, at times that you can't observe or predict with precision, even if you were the only one submitting queries.)

As a non-expert who hasn't studied crypto in a long time, I'm assuming that an attack would require being able to generate an uninterrupted sequence of observations of the output of the LCG. I'm quite possibly mistaken on that point.

(no subject)

Date: 2008-01-04 06:40 pm (UTC)
From: [identity profile] enochsmiles.livejournal.com
"what means does A employ to generate its random sequence?"

Close enough. I'm talking about the quality of the randomness of the "random" sequence. See Shannon.

but I had in mind something like a simple LCG

pwnd.

You could have said "use a cryptographically secure PRNG" (which reduces to "use the random oracle in my proof", but since we're practitioners here and not theorists, we don't care -- we just have to recognize that the output of the PRNG may, in fact, be deterministic despite what we think now), or perhaps "use a noisy diode or some other truly random source of entropy" though that's... expensive, in terms of entropy, and probably infeasible given the system we're talking about. But, LCGs are not suitable for this, or most use-cases that require a secure entropy source. (I'm rather swamped with some work right now, or I'd dig up the nice LCG lattice work that was done... a while ago, but not only are they deterministic, but they're also easily precomputable for many inputs, some of which do not even produce random sequences!)

But, I think my point has been made. We don't know how to design algorithms that produce truly random output from a fixed input seed -- sure, we can do things like change the input seed before the expected period of the algorithm elapses, etc., or we can go the expensive route and produce one random bit of input per one random bit of output (as is needed to make one-time-pads unconditionally secure), but even if you're talking about the "one-time-pad" equivalent of a search algorithm, you still have that problem of truly random "randomness" to contend with.

This isn't a trivial problem. The first version of SSL, in Netscape, was broken because of a (much stronger on paper than LCG, but poorly seeded) flawed PRNG, and we're still seeing these problems in systems being designed right now. Dismissing the last 63 years of research into the entropy of a presumed-random stream of data is a mistake that far too many implementors make.

(no subject)

Date: 2008-01-04 06:58 pm (UTC)
From: [identity profile] jrtom.livejournal.com
pwnd

Oh, quite. I rather expected that. :)

I should have clarified: I know that LCGs are not, shall we say, precisely the cutting edge of PRNGs. I'd assumed that a successful attack would at least be facilitated by being able to observe an uninterrupted sequence of outputs; perhaps that's irrelevant.

I still believe that, due to certain practical concerns, that subverting Algorithm A would be at least very difficult (in the same way that breaking the output of a good crypto algorithm via brute force is very difficult), but I acknowledge that I may not have technically demonstrated that it's _impossible_.

Rest assured, if I ever need a good (i.e., cryptographically sound) source of randomness, I'm not going to assume that I already know how to do that. :)

(no subject)

Date: 2008-01-04 07:02 pm (UTC)
From: [identity profile] maradydd.livejournal.com
Not necessarily an uninterrupted sequence of outputs; ISTR that polynomial interpolation is a common first step in figuring out a PRNG's seeds. In any case, any attack that works on a stream cipher will frequently be useful against a PRNG as well (maybe not in the general case, but there are equivalences between certain stream ciphers and certain PRNGs).

(no subject)

Date: 2008-01-04 07:04 pm (UTC)
From: [identity profile] enochsmiles.livejournal.com
Rest assured, if I ever need a good (i.e., cryptographically sound)
source of randomness, I'm not going to assume that I already know how to
do that. :)


Alas, many of your colleagues will.

(And you're right that it makes it harder on the attacker if they don't have an uninterrupted stream, but you're vastly overestimating how much harder, especially in the LCG case.)

(no subject)

Date: 2008-01-04 07:00 pm (UTC)
From: [identity profile] enochsmiles.livejournal.com
It occurs to me that we're running up against something I see in academia/industry time and time again. One field knows that something is trivially breakable, while another field has no notion that there's anything wrong with it. But the two fields don't talk, because field A doesn't see anything interesting in field B, and field B doesn't see how field A applies (because, likewise, field A is uninteresting to field B.)

A lot of my success in research has simply been the putting-together of an insight from field A, a problem from field B, and an observation from field C to answer a question in field D. I sort of think that's cheating, but hey, it tends to advance the fields involved, either by leading them to better answers, or showing them what research has utility (gasp) to other fields.

And this is all within the field of computer science! Science in general has gotten too big, too fragmented, and too specialized to do anything but continue to build more narrow tunnels toward nothing. It's quite frustrating, but I'm getting well off topic here.

Profile

maradydd: (Default)
maradydd

September 2010

S M T W T F S
   1234
567891011
12131415 161718
19202122232425
26 27282930  

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags