Cyrus POP3 Issue

Fri Mar 4 12:52:49 EST 2005

On Fri, 04 Mar 2005, Marco Colombo wrote:
> First of all, let me state that I don't really believe that attacking the
> internal state of the kernel PRNG is pratical at all. It is possible,
> in theory. Using /dev/urandom is pretty safe for many real-world uses.

Which *was* my whole point, that in real-world usage, we would be better off
with SASL using /dev/urandom.  OTOH, gnupg and openssl x509 stuff better use
/dev/random...

> We're discussing theory here.

If we are going to go down that road, the explanation is very simple: we
have to use /dev/random always.  There is at least one far-fetched scenario
where the attacker can keep the entropy pool empty (even the primary one
that the kernel reserves for IP traffic and other high-priority entropy
needs), and thus continuously break the PRNG and guess all session keys.

That scenario assumes a lot of stuff like complete SHA break, complete PRNG
break, capabilities to observe almost all output of the PRNG ordered on
time, etc.

So, why bother going the pure theory road, if it is useless to access just
how safe (or unsafe) a real-world decision is?  We have to at the very
least, mix both approaches.

> Henrique de Moraes Holschuh wrote:
> >Which, in a Cyrus server, won't be helpful since there is a lot of disk
> >activity and the PRNG state changes quite often, provided you don't keep it
> >drained (and thus without activity because nobody can talk to the server!).
> >
> >Using random instead of urandom goes a long way into helping to keep the
> >pool drained (the kernels usually protect a small amount of entropy 
> >won't let urandom drain it completely).
> 
> Well, the problem is simple. Let's say your server generated 100 session
> keys in the last minute. If the system collected at least 100*key size
> random bits, you could have used /dev/random. If it not, and you used
> /dev/urandom, the attacker has fewer bits to guess. Let E be this number. 
> Assuming that he knows the pool state _now_ (that is, he collects enough 
> keys and is able to break SHA1 and reverse the PRNG algorithm to obtain 
> its state), in order to get the pool state _one minute_ ago, he has to
> guess E bits only, which it is easier than guessing all 100 keys. Every
> _single_ bit you got out of /dev/urandom for your session keys when
> /dev/random would have blocked _halves_ the attacker's effort.

That is not taking into account that there are small amounts of entropy
entering the pool (which WILL mess with blind backwards-in-time predicitions,
since you cannot adjust for those.  If it is non-blind, i.e. you have some
blocks, you may be able to avoid this).  That will not make the attack
impossible, but it will increase the search space a bit (I have no idea how
much).

AND that also assumes no catastrophic reseeds, which completely reset the
PRNG, and kills any possibility of blind predictions to the past.

> >The hard facts are that most kernel entropy accounting is not anywhere 
> >close
> >to exact enough, so we cannot rely on it on anything close to a border
> >condition (such as the entropy pool being empty, which is the only time
> >urandom differs too much from random).  You *are* already depending on the
> >security of the PRNG for small ammounts of time (when the entropy in the
> >pool is "close" to zero).
> 
> Well, the accounting needs not to be accurate at all. _Underestimating_
> the randomness in the pool is perfectly safe, and that's what the kernel

Indeed.  The problem is that the errors are not always to the safe side.
The fact that all Linux kernels in the 2.4.x series will *over*estimate
entropy fed *from userspace* by a factor of 4 doesn't exactly fill me with
confidence that the other entropy accounting done by those kernels is ok.

> Since there's no way to make sure bits are "truly" random at all (all
> tests may fail, all devices may be somehow biased, and so on) you can
> only reach some level of confidence. Extracting N bits from a "good"
> source, injecting them into the pool, and accounting only M entropy bits,
> where M<<N, is the only thing you can do. The smaller M, the better.
> It's a tradeoff: your confidence vs. the cost of those N bits.

Correct.

> The kernel is quite conservative in estimating randomness, that's why
> the pool empties so easily, expecially on headless servers.

THAT is what I am NOT completely at ease with.  I sure hope it is being
quite conserative.

> And no, you're not depending on the security of the PRNG+hash when you're
> using /dev/random, assuming they're not buggy.

Even if the entropy pool has zero bits of randomness but the kernel thinks
it still has some?  That was what I was talking about...

> >Given that fact, IMHO you are much better using urandom anyway, because the
> >system will still work enough close to an empty entropy pool to regen that
> >pool a bit.
> 
> I'm not following you. Unless you're implying that the only source of
> randomness is the result of the activity the _needs_ the random bits.

Not really.  But most of the activity in a headless Cyrus server needs
TCP/IP to happen, and last time I checked, my boxes stopped talking to the
network well when I drained them completely (2.4.25 or so at that time, I
think. I should test that with 2.6.10).  Local processes doing a lot of IO
without the need of the network would avoid this issue, obviously.  As would
someone typing at a keyboard or moving a mouse around, etc.

> Well, first I think disk activity is much bigger for already connected
> clients. I know some clients out there make a new connections to the

ONLY if the network layer does not starve due to a lack of entropy.  If it
doesn't anymore, then you are perfectly correct.

> Second, giving up security only to squeeze some more entropy bits from
> your disks isn't a winning strategy, IMHO.

Well, that's what I am trying to argue.  That in the SASL case (relatively
short-lived session keys, tipically on headless servers), you are not
sacrificing too much security by using urandom, and that the loss of
security in that case outweights starving the entire machine of entropy.

Which STILL means disabling APOP if you don't need it is a major good idea.
In fact, IMHO the APOP default should be disabled, since it is not widely
used (people who would deploy APOP usually know better and deploy IMAP
instead, in my experience).

> >If, and only if, you have no new entropy going in.  If you have a few bits
> >of entropy going in, they have a reduced search space (which IS already a
> >very bad thing)... unless the kernel does a catastrophic reseed of the 
> >PRNG.
> >Which on newer Linux kernels for example, it does as often as it possibly
> >can.
> 
> I know that every single bit of entropy going in changes the state, but
> the attacker needs only to try the possible values. The reduced search
> space _is_ the problem.

Yes.  The fact that those bits come in a variable rate helps a bit, but they
are easy to find in the stream if you have a perfect break.  Catastrophic
reseeds force you to break the thing again, though, since previous state is
useless in that case.

> foresee. That is, entropy bits. In order to truly randomly reseed the PRNG,
> you need N truly random bits, where N is the size of the seed.

Exactly.

> >>It happened in the past, some software used to generate 128 bit keys was
> >>buggy and the actual number of possible combinations was much much smaller

Nerdscape was the one, wasn't it?

> >>(say 2^20). But the same PRNG and the same hash function are used by
> >>/dev/random as well, so it isn't much safer in this respect.
> >
> >It won't be any safer, you mean.
> 
> It's _likely_ safer. It depends on the kind of bug. If the bug lies
> in the routine that computes the next state, /dev/random may be safe,
> since it changes the state by external means. If the bug is in
> output generation, both /dev/random and /dev/urandom are affected.

True.

> >                                  Well, I am not afraid of the SHA search
> > space, in Linux, BSDs and Solaris, that have somewhat good PRNGs.  The 
> state
> > of the entropy pool changes unpredictably quite often enough.
> 
> What do you mean for "good" PRNG? Not invertible? That's not what you'd

One whose output is not easy to predict [while NOT drained], unlike, say,
that crap some unnamed OSes used a few years ago.

> expect from a PRNG (I expect it to be _easily_ invertible, but I don't
> think I can prove it mathematically). That's why a crypto-strong hash
> function must be used (its job is to be not easily invertible).

Yes.

Well, there are cryptographically secure PRNGs, but AFAIK the one in most
kernels ain't cryptographically secure because those are too slow.  I wish I
could call only cryptographically secure PRNGs "good enough", but that's not
where we are at, today.

If the PRNG in Linux 2.6 and at least FreeBSD and Solaris are
cryptographically secure, that would be very good news to me.

> No, the entropy pool does _not_ change often enough. This is the base
> assumption for all this discussion. If it changes often enough, you can
> use /dev/random.

True. I was in a tangent about "good PRNGs".  Sorry about that.

> >At least so far.  Someone was talking about removing "that useless 
> >feature", so keep an eye for it to make sure they don't.
> 
> Well something must be done about the "feature". If all you need is a PRNG
> whose seed it randomly changed now and then (ala /dev/urandom, when the
> pool is almost empty), you'd better implement your own in user space, and
> _always_ use that.

I'd rather have it fixed where it will do the most good, which is in the
kernel.  I don't want to see what happens in the Windows world happening in
the Unix world...

> Let me explain better. Let's say your cyrus server needs 100k/bits random 
> bits per second. The kernel gets 25kbit/s (_you_ should underestimate
> them as well, btw, adding less to the counter). So, every 100000 bits

That source has H > 0.998, and I am already underestimating it to H=0.99,
which is good enough for my needs. I could be really paranoid and use 0.75
or even 0.5, but it is a slow source, so that would hurt.

> of key space, there are only 25000 random bits the attacker needs to
> guess. That is, if you're using 128-bits session keys, you could just use
> 32-bits session keys.

Plus whatever the kernel managed to get from other sources.  But yes, you
are mostly correct.

But are you talking about a fully-compromised /dev/urandom above, or about a
non-compromised /dev/urandom?  For a fully-compromised urandom, I agree with
you.

> So, first, what's the point in making your application believe it's
> generating 128-bit keys, when the search space is, per key, only 32bits?

None.

> Secondly, assuming you're fine with it, the question is: why don't you
> _always_ drain 25kbit/s only, if that is enough for you?

I draw less than that most of the time, and for *my* case telling SASL to
use /dev/random is fine.  Heck, I have always about 100k bits of entropy
bufferized and waiting for a load spike, and I could easily make that 100M
bits if I needed to and wanted to waste that much real memory (pages with
random data are obviously locked and cannot be swapped).

We are talking defaults here, for people without HRNGs.  If that was not
clear, I apologise.

> Why don't you use your private PRNG, extract 100kbit/s from it, and get
> only 25kbit/s from the kernel, _always_? By using /dev/urandom, you're
> asking for more than, by definition, you need.

Good point.  But I would actually argue that we should enhance /dev/urandom
to be configurable to be able to do exactly that.

> As far as the application is concerned, the following can happen:
> read [a] gets 1024 random bits, of which 1024 are truly random
> read [b] gets 1024 random bits, of which none is truly random

True.

> So yes, I am one who thinks that /dev/urandom is a useless feature
> (almost). Run a user space daemon instead, which applications can register
> to, and ask for the quality they need, and no more.

I'd say that /dev/urandom *would* be an useless feature if we could trust
people to know what they are doing when writing their PRNGs. We cannot, so
we better improve urandom to have better resource reservation semanthics.

It *could* be teached to set a maximum output of real entropy per calling
process, if one is willing to write the code.  That would fix the issue you
talk about.  And IMHO it would be a safer (in the real world) way to go for
it.

But yes, it does not have to be done in kernel space.  In which case we are
probably better off removing urandom or making it a named pipe managed by
that user space daemon.  *And* teaching SASL to use the daemon instead of
mucking with /dev/random (and there we go arguing it all again ;-) ).

-- 
  "One disk to rule them all, One disk to find them. One disk to bring
  them all and in the darkness grind them. In the Land of Redmond
  where the shadows lie." -- The Silicon Valley Tarot
  Henrique Holschuh
---
Cyrus Home Page: http://asg.web.cmu.edu/cyrus
Cyrus Wiki/FAQ: http://cyruswiki.andrew.cmu.edu
List Archives/Info: http://asg.web.cmu.edu/cyrus/mailing-list.html