Edgewall Software
Modify

Opened 13 years ago

Closed 13 years ago

Last modified 3 years ago

#10397 closed defect (fixed)

hex_entropy() produces same digest under forking server

Reported by: songofacandy@… Owned by: Remy Blank
Priority: normal Milestone: 0.12.3
Component: general Version: 0.12.2
Severity: major Keywords:
Cc: Branch:
Release Notes:
API Changes:
Internal Changes:

Description

I use Trac with customized forking server that imports imports trac before fork to reduce initialization time.

On that server, trac produces same cookie for many people.

I found Trac stores random number generator on global variable. This makes all of forked server produces same hexdigest.

I think RNG should not initialized at import time. Following patch fixes it.

--- a/trac/util/__init__.py
+++ b/trac/util/__init__.py
@@ -554,9 +554,9 @@
     return info
 
 # -- crypto utils
-_entropy = random.Random()
 
 def hex_entropy(bytes=32):
+    _entropy = random.Random()
     return sha1(str(_entropy.random())).hexdigest()[:bytes]

Attachments (3)

10397-thread-safe-random-r10828.patch (3.1 KB ) - added by Remy Blank 13 years ago.
Per-thread random generator.
10397-lazy-random-r10828.patch (836 bytes ) - added by Remy Blank 13 years ago.
Single lazily-initialized generator.
10397-use-urandom-r10832.patch (2.1 KB ) - added by Remy Blank 13 years ago.
Use os.urandom() if available.

Download all attachments as: .zip

Change History (20)

comment:1 by Jun Omae, 13 years ago

Milestone: 0.12.3
Version: 0.12.2

The code, _entropy = random.Random(), has been committed in [10115] against a plugin which calls random.seed() (see comment:ticket:8969:16).

However, I understand that the code has a problem in your situation.

I think it would be better as the following.

  • trac/util/__init__.py

    diff --git a/trac/util/__init__.py b/trac/util/__init__.py
    a b  
    558558_entropy = random.Random()
    559559
    560560def hex_entropy(bytes=32):
    561     return sha1(str(_entropy.random())).hexdigest()[:bytes]
     561    s = '%s/%s/%s' % (os.getpid(), time.time(), _entropy.random())
     562    return sha1(s).hexdigest()[:bytes]
    562563
    563564
    564565# Original license for md5crypt:

comment:2 by Remy Blank, 13 years ago

We could also use os.urandom() as an entropy source, with a fallback in case it raises NotImplementedError (as mentioned in the doc).

comment:3 by songofacandy@…, 13 years ago

I believe os.urandom() is the best way to produce cookie. random.Random() uses os.urandom at __init__ and fallback to time.time if it raises NotImprementedError.

It is one reason for I against having _entropy as a global. Sharing one entropy between forked processes seems very bad at security point of view.

in reply to:  3 comment:4 by Remy Blank, 13 years ago

Replying to songofacandy@…:

It is one reason for I against having _entropy as a global. Sharing one entropy between forked processes seems very bad at security point of view.

Well, your use case is somewhat special. The normal use case is for each process to fully initialize its environment. The issue comes from sharing an initialized environment.

Although… come to think of it, the Random instance isn't marked as thread-safe in the documentation, which suggests creating one instance per thread. So we actually have an issue also in the common multi-threaded case.

We could use a thread-local object, and instantiate a new Random instance in each thread on first use. This would ensure that each thread has its own independent and properly seeded generator. But it also sounds a bit complicated. Better ideas welcome.

comment:5 by INADA Naoki <songofacandy@…>, 13 years ago

random.Random() is not a cryptographic RNG so reusing it repeatedly is also dangerous in normal web server like mod_wsgi.

in reply to:  5 ; comment:6 by Remy Blank, 13 years ago

Replying to INADA Naoki <songofacandy@…>:

random.Random() is not a cryptographic RNG so reusing it repeatedly is also dangerous in normal web server like mod_wsgi.

We're passing its output through SHA1, so it should be fine.

comment:7 by Remy Blank, 13 years ago

Owner: set to Jun Omae

Heh, I had a bit of an over-engineering session here ;)

First, a small correction to what I wrote in comment:4. Python's random number generator is actually thread-safe. The documentation only suggests creating separate generators for each thread if the generators shouldn't share state.

I have implemented two rather complicated fixes for this issue:

And then, there's Jun's patch above, which is very simple, readable, and probably works just as well. So unless anyone objects, we're going to go with Jun's solution.

by Remy Blank, 13 years ago

Per-thread random generator.

by Remy Blank, 13 years ago

Single lazily-initialized generator.

in reply to:  6 ; comment:8 by INADA Naoki <songofacandy@…>, 13 years ago

Replying to rblank:

Replying to INADA Naoki <songofacandy@…>:

random.Random() is not a cryptographic RNG so reusing it repeatedly is also dangerous in normal web server like mod_wsgi.

We're passing its output through SHA1, so it should be fine.

No! SHA1 is just a hash. It makes difficult to attack but not impossible.

Session key maker should take entropy repeatedly. Random() takes entropy only at initialization time.

comment:9 by INADA Naoki <songofacandy@…>, 13 years ago

FYI: https://www.owasp.org/index.php/Guide_to_Cryptography#Reversible_Authentication_Tokens

The only way to generate secure authentication tokens is to ensure there is no way to
predict their sequence. In other words: true random numbers.
It could be argued that computers can not generate true random numbers, but using new
techniques such as reading mouse movements and key strokes to improve entropy has 
significantly increased the randomness of random number generators. It is critical 
that you do not try to implement this on your own; use of existing, proven implementations 
is highly desirable.

in reply to:  9 ; comment:10 by Jun Omae, 13 years ago

Replying to INADA Naoki <songofacandy@…>:

FYI: https://www.owasp.org/index.php/Guide_to_Cryptography#Reversible_Authentication_Tokens […]

Thanks for the information! Our implemention for the token looks it is able to predict the sequence, in my patch also.

Another patch is here; Always use os.urandom() directly if available. Instead of 'Random()` class which isn't thread safe. When the method isn't available, it is probably still able to predict….

Thoughts?

  • trac/util/__init__.py

     
    555555
    556556# -- crypto utils
    557557
    558 _entropy = random.Random()
     558try:
     559    os.urandom(4)
     560    def hex_entropy(bytes=32):
     561        return sha1(os.urandom(8)).hexdigest()[:bytes]
     562except NotImplementedError:
     563    def hex_entropy(bytes=32):
     564        s = '%s/%.16e/%s' % (os.getpid(), time.time(), id({}))
     565        return sha1(s).hexdigest()[:bytes]
    559566
    560 def hex_entropy(bytes=32):
    561     return sha1(str(_entropy.random())).hexdigest()[:bytes]
    562567
    563 
    564568# Original license for md5crypt:
    565569# Based on FreeBSD src/lib/libcrypt/crypt.c 1.2
    566570#

in reply to:  8 ; comment:11 by Remy Blank, 13 years ago

Replying to INADA Naoki <songofacandy@…>:

No! SHA1 is just a hash. It makes difficult to attack but not impossible.

I defy you to predict the next session key, given only the current session key (in a reasonable time). You can't, because you would have to derive the current state of the random generator from the session key, which is computationally difficult because the hash is a (cryptographically secure) one-way function.

Our current implementation of hex_entropy() isn't broken. I'm not against using urandom(), which is indeed the safest way to go, provided:

  • It's not massively slower than what we currently have. It shouldn't be, and a quick test should show that.
  • The fallback if urandom() isn't available is at least as good as what we have today. Jun's implementation in comment:10 isn't, because it removes the entropy generator in that case, and only relies on os.getpid() and time.time(). This is insufficient, as the time can be predicted, and the other items are constant and could be leaked by other means.

Also, Jun, you could simply add os.getpid(), time.time() and _entropy.random(), and take the str() of the sum (saves a few string operations).

I'm also open to completely new implementations (for the fallback), provided they have been shown secure. That is, I only want to exchange our non-proven, but probably reasonably secure implementation with a new one if the latter has better characteristics.

in reply to:  10 comment:12 by INADA Naoki <songofacandy@…>, 13 years ago

Replying to jomae:

I love your patch.

in reply to:  11 comment:13 by INADA Naoki <songofacandy@…>, 13 years ago

Sorry, I may wrong. I don't know it's safe or not now.

Replying to rblank:

Replying to INADA Naoki <songofacandy@…>:

No! SHA1 is just a hash. It makes difficult to attack but not impossible.

I defy you to predict the next session key, given only the current session key (in a reasonable time). You can't, because you would have to derive the current state of the random generator from the session key, which is computationally difficult because the hash is a (cryptographically secure) one-way function.

Our current implementation of hex_entropy() isn't broken. I'm not against using urandom(), which is indeed the safest way to go, provided:

  • It's not massively slower than what we currently have. It shouldn't be, and a quick test should show that.
  • The fallback if urandom() isn't available is at least as good as what we have today. Jun's implementation in comment:10 isn't, because it removes the entropy generator in that case, and only relies on os.getpid() and time.time(). This is insufficient, as the time can be predicted, and the other items are constant and could be leaked by other means.

Also, Jun, you could simply add os.getpid(), time.time() and _entropy.random(), and take the str() of the sum (saves a few string operations).

I'm also open to completely new implementations (for the fallback), provided they have been shown secure. That is, I only want to exchange our non-proven, but probably reasonably secure implementation with a new one if the latter has better characteristics.

in reply to:  11 ; comment:14 by Jun Omae, 13 years ago

Replying to comment:7:

First, a small correction to what I wrote in comment:4. Python's random number generator is actually thread-safe. The documentation only suggests creating separate generators for each thread if the generators shouldn't share state.

Sorry, now I just read it.

Replying to comment:11:

Our current implementation of hex_entropy() isn't broken. I'm not against using urandom(), which is indeed the safest way to go, provided:

Ok. I agree that the current implementation which uses Random.random() with SHA-1 makes it enough difficult to predict using a rainbow table and some other approachs.

Fixing the original issue; +1 for 10397-lazy-random-r10828.patch, I think a same entropy should not be shared between forked processes.

BTW, hex_entropy() cannot generate a token longer than 40 bytes. But we pass 64 in the unit test at trac/util/tests/__init__.py. Should we fix the method?

in reply to:  14 comment:15 by Remy Blank, 13 years ago

Replying to jomae:

Fixing the original issue; +1 for 10397-lazy-random-r10828.patch, I think a same entropy should not be shared between forked processes.

Are you sure? Your patch in comment:1 is much simpler, and solves the issue as well.

To be honest, I would prefer using os.urandom() in hex_entropy(), and fall back to your impelmentation if the former isn't available. Path coming shortly.

BTW, hex_entropy() cannot generate a token longer than 40 bytes. But we pass 64 in the unit test at trac/util/tests/__init__.py. Should we fix the method?

Yes, we should. That was my fault, there.

comment:16 by Remy Blank, 13 years ago

Owner: changed from Jun Omae to Remy Blank

10397-use-urandom-r10832.patch is my last attempt. It uses os.urandom() if available, and falls back to a technique similar to comment:1 if not. It exposes the random sequence generator as urandom(), and should generate different sequences in different processes, therefore solving the original issue.

Last edited 13 years ago by Remy Blank (previous) (diff)

by Remy Blank, 13 years ago

Use os.urandom() if available.

comment:17 by Remy Blank, 13 years ago

Resolution: fixed
Status: newclosed

Yes, I'm pretty sure using os.urandom() is the best solution. Patch applied in [10834].

Modify Ticket

Change Properties
Set your email in Preferences
Action
as closed The owner will remain Remy Blank.
The resolution will be deleted. Next status will be 'reopened'.
to The owner will be changed from Remy Blank to the specified user.

Add Comment


E-mail address and name can be saved in the Preferences .
 
Note: See TracTickets for help on using tickets.