The issue
I am currently designing the backend for a SPA (Single Page Application), which I'm planning to construct in a fairly RESTful manner. The backend will ideally be just a thin layer between the client and the database. Almost all data in the database will be keyed to a specific user, which is why I need some form of authentication system.
Since we live in a universe where humans are fairly predictable and machines can be really really fast we obviously have to make password hashing and password verification slow (normally using key stretching schemes such as bcrypt or whatnot). This is all fine and dandy, but it complicates life for us poor souls who want to design fast and snappy applications using a RESTful backend, because ideally such a backend would not need to store any session data, but just authenticate every single request individually (using for example Basic Access Authentication).
However this would mean hashing the users password for every single request, which would add a painful penalty for every request (and possibly making DDoS attacks easier). Of course this issue can be solved by caching user credentials in memory at the backend, but that solution doesn't feel very clean and raises a few other issues, such as handling cache retirement and forcing the client to store credentials in plain text.
So in short, my issue is that I need some form of system to handle the authentication of users in a snappy way, but ideally still being able to avoid any server side state.
My proposed solution
First of all the traffic between the client and the server will be encrypted using bog standard TSL, so we should not be vulnerable to eavesdroppers nor man-in-the-middle attacks.
The solution that I have thought of is to issue a token to the client, upon initial successful authentication using credentials sent in plain text over TSL, which contains the necessary information needed for the server to authenticate a user. This token would be calculated in the following way:
key = a random key generated when the server starts, never to be shared
nonce = just some random bytes grabbed out of the air
token = nonce + timestamp + user_id + HMAC(key, nonce + timestamp + user_id)
This would allow the server to check whether the token is valid by simply validating the HMAC for every request, which is very cheap to do (about as cheap as doing a lookup in a hash table, but entirely sidestepping the need for a hash table in the first place). If the token is valid and has not expired (the timestamp is not to be allowed to be too old) I let the request proceed, and letting the database handle the authorisation issue.
This feels like a way to efficiently and securely authenticating users, in addition to requiring surprisingly few LoCs to implement as well as totally avoiding any kind of state in the backend (the server application would use a constant amount of memory throughout its lifespan).
My true question
Now this scheme may seem totally smashing at first sight, but after looking at it a while I see two potential issues with it that brings up questions:
There is no way to retire a token, except waiting for it to expire. Is this actually an issue or is it me being paranoid? Seeing as it must be stored client side the security is immediately compromised if any malicious party gets access to a client before the token expires. Sure, the danger is contained to the timespan that the token is valid, but for this SPA that timespan might be counted in days.
How do we scale? The secret server side key would have to be shared amongst all servers in the cluster, but how would we do this securely, and when should we retire a server side key?
There might also be other issues, but these are the two glaring ones that stood out to me. What are your thoughts on this scheme and these issues?
Note that this question is not a general question about secure token generation or session management (as that has been answered many times here), but about the peripheral issues regarding this specific scheme, issues that I have not seen discussed.