First, my apologies for the math, and for overly simplifying the math!
The difference between DHE and ECDH in two bullet points:
- DHE uses modular arithmetic to compute the shared secret.  
- ECDH is like DHE but in addition, uses algebraic curves to generate keys (An elliptic curve is a type of algebraic curve).
The overall method in both cases is still Diffie–Hellman. (Or are we calling it Diffie–Hellman-Merkle these days?)
Perfect forward secrecy is achieved by using temporary key pairs to secure each session - they are generated as needed, held in RAM during the session, and discarded after use.
The "permanent" key pairs (the ones validated by a Certificate Authority) are used for identity verification, and signing the temporary keys as they are exchanged. Not for securing the session. 
Does that explain things a bit better?
Edit: To examine your examples in detail...
secret session key that's never shared
Well, this is the definition of DH key exchange, but isn't related to perfect forward secrecy. DH allows both parties to independently calculate the shared secret will be, without transmitting the shared secret in the clear, over the still-insecure channel.
session key that changes based on random input from both users
...Certainly both sides of the connection will use local sources of randomness to derive their temporary session keys, but I think the above phrasing misses the point: perfect forward secrecy is achieved by discarding the session keys after use.
session key that is derived from a shared secret that only the 2 users know
By now you're thinking "How does this fact give us perfect forward secrecy?" To belabor the point: perfect forward secrecy is achieved by discarding the session keys after use.