HITCON CTF 2014: Crazy 500 "polyglot" writeup

This challenge required us to submit a single program that can be compiled and run as five different programming languages. The task presented us with

Just `cat flag`

$ python2 --version
Python 2.7.6
$ python3 --version
Python 3.4.0
$ gcc --version
gcc (Ubuntu 4.8.2-19ubuntu1) 4.8.2
$ ruby --version
ruby 1.9.3p484 (2013-11-22 revision 43786) [x86_64-linux]
$ ghc --version
The Glorious Glasgow Haskell Compilation System, version 7.6.3

and a file upload field below.

This suggested we needed to write a program that prints the file flag to stdout in all of

  • Python2;
  • Python3;
  • C;
  • Ruby;
  • Haskell

on a x86_64 machine running Linux.

When uploading a Python file, the output looks similar to the following:

Executing as Python 2...
Checking output of Python 2...
  ...Correct!

Executing as Python 3...
Checking output of Python 3...
  ...Correct!

Executing as C...
Checking output of C...
  Error: Output of C doesn't match the content of flag

We quickly noted that this status output potentially leaked a single bit of information about the flag file’s contents with each try. For example,

import sys
s = open('flag').read()
if len(s) > 0x100:
    print s

is a Python2 script that leaks the information whether the file is longer than 256 bytes or not. Similarly, one can extract each byte of the file (using binary search, for instance). We quickly wrote a script to obtain the flag file and got the following output:

    I'm the flag file
  BUT I don't really contains the flag.  
====!@#$%^#$#)%#)%(#)===
  To get the flag, you need to pass the challenge.    

Damn! We got trolled…

So it seemed we really had to write a working polyglot. We tried many, many things during the painful process, and in the end came up with the following:

a = 1;
b = 0;
a
    # pragma = 1; main = readFile "flag" >>= putStr {-
#define b ;
b
#if 0
eval((0 and '' or '__import__("sys").stdout.write(open("flag").read())'))
exec('cat flag')
#endif
__asm__(".section .text\n.globl main\nmain:\npush $0x67616c66\nmov %rsp, %rdi\nxor %rsi, %rsi\nmov $2, %rax\nsyscall\nmov %rax, %rdi\nmov %rsp, %rsi\nmov $0x100, %rdx\nxor %rax, %rax\nsyscall\nmov $1, %rdi\nmov %rsp, %rsi\nmov %rax, %rdx\nmov $1, %rax\nsyscall\nxor %rdi, %rdi\nmov $60, %rax\nsyscall\n");
#define X -}

There are many rewarding ideas wrapped up in this solution. Most notably,

  • all languages share the usage of " for strings, so we tried to embed as much code as possible in strings;
  • the same code can be used for Python2 and Python3;
  • Ruby and Python have very similar syntax, so we can satisfy both of them in almost the same way;
  • Haskell can define binary operators consisting of almost arbitrary characters;
  • we avoid C code by using inline assembly (note that this is a nice workaround for C’s lacking of eval and friends);
  • stderr is ignored, so we can throw runtime errors in Python and Ruby after the file content was printed.

Let’s see in detail what the program means for each language.

C

a = 1;
b = 0;

These are variable declarations. The missing type is assumed to be int, so this initializes the integers a and b with some values.

a
    # pragma = 1; main = readFile "flag" >>= putStr {-
#define b ;
b

The key idea to the whole thing is the #pragma in this snippet. gcc ignores it because “= 1; main = ...” is certainly not a pragma name, so this whole snippet simply becomes the expression “a;” after preprocessing.

Next follows the Python and Ruby code that is removed by the preprocessor because of the #if 0. Then there is the actual C code (the inline assembly) and a final #define that gcc doesn’t really care about.

Python[23]

a = 1;
b = 0;

This initializes the variables a and b.

a
    #pragma = 1; main = readFile "flag" >>= putStr {-
#define b ;
b
#if 0

In these lines, the Python interpreter checks a and b for existence, but does nothing else with them. The lines starting with # are comments.

eval((0 and '' or '__import__("sys").stdout.write(open("flag").read())'))

This is where the magic happens for Python. The 0 and '' or '...' construction is used to distinguish Ruby from Python. Python casts the integer 0 to False, thus the second string is evaluated and prints the file contents.

exec('cat flag')
#endif
__asm__("...");
#define X -}

Lines starting with # are ignored; the others are syntactically correct function calls. The call to exec throws an exception because the shell command “cat flag” is a syntax error in Python, but the server doesn’t look at stderr, so the resulting crash is not a problem.

Ruby

This is very much like Python, except in

eval((0 and '' or '...'))

the 0 is coerced to true (in Ruby, everything but false and nil is true). Thus, the eval goes through without errors in Ruby, and

exec('cat flag')

is executed. The __asm__ line throws an exception, but just like in Python, this is okay.

Haskell

This is the author’s favorite language :-).

a = 1;
b = 0;

…just defines two constants a and b, like before.

a
    # pragma = 1; main = readFile "flag" >>= putStr {-

This is the most important part: It defines an infix operator # as the binary function returning the constant 1, as in:

(#) :: a -> b -> Integer
x # y = 1
(As an insignificant detail: Note that the inferred type is actually Num c => a -> b -> c because 1 :: Num a => a is polymorphic.)

The left-hand side is spread over two lines, as signified by the second line’s indentation.

Recall that anything after “# pragma” is ignored by all other languages. Because of this, we can use the rest of this line to define our main function (using the >>= operator, because do-notation is an abomination) and start a Haskell multi-line comment using {-. Thus, everything between the next line and the closing -} (which is the very last token in the file) is ignored by ghc.

In $\Sigma$…

there’s our flag!

Executing as Python 2...
Checking output of Python 2...
  ...Correct!

Executing as Python 3...
Checking output of Python 3...
  ...Correct!

Executing as C...
Checking output of C...
  ...Correct!

Executing as Ruby...
Checking output of Ruby...
  ...Correct!

Executing as Haskell...
Checking output of Haskell...
  ...Correct!

All correct! Here's your flag: HITCON{SPacE0rNLiSAhArDPRoBleM}