a file is a file is a file.

Ever wondered how a software finds out that this file named “filename” is a pdf, jpeg, movie? There are several thousands, probably hundreds-of-thousands of fileformats out there. Some of them are used many times a day without us even noticing. We’re just moving an image from A to B not caring about what constitutes an image file and what makes a jpeg different to a png image.

Now for pure academic reasons there is one file that is many (no, not borg). It’s a file that is:

“CorkaMIX.exe is simultaneously a valid: * Windows Portable Executable binary * Adobe Reader PDF document * Oracle Java JAR (a CLASS inside a ZIP)/Python script * HTML page

It serves no purpose, except proving that files format not starting at offset 0 are a bad idea. Many files (known as polyglot) already combines various langages in one file, however it’s most of the time at source level, not binary level.”

Source 1: http://code.google.com/p/corkami/downloads/detail?name=CorkaMIX.zip