Guess the language

C, 297 bytes, 43.194351% matched (v2)

This is the first non-golf challenge I've competed in. Surprisingly, golfing languages are actually rather easy to separate, with about 60% matching accuracy per language.

The code requires input as UTF-8 string, results based on version 2 of the supplied dataset. This code does not require <LF> to be replaced with actual newlines.

#define S(x)!!strstr(p,#x)
f(char*p){return S(#d)?:S(voi)?0:S(mai)|S(utc)?:S(mbd)|S(impor)|S(input)|S(def)|S(rang)?2:S(log)|S(fun)|S(=>)|S(lert)?3:S(<?)?4:S(echo)|S(sed)?5:S(+++)?6:S(<-)?7:S($_)|S(say)?8:S(\342)|S(\303)?9:S(->)|S(map)?10:S(@#)|S(]])|S([#)?11:S(V)|S(Q)?12:S(Z)|S(Y)?13:S(.)?14:15;}

Mapping table:

 0. java
 1. c
 2. python
 3. javascript
 4. php
 5. bash
 6. brainf*
 7. haskell
 8. perl
 9. apl
10. ruby
11. wolfram
12. pyth
13. matl
14. golfscript
15. cjam

The percentage is based on my hits/total calculation: 3916 hits/9066 total.


Python 3, 271 278 bytes, 25.049636% matched (v2, unverified)

def f(c):
 try:compile(c,'','exec');return 5
 except:
  for j in range(9):
   if any(l in c for l in [['echo'],['require'],['Main','string'],['document','alert','var ','function'],['String'],['def ','lambda','print '],['main','int','char'],['+++','<<<'],[]][j]):break
 return j

map:

0 = bash
1 = ruby
2 = c#
3 = javascript
4 = java
5 = python
6 = c
7 = brainf*
8 = cjam

much better golfed (probably still not great), finally broke the 25% barrier! Inputs have <LF> replaced with newline (\n)