Integer.valueOf Arabic number works fine but Float.valueOf the same number gives NumberFormatException

It seems that Float.parseFloat() does not support Eastern-Arabic numbers. Alternatively, you can use NumberFormat class:

Locale EASTERN_ARABIC_NUMBERS_LOCALE = new Locale.Builder()
                                                 .setLanguage("ar")
                                                 .setExtension('u', "nu-arab")
                                                 .build();
float f = NumberFormat.getInstance(EASTERN_ARABIC_NUMBERS_LOCALE)
                      .parse("۱٫۵")
                      .floatValue();
System.out.println(f);

OUTPUT:

1.5

Answer

In Float.valueOf("۱") there is no check for different languages or character, it only checks the digits 0-9. Integer.valueOf uses Character.digit() to get the value of each digit in the string.

Research/Explanation

I debugged the statement Float.valueOf("۱") with Intellij debugger. If you dive into FloatingDecimal.java, it appears this code determines which character should be counted as a digit:

  digitLoop:
        while (i < len) {
            c = in.charAt(i);
            if (c >= '1' && c <= '9') {
                digits[nDigits++] = c;
                nTrailZero = 0;
            } else if (c == '0') {
                digits[nDigits++] = c;
                nTrailZero++;
            } else if (c == '.') {
                if (decSeen) {
                    // already saw one ., this is the 2nd.
                    throw new NumberFormatException("multiple points");
                }
                decPt = i;
                if (signSeen) {
                    decPt -= 1;
                }
                decSeen = true;
            } else {
                break digitLoop;
            }
            i++;
        }

As you can see, there is no check for different languages, it only checks the digits 0-9.

While stepping through Integer.valueOf execution,

public static int parseInt(String s, int radix)

executes with s = "۱" and radix = 10.

The parseInt method then calls Character.digit('۱',10) to get the digit value of 1.

See Character.digit()


The specification of Float.valueOf(String) says:

Leading and trailing whitespace characters in s are ignored. Whitespace is removed as if by the String.trim() method; that is, both ASCII space and control characters are removed. The rest of s should constitute a FloatValue as described by the lexical syntax rules:

FloatValue:
  Signopt NaN
  Signopt Infinity
  Signopt FloatingPointLiteral
  Signopt HexFloatingPointLiteral
  SignedInteger
...

The closest lexical rule to what you have is SignedInteger, which consists of an optional sign, and then Digits, which can only be 0-9.

Digits:
  Digit
  Digit [DigitsAndUnderscores] Digit

Digit:
  0
  NonZeroDigit

NonZeroDigit:
  (one of)
  1 2 3 4 5 6 7 8 9

On the other hand, Integer.valueOf(String) refer to Integer.parseInt(String), which simply says:

The characters in the string must all be decimal digits, except that the first character may be an ASCII minus sign

"Decimal digits" is broader than 0-9; anything in the DECIMAL_DIGIT_NUMBER can be used, for example "१२३" (shameless plug).

More precisely, .


So, this is behaving as specified; whether you consider this to be a correct specification is another matter.