Authentication via RADIUS : MSCHAPv2 Error 691
Wouldn't you know it, about a month after we had abandoned this approach to using a RADIUS server to control access to the SBC we find a possible solution. Of course at this point we are too busy with other projects to go back and try this solution out so I can't say 100% if this would fix the problem but once I explain you will probably agree this is the answer.
One of my colleagues was at a Microsoft conference having various discussions when it dawned on him that MSCHAPv2 relies on NTLM to generate the password challenges and responses. Now plain old MSCHAP and MSCHAPv2 (i.e. not EAP-MSCHAPv2 or PEAP) when used in Windows RAS services will use NTLMv1 by default.
As many of of you have already started to catch on, we, like many administrators, have disabled NTLMv1 on our DCs and as such the DCs will only accept NTLMv2 requests. This explains why the failure I continued to get was a "bad password" error. The password being sent to the DCs was in NTLMv1 format and was getting ignored.
Once we realized this, I was able to do some more research and I found the following article:
This article describes the same behavior I was experiencing including the E=691 error code I have mentioned. This article also provides a workaround to force RAS services to use NTMLv2 when building a MSCHAPv2 response. Funny how easy it is to find these articles after you know precisely what the issue is.
Again, I have yet to verify this is the issue as I haven't had time to rebuild the RADIUS servers that I have already decommissioned but I do plan on trying this out sometime if I get a chance. I just wanted to post this possible solution in case someone else stumbles across this issue. If anyone can confirm this before I do please let me know.
Edit - Confirmed
We were asked to take on a bigger role with these SBCs and as such we came back to this project and brought up a Windows RADIUS server again. This time we applied the registry key described in the link above. After taking a packet capture of the communication between the RADIUS server and the SBCs I can in fact now see "Access Accept" messages getting fired back at the SBCs. So I can now confirm at least in our scenario that the issue we were having is as described above.
NTLMv1 was disabled on the DCs. Setting a registry key to force the RADIUS server to use NTLMv2 fixed the issue.