DUDE! Where’s My Auth Request? Adventures in RADIUS Load Balancing

June 16th, 2019

Environment:

Greenfield multi-tiered Citrix ADC deployment, Citrix Gateways on processing tier configured with multi-factor authentication with RADIUS.

Products:

  • Citrix ADC (NetScaler) MPX 8905 running 12.1 b52.15
  • PingID via RADIUS

Issue:

So this particular issue has come up a few times now in my configs across various RADIUS-capable authentication products. Essentially, when setting up RADIUS configurations on the ADC, authentication servers configured to authenticate directly to the RADIUS server work fine in testing, but against an LBVIP for RADIUS, it fails due to non-response. The authentication attempt flat out just disappears without discernable error in the flow when analyzing dumps.

I’ve seen this occur on both routable VIP owned by the ADC as well as APIPA VIP addresses but have only managed to sort the resolution out on this recent APIPA address use case. In other instances where routable IPs were used for the VIP (appliances running 11.1 if it’s of interest), the authentication server could authenticate against another routable IP for the same RADIUS servers hosted on another Citrix ADC, but not it’s own with the exact same config.

RADIUS VIP was created following  Citrix guidance, ensuring lbmethod employs the proper token \ rule configurations to effectively load balance the servers correctly without creating similar errors to what was being experienced.

add lb vserver lbvs_pingID_RADIUS_1812 RADIUS 169.254.100.4 1812 -lbmethod TOKEN -rule CLIENT.UDP.RADIUS.USERNAME

Firewall rules were of course checked, and traffic from UDP-ECV monitors was seen traversing the firewall successfully from SNIP to back-end server. The RADIUS server however never receives an authentication request at all. As is often the case, the customer and I were assuming a network issue beyond the ADC but things just weren’t adding up.

After running variations of traces via nstcpdump.sh including nstcpdump.sh port 1812, we could see monitor traffic leaving the SNIP, however, if attempting to test RADIUS using the “Test RADIUS Reachability” function on the authentication server, we merely see the authentication request from the NSIP toward the APIPA LBVIP for RADIUS and nothing else after. It would appear at the application level, traffic flow simply dies off from client to VIP.

Root Cause & Resolution:

Root cause is a bit challenging to determine, I was unable to find rhyme or reason for this behaviour as LDAP APIPA VIPs on the same appliance works perfectly fine when brokering authentication. This matter seemed to be unique to RADIUS. So I’m trumping this up to Citrix ADC being… Citrix ADC.

After exhausting multiple avenues of troubleshooting including tweaking of our management traffic PBR, toggling MBF on and then off (because MBF is a routing crutch and has few solid use cases), creating a PBR to ensure SNIP sourced the requests to those back-end servers, an APIPA SNIP, creating a new VIP entirely, and toggling various load balancing features, the final resolution lied in using a net profile forcing traffic to source from the SNIP, and bound it to the LBVIP.

Once this net profile was created under System > Network > Net profile, and bound it to the LBVIP, our authentication requests actually went through the rest of the traffic flow we’d expect (NSIP -> VIP -> SNIP -> RADIUS SRV, and then back again for the response) instead of completely disappearing into the ether. We can see below when testing authentication actions, our trace exhibits the expected flow when observing nstcpdump.sh port 1812 on the appliance.

Hope this helps others!

 

 

Leave a Reply

avatar
  Subscribe  
Notify of