Kerberos is Easy – Part 1

191432700_dd8a10fa7e_mI’m about to say two words that bring tears and frustration to most application developers and administrators alike.  Are you ready?  Okay…. Kerberos authentication.  There, I said it and if that was not bad enough I’m going to say another phrase that will cause rage and a fit that makes Lewis Black look calm…. Kerberos is easy.  That boom you hear was me dropping the mic, yea I said Kerberos was easy.  So if you’re still reading you are more than likely saying this guy is full of $%*! and you’d be somewhat right but that’s a topic for another time.  Now, the reason I say Kerberos is easy is because most problems can usually be traced down to a fuzzy understanding on how APM authentication and Single Sign-On (SSO) work or the following common Kerberos issues:

  • Service Principal Name (SPN) issues
  • DNS issues
  • Time issues
In this blog series we’ll examine how F5 APM and Kerberos work together as well as how to troubleshoot the three most common issues that plague Kerberos implementations.  So lets get started.

APM and Kerberos

In regards to Kerberos and F5 Access Policy Manager (APM) the below information and advice will save you a lot of time and hopefully some hair; for me it’s too late… Kerberos took the best of me a long time ago.

  1. Client authentication is completely separate from server authentication
  2. For the love of all that is holy please don’t troubleshoot client authentication and server authentication issues at the same time

Authentication with a Full Proxy

So if you’ve every seen me present I typically pound the drum over and over again about how F5 is a “dual-stack full proxy” which is fancy marketing speak for two completely separate TCP stacks.  What I mean by this is the client has a completely separate TCP connection from the server. This gives the F5 a lot of power but it also confuses many administrators that are new to the BIG-IP platform.

Full proxy

So for authentication this means that auth requests to a website do not just “pass-through” the F5 to the server.  The user authenticates to the F5 and then the F5 authenticates to the server. When we’re dealing with Kerberos this means that Kerberos can refer to two authentication options:

  • client authentication to the F5
  • server authentication from the F5 using Kerberos Constrained Delegation

Now we typically refer to the client side as AAA authentication and the server side as Single Sign On. If you’re new to this I high recommend you checkout Brett Smith’s Single Sign On (SSO) using Kerberos post on DevCentral.  This will walk you through how to configure the APM AAA object and how to configure the Kerberos SSO object.

Break The Issue Down

Once you’ve setup the Kerberos AAA authentication and the Kerberos SSO you’ll typically fire up a browser and try to authenticate against APM and expect to see your protected website.  Now, if you one of the lucky few this will just work and fireworks will start going off in the distance as a parade in your honor proceeds past the water cooler.  For the rest of us this will fail in epic fashion and we’ll start channeling Lewis Black again as we yell profanity at our computer; I promise you that 1/100 times this will make things work so I strongly encourage this practice.  At this point take a deep breath and start breaking the problem into smaller size bites that we can address.

1. Did APM authenticate the user correctly?

If you look under Managed Sessions in the APM menu and you see a green ball with your username in the table then Kerberos AAA is working.  If not, we now know what to focus on and we can start troubleshooting this aspect of the APM policy without worrying about the Kerberos SSO – which you should ignore until this step works.

2. Did your browser present a pop-up asking for username/password

This is typically a tell-tell sign that Kerberos SSO did not work.  Now, you can troubleshoot this with your APM policy as is but I like to build a separate APM policy that only deals with Kerberos SSO and does not perform client authentication. This allows me to quickly progress through multiple testing iterations without constantly logging in (keep in mind that Kerberos SSO can be used with any form of client authentication from SAML to Forms Auth to Basic).  I credit Kevin Stewart for this trick and it has greatly reduced the time it takes for me to solve Kerberos issues.   So what does this policy look like?

KCD-VPE variable-assign

Pretty simple, you only need a Variable Assign and set the APM session variables that the Kerberos SSO configuration is looking for:

  • session.logon.last.domain
  • session.sso.token.last.username

Up Next

In the next post we’ll examine the troubleshooting steps you’d take to troubleshoot Kerberos Authentication and then we’ll tackle Kerberos SSO.

Featured image courtesy of renee.

APM Troubleshooting with ADTest

Overview

When I first started working in IT it drove me crazy when users would verify if their Internet connection was working by opening a browser and try to get to Google.  Ideally they should have used ping and progressed through the process of pinging their gateway then their exit router and then a public DNS server to determine if their Internet connection was working – yea right!

Well, I have this same feeling when someone that’s new to APM configures AD authentication and then immediately opens a browser and tries to authenticate to an APM application only to find it doesn’t work.  The problem with this approach is that the APM login page will give you very little data as to why this did not work.  A better method would be to use a tool on the F5 that can test AD authentication and ensure that your network, DNS and firewall settings are all correct so you can ensure AD authentication stands a good chance at being successful.  Such a tool exists via the CLI and it’s called adtest.

I’ve mentioned adtest before in a few of my posts but I’d like to give you a little more insight into how I use this tool.  When working with Active Directory there are a few things you need to know regarding how APM performs authentication and data retrieval.  When using the Active Directory AAA object APM will use Kerberos for authentication and LDAP for AD Query processes.  APM authenticates the user via their credentials and not via the service account configured in the AD AAA object.  So for Kerberos authentication to work you must have DNS configured correctly on your BIG-IP and you need to ensure the BIG-IP can access port 88 on the Active Directory Domain Controllers.  Something to note, adtest does not use the data in your AD AAA object but instead these options are entered by you via the CLI when executing the adtest command.

So lets get down to it:

adtest -t auth -d 10 -r f5guru.com -u gututest

So what does this command do?

  • -t auth tells adtest that we’ll be authenticating against AD versus querying.
  • -d 10 sets the debug level to it’s highest setting so we can see all errors that may occur
  • -r defines the realm we’ll be authenticating against
  • -u defines the username we’re authenticating with

So what can go wrong when using this command?

1. DNS not configured or misconfigured

If adtest states that it can’t find the defined realm then either the BIG-IP DNS configuration is not correct or your DNS infrastructure does not have the correct service records to point Kerberos clients to the correct KDC.  If your DNS infrastructure is missing the correct service records you can use the -h tag to specify the Active Directory domain controllers name.

adtest -t auth -d 10 -h ad01.f5guru.com -r f5guru.com -u gututest

2. Firewall Issues

If the F5 is not on the same layer2 network as the preferred Active Directory Domain Controller then there is a good chance our Kerberos request will traverse through a firewall and/or IPS solution.  While adtest does not have specific error messages that would indicate a firewall issues exists you would probably start to guess this may be the issues once you’ve verified that your routing, DNS and test credentials are all correct.  A definitive way to prove this is take a tcpdump via the BIG-IP CLI.  An easy way to capture and review tcpdump on the BIG-IP CLI is with tshark (on TMOS 11.3 and higher).  The command below listens for all Kerberos traffic:

tshark -i 0.0 -d tcp.port==88,kerberos -R kerberos -nVXs0

Note: tshark will create temporary files in /tmp/ that will need to be delete once you’re done.  Otherwise you may fill up you BIG-IP disk and cause bigger problems than authentication not working

3. ADtest works but APM AD AAA Does Not

While this can be caused by many things typically I see it boil down to the following issues:

  1. APM AD AAA configuration does not match adtest CLI arguments
  2. APM AD AAA is using a pool of AD servers

Obviously the 1st issue can be fixed by retyping all of your APM AD AAA configuration object over again – I know this seems silly but it happens more than you’d think.  The 2nd issue stems from the fact that adtest may not perform authentication from the same Active Directory Domain Controllers as you’ve specified in your APM AD AAA configuration.  When adtest uses the -r flag it queries DNS to obtain a KDC server for authentication.  An easy way to ensure the Active Directory Domain Controllers you’ve selected work with APM is to perform an adtest against each domain controller using the -h flag:

adtest -t auth -d 10 -h ad01.f5guru.com -r f5guru.com -u gututest

adtest -t auth -d 10 -h ad02.f5guru.com -r f5guru.com -u gututest

adtest -t auth -d 10 -h ad03.f5guru.com -r f5guru.com -u gututest

I hope this post helps give you an idea of how I use this tool and I’ll keep updating this page as I find new ways to troubleshoot with adtest.

BIG-IP Troubleshooting 101

When you work with any technology there reaches a point where the “it’s a black box” approach is no longer valid and you have to dig in a little deeper and understand how the product works. With F5 BIG-IP this means understanding how traffic flows through the appliance and how to monitor and watch it.

TMOS – Client and Server Traffic
The F5 BIG-IP Traffic Management Operating System (TMOS) is a dual-stack full proxy which means the client terminates their TCP connection with the BIG-IP and the BIG-IP then makes a new TCP connection to the backend server. So as far as the client is concerned the F5 is the server and as far as the backend server is concerned the F5 is the client. So when you are troubleshooting it is important to understand there will always be client-side traffic and server-side traffic. The names are pretty self explaining but can be misleading in the troubleshooting process. What I mean by this is you’ll more than likely need to look at both client and server side traffic to gain a better understanding in how the application behaves/operated.

LTM Monitors
LTM monitors are used to evaluate the health of a pool member or a node. They typically run at a set interval (15 seconds by default) and will mark a pool member or node down after 3 failed intervals (this setting is configurable). If you are unsure why a monitor is failing the first place to look is the Local Traffic Manager logs. These logs are accessible via the GUI (System -> Logging) or the CLI (less /var/log/ltm) and will give you some basic information such as:
– when the resource was makes offline
– if the resource is flapping

The LTM log will not however tell you why the monitor failed. To determine this you typically need to run a synthetic request using a CLI based tool such as curl or the TMSH monitor test command. Please note: if you can not access the application using these steps the F5 is probably not at fault – no matter how much the application owner swears everything works on the server 🙂

If this is a new application deployment I typically see monitor failures resulting from:
– Networking/firewall issues
– does the BIG-IP have a Self-IP on the sames network as the server?
– If not does the BIG-IP have a route to that network?
– Application issues
– Is the web server using name bases virtual directories? If so, what HTTP host header is it expecting?
– does the host OS have a firewall installed/configured?

If this is an existing application you need to answer the tried and true “what changed” question. In these scenarios I typically work my way down this checklist to see where the problem lies:
– can I ping the server?
– can I telnet to the port?
– can I run a synthetic request using a CLI tool like curl?
– does the web server respond with the correct website (if you’re using customized HTTP monitors – which I highly recommend)

These steps will usually lead me to the underlying issue or point me to the team who managed the device/server with the issue.

I will follow up with an additional post that provide examples of synthetic transactions as well as a 201 post regarding APM troubleshooting and ASM troubleshooting.