When last we saw our heroes, they were battling Hadoop authentication in DPaaS. But our intrepid knights were able to overcome that crafty foe and move on to the next obstacle: authorization. And that’s where our story continues…
Authorization is the control of access to resources or actions for an entity that’s requesting them. Implementing a proper authorization layer within our system is what will give our users the peace of mind that they are in control of their data & processes and are properly isolated from one another. As the operators of the system, we gain the same peace of mind and more. We get more control over the use of our finite resources. Better yet, we can take ourselves out of the critical path of our users’ needs — more automatic security means more self-service. That’s a big win-win.
Which systems need authorization?
Just like authentication, the answer here is necessarily “all of them!” again. Fortunately our TechOps team has already borne a large part of the authorization burden. We already have tight controls on who can log into which systems on our network, so we could really focus our attention on the authorization aspect of the DPaaS system that we built and maintain. We tackled three critical resources:
- HDFS files
- DPaaS jobs
- Yarn applications
Who’s doing what now?
Just knowing which systems need authorization wasn’t enough. We needed a paradigm for how we were going to give users access to these resources. That meant identifying entities that have similar access needs — i.e. groups of users — and identifying what kind of access they need — i.e. read and/or write.
As with any medium to large organization, we already have plenty of groupings of AppNexus employees. Technically we maintain those groupings in an HR database and they’re made available for query via LDAP. An important feature of our system was twofold:
- We didn’t want to incur operational burden as a result of AppNexians being added or moved between teams.
- We didn’t want to be affected by any re-organizations.
As such, we decided to create our own distinct DPaaS groups that are linked to feature sets. The groups would still be in LDAP (more on that later), and populated by HR groups. For example, if we have a feature set that is all about video, we would create a
dpaas_video group. If the people who are initially working on video features are in HR group Portland Team X then we’d map that HR group to the DPaaS group. If later there’s a dedicated team related to video, then it’d simply be a matter of mapping the new HR team to the existing DPaaS team, and we wouldn’t have to deal with anything at the individual employee level.
Read or write?
We decided to implement some standard best practices that any solid production system should.
- Humans shouldn’t have direct write access to production data.
- Only peer-reviewed code that’s been deployed through standard (and audited) practices should ever manipulate production data directly.
- Only those people who own a given feature set should have access to deploy code for that feature.
To continue the example from above, the
dpaas_video group would have a bunch of humans in it, and there would be a parallel
dpaas_video_rw group which only has a non-human (system) user in it. That system user (e.g.
dpaas_video_user) gets write access to the data that is owned by the
dpaas_video team. This system user will exist in LDAP, be a member of the corresponding
dpaas_video_rw group, but won’t have any credentials, so no one can actually “log in” as this user. More on this last bit later.
So let’s bring the who and what together…
To implement authorization on HDFS we used the ACLs that are provided natively with it. We evaluated third party tools such as Apache Ranger and Apache Sentry, but found them to be too nascent for us to adopt. The built-in features already available were rich enough for us for the time being. (For reference, the details can be found here).
There were two pieces needed: HDFS NameNodes needed to get group membership information from somewhere and then the right ACLs had to be applied to HDFS directories.
Hadoop has a built-in mechanism for ascertaining a user’s group mappings. You can go to any Hadoop installation and type
hdfs groups and it’ll tell you what groups Hadoop thinks you belong to. The mechanism is controlled by the
hadoop.security.group.mapping configuration parameter. The two main choices for where Hadoop can get group information are from either LDAP or from the Linux shell itself. There is also a very convenient
org.apache.hadoop.security.CompositeGroupsMapping which allows you to define an array of mapping providers. This is what we used because we still had some legacy groups that were defined on the Linux boxes themselves that we didn’t want to disturb during development and deployment.
I won’t go into a lot of detail here as to how we configured the group mapping; there’s pretty good information out there about the various mapping providers. One little detail worth mentioning is that within LDAP we created a new “organizational unit” (i.e.
dpaas. That way we could namespace the groups that had to do with our application, and not pick up all the groups that are defined in LDAP for a given user, of which there could be very many depending on your organization.
1 2 3 4 5
So once the group-mapping-provider’s configuration was deployed we had something like this (continuing the example from above):
1 2 3
We needed to set the ACLs on the HDFS directories to achieve our authorization goals. The salient portions look something like this:
1 2 3 4 5 6
This means that the
dpaas_video team owns the directory (that’s the group with humans in it), but they don’t have write privileges. Only members of the
dpaas_video_user: a non-human system account) team can write.
ACLs are set by running:
hdfs dfs -setfacl --set user::rwx,group::r-x,group:dpaas_video_rw:rwx,... /video
All of this is one-time setup which we wrote scripts for.
A DPaaS “job” is some complete unit of work that runs on our system. We’ve written a workflow engine which kicks off jobs when they’re ready to be run. The picture below is a tiny excerpt of our complete dependency graph. Each node is a job that has dependencies upstream and dependents downstream.
Since DPaaS jobs are our own invention, we needed our own implementation of authorization. Each job has an explicit group ownership (e.g.
dpaas_video). Only humans in a given group should create/edit/delete jobs for that group. The DPaaS job API would talk to a separate web service that assists with authorization (and authentication) to ascertain the group membership for the user making a request. If the user is a member of the group, then they’re authorized to operate on the job owned by that group.
There are two aspects to authorization of jobs running on the cluster. First is managing the user with which the job runs on the cluster, and second is access to compute resources on the cluster.
The DPaaS jobs discussed above each represent some number of Yarn applications that will run on our cluster. Each Yarn application has to have a user associated with it — the user it’s being run as. We’d run the applications as the system user associated with the group that owns the job. This is the same user that’d be in the read/write group with write privileges to the HDFS directory owned by that group.
But we said that the system user won’t have any credentials and won’t be able to “log in” in any way. So how did we accomplish this? We used Hadoop’s facility for proxying users. We created a DPaaS system super-user that has proper Kerberos credentials and is also authorized to proxy any other user. In
core-site.xml you can define the following configuration parameters:
1 2 3 4 5 6 7 8
This means that
dpaas_superuser can now run any Hadoop API as any other user — this was discussed in some detail in part 1 of this series.
Putting it all together
So here’s the whole workflow from end-to-end with all the authorization pieces in place. We’ll continue the example users and groups from above:
- The user (
bob_in_portland) defines a job in the DPaaS UI and sets the job ownership to
dpaas_video, the run-as user to
dpaas_video_user, and hits “save”.
- The UI makes the appropriate call to the API service which calls the Auth service to ask whether
bob_in_portlandis a member of
- The Auth service looks up in LDAP and returns all the
bob_in_portlandis a member of.
bob_in_portlandis a member of
dpaas_videoso he may create a job owned by that team. We also have a mapping that maintains that the
dpaas_video_useris a valid run-as user for the
dpaas_videoteam’s jobs, so that also checks out.
- The job is created in the database.
- The workflow engine picks up the job at the appropriate time and executes it on Hadoop. The workflow engine runs as the
dpaas_superuserwhich then proxies as the run-as user set in the job definition (
- The job attempts to write its output somewhere under the
- The HDFS NameNode sees that write and queries LDAP to see what groups the
dpaas_video_useris a member of — it gets back
dpaas_video_rwis authorized to write to the
/videodirectory, so the job runs successfully.
That’s it — end-to-end!
Building a super useful and unique system within a large organization is a dual-edged blade. Of course it’s great to make something that’s awesome and garners your team a bunch of respect, but then other people want access to it. This is often the next step in many successful systems’ lifetimes. Going from a closed and insular “black box” to an open user-facing service is a tricky endeavor. Security is a critical step on that path, and we are happy to share some of the challenges we faced and choices we made along the way.
Authentication bonus material — Environment Isolation
One fun piece of the authentication puzzle that we didn’t tackle in part 1 is environment isolation. Hadoop has a few built-in power users for its subsystems:
yarn, and others. The issue is that the
hdfs user you’ve deployed to your test environment is just as powerful as the
hdfs user you deployed to production. But you definitely don’t want test environment users to be used in prod. How do you get around this?
Well… you could create different users for each environment —
hdfs_prod, etc. but that’s a lot more users to manage, it’s not really “native” to what Hadoop expects for its user name, and would incur a lot of custom config at best. At worst it might not even work, which is why we didn’t go down this path.
Hadoop has a facility for translating your Kerberos principals (i.e. your authenticated usernames) to the local username — the config parameter
hadoop.security.auth_to_local (documentation can be found here). We know that the environment string appears in the hostname portion of the system user’s principal — i.e. it’s
hdfs/namenode1.prod.appnexus.com@APPNEXUSDOMAIN.COM — so we can filter that out per environment. We did something like this:
1 2 3 4 5 6 7 8 9 10 11
Following the document linked above — this rule says that if the principal has 2 parts, then check for “prod.appnexus.com” and if it exists, just remove that hostname. The only other rule available is for one part principals, which is what humans’ principals look like. Since there are no other rules, this will fail to find a match, and thus it won’t authenticate.
Let’s do a few examples:
- Bob from our examples above has Kerberos principal
bob_in_portland@APPNEXUSDOMAIN.COM. When he makes a call to the NameNode his principal will fall into the second rule because there is only 1 field in the non-domain part of his principal, i.e.
bob_in_portland. Therefore we take it as-is.
hdfssystem user from our backup production NameNode is making a call to the primary. It uses its principal
hdfs/namenode2.prod.appnexus.com@APPNEXUSDOMAIN.COM. This falls into the first category because it has 2 fields in the non-domain part, i.e.
namenode2.prod.appnexus.com. It takes these two fields and concatenates them together with the
@symbol, so we now have
firstname.lastname@example.org(the domain’s been stripped off). This is passed through the regular expression and matches. Then it can be passed into the
sedexpression which pulls off the
hdfspart in the one capture group, and that’s the username implied by the original principal.
hdfssystem user from our dev environment tries to make a call to the production NameNode. It uses
hdfs/namenode2.dev.appnexus.com@APPNEXUSDOMAIN.COM. Again this is the first category and pulls out
email@example.com. This doesn’t match the regex, so the rules engine moves on to the next. The next rule doesn’t match either because the principal has 2 parts, and thus no rules match. In the end that principal can’t authenticate with this NameNode, which is precisely what we want.