Skip to main content

· 8 min read
Jens Langhammer

authentik is an open source Identity Provider that unifies your identity needs into a single platform, replacing Okta, Active Directory, and Auth0. Authentik Security is a public benefit company building on top of the open source project.


There are two ways to judge an application:

  1. Does it do what it’s supposed to do?
  2. Is it easy to run?

This post is about the second.

Using containers is not a best practice in itself. As an infrastructure engineer by background, I’m pretty opinionated about how to set up containers properly. Doing things the “right” way makes things easier not just for you, but for your users as well.

Below are some common mistakes that I see beginners make with containers:

  1. Using one container per application
  2. Installing things at runtime
  3. Writing logs to files instead of stdout

Mistake #1: One container per application

There tend to be two mindsets when approaching setting up containers:

  • The inexperienced usually think 1 container = 1 application
  • The other option is 1 container = 1 service

Your application usually consists of multiple services, and to my mind these should always be separated into their own containers (in keeping with the Single Responsibility Principle).

For example, authentik consists of four components (services):

  • Server
  • Worker
  • Database
  • Cache

With our deployment, that means you get four different containers because they each run one of those four services.

Why you should use one container per service

At the point where you need to scale, or need High Availability, having different processes in separate containers enables horizontal scaling. Because of how authentik deploys, if we need to handle more traffic we can scale up to 50 servers, rather than having to scale up everything. This wouldn’t work if all those components were all bundled together.

Additionally, if you’re using a container orchestrator (whether that’s Kubernetes or something simpler like Docker Compose), if it’s all bundled together, the orchestrator can’t distinguish between components because they’re all in the black box of your container.

Say you want to start up processes in a specific order. This isn’t possible if they’re in a single container (unless you rebuild the entire image). If those processes are separate, you can just tell Docker Compose to start them up in the order you want, or you can run specific components on specific servers.

Of course, your application architecture and deployment model need to support this setup, which is why it’s critical to think about these things when you’re starting out. If you’re reading this and thinking, I have a small-scale, hobby project, this doesn’t apply to me—let me put it this way: you will never regret setting things up the “right” way. It’s not going to come back to bite you if your situation changes later. It also gives users who install the application a lot more freedom and flexibility in how they want to run it.

Mistake #2: Installing things at runtime

Your container image should be complete in itself: it should contain all code and dependencies—everything it needs to run. This is the point of a container—it’s self contained.

I’ve seen people set up their container to download an application from the vendor and install it into the container on startup. While this does work, what happens if you don’t have internet access? What if the vendor shut down and that URL now points to a malicious bit of code?

If you have 100 instances downloading files at startup (or end up scaling to that point), this can lead to rate limiting, failed downloads, or your internet connection getting saturated—it’s just inefficient and causes problems that can be avoided.

Also, don’t use :latest

This leads me to a different but related bad practice: using the :latest tag. It’s a common pitfall for folks who use containers but don’t necessarily build them themselves.

It’s easy to get started with the :latest tag and it’s understandable to want the latest version without having to go into files and manually edit everything. But what can happen is that you update and suddenly it’s pointing to a new version and breaking things.

I’ve seen this happen where you’re just running something on a local server and your disk is full, so you empty out your Docker images. The next time you pull, it’s with a new version which now no longer works and you’re stuck trying to figure out what version you were on before.

Instead: Pin your dependencies

You should be pinning your dependencies to a specific version, and updating to newer versions intentionally rather than by default.

The most reliable way to do this is with a process called GitOps:

  • In the context of Kubernetes, all the YAML files you deploy with Kubernetes are stored in the central Git repository.
  • You have software in your Kubernetes cluster that automatically pulls the files from your Git repo and installs them into the cluster.
  • Then you can use a tool like Dependabot or Renovate to automatically create PRs with a new version (if there is one) so you can test and approve it, and it’s all captured in your Git history.

GitOps might be a bit excessive if you’re only running a small hobby project on a single server, but in any case you should still pin a version.

For a long time, authentik purposefully didn’t have a :latest tag, because people would use it inadvertently (sometimes not realizing they had an auto-updater running). Suddenly something wouldn’t work and there wasn’t really a way to downgrade.

We have since added it due to popular request. This is how authentik’s version tags work:

  • Our version number is 3 digits reflecting the date of the release, so the latest currently is 2023.10.1.
    • You can either use 2023.10.1 as the tag, pinning to that specific version
    • You can pin to 2023.10, which you means that you always get the latest patch version, or
    • You can use 2023, which means you always get the latest version within that year.

The principle is roughly the same with any project using SemVer: you could just lock to v1, which means you get the latest v1 with all minor patches and fixes, without breaking updates. Then you switch to v2 when you’re ready.

With this approach you are putting some trust in the developer not to publish any breaking changes with the wrong version number (but you’re technically always putting trust in some developer when using someone else’s software!).

Mistake #3: Writing logs to files instead of stdout

This is another issue on the infrastructure side that mainly happens when you put legacy applications into containers. It used to be standard that applications put their log output into a file, and you’d probably have a system daemon set up to rotate those files and archive the old ones. This was great when everything ran on the same server without containers.

A lot of software still logs to files by default, but this makes collecting and aggregating your services logs much harder. Docker (and containers in general) expect that you log to standard output so your orchestration platform can route the logs to your monitoring tool of choice.

Docker puts the logs into a JSON file that it can read itself and see the timestamps and which container the log refers to. You can set up log forwarding with both Docker and Kubernetes. If you have a central logging server, the plugin gets the standard output of a container and sends it to that server.

Not logging to stdout just makes it harder for everyone, including making it harder to debug: Instead of just running docker logs + the name of the container, you need to exec into the container, go to find the files, then look at the files to start debugging.

This bad practice is arguably the easiest one to work around

As an engineer you can easily redirect the logs back from a file into the standard output, but there’s no real reason not to do it the “correct” way.

There aren’t many use cases where there’s an advantage to writing your logs directly to a file instead of stdout—in fact the main one is for when you’re making the first mistake (having your whole application in one container)! If you’re running multiple services in one container, then you’ll have logs from multiple different processes in one place, which could be easier to work with in a file vs stdout.

Even if you specifically want your logs to exist in a file, by default if you run docker logs it just reads a JSON file that it adds the logs to, so you’re not losing anything by logging to stdout. You can configure Docker to just put the logs into a plain text file wherever you want to.

It’s a little simplistic, but I’d encourage you to check out The Twelve-Factor App which outlines good practices for making software that’s easy to run.

Are you doing containers differently and is it working for you? Let us know in the comments, or send us an email at hello@goauthentik.io!

· 7 min read
Jens Langhammer

authentik is an open source Identity Provider that unifies your identity needs into a single platform, replacing Okta, Active Directory, and Auth0. Authentik Security is a public benefit company building on top of the open source project.


Another security breach for Okta

Late last week, on October 20, Okta publicly shared that they had experienced a security breach. Fortunately, the damage was limited. However, the incident highlights not only how incredibly vigilant vendors (especially huge vendors of security solutions!) must be, but also how risky the careless following of seemingly reasonable requests can be.

We now know that the breach was enabled by a hacker who used stolen credentials to access the Okta support system. This malicious actor then collected session tokens that were included in HAR files (HTTP Archive Format) that were uploaded to the Okta support system by customers. A HAR file is a JSON archive file format that stores session data for all browsers running during the session. It is not rare for a support team troubleshooting an issue to request a HAR file from their customer: Zendesk does it, Atlassian does it, Salesforce as well.

So it’s not the HAR file itself; it was what was in the file, and left in the file. And, destructively, it is our collective training to not second-guess support teams; especially the support team at one of the world’s most renowned identity protection vendors.

But it is not all on Okta; every customer impacted by this hack, including 1Password (who communicated the breach to Okta on September 29), BeyondTrust (who communicated the breach on October 2), and Cloudflare (October 18) were "guilty" of uploading HAR files that had not been scrubbed clean and still included session tokens and other sensitive access data. (Cleaning an HAR file is not always a simple task, there are tools like Google's HAR Sanitizer, but even tools like that don't 100% guarantee that the resulting file will be clean.)

Target the ancillaries

An interesting aspect of this hack was that it exploited the less-considered vulnerability of Support teams, not considered to be the typical entry-way for hackers.

But security engineers know that hackers go in at the odd, unexpected angles. A classic parallel is when someone wants data that a CEO has, they don’t go to the CEO, they go to (and through) the CEO’s assistant!

Similarly, the support team at Okta was used as entry point. Once the hacker gained control of a single customer’s account, they worked to take control of the main Okta dashboard and the entire support system. This lateral-to-go-up movement through access control layers is common technique of hackers.

It’s the response… lesson not yet learned

The timing of Okta's response, not great. The initial denial of the incident, not great. And then, add insult to injury, there’s what can objectively be labeled an abysmal “announcement” blog from Okta on October 20.

Everything from the obfuscatory title to the blog’s brevity to the actual writing… and importantly, the lack of any mention at all of BeyondTrust, the company that informed Okta on October 2nd that they suspected a breach of the Okta support system.

Tracking Unauthorized Access to Okta's Support System” has to be the lamest of all confession titles in the history of security breach announcements.

Not to acknowledge that their customers first informed them seems like willful omission, and it absolutely illustrates that Okta has not yet learned their lesson about transparency, trusting their customers and security partners, and the importance of moving more quickly towards full disclosure. Ironically, BeyondTrust thanks Okta for their efforts and communications during the two week period of investigation (and denial).

Back to the timing; BeyondTrust has written an excellent article about the breach, with a rather damning timeline of Okta’s responses.

“We raised our concerns of a breach to Okta on October 2nd. Having received no acknowledgement from Okta of a possible breach, we persisted with escalations within Okta until October 19th when Okta security leadership notified us that they had indeed experienced a breach and we were one of their affected customers.”(source)

The BeyondTrust blog provides important details about the persistence and ingenuity of the hacker.

“Within 30 minutes of the administrator uploading the file to Okta’s support portal an attacker used the session cookie from this support ticket, attempting to perform actions in the BeyondTrust Okta environment. BeyondTrust’s custom policies around admin console access initially blocked them, but they pivoted to using admin API actions authenticated with the stolen session cookie. API actions cannot be protected by policies in the same way as actual admin console access. Using the API, they created a backdoor user account using a naming convention like existing service accounts.”

Oddly, the BeyondTrust blog about the breach does a better job of selling Okta (by highlighting the things that went right with Okta) than the Okta announcement blog. For example, in the detailed timeline, BeyondTrust points out that one layer of prevention succeeded when the hacker attempted to access the main internal Okta dashboard, but because Okta still views dashboard access as a new sign in, it prompted for MFA thus thwarting the log in attempt.

Cloudflare’s revelation of their communications timeline with Okta shows another case of poor response timing by Okta, another situation where the customer informed the breached vendor first, and the breached company took too long to publicly acknowledge the breach.

“In fact, we contacted Okta about the breach of their systems before they had notified us.” … “We detected this activity internally more than 24 hours before we were notified of the breach by Okta.” (source)

In their blog about this incident, Cloudflare provides a helpful set of recommendations to users, including sensible suggestions such as monitoring for new Okta users created, and reactivation of Okta users.

Which just takes us back to the rather lean response by Okta; their customers wrote much more informative and helpful responses than Okta themselves.

Keep telling us

We can’t be reminded often enough about keeping our tokens safe.

This incident at Okta is parallel to the breach at Sourcegraph that we recently blogged about, in which a token was inadvertently included in a GitHub commit, and thus exposed to the world. With Okta, it was session tokens included in an uploaded HAR file, exposed to a hacker who had already gained access to the Okta support system.

But talk about things that keep security engineers up at night; timing was tight on this one.

The initial breach attempt was noticed by BeyondTrust within only 30 minutes of their having uploaded a HAR file to Okta Support. By default (and this is a good, strong, industry-standard default) Okta session tokens have a lifespan of two hours. However, with hackers moving as quickly as these, 2 hours is plenty long for the damage to be done. So, the extra step of scrubbing clean any and all files that are uploaded would have saved the day in this case.

Keep your enemies close, but your tokens even closer.

Stay vigilant out there

Lessons learned abound with every breach. Each of us in the software and technology area watch and learn from each attack. In the blog by BeyondTrust, they provide some valuable steps that customers and security teams can take to monitor for possible infiltration.

Strong security relies on multiple layers, enforced processes, and defense-in-depth policies.

“The failure of a single control or process should not result in breach. Here, multiple layers of controls -- e.g. Okta sign on controls, identity security monitoring, and so on, prevented a breach.” (source)

A writer on HackerNews points out that Okta has updated their documentation about generating HAR files, to tell users to sanitize the files first. But whether HAR files or GutHub commits, lack of MFA or misuse of APIs, we all have to stay ever-vigilant to keep ahead of malicious hackers.

Addendum

This blog was edited to provide updates about the 1Password announcement that they too were hacked, and to clarify that the hacker responsible for obtaining session tokens from the HAR files had originally gained entry into the Okta support system using stolen credentials.

· 13 min read
Jens Langhammer

authentik is an open source Identity Provider that unifies your identity needs into a single platform, replacing Okta, Active Directory, and auth0. Authentik Security is a public benefit company building on top of the open source project.


Let’s say you’re working at a small startup: You’re the CTO, your CEO is a good friend, and you have a couple of developers working with you from a previous company. You’re building your initial tech stack, and you start – where else? – with GitHub.

The pricing is simple enough. There’s a pretty feature-rich free plan, but you’re willing to pay up because the Team plan includes features for restricting access to particular branches and protecting secrets.

But the enterprise plan, the plan that costs more than four times as much per user per month – the plan that seems targeted at, well, enterprises – promises “Security, compliance, and flexible deployment.”

Is security… not for startups?

The feature comparison bears this out: Only the enterprise plan offers single-sign-on (SSO) functionality as part of the package – a feature that security experts have long agreed is essential. But don’t get mad at GitHub.

Do you want Box? You’ll have to pay twice as much for external two-factor authentication.

Do you want Mailtrap? The team, premium, and business plans won’t do. Only the enterprise plan, which costs more than $300 per month more than the team plan, offers SSO.

Do you want Hubspot’s marketing product, but with SSO? Prepare to pay $2,800 more per month than the next cheapest plan.

And these are only a few examples. SSO.tax, a website started by Rob Chahin, gathers many more. If you look through, you’ll see companies like SurveyMonkey and Webflow even restrict SSO to enterprise plans with a Contact Us option instead of a price.

"pricing page"

· 7 min read
Jens Langhammer

authentik is an open source Identity Provider that unifies your identity needs into a single platform, replacing Okta, Active Directory, and auth0. Authentik Security is a public benefit company building on top of the open source project.


As a young security company, we’ve been working on our implementation of SCIM (System for Cross-domain Identity Management), which I’ll share more about below. SCIM is in many ways a great improvement on LDAP, but we’ve run into challenges in implementation and some things just seem to be harder than they need to be. Is it just us?

"authentik admin interface"

· 8 min read
Jens Langhammer

authentik is a unified identity platform that helps with all of your authentication needs, replacing Okta, Active Directory, Auth0, and more. Building on the open-source project, Authentik Security Inc is a public benefit company that provides additional features and dedicated support.


We have provided M2M communication in authentik for the past year, and in this blog we want to share some more information about how it works in authentik, and take a look at three use cases.

What is M2M?

Broadly speaking, M2M communication is the process by which machines (devices, laptops, servers, smart appliances, or more precisely the client interface of any thing that can be digitally communicated with) exchange data. Machine-to-machine communication is an important component of IoT, the Internet of Things; M2M is how all of the “things” communicate. So M2M is more about the communication between the devices, while IoT is the larger, more complex, overarching technology.

Interestingly, M2M is also implemented as a communication process between business systems, such as banking services, or payroll workflows. One of the first fields to heavily utilize M2M is the oil and gas industry; everything from monitoring the production (volume, pressure, etc.) of gas wells, to tracking fleets of trucks and sea vessels, to the health of pipelines can be done using M2M communication.

Financial systems, analytics, really any work that involves multi-machine data processing, can be optimized using M2M.

“Machine to machine systems are the key to reliable data processing with near to zero errors” (source)

Where there is communication in software systems, there is both authentication and authorization. The basic definition of the terms is that authentication is about assessing and verifying WHO (the person, device, thing) is involved, while authorization is about what access rights that person or device has. So we choose to use the phrase “machine-to-machine communication” in order to capture both of those important aspects.

Or we could use fun terms like AuthN (authentication) and AuthZ (authorization).

So in some ways you can think of M2M as being like an internal API, with data (tokens and keys and certs and all thing access-related) being passed back and forth, but specifically for authentication and authorization processes.

"Screenshot of authentik UI"

· 9 min read
Jens Langhammer

authentik is an open source Identity Provider that unifies your identity needs into a single platform, replacing Okta, Active Directory, and auth0. Authentik Security is a public benefit company building on top of the open source project.


Legacy security vendors that rely on black box development can't keep up with open source. It's an oft-discussed topic—the ability of open source communities to quickly jump in and collectively solve problems and innovate solutions—but it is equally believed that "serious" security software companies have proprietary software.

In this blog, we will take a closer look at the pros and cons of the various source availability types of SSO and other security software.

"mike-kononov-lFv0V3_2H6s-unsplash.jpg"

· 7 min read
Jens Langhammer

Access tokens make identity management and authentication relatively painless for our end-users. But, like anything to do with access, tokens also can be fraught with risk and abuse.

The recent announcement from Sourcegraph that their platform had been penetrated by a malicious hacker using a leaked access token is a classic example of this balance of tokens being great… until they are in the wrong hands.

This incident prompts all of us in the software industry to take yet another look at how our security around user identity and access can be best handled, to see if there are lessons to be learned and improvements to be made. These closer looks are not only at how our own software and users utilizes (and protects) access tokens, but also in how such incidents are caught, mitigated, and communicated.

Photo by <a href="https://unsplash.com/@juvnsky?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Anton Maksimov 5642.su</a> on <a href="https://unsplash.com/photos/wrkNQmhmdvY?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a>

· 3 min read
Jens Langhammer

📣 We are happy to announce that the first authentik Enterprise release is here! 🎉

The Enterprise release of authentik provides all of the functionality that we have spent years building in our open source product, plus dedicated support and account management. This Enterprise version is available in Preview mode in our latest release, 2023.8.

This is an exciting step for us, as we grow the team and the company and our user base. We officially became a company just last fall (I wrote about it in November 2022, in “The next step for authentik"), and this release is another move forwards in maturing authentik into the SSO and identity management app of choice.

One thing we want to acknowledge, up front, is that we would never have been able to achieve this goal without the years of support from our open source community. You all helped build authentik into what it is today, and that’s why all of our Enterprise-level features will be open core and source available, always.

· 5 min read
Jens Langhammer

There’s been a lot of discussion about licensing in the news, with Red Hat and now Hashicorp notably adjusting their licensing models to be more “business friendly,” and Codecov (proudly, and mistakenly) pronouncing they are now “open source.”

“Like the rest of them, they have redefined ‘Open’ as in ‘Open for business’”—jquast on Hacker News

This is a common tension when you’re building commercially on top of open source, so I wanted to share some reflections from my own experience of going from MIT, to GPL, back to MIT.

"Photo by <a href="https://unsplash.com/@gcalebjones?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Caleb Jones</a> on <a href="https://unsplash.com/photos/J3JMyXWQHXU?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText">Unsplash</a>"

· 12 min read
Jens Langhammer

Identity – whether we’re talking about internal authentication (think Auth0) or external authentication (think Okta) – has become boring.

Little else proves this better than the fact that Okta and Auth0 are now the same company and that their primary competitor, Microsoft AD, survives based on bundling and momentum. Identity has become a commodity – a component you buy off the shelf, integrate, and ignore.

Of course, taking valuable things for granted isn’t always bad. We might regularly drive on roads we don’t think much about, for example, but that doesn’t make them any less valuable.

The danger with letting identity become boring is that we’re not engaging in the problem and we’re letting defaults drive the conversation rather than context-specific needs. We’re not engaging in the solution because we’re not encouraging a true buy vs. build discussion.

My pitch: Let’s make identity fun again. And in doing so, let’s think through a better way to decide whether to build or buy software.

Image1