Discovering Exfiltrated Credentials

Anderson

Mar 17, 2023

3 min read

x-twitter

facebook

We regret to inform you that a malicious third party may have had read access to the secret data you stored in our system.

Any time we get a communication like the above, as recently occurred with CircleCI or LastPass, several security processes need to execute as quickly as possible. One, identify any exposed credential data. Two, audit the use of any exposed credential data. Three, begin creating and installing replacement credential data as quickly as possible. This blog will demonstrate a detective solution for Step 2, which you can put in place before a security incident is declared.

In a Continuous Integration and Delivery system, the implications of exposed credentials can be severe. CI/CD tasks necessarily leverage resource creation and deletion privilege. In a reduced-risk world, build and deployment systems leverage short-lived credentials across pre-configured trust boundaries. In the world that we live in, APIs get integrated with authentication tokens that never expire.

This blog will provide an example of an advanced, proactive detection for identifying exfiltrated credentials as used in Continuous Integration/Continuous Delivery systems.

Detecting the Risk

We must establish the criteria for unauthorized use to detect authentication token exfiltration. Then we can start creating theses (and tests!) to confirm or invalidate our idea. If our theories pan out, we codify them as detections.

Let’s begin by stating our hope for the ideal state. “I expect all activity from $SpecialIdentity in Infrastructure as a Service (IaaS) to be the result of $SomeTask executing in my Continuous Integration and Continuous Delivery (CICD) system.” We can assert this accurately for all conditions with enough shared metadata between our IaaS and CICD systems. The data that we have may be insufficient to create formal proofs.

In a typical environment, the CICD and IaaS systems align exclusively via timestamps. There is no additional metadata that can be used to cross-reference events in the systems. Using this constraint, we will write technical expectations.

Our hypothesis is built with these expectations:

IaaS Activity
1. We will only use IaaS event logs for $SpecialIdentity
2. All IaaS sessions of interest will contain at least two events. We will ignore any single event with no neighbor +/- 10 minutes
3. Events separated by more than ten minutes are separate sessions.
CICD Activity
1. We will only use CICD event logs for $SomeTask
2. If $SomeTask executes fully in CICD, CICD will call IaaS within ten minutes.
3. It is acceptable for $SomeTask to execute without calling IaaS. The CICD test suite might sometimes fail.

Implementing our hypothesis with SQL

Panther with Snowflake provide a query capability that matches a pattern in a series of rows. This capability is called MATCH_RECOGNIZE. If you have used Regular Expressions, you’ll likely find MATCH_RECOGNIZE familiar. In this example, we will use the regular expression repetition operators * (meaning zero or more) and + (meaning 1 one or more).

The MATCH_RECOGNIZE portion of our query

The PATTERN() statement below declares how to group events along our timeline. This PATTERN says zero or more rows identified as row_with_CICD_events followed by one or more rows identified as same_IaaS_session. We will surface when to alert and when to ignore based on the event grouping later in our SQL statement.

The DEFINE statement declares how rows should be classified for the PATTERN.

The HAVING statement declares that we only want groupings where we did not find the CICD events happening before the IaaS events.

Putting it all together with tests

We will assert that our detection works as expected by stubbing in some test data. In this data, there are three sets of test data. The test data and what they demonstrate are described below. For this detection, the alert condition is when the num_CICD_events column in our final output has a value of 0.

These are our test data with explanations about what is demonstrated by them:

01:00 Hour data
- This data is an alert scenario, indicated by the 0 value in the num_CICD_events column
- This data is a CICD event that is followed by an IaaS event.
- The 11-minute gap between the CICD event and the IaaS event is greater than our threshold.
02:00 hour data
- This data is not an alert scenario
- This data is a CICD event that is followed by an IaaS event
- There is a 6-minute gap between the two systems, which is inside our tolerance.
- The IaaS session in this data lasts a total of 15 minutes. This demonstrates that IaaS sessions can hold arbitrary lengths, as long as there is no gap between events longer than 10 minutes.
03:00 hour data
- This data is not an alert scenario
- This data is a single IaaS event followed by a single CICD event followed by several more IaaS events
- There is a 4-minute gap between the CICD event and the following IaaS events, which is within our tolerance.
- The single IaaS event before the CICD event is ignored because we consider that every IaaS session of interest will have at least two events.
- This data will become an alert scenario if you uncomment the SendMFAToken IaaS event

Putting it all together for the Real World

A tangible application for this detection is GitHub Actions jobs integrated with AWS. In this detection, we want to alert if the AWS IAM Role arn:aws:iam::123456789012:role/DeploymentUpdateGitHubRole has activity and there was not a preceding pull_request.merge in the repo panther-labs/example-repo. At the end of the statement, we set HAVING num_CICD_events = 0 to ensure our output only contains suspicious windows of IAM activity.

Concluding Thoughts

Detecting unauthorized use of authentication tokens is critical for security in CI/CD systems, especially in the event of a security incident. By establishing criteria for unauthorized use and codifying them as detections, security teams can proactively monitor for potential threats and take action before any damage occurs. The example presented here demonstrates how such detections can be implemented and tested. We hope that you will integrate this detective approach as a part of your security practice.