SIEM CI/CD Challenges – Getting to Happily Ever After
Aug 29, 2022 | By Page Glave
Next up in our detection-as-code (DaC) series, we're looking at what challenges we had once Jenkins was up and running to manage our detections in Panther. If you haven't yet, check out our earlier blogs on DaC - putting theory into practice and integrating SIEM with CI/CD to get a feel for what we're doing.
As a team, we are focused on managing what we can via code. This focus helps us automate as much as possible. So we can have less alert fatigue and be more focused on the fun stuff, like threat hunting and pushing our program forward. Getting Panther into our CI/CD pipeline was a big step in this process. However, it hasn't been without growing pains.
The key painpoints can be summed up as:
- Required Features
- Scalability Expansions
The initial Jenkins job did what we needed to manage our custom detections, but there were two additional pieces of functionality that were critical to establish DaC.
The first was a way to update our release version as Panther released updates to the panther-analysis repo. We were initially over-engineering our solution to upload all the things instead of just our FloQast detections. With a little bit of trial and error, we were able to tweak a parameter (the target folder) in the Jenkins job, and voila, problem solved.
The other critical add was ensuring custom schemas were uploaded. Luckily the Python-based panther_analysis_tool already had the capability to upload schemas, so with a few lines of code, this feature was also addressed. Keeping our custom schemas organized and easily updatable was vital to keep up with our need to ingest custom log sources.
With these two problems solved, the CI/CD pipeline for Panther was working very well. So of course, we had to see what else we could do.
With detections managed via code, our focus turned to custom log ingestions. I want to address a bit here, but a dedicated blog about log ingestion will be coming in the future. Panther has the option to ingest logs from a variety of sources. We are generally using AWS S3 buckets for our custom sources. Luckily, FloQast has this really cool auto-deployment of repos via Slack that we were able to use as the basis for our custom log sources. With Lambdas built off a repo from GRAIL (GitHub Repository Automation to Increase Legerity) and FloQast's infrastructure-as-code approach using Terraform, we were able to standardize our ingestion process and decrease time to ingestion. This code is in its own repo and uses a Jenkins job to push the Lambdas to our development or production environment.
Custom sources require an AWS IAM (Identity and Access Management) role and SNS (Simple Notification Service) notifications to get the logs to Panther. These can be set up through the GUI if you have adequate access to the AWS account but can easily lead to sprawl as each source gets its own role. Managing the roles programatically also isn't an option going through the GUI, so you have to manually delete them. Plus you have to click like 3 times in set up, and who wants to have to click things? Panther provides infrastructure-as-code information in the panther-auxiliary repo. We took that foundation to deploy Terraform to create the role and SNS topic. We've chosen to make a single IAM role per bucket to limit the roles created, and since it's all in code, we can easily track and destroy the roles. We have a separate Jenkins job for this bit of functionality, but the code lives in the same repo as our custom detections.
The last bit of adjusting has been training new security engineers. FloQast values employees with a variety of backgrounds, and our security program is no different. We have to ensure onboarding gets engineers with less development experience up to speed with the development lifecycle, and engineers with less SIEM experience up to speed on SIEM. Everyone has to buy in to the DaC approach and understand how the CI/CD pipeline for Panther works.
As Niko pointed out in his blog on CI/CD integration, the process had to be easily teachable. For working with Panther, we need security engineers to understand what we are doing with DaC and why, learn enough Jenkins to Jenkins and Terraform to Terraform, and understand what's going on enough to troubleshoot. We've added specific pieces to our onboarding process to allow new engineers to acclimate more quickly. We've collected resources about Jenkins and Terraform that provide a crash course on what's going on. We also have structured tool intro sessions. I think one of the best steps has been to identify quick wins that new hires can move through the pipeline. Tuning an existing detection or ingestion has been a great way to learn the process without having to focus as much on the content. More senior team members are available for code review and to walk through the release process.
Now our new security engineers are able to contribute faster and with more confidence. We also ask each new hire to point out places where we can do better with onboarding. We're careful to document how we are doing things with CI/CD and have one person write the initial documentation when working through something and another person do it the next time to refine the documentation. (We're big believers in taking vacation, so documentation means we are able to actually step away and recharge.)
Happily Ever After
So what's next? At this point, we are pushing the boundaries of how much Panther we can manage with code. As Panther adds capabilities on their end, we'll keep pushing what we can do. We have made a lot of progress over the last year and continue to look for opportunities to work more efficiently. We want to address as much of the mundane through automation as we can to increase scalability so we can focus on the things humans need to do.