Introduction

Turning security tool outputs into actionable insights is one of the biggest challenges for developers and security engineers. In this post, I’m sharing a minimal viable product (MVP) that takes Semgrep scan outputs and visualizes: top rules, the most affected files, and clusters of related findings.

Watch the Demo

How the MVP Works

  1. Aggregates Semgrep Results: JSON outputs from multiple scans are collected into a single dataset, ready for analysis.
  2. Highlights Top Rules & Top Files: Quickly identifies the rules triggered most often and the files with the highest number of findings. This helps prioritize what to fix first.
  3. Clusters Related Findings: Findings are grouped into logical clusters to reveal patterns and correlations between rules and code contexts.

Why This Matters

  • Provides a fast overview of large codebases.
  • Reduces noise by focusing on the findings that matter most.
  • Serves as the foundation for a Signal Engine: ingest → normalize → rank → export/report.

Next Steps

This MVP is just the start. The ultimate goal is a full tool, I will call it Signal Engine, that can:

  • Ingest results from multiple security tools
  • Normalize and deduplicate findings
  • Rank risks per finding
  • Export actionable reports for developers and security engineers

The demo shows a minimal but immediately usable implementation of this approach.

Try it Yourself

If you want to explore this workflow with your own Semgrep outputs, this MVP provides a fast way to see patterns, prioritize rules, and focus on what really matters in your code security scans.

The code is AGPLv3 licensed and released here: https://github.com/thesp0nge/mvp_semgrep