Skip to content

Authoring Custom Packs

Create custom standards packs for organization-specific rules.

Pack structure

Each pack is a Python package inside src/guard/packs/:

src/guard/packs/
  my_custom_pack/
    __init__.py       # Module docstring (1 line)
    pack.yaml         # Pack metadata + rules

__init__.py

"""My Custom Pack — description of what this pack covers."""

pack.yaml

name: "My Custom Pack"
description: "One-sentence description of the pack's purpose"

rules:
  - id: MYPACK-001
    name: rule-name
    type: regex
    pattern: "..."
    severity: high
    file_glob: "**/*.py"

Rule format

Required fields

Field Type Description
id string Unique ID (prefix with pack abbreviation)
name string Kebab-case name
type string regex, required_pattern, file_policy, ast, llm, or documentation_obligation
severity string critical, high, medium, low, or info

Optional fields

Field Type Description
description string What the rule detects
file_glob string Glob pattern for files to check
pattern string Regex pattern (for regex/required_pattern)
all_patterns list All patterns must match (composite regex)
check string Check function name (for file_policy)
ast_check string Built-in Python AST check name (for ast)
language string Tree-sitter language name (for ast with query)
query string Tree-sitter S-expression query (for ast with language)
params dict Configurable thresholds
autofix string remove_line, comment_out, or replace
replacement string Replacement text (with autofix: replace)
standard string Regulatory standard name
clause string Standard clause reference
remediation_guidance string Fix instructions shown in reports
tags list Categorization tags

Rule type examples

Regex

- id: MYPACK-001
  name: no-hardcoded-passwords
  type: regex
  pattern: "(?i)password\\s*=\\s*[\"'][^\"']{4,}"
  severity: critical
  file_glob: "**/*.py"
  remediation_guidance: "Use environment variables or a secrets manager"

Composite regex

- id: MYPACK-002
  name: sql-injection-risk
  type: regex
  severity: critical
  file_glob: "**/*.py"
  all_patterns:
    - "execute\\("
    - "request\\.(GET|POST|form)"

Required pattern

- id: MYPACK-003
  name: require-license-header
  type: required_pattern
  pattern: "Licensed under"
  severity: low
  file_glob: "src/**/*.py"

File policy

- id: MYPACK-004
  name: max-file-length
  type: file_policy
  check: max_lines
  severity: medium
  file_glob: "**/*.py"
  params:
    max_lines: 400

Available checks: starts_with_docstring, max_lines, must_contain_pattern, must_not_import, max_function_count.

AST — built-in Python checks

- id: MYPACK-005
  name: strict-complexity
  type: ast
  ast_check: high_complexity
  severity: high
  file_glob: "src/**/*.py"
  params:
    max_complexity: 7

Available checks: no_mutable_defaults, no_star_imports, high_complexity, no_nested_functions, no_global_variables, max_function_length.

AST — tree-sitter queries (multi-language)

The ast rule type also supports tree-sitter queries, enabling precise structural analysis across any supported language — C++, Go, Rust, Java, JavaScript, TypeScript, Kotlin, Ruby, C#, and more. No regex approximations: rules match real AST nodes.

- id: MYPACK-006
  name: no-eval
  description: "eval() executes arbitrary code and is a critical security risk"
  type: ast
  language: python
  query: |
    (call function: (identifier) @fn (#eq? @fn "eval"))
  severity: critical
  file_glob: "**/*.py"
  remediation_guidance: "Use ast.literal_eval() for safe data parsing."

The language field and query field are both required when using tree-sitter. The query uses tree-sitter's S-expression pattern syntax with support for #eq?, #match?, #any-of?, and other predicates.

Supported languages

language value Aliases Grammar package
python tree-sitter-python
cpp c++ tree-sitter-cpp
c tree-sitter-c
javascript js tree-sitter-javascript
typescript ts tree-sitter-typescript
go golang tree-sitter-go
rust tree-sitter-rust
java tree-sitter-java
kotlin tree-sitter-kotlin
ruby rb tree-sitter-ruby
c_sharp csharp, cs tree-sitter-c-sharp

Install all grammars at once: pip install "sentrik[treesitter]"

Writing queries

The best way to explore the AST structure for a language is the tree-sitter playground. Paste in sample code, browse the node tree, and copy node type names directly into your query.

Common patterns:

# Flag a specific function call by name (Go)
query: |
  (call_expression function: (selector_expression
    field: (field_identifier) @fn (#eq? @fn "Exec")))

# Flag any class whose name doesn't start with a capital letter (Java)
query: |
  (class_declaration name: (identifier) @name
    (#match? @name "^[a-z]"))

# Flag reinterpret_cast in C++
query: |
  (call_expression function: (template_function
    name: (identifier) @fn (#eq? @fn "reinterpret_cast")))

# Flag empty catch blocks (JavaScript)
query: |
  (catch_clause body: (statement_block . "}"))

Each captured node in the query becomes one finding, with the exact line number and source snippet. Findings on the same line are deduplicated automatically.

Installation check

sentrik list-packs   # Tree-sitter rules load silently if grammars are missing
python -c "from guard.rules.treesitter_checks import language_available; print(language_available('cpp'))"

LLM-enforced

llm rules use an AI model to analyze each matching file and identify violations. They can catch semantic issues that regex and AST cannot — constructor contracts, naming conventions, design patterns, architectural constraints.

- id: MYPACK-007
  name: single-argument-constructors-explicit
  description: "Single-argument constructors must be marked explicit to prevent implicit conversions"
  type: llm
  severity: high
  file_glob: "**/*.{hpp,h}"
  remediation_guidance: "Add the 'explicit' keyword before the constructor declaration."

Behavior by environment:

LLM configured? Behavior
Yes Evaluated per file, findings with line numbers, confidence: 0.75
No Degrades to a single informational obligation — pack still loads and reports the rule

Configure an LLM in VS Code (SENTRIK: Configure AI Provider) or via env var (GUARD_LLM_PROVIDER=anthropic). The rule description and remediation_guidance are sent as-is to the model — write them precisely.

Tips for effective LLM rules: - Be specific in the description: "constructors with exactly one non-defaulted parameter" is better than "constructors" - Keep file_glob tight — LLM rules are more expensive than regex - Use params.llm_max_chars: 6000 to limit file size for large files - Severity should reflect the real impact — the LLM returns findings at the severity you set

Documentation obligation

- id: MYPACK-DOC-001
  name: incident-response-plan
  type: documentation_obligation
  severity: info
  standard: "Internal Policy"
  clause: "IRP-01"
  remediation_guidance: "Create and maintain an incident response plan"

Adding skill guidance

Skills give AI agents rich, actionable context for your custom pack rules — correct code patterns, anti-patterns, and a verification checklist. Without skills, agents see only the raw rule descriptions.

Create a skills/ subdirectory alongside your pack.yaml:

.sentrik/rules/
  my-api-rules/
    pack.yaml
    skills/
      api-security.md     # covers MYAPI-001 through MYAPI-003

Each .md file is a full skill with frontmatter + sections. See the Compliance Skills guide for the complete format.

Minimal example:

---
skill_id: MYAPI-SECURITY
version: 1.0.0
pack: my-api-rules
rules:
  - MYAPI-001
  - MYAPI-002
languages: [python]
severity_floor: high
---

## Purpose
Enforce authentication and input validation on all API endpoints.

## Requirements
1. All endpoints must require authentication.
2. All inputs must be validated before use.

## Patterns

```python
@require_auth
def get_user(user_id: int) -> User:
    validated_id = validate_int(user_id, min=1)
    return db.get(validated_id)

Anti-Patterns

# WRONG: no auth, raw input passed to query
def get_user(user_id):
    return db.execute(f"SELECT * FROM users WHERE id={user_id}")

Verification Checklist

  • [ ] All endpoints decorated with @require_auth
  • [ ] All path/query parameters validated before use
    ### Option 2: Auto-synthesized (zero work)
    
    If no `skills/` directory exists, sentrik generates a minimal skill from your `pack.yaml` at runtime. The quality is lower (no code examples), but AI agents still receive the rule requirements and remediation hints automatically.
    
    To get the best results from auto-synthesis, write detailed `description` and `remediation_guidance` fields in your rules:
    
    ```yaml
    rules:
      - id: MYAPI-001
        description: "All API endpoints must require authentication — unauthenticated access enables data theft."
        remediation_guidance: "Add @require_auth decorator. See docs/auth-patterns.md for implementation examples."
        severity: high
    

Registering a pack

Add an entry to _BUILTIN_PACKS in src/guard/packs/registry.py:

_BUILTIN_PACKS: dict[str, str] = {
    "fda-iec-62304": "guard.packs.fda_iec_62304",
    "owasp-top-10": "guard.packs.owasp_top_10",
    "soc2": "guard.packs.soc2",
    "my-custom-pack": "guard.packs.my_custom_pack",
}

Enabling

sentrik add-pack my-custom-pack

Or in .guard.yaml:

standards_packs:
  - my-custom-pack

Testing

sentrik list-packs           # Verify pack loads
sentrik list-rules           # Verify rules appear
sentrik scan                 # Check findings

ID naming conventions

Pack Format Example
FDA IEC 62304 IEC62304-{category}-{num} IEC62304-CODE-001
OWASP Top 10 OWASP-{category}-{num} OWASP-A01-001
SOC2 SOC2-{category}-{num} SOC2-CC6-001

Use a consistent prefix for your organization.

Severity guidelines

Severity When to use
critical Security vulnerabilities, data exposure
high Bugs likely to cause issues, unsafe patterns
medium Code quality, maintainability risks
low Style issues, minor improvements
info Documentation obligations, informational