Authoring Custom Packs¶
Create custom standards packs for organization-specific rules.
Pack structure¶
Each pack is a Python package inside src/guard/packs/:
src/guard/packs/
my_custom_pack/
__init__.py # Module docstring (1 line)
pack.yaml # Pack metadata + rules
__init__.py¶
pack.yaml¶
name: "My Custom Pack"
description: "One-sentence description of the pack's purpose"
rules:
- id: MYPACK-001
name: rule-name
type: regex
pattern: "..."
severity: high
file_glob: "**/*.py"
Rule format¶
Required fields¶
| Field | Type | Description |
|---|---|---|
id |
string | Unique ID (prefix with pack abbreviation) |
name |
string | Kebab-case name |
type |
string | regex, required_pattern, file_policy, ast, llm, or documentation_obligation |
severity |
string | critical, high, medium, low, or info |
Optional fields¶
| Field | Type | Description |
|---|---|---|
description |
string | What the rule detects |
file_glob |
string | Glob pattern for files to check |
pattern |
string | Regex pattern (for regex/required_pattern) |
all_patterns |
list | All patterns must match (composite regex) |
check |
string | Check function name (for file_policy) |
ast_check |
string | Built-in Python AST check name (for ast) |
language |
string | Tree-sitter language name (for ast with query) |
query |
string | Tree-sitter S-expression query (for ast with language) |
params |
dict | Configurable thresholds |
autofix |
string | remove_line, comment_out, or replace |
replacement |
string | Replacement text (with autofix: replace) |
standard |
string | Regulatory standard name |
clause |
string | Standard clause reference |
remediation_guidance |
string | Fix instructions shown in reports |
tags |
list | Categorization tags |
Rule type examples¶
Regex¶
- id: MYPACK-001
name: no-hardcoded-passwords
type: regex
pattern: "(?i)password\\s*=\\s*[\"'][^\"']{4,}"
severity: critical
file_glob: "**/*.py"
remediation_guidance: "Use environment variables or a secrets manager"
Composite regex¶
- id: MYPACK-002
name: sql-injection-risk
type: regex
severity: critical
file_glob: "**/*.py"
all_patterns:
- "execute\\("
- "request\\.(GET|POST|form)"
Required pattern¶
- id: MYPACK-003
name: require-license-header
type: required_pattern
pattern: "Licensed under"
severity: low
file_glob: "src/**/*.py"
File policy¶
- id: MYPACK-004
name: max-file-length
type: file_policy
check: max_lines
severity: medium
file_glob: "**/*.py"
params:
max_lines: 400
Available checks: starts_with_docstring, max_lines, must_contain_pattern, must_not_import, max_function_count.
AST — built-in Python checks¶
- id: MYPACK-005
name: strict-complexity
type: ast
ast_check: high_complexity
severity: high
file_glob: "src/**/*.py"
params:
max_complexity: 7
Available checks: no_mutable_defaults, no_star_imports, high_complexity, no_nested_functions, no_global_variables, max_function_length.
AST — tree-sitter queries (multi-language)¶
The ast rule type also supports tree-sitter queries, enabling precise structural analysis across any supported language — C++, Go, Rust, Java, JavaScript, TypeScript, Kotlin, Ruby, C#, and more. No regex approximations: rules match real AST nodes.
- id: MYPACK-006
name: no-eval
description: "eval() executes arbitrary code and is a critical security risk"
type: ast
language: python
query: |
(call function: (identifier) @fn (#eq? @fn "eval"))
severity: critical
file_glob: "**/*.py"
remediation_guidance: "Use ast.literal_eval() for safe data parsing."
The language field and query field are both required when using tree-sitter. The query uses tree-sitter's S-expression pattern syntax with support for #eq?, #match?, #any-of?, and other predicates.
Supported languages¶
language value |
Aliases | Grammar package |
|---|---|---|
python |
— | tree-sitter-python |
cpp |
c++ |
tree-sitter-cpp |
c |
— | tree-sitter-c |
javascript |
js |
tree-sitter-javascript |
typescript |
ts |
tree-sitter-typescript |
go |
golang |
tree-sitter-go |
rust |
— | tree-sitter-rust |
java |
— | tree-sitter-java |
kotlin |
— | tree-sitter-kotlin |
ruby |
rb |
tree-sitter-ruby |
c_sharp |
csharp, cs |
tree-sitter-c-sharp |
Install all grammars at once: pip install "sentrik[treesitter]"
Writing queries¶
The best way to explore the AST structure for a language is the tree-sitter playground. Paste in sample code, browse the node tree, and copy node type names directly into your query.
Common patterns:
# Flag a specific function call by name (Go)
query: |
(call_expression function: (selector_expression
field: (field_identifier) @fn (#eq? @fn "Exec")))
# Flag any class whose name doesn't start with a capital letter (Java)
query: |
(class_declaration name: (identifier) @name
(#match? @name "^[a-z]"))
# Flag reinterpret_cast in C++
query: |
(call_expression function: (template_function
name: (identifier) @fn (#eq? @fn "reinterpret_cast")))
# Flag empty catch blocks (JavaScript)
query: |
(catch_clause body: (statement_block . "}"))
Each captured node in the query becomes one finding, with the exact line number and source snippet. Findings on the same line are deduplicated automatically.
Installation check¶
sentrik list-packs # Tree-sitter rules load silently if grammars are missing
python -c "from guard.rules.treesitter_checks import language_available; print(language_available('cpp'))"
LLM-enforced¶
llm rules use an AI model to analyze each matching file and identify violations. They can catch semantic issues that regex and AST cannot — constructor contracts, naming conventions, design patterns, architectural constraints.
- id: MYPACK-007
name: single-argument-constructors-explicit
description: "Single-argument constructors must be marked explicit to prevent implicit conversions"
type: llm
severity: high
file_glob: "**/*.{hpp,h}"
remediation_guidance: "Add the 'explicit' keyword before the constructor declaration."
Behavior by environment:
| LLM configured? | Behavior |
|---|---|
| Yes | Evaluated per file, findings with line numbers, confidence: 0.75 |
| No | Degrades to a single informational obligation — pack still loads and reports the rule |
Configure an LLM in VS Code (SENTRIK: Configure AI Provider) or via env var (GUARD_LLM_PROVIDER=anthropic). The rule description and remediation_guidance are sent as-is to the model — write them precisely.
Tips for effective LLM rules:
- Be specific in the description: "constructors with exactly one non-defaulted parameter" is better than "constructors"
- Keep file_glob tight — LLM rules are more expensive than regex
- Use params.llm_max_chars: 6000 to limit file size for large files
- Severity should reflect the real impact — the LLM returns findings at the severity you set
Documentation obligation¶
- id: MYPACK-DOC-001
name: incident-response-plan
type: documentation_obligation
severity: info
standard: "Internal Policy"
clause: "IRP-01"
remediation_guidance: "Create and maintain an incident response plan"
Adding skill guidance¶
Skills give AI agents rich, actionable context for your custom pack rules — correct code patterns, anti-patterns, and a verification checklist. Without skills, agents see only the raw rule descriptions.
Option 1: Hand-authored skills (recommended)¶
Create a skills/ subdirectory alongside your pack.yaml:
.sentrik/rules/
my-api-rules/
pack.yaml
skills/
api-security.md # covers MYAPI-001 through MYAPI-003
Each .md file is a full skill with frontmatter + sections. See the Compliance Skills guide for the complete format.
Minimal example:
---
skill_id: MYAPI-SECURITY
version: 1.0.0
pack: my-api-rules
rules:
- MYAPI-001
- MYAPI-002
languages: [python]
severity_floor: high
---
## Purpose
Enforce authentication and input validation on all API endpoints.
## Requirements
1. All endpoints must require authentication.
2. All inputs must be validated before use.
## Patterns
```python
@require_auth
def get_user(user_id: int) -> User:
validated_id = validate_int(user_id, min=1)
return db.get(validated_id)
Anti-Patterns¶
# WRONG: no auth, raw input passed to query
def get_user(user_id):
return db.execute(f"SELECT * FROM users WHERE id={user_id}")
Verification Checklist¶
- [ ] All endpoints decorated with @require_auth
- [ ] All path/query parameters validated before use
### Option 2: Auto-synthesized (zero work) If no `skills/` directory exists, sentrik generates a minimal skill from your `pack.yaml` at runtime. The quality is lower (no code examples), but AI agents still receive the rule requirements and remediation hints automatically. To get the best results from auto-synthesis, write detailed `description` and `remediation_guidance` fields in your rules: ```yaml rules: - id: MYAPI-001 description: "All API endpoints must require authentication — unauthenticated access enables data theft." remediation_guidance: "Add @require_auth decorator. See docs/auth-patterns.md for implementation examples." severity: high
Registering a pack¶
Add an entry to _BUILTIN_PACKS in src/guard/packs/registry.py:
_BUILTIN_PACKS: dict[str, str] = {
"fda-iec-62304": "guard.packs.fda_iec_62304",
"owasp-top-10": "guard.packs.owasp_top_10",
"soc2": "guard.packs.soc2",
"my-custom-pack": "guard.packs.my_custom_pack",
}
Enabling¶
Or in .guard.yaml:
Testing¶
sentrik list-packs # Verify pack loads
sentrik list-rules # Verify rules appear
sentrik scan # Check findings
ID naming conventions¶
| Pack | Format | Example |
|---|---|---|
| FDA IEC 62304 | IEC62304-{category}-{num} |
IEC62304-CODE-001 |
| OWASP Top 10 | OWASP-{category}-{num} |
OWASP-A01-001 |
| SOC2 | SOC2-{category}-{num} |
SOC2-CC6-001 |
Use a consistent prefix for your organization.
Severity guidelines¶
| Severity | When to use |
|---|---|
critical |
Security vulnerabilities, data exposure |
high |
Bugs likely to cause issues, unsafe patterns |
medium |
Code quality, maintainability risks |
low |
Style issues, minor improvements |
info |
Documentation obligations, informational |