Loading...
Loading...
Configure code chunking in GrepAI. Use this skill to optimize how code is split for embedding.
npx skill4agent add yoanbernabeu/grepai-skills grepai-chunking┌─────────────────────────────────────┐
│ Large Source File │
│ (1000+ tokens) │
└─────────────────────────────────────┘
↓
┌─────────┐ ┌─────────┐ ┌─────────┐
│ Chunk 1 │ │ Chunk 2 │ │ Chunk 3 │
│ ~512 │ │ ~512 │ │ ~512 │
│ tokens │ │ tokens │ │ tokens │
└─────────┘ └─────────┘ └─────────┘
↓
Each chunk gets
its own embedding# .grepai/config.yaml
chunking:
size: 512 # Tokens per chunk
overlap: 50 # Overlap between chunks| Size | Effect |
|---|---|
| 256 | More precise, less context |
| 512 | Balanced (default) |
| 1024 | More context, less precise |
| Overlap | Effect |
|---|---|
| 0 | No overlap, may lose context at boundaries |
| 50 | Standard overlap (default) |
| 100 | More context, larger index |
File: auth.go (1000 tokens)
Chunk 1: tokens 1-512
┌────────────────────────────────────┐
│ func Login(user, pass)... │
└────────────────────────────────────┘
↘
50 token overlap
↙
Chunk 2: tokens 463-974
┌────────────────────────────────────┐
│ ...validate credentials... │
└────────────────────────────────────┘
↘
50 token overlap
↙
Chunk 3: tokens 925-1000
┌──────────────┐
│ ...return │
└──────────────┘chunking:
size: 768 # Larger to capture full methods
overlap: 75chunking:
size: 512 # Standard size
overlap: 50chunking:
size: 384 # Smaller for precise results
overlap: 40chunking:
size: 384 # Capture individual functions
overlap: 40chunking:
size: 768 # Capture more context
overlap: 100chunking:
size: 512 # Balanced default
overlap: 50func calculateTotal(items []Item) float64 {
total := 0.0
for _, item := range items {
total += item.Price * float64(item.Quantity)
}
return total
}| Size | Overlap | Chunks per 10K tokens | Index Impact |
|---|---|---|---|
| 512 | 0 | ~20 | Smallest |
| 512 | 50 | ~22 | Standard |
| 512 | 100 | ~24 | +10% |
| 256 | 50 | ~44 | +100% |
Query: "authentication middleware"
Result: "...c.AbortWithStatus(401)..."
(Fragment, missing context)Query: "authentication middleware"
Result: "func AuthMiddleware() gin.HandlerFunc {
return func(c *gin.Context) {
token := c.GetHeader("Authorization")
if token == "" {
c.AbortWithStatus(401)
return
}
// validate token...
}
}"
(Complete function with context)Query: "authentication middleware"
Result: "// Multiple unrelated functions...
func AuthMiddleware()... (your match)
func LoggingMiddleware()...
func CORSMiddleware()..."
(Too much noise)chunking:
size: 384
overlap: 40rm .grepai/index.gob
grepai watchgrepai search "your query"grepai search "authentication" > before.txtgrepai search "authentication" > after.txt
diff before.txt after.txtchunking:
size: 768chunking:
size: 384chunking:
overlap: 100✅ Chunking Configuration
Size: 512 tokens
Overlap: 50 tokens
Index Statistics:
- Total files: 245
- Total chunks: 1,234
- Avg chunks/file: 5.0
- Avg chunk size: 478 tokens
Recommendations:
- Current settings are balanced
- Consider size: 384 for more precise results
- Consider size: 768 for more context