langchain4j-rag-implementation-patterns

Compare original and translation side by side

🇺🇸

Original

English
🇨🇳

Translation

Chinese

LangChain4j RAG Implementation Patterns

LangChain4j RAG 实现模式

When to Use This Skill

何时使用该技能

Use this skill when:
  • Building knowledge-based AI applications requiring external document access
  • Implementing question-answering systems over large document collections
  • Creating AI assistants with access to company knowledge bases
  • Building semantic search capabilities for document repositories
  • Implementing chat systems that reference specific information sources
  • Creating AI applications requiring source attribution
  • Building domain-specific AI systems with curated knowledge
  • Implementing hybrid search combining vector similarity with traditional search
  • Creating AI applications requiring real-time document updates
  • Building multi-modal RAG systems with text, images, and other content types
在以下场景中使用该技能:
  • 构建需要访问外部文档的基于知识的AI应用
  • 实现针对大型文档集合的问答系统
  • 创建可访问公司知识库的AI助手
  • 为文档仓库构建语义搜索功能
  • 实现可参考特定信息源的聊天系统
  • 创建需要来源归因的AI应用
  • 构建带有精选知识的领域特定AI系统
  • 实现结合向量相似度与传统搜索的混合搜索
  • 创建支持实时文档更新的AI应用
  • 构建支持文本、图像等多种内容类型的多模态RAG系统

Overview

概述

Implement complete Retrieval-Augmented Generation (RAG) systems with LangChain4j. RAG enhances language models by providing relevant context from external knowledge sources, improving accuracy and reducing hallucinations.
使用LangChain4j实现完整的检索增强生成(RAG)系统。RAG通过从外部知识源提供相关上下文来增强语言模型,提高准确性并减少幻觉现象。

Instructions

操作步骤

Initialize RAG Project

初始化RAG项目

Create a new Spring Boot project with required dependencies:
pom.xml:
xml
<dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j-spring-boot-starter</artifactId>
    <version>1.8.0</version>
</dependency>
<dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j-open-ai</artifactId>
    <version>1.8.0</version>
</dependency>
创建包含所需依赖的新Spring Boot项目:
pom.xml:
xml
<dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j-spring-boot-starter</artifactId>
    <version>1.8.0</version>
</dependency>
<dependency>
    <groupId>dev.langchain4j</groupId>
    <artifactId>langchain4j-open-ai</artifactId>
    <version>1.8.0</version>
</dependency>

Setup Document Ingestion

配置文档摄入

Configure document loading and processing:
java
@Configuration
public class RAGConfiguration {

    @Bean
    public EmbeddingModel embeddingModel() {
        return OpenAiEmbeddingModel.builder()
            .apiKey(System.getenv("OPENAI_API_KEY"))
            .modelName("text-embedding-3-small")
            .build();
    }

    @Bean
    public EmbeddingStore<TextSegment> embeddingStore() {
        return new InMemoryEmbeddingStore<>();
    }
}
Create document ingestion service:
java
@Service
@RequiredArgsConstructor
public class DocumentIngestionService {

    private final EmbeddingModel embeddingModel;
    private final EmbeddingStore<TextSegment> embeddingStore;

    public void ingestDocument(String filePath, Map<String, Object> metadata) {
        Document document = FileSystemDocumentLoader.loadDocument(filePath);
        document.metadata().putAll(metadata);

        DocumentSplitter splitter = DocumentSplitters.recursive(
            500, 50, new OpenAiTokenCountEstimator("text-embedding-3-small")
        );

        List<TextSegment> segments = splitter.split(document);
        List<Embedding> embeddings = embeddingModel.embedAll(segments).content();
        embeddingStore.addAll(embeddings, segments);
    }
}
配置文档加载与处理:
java
@Configuration
public class RAGConfiguration {

    @Bean
    public EmbeddingModel embeddingModel() {
        return OpenAiEmbeddingModel.builder()
            .apiKey(System.getenv("OPENAI_API_KEY"))
            .modelName("text-embedding-3-small")
            .build();
    }

    @Bean
    public EmbeddingStore<TextSegment> embeddingStore() {
        return new InMemoryEmbeddingStore<>();
    }
}
创建文档摄入服务:
java
@Service
@RequiredArgsConstructor
public class DocumentIngestionService {

    private final EmbeddingModel embeddingModel;
    private final EmbeddingStore<TextSegment> embeddingStore;

    public void ingestDocument(String filePath, Map<String, Object> metadata) {
        Document document = FileSystemDocumentLoader.loadDocument(filePath);
        document.metadata().putAll(metadata);

        DocumentSplitter splitter = DocumentSplitters.recursive(
            500, 50, new OpenAiTokenCountEstimator("text-embedding-3-small")
        );

        List<TextSegment> segments = splitter.split(document);
        List<Embedding> embeddings = embeddingModel.embedAll(segments).content();
        embeddingStore.addAll(embeddings, segments);
    }
}

Configure Content Retrieval

配置内容检索

Setup content retrieval with filtering:
java
@Configuration
public class ContentRetrieverConfiguration {

    @Bean
    public ContentRetriever contentRetriever(
            EmbeddingStore<TextSegment> embeddingStore,
            EmbeddingModel embeddingModel) {

        return EmbeddingStoreContentRetriever.builder()
            .embeddingStore(embeddingStore)
            .embeddingModel(embeddingModel)
            .maxResults(5)
            .minScore(0.7)
            .build();
    }
}
设置带过滤功能的内容检索:
java
@Configuration
public class ContentRetrieverConfiguration {

    @Bean
    public ContentRetriever contentRetriever(
            EmbeddingStore<TextSegment> embeddingStore,
            EmbeddingModel embeddingModel) {

        return EmbeddingStoreContentRetriever.builder()
            .embeddingStore(embeddingStore)
            .embeddingModel(embeddingModel)
            .maxResults(5)
            .minScore(0.7)
            .build();
    }
}

Create RAG-Enabled AI Service

创建支持RAG的AI服务

Define AI service with context retrieval:
java
interface KnowledgeAssistant {
    @SystemMessage("""
        You are a knowledgeable assistant with access to a comprehensive knowledge base.

        When answering questions:
        1. Use the provided context from the knowledge base
        2. If information is not in the context, clearly state this
        3. Provide accurate, helpful responses
        4. When possible, reference specific sources
        5. If the context is insufficient, ask for clarification
        """)
    String answerQuestion(String question);
}

@Service
@RequiredArgsConstructor
public class KnowledgeService {

    private final KnowledgeAssistant assistant;

    public KnowledgeService(ChatModel chatModel, ContentRetriever contentRetriever) {
        this.assistant = AiServices.builder(KnowledgeAssistant.class)
            .chatModel(chatModel)
            .contentRetriever(contentRetriever)
            .build();
    }

    public String answerQuestion(String question) {
        return assistant.answerQuestion(question);
    }
}
定义带上下文检索的AI服务:
java
interface KnowledgeAssistant {
    @SystemMessage("""
        You are a knowledgeable assistant with access to a comprehensive knowledge base.

        When answering questions:
        1. Use the provided context from the knowledge base
        2. If information is not in the context, clearly state this
        3. Provide accurate, helpful responses
        4. When possible, reference specific sources
        5. If the context is insufficient, ask for clarification
        """)
    String answerQuestion(String question);
}

@Service
@RequiredArgsConstructor
public class KnowledgeService {

    private final KnowledgeAssistant assistant;

    public KnowledgeService(ChatModel chatModel, ContentRetriever contentRetriever) {
        this.assistant = AiServices.builder(KnowledgeAssistant.class)
            .chatModel(chatModel)
            .contentRetriever(contentRetriever)
            .build();
    }

    public String answerQuestion(String question) {
        return assistant.answerQuestion(question);
    }
}

Examples

示例

Basic Document Processing

基础文档处理

java
public class BasicRAGExample {
    public static void main(String[] args) {
        var embeddingStore = new InMemoryEmbeddingStore<TextSegment>();

        var embeddingModel = OpenAiEmbeddingModel.builder()
            .apiKey(System.getenv("OPENAI_API_KEY"))
            .modelName("text-embedding-3-small")
            .build();

        var ingestor = EmbeddingStoreIngestor.builder()
            .embeddingModel(embeddingModel)
            .embeddingStore(embeddingStore)
            .build();

        ingestor.ingest(Document.from("Spring Boot is a framework for building Java applications with minimal configuration."));

        var retriever = EmbeddingStoreContentRetriever.builder()
            .embeddingStore(embeddingStore)
            .embeddingModel(embeddingModel)
            .build();
    }
}
java
public class BasicRAGExample {
    public static void main(String[] args) {
        var embeddingStore = new InMemoryEmbeddingStore<TextSegment>();

        var embeddingModel = OpenAiEmbeddingModel.builder()
            .apiKey(System.getenv("OPENAI_API_KEY"))
            .modelName("text-embedding-3-small")
            .build();

        var ingestor = EmbeddingStoreIngestor.builder()
            .embeddingModel(embeddingModel)
            .embeddingStore(embeddingStore)
            .build();

        ingestor.ingest(Document.from("Spring Boot is a framework for building Java applications with minimal configuration."));

        var retriever = EmbeddingStoreContentRetriever.builder()
            .embeddingStore(embeddingStore)
            .embeddingModel(embeddingModel)
            .build();
    }
}

Multi-Domain Assistant

多领域助手

java
interface MultiDomainAssistant {
    @SystemMessage("""
        You are an expert assistant with access to multiple knowledge domains:
        - Technical documentation
        - Company policies
        - Product information
        - Customer support guides

        Tailor your response based on the type of question and available context.
        Always indicate which domain the information comes from.
        """)
    String answerQuestion(@MemoryId String userId, String question);
}
java
interface MultiDomainAssistant {
    @SystemMessage("""
        You are an expert assistant with access to multiple knowledge domains:
        - Technical documentation
        - Company policies
        - Product information
        - Customer support guides

        Tailor your response based on the type of question and available context.
        Always indicate which domain the information comes from.
        """)
    String answerQuestion(@MemoryId String userId, String question);
}

Hierarchical RAG

分层RAG

java
@Service
@RequiredArgsConstructor
public class HierarchicalRAGService {

    private final EmbeddingStore<TextSegment> chunkStore;
    private final EmbeddingStore<TextSegment> summaryStore;
    private final EmbeddingModel embeddingModel;

    public String performHierarchicalRetrieval(String query) {
        List<EmbeddingMatch<TextSegment>> summaryMatches = searchSummaries(query);
        List<TextSegment> relevantChunks = new ArrayList<>();

        for (EmbeddingMatch<TextSegment> summaryMatch : summaryMatches) {
            String documentId = summaryMatch.embedded().metadata().getString("documentId");
            List<EmbeddingMatch<TextSegment>> chunkMatches = searchChunksInDocument(query, documentId);
            chunkMatches.stream()
                .map(EmbeddingMatch::embedded)
                .forEach(relevantChunks::add);
        }

        return generateResponseWithChunks(query, relevantChunks);
    }
}
java
@Service
@RequiredArgsConstructor
public class HierarchicalRAGService {

    private final EmbeddingStore<TextSegment> chunkStore;
    private final EmbeddingStore<TextSegment> summaryStore;
    private final EmbeddingModel embeddingModel;

    public String performHierarchicalRetrieval(String query) {
        List<EmbeddingMatch<TextSegment>> summaryMatches = searchSummaries(query);
        List<TextSegment> relevantChunks = new ArrayList<>();

        for (EmbeddingMatch<TextSegment> summaryMatch : summaryMatches) {
            String documentId = summaryMatch.embedded().metadata().getString("documentId");
            List<EmbeddingMatch<TextSegment>> chunkMatches = searchChunksInDocument(query, documentId);
            chunkMatches.stream()
                .map(EmbeddingMatch::embedded)
                .forEach(relevantChunks::add);
        }

        return generateResponseWithChunks(query, relevantChunks);
    }
}

Best Practices

最佳实践

Document Segmentation

文档分割

  • Use recursive splitting with 500-1000 token chunks for most applications
  • Maintain 20-50 token overlap between chunks for context preservation
  • Consider document structure (headings, paragraphs) when splitting
  • Use token-aware splitters for optimal embedding generation
  • 对于大多数应用,使用递归分割,块大小为500-1000个token
  • 块之间保留20-50个token的重叠以保持上下文连贯性
  • 分割时考虑文档结构(标题、段落)
  • 使用支持token感知的分割器以优化嵌入生成

Metadata Strategy

元数据策略

  • Include rich metadata for filtering and attribution:
    • User and tenant identifiers for multi-tenancy
    • Document type and category classification
    • Creation and modification timestamps
    • Version and author information
    • Confidentiality and access level tags
  • 包含丰富的元数据以支持过滤和归因:
    • 多租户场景下的用户和租户标识符
    • 文档类型和类别分类
    • 创建和修改时间戳
    • 版本和作者信息
    • 保密性和访问级别标签

Query Processing

查询处理

  • Implement query preprocessing and cleaning
  • Consider query expansion for better recall
  • Apply dynamic filtering based on user context
  • Use re-ranking for improved result quality
  • 实现查询预处理与清理
  • 考虑查询扩展以提高召回率
  • 根据用户上下文应用动态过滤
  • 使用重排序以提升结果质量

Performance Optimization

性能优化

  • Cache embeddings for repeated queries
  • Use batch embedding generation for bulk operations
  • Implement pagination for large result sets
  • Consider asynchronous processing for long operations
  • 对重复查询的嵌入结果进行缓存
  • 批量操作时使用批量嵌入生成
  • 为大型结果集实现分页
  • 耗时操作考虑使用异步处理

Common Patterns

常见模式

Simple RAG Pipeline

简单RAG管道

java
@RequiredArgsConstructor
@Service
public class SimpleRAGPipeline {

    private final EmbeddingModel embeddingModel;
    private final EmbeddingStore<TextSegment> embeddingStore;
    private final ChatModel chatModel;

    public String answerQuestion(String question) {
        Embedding queryEmbedding = embeddingModel.embed(question).content();
        EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
            .queryEmbedding(queryEmbedding)
            .maxResults(3)
            .build();

        List<TextSegment> segments = embeddingStore.search(request).matches().stream()
            .map(EmbeddingMatch::embedded)
            .collect(Collectors.toList());

        String context = segments.stream()
            .map(TextSegment::text)
            .collect(Collectors.joining("\n\n"));

        return chatModel.generate(context + "\n\nQuestion: " + question + "\nAnswer:");
    }
}
java
@RequiredArgsConstructor
@Service
public class SimpleRAGPipeline {

    private final EmbeddingModel embeddingModel;
    private final EmbeddingStore<TextSegment> embeddingStore;
    private final ChatModel chatModel;

    public String answerQuestion(String question) {
        Embedding queryEmbedding = embeddingModel.embed(question).content();
        EmbeddingSearchRequest request = EmbeddingSearchRequest.builder()
            .queryEmbedding(queryEmbedding)
            .maxResults(3)
            .build();

        List<TextSegment> segments = embeddingStore.search(request).matches().stream()
            .map(EmbeddingMatch::embedded)
            .collect(Collectors.toList());

        String context = segments.stream()
            .map(TextSegment::text)
            .collect(Collectors.joining("\n\n"));

        return chatModel.generate(context + "\n\nQuestion: " + question + "\nAnswer:");
    }
}

Hybrid Search (Vector + Keyword)

混合搜索(向量+关键词)

java
@Service
@RequiredArgsConstructor
public class HybridSearchService {

    private final EmbeddingStore<TextSegment> vectorStore;
    private final FullTextSearchEngine keywordEngine;
    private final EmbeddingModel embeddingModel;

    public List<Content> hybridSearch(String query, int maxResults) {
        // Vector search
        List<Content> vectorResults = performVectorSearch(query, maxResults);

        // Keyword search
        List<Content> keywordResults = performKeywordSearch(query, maxResults);

        // Combine and re-rank using RRF algorithm
        return combineResults(vectorResults, keywordResults, maxResults);
    }
}
java
@Service
@RequiredArgsConstructor
public class HybridSearchService {

    private final EmbeddingStore<TextSegment> vectorStore;
    private final FullTextSearchEngine keywordEngine;
    private final EmbeddingModel embeddingModel;

    public List<Content> hybridSearch(String query, int maxResults) {
        // Vector search
        List<Content> vectorResults = performVectorSearch(query, maxResults);

        // Keyword search
        List<Content> keywordResults = performKeywordSearch(query, maxResults);

        // Combine and re-rank using RRF algorithm
        return combineResults(vectorResults, keywordResults, maxResults);
    }
}

Troubleshooting

故障排除

Common Issues

常见问题

Poor Retrieval Results
  • Check document chunk size and overlap settings
  • Verify embedding model compatibility
  • Ensure metadata filters are not too restrictive
  • Consider adding re-ranking step
Slow Performance
  • Use cached embeddings for frequent queries
  • Optimize database indexing for vector stores
  • Implement pagination for large datasets
  • Consider async processing for bulk operations
High Memory Usage
  • Use disk-based embedding stores for large datasets
  • Implement proper pagination and filtering
  • Clean up unused embeddings periodically
  • Monitor and optimize chunk sizes
检索结果不佳
  • 检查文档块大小和重叠设置
  • 验证嵌入模型的兼容性
  • 确保元数据过滤器不过于严格
  • 考虑添加重排序步骤
性能缓慢
  • 对频繁查询使用缓存的嵌入结果
  • 优化向量存储的数据库索引
  • 为大型数据集实现分页
  • 批量操作考虑使用异步处理
内存占用过高
  • 大型数据集使用基于磁盘的嵌入存储
  • 实现适当的分页和过滤
  • 定期清理未使用的嵌入结果
  • 监控并优化块大小

References

参考资料