RAG 分块策略 LLM优化上下文处理 AI成本控制

• Sep 13, 2025

• 1 min read

长文本Rag终极解决方案 Llm Based Chunking

文章分析了RAG技术中分块处理的三个主要缺点：响应延迟、输出上下文窗口限制和成本增加，并建议结合上下文RAG方案进行优化。同时提供了分块策略参考链接和代码实现资源。

雷哥(微信:leigeaicom)

Written by 雷哥(微信:leigeaicom)

Table of Contents

示例图

示例图缺点:

Time/latency -> it takes time for the LLM to output all the chunks.
Hitting output context window cap -> since you’re essentially re-creating entire documents but in chunks, then you’ll often hit the token capacity of the output window.
Cost - since your essentially outputting entire documents again, you r costs go up.

可以结合contextual rag : 补充终极RAG方案

注意: 拆分的时候,很多时候不用拆句子, 根据换行符拆分就可以.

reddit: https://www.reddit.com/r/LocalLLaMA/comments/1kmcdyt/llm_better_chunking_method/ chunking-strategies-for-rag: https://weaviate.io/blog/chunking-strategies-for-rag#agentic-chunking

实现的代码

[[相关代码已经放到知识星球了]]

雷哥(微信:leigeaicom)

Written by

雷哥(微信:leigeaicom)

带你AI编程和AI工程化落地, 让你少走弯路, 做更有价值的创造者.

大家一起来讨论

Related

See all RAG

RAG LLM评估自我反思检索优化 Ragas框架

•Nov 16, 2025

Llm Judge的洞见

文章介绍了使用Ragas框架进行LLM评估测试的方法，重点阐述了自我反思RAG流程：行动、检查、反思、修正、再行动。通过评分提示词对检索结果进行相关度评分，当分数低于3时重写查询语句以改进检索效果。

雷哥(微信:leigeaicom)

Written by 雷哥(微信:leigeaicom)

知识图谱 RAG 股权分析图数据库数据关联

•Sep 18, 2025

长篇合同条款、行业分析报告Ai方案

文章通过对比RAG与知识图谱在处理复杂股权关系查询时的表现，突显知识图谱在强关联业务场景中的优势。使用具体案例展示知识图谱如何通过关系推理解决传统文本检索无法处理的深层关联问题。

雷哥(微信:leigeaicom)

Written by 雷哥(微信:leigeaicom)