Text splitters break documents into manageable chunks.
Split text for optimal LLM processing.
Splitter Types
CharacterTextSplitter: By characters
RecursiveCharacterTextSplitter: Smart splitting
TokenTextSplitter: By tokens
SentenceTextSplitter: By sentences
Example
from langchain.text_splitter import RecursiveCharacterTextSplitter
splitter = RecursiveCharacterTextSplitter(
chunk_size=1000,
chunk_overlap=200
)
chunks = splitter.split_text(long_text)
Best Practices
✅ Use overlap for context
✅ Match chunk size to model limits
✅ Test on your data
Conclusion
Text splitters prepare documents for LLMs!