If you’re building production-ready RAG (Retrieval-Augmented Generation), you’ve likely heard of R2R — the open-source framework for building, scaling, and managing RAG pipelines. While most tutorials focus on ingesting and querying, one question keeps popping up in the community: How do I download my data out of R2R?
from r2r import R2RClient client = R2RClient("http://localhost:7272") response = client.documents.export(document_id="doc-123") Save to file with open("exported_chunks.json", "w") as f: f.write(response.json) Or download raw file file_content = client.documents.download("doc-123") with open("downloaded.pdf", "wb") as f: f.write(file_content) Advanced: Download from R2R Cloud If you’re using R2R Cloud, the process is similar — just set your API key: r2rdownload
r2r documents list --output-format json > all_docs.json r2r documents list --ids-only | xargs -I {} r2r documents export {} --output ./exports/{}.json 4. Download via Python SDK For programmatic access: Download via Python SDK For programmatic access: