A pure Rust tool and library that converts various document formats into Markdown — designed for LLM consumption.
MarkItDown is a great Python library for converting documents to Markdown. But integrating Python into Rust applications means bundling a Python runtime (~50 MB), dealing with cross-platform compatibility issues, and managing dependency hell.
anytomd solves this with a single cargo add anytomd — zero external runtime, no C bindings, no subprocess calls. Just pure Rust.
| Format | Extensions | Notes |
|---|---|---|
| DOCX | .docx |
Headings, tables, lists, bold/italic, hyperlinks, images, text boxes |
| PPTX | .pptx |
Slides, tables, speaker notes, images, group shapes |
| XLSX | .xlsx |
Multi-sheet, date/time handling, images |
| XLS | .xls |
Legacy Excel (via calamine) |
| HTML | .html, .htm |
Full DOM: headings, tables, lists, links, blockquotes, code blocks |
| CSV | .csv |
Converted to Markdown tables |
| Jupyter Notebook | .ipynb |
Markdown cells preserved, code cells in fenced blocks with language detection |
| JSON | .json |
Pretty-printed in fenced code blocks |
| XML | .xml |
Pretty-printed in fenced code blocks |
| Images | .png, .jpg, .gif, .webp, .bmp, .tiff, .svg, .heic, .avif |
Optional LLM-based alt text via ImageDescriber |
| Code | .py, .rs, .js, .ts, .c, .cpp, .go, .java, .rb, .swift, .sh, ... |
Fenced code blocks with language identifier |
| Plain Text | .txt, .md, .rst, .log, .toml, .yaml, .ini, etc. |
Passthrough with encoding detection (UTF-8, UTF-16, Windows-1252) |
Note on PDF: PDF conversion is intentionally out of scope. Gemini, ChatGPT, and Claude already provide native PDF support (with plan/model-specific limits), so anytomd focuses on formats that still benefit from dedicated Markdown conversion. Attempting to convert a PDF will return a descriptive FormatNotSupported error.
Format is auto-detected from magic bytes and file extension. ZIP-based formats (DOCX/PPTX/XLSX) are distinguished by inspecting internal archive structure.
A CSV file with multilingual data:
Name,Age,City
Alice,30,Seoul
Bob,25,東京
Charlie,35,New York
다영,28,서울
Output:
| Name | Age | City |
|---|---|---|
| Alice | 30 | Seoul |
| Bob | 25 | 東京 |
| Charlie | 35 | New York |
| 다영 | 28 | 서울 |A Word document with headings, links, Korean text, and emoji:
Output:
# Sample Document
This is a simple paragraph.
## Section One
Visit [Example](https://example.com) for more info.
Korean: 한국어 테스트
Emoji: 🚀✨🌍
### Subsection
Final paragraph with mixed content.A PowerPoint presentation with slides, tables, speaker notes, and multilingual content:
Output:
## Slide 1: Sample Presentation
Welcome to the presentation.
---
## Slide 2
Data Overview
| Name | Value | Status |
|---|---|---|
| Alpha | 100 | Active |
| Beta | 200 | Inactive |
| Gamma | 300 | Active |
> Note: Remember to explain the data table.
---
## Slide 3: Multilingual
한국어 테스트
🚀✨🌍
> Note: Test multilingual rendering.cargo add anytomdnpm install anytomdimport init, { convertBytes } from 'anytomd';
await init();
const response = await fetch('document.docx');
const bytes = new Uint8Array(await response.arrayBuffer());
const result = convertBytes(bytes, 'docx');
console.log(result.markdown);| Feature | Dependencies | Description |
|---|---|---|
| (default) | async-gemini |
Async API + AsyncGeminiDescriber — all async features enabled out of the box |
async |
futures-util |
Async API (convert_file_async, convert_bytes_async, AsyncImageDescriber trait) |
async-gemini |
async + reqwest |
AsyncGeminiDescriber for concurrent image descriptions via Gemini |
wasm |
wasm-bindgen, js-sys, wasm-bindgen-futures |
WebAssembly bindings (convertBytes, convertBytesWithOptions) for browser/edge use |
wasm + async-gemini |
(combined) | Adds convertBytesWithGemini for async Gemini-powered conversion in WASM |
Async features are included by default. To opt out:
anytomd = { version = "1", default-features = false }anytomd compiles to wasm32-unknown-unknown, enabling client-side document conversion in browsers, Cloudflare Workers, Deno Deploy, and other edge runtimes. Documents never leave the user's device.
# Basic WASM build (sync conversion only)
wasm-pack build --target web --no-default-features --features wasm
# With Gemini async image descriptions
wasm-pack build --target web --no-default-features --features wasm,async-geminiimport init, { convertBytes } from './pkg/anytomd.js';
await init();
const response = await fetch('document.docx');
const bytes = new Uint8Array(await response.arrayBuffer());
const result = convertBytes(bytes, 'docx');
console.log(result.markdown);
console.log(result.plainText);
console.log(result.title); // string or null
console.log(result.warnings); // string[]import init, { convertBytesWithGemini } from './pkg/anytomd.js';
await init();
const response = await fetch('presentation.pptx');
const bytes = new Uint8Array(await response.arrayBuffer());
// Images are described concurrently via the Gemini API
const result = await convertBytesWithGemini(bytes, 'pptx', 'your-gemini-api-key');
console.log(result.markdown); // images have LLM-generated alt text| API | Native | WASM |
|---|---|---|
convert_bytes / convertBytes |
Yes | Yes |
convert_bytes_async |
Yes | Yes |
convert_file / convert_file_async |
Yes | No (no filesystem) |
GeminiDescriber (sync) |
Yes | No (uses ureq) |
AsyncGeminiDescriber / convertBytesWithGemini |
Yes | Yes (wasm + async-gemini) |
All 12 format converters work on WASM via convert_bytes.
cargo install anytomd# Convert a single file
anytomd document.docx > output.md
# Convert multiple files (separated by <!-- source: path --> comments)
anytomd report.docx data.csv slides.pptx > combined.md
# Write output to a file
anytomd document.docx -o output.md
# Read from stdin (--format is required)
cat data.csv | anytomd --format csv
# Override format detection
anytomd --format html page.dat
# Strict mode: treat recoverable errors as hard errors
anytomd --strict document.docx
# Plain text output (Markdown formatting stripped)
anytomd --plain-text document.docx
# Plain text from stdin
echo "Name,Age" | anytomd --format csv --plain-text
# Auto image descriptions (just set GEMINI_API_KEY)
export GEMINI_API_KEY=your-key
anytomd presentation.pptx| Code | Meaning |
|---|---|
| 0 | Success |
| 1 | Conversion failure |
| 2 | Invalid arguments |
use anytomd::{convert_file, convert_bytes, ConversionOptions};
// Convert a file (format auto-detected from extension and magic bytes)
let options = ConversionOptions::default();
let result = convert_file("document.docx", &options).unwrap();
println!("{}", result.markdown);
// Convert raw bytes with an explicit format
let csv_data = b"Name,Age\nAlice,30\nBob,25";
let result = convert_bytes(csv_data, "csv", &options).unwrap();
println!("{}", result.markdown);Every conversion produces both Markdown and plain text output. The plain text is extracted directly from the source document — no post-processing or markdown stripping — so source characters like **kwargs or # comment are preserved exactly.
use anytomd::{convert_file, ConversionOptions};
let result = convert_file("document.docx", &ConversionOptions::default()).unwrap();
// Markdown output
println!("{}", result.markdown);
// Plain text output (no headings, bold, tables, code fences, etc.)
println!("{}", result.plain_text);use anytomd::{convert_file, ConversionOptions};
let options = ConversionOptions {
extract_images: true,
..Default::default()
};
let result = convert_file("presentation.pptx", &options).unwrap();
for (filename, bytes) in &result.images {
std::fs::write(filename, bytes).unwrap();
}anytomd can generate alt text for images using any LLM backend via the ImageDescriber trait. A built-in Google Gemini implementation is included.
use std::sync::Arc;
use anytomd::{convert_file, ConversionOptions, ImageDescriber, ConvertError};
use anytomd::gemini::GeminiDescriber;
// Option 1: Use the built-in Gemini describer
let describer = GeminiDescriber::from_env() // reads GEMINI_API_KEY
.unwrap()
.with_model("gemini-3-flash-preview".to_string());
let options = ConversionOptions {
image_describer: Some(Arc::new(describer)),
..Default::default()
};
let result = convert_file("document.docx", &options).unwrap();
// Images now have LLM-generated alt text: 
// Option 2: Implement your own describer for any backend
struct MyDescriber;
impl ImageDescriber for MyDescriber {
fn describe(
&self,
image_bytes: &[u8],
mime_type: &str,
prompt: &str,
) -> Result<String, ConvertError> {
// Call your preferred LLM API here
Ok("description of the image".to_string())
}
}For documents with many images, the async API resolves all descriptions concurrently. Included by default since v0.11.0.
use std::sync::Arc;
use anytomd::{convert_file_async, AsyncConversionOptions, AsyncImageDescriber, ConvertError};
use anytomd::gemini::AsyncGeminiDescriber;
#[tokio::main]
async fn main() {
let describer = AsyncGeminiDescriber::from_env().unwrap();
let options = AsyncConversionOptions {
async_image_describer: Some(Arc::new(describer)),
..Default::default()
};
let result = convert_file_async("presentation.pptx", &options).await.unwrap();
println!("{}", result.markdown);
// All images described concurrently — significant speedup for multi-image documents
}The library has no tokio dependency — the caller provides the async runtime. Any runtime (tokio, async-std, etc.) works.
/// Convert a file at the given path to Markdown.
/// Format is auto-detected from magic bytes and file extension.
pub fn convert_file(
path: impl AsRef<Path>,
options: &ConversionOptions,
) -> Result<ConversionResult, ConvertError>/// Convert raw bytes to Markdown with an explicit format extension.
pub fn convert_bytes(
data: &[u8],
extension: &str,
options: &ConversionOptions,
) -> Result<ConversionResult, ConvertError>Included by default (requires the async feature if default features are disabled).
/// Convert a file at the given path to Markdown with async image description.
/// If an async_image_describer is set, all image descriptions are resolved concurrently.
pub async fn convert_file_async(
path: impl AsRef<Path>,
options: &AsyncConversionOptions,
) -> Result<ConversionResult, ConvertError>Included by default (requires the async feature if default features are disabled).
/// Convert raw bytes to Markdown with async image description.
pub async fn convert_bytes_async(
data: &[u8],
extension: &str,
options: &AsyncConversionOptions,
) -> Result<ConversionResult, ConvertError>| Field | Type | Default | Description |
|---|---|---|---|
extract_images |
bool |
false |
Extract embedded images into result.images |
max_total_image_bytes |
usize |
50 MB | Hard cap for total extracted image bytes |
max_input_bytes |
usize |
100 MB | Maximum input file size |
max_uncompressed_zip_bytes |
usize |
500 MB | ZIP bomb guard |
strict |
bool |
false |
Error on recoverable failures instead of warnings |
image_describer |
Option<Arc<dyn ImageDescriber>> |
None |
LLM backend for image alt text generation |
pub struct ConversionResult {
pub markdown: String, // The converted Markdown
pub plain_text: String, // Plain text (extracted directly, no markdown syntax)
pub title: Option<String>, // Document title, if detected
pub images: Vec<(String, Vec<u8>)>, // Extracted images (filename, bytes)
pub warnings: Vec<ConversionWarning>, // Recoverable issues encountered
}Conversion is best-effort by default. If a single element fails to parse (e.g., a corrupted table), it is skipped and a warning is added to result.warnings. The rest of the document is still converted.
Set strict: true in ConversionOptions to turn recoverable failures into errors instead.
Warning codes: SkippedElement, UnsupportedFeature, ResourceLimitReached, MalformedSegment.
cargo build && cargo test && cargo clippy -- -D warningsA Docker environment is available for reproducible Linux builds:
docker compose run --rm verify # Full loop: fmt + clippy + test + release build
docker compose run --rm test # Run all tests
docker compose run --rm lint # clippy + fmt check
docker compose run --rm shell # Interactive bashApache-2.0