Metadata-Version: 2.4
Name: spreadsheet-engine
Version: 0.1.0
Summary: Compile Excel workbooks into semantic blocks and machine-readable artifacts
Project-URL: Homepage, https://github.com/your-org/spreadsheet-engine
Project-URL: Documentation, https://github.com/your-org/spreadsheet-engine#readme
Project-URL: Repository, https://github.com/your-org/spreadsheet-engine
License-Expression: MIT
License-File: LICENSE
Requires-Python: >=3.12
Requires-Dist: networkx>=3.0
Requires-Dist: numpy>=1.26.0
Requires-Dist: openpyxl>=3.1.5
Requires-Dist: pandas>=2.0.0
Requires-Dist: pyarrow>=14.0.0
Requires-Dist: pydantic>=2.0.0
Requires-Dist: python-dotenv>=1.2.2
Requires-Dist: rapidfuzz>=3.0.0
Provides-Extra: dev
Requires-Dist: pytest>=8.0.0; extra == 'dev'
Provides-Extra: llm
Requires-Dist: langchain-core>=0.3.0; extra == 'llm'
Requires-Dist: langchain-openai>=0.3.0; extra == 'llm'
Requires-Dist: openai>=2.38.0; extra == 'llm'
Description-Content-Type: text/markdown

# spreadsheet-engine

Open-source **Spreadsheet Understanding Engine** (`spreadsheet_engine`). Compiles Excel workbooks into semantic blocks, compressed formulas, UI render models, and retrieval chunks—without dumping raw cell grids.

Install from PyPI:

```bash
pip install spreadsheet-engine
# optional LLM block enrichment
pip install "spreadsheet-engine[llm]"
```

Or from git:

```bash
pip install "spreadsheet-engine @ git+https://github.com/your-org/spreadsheet-engine.git"
```

## Quick start

```bash
cd spreadsheet-engine
uv sync --extra dev

# Compile workbook → ./out
uv run spreadsheet-engine compile input_1.xlsx --out ./out

# Single sheet
uv run spreadsheet-engine compile input_1.xlsx --sheet Dashboard --out ./out
```

## Artifact outputs (public contract)

These files are consumed by downstream tools (e.g. [spreadsheet-agent](https://github.com/your-org/spreadsheet-agent)):

| File | Purpose |
|------|---------|
| `artifact.json` | Semantic IR (blocks, tables, timelines) |
| `artifact_ui.json` | UI positioning + sparse cell runs |
| `artifact_retrieval.jsonl` | Embedding-ready chunks |
| `artifact_values.jsonl` | Resolved cell values for metric lookup |

Optional: `--enrich-llm` (requires `spreadsheet-engine[llm]` and `OPENAI_API_KEY`).

## Architecture

```
Workbook → SparseCellStore → BlockDetection → Table/Timeline/Formula analysis
         → Semantic enrichment → Retrieval + UI emitters
```

## Tests

```bash
uv sync --extra dev
uv run pytest spreadsheet_engine/tests -q
```

Place `input_1.xlsx` at the repo root for e2e tests, or rely on skipped tests when absent.

## Related projects

- **spreadsheet-agent** — LangGraph Q&A over compiled artifacts (`pip install` this package, compile to `out/`, then use the agent).

## License

MIT — see [LICENSE](LICENSE).
