schema-translator / README.md
sanzgiri's picture
Fix Gradio dependency: upgrade to v5.0.0 for HuggingFace Hub compatibility
0e1ffab
---
title: Schema Translator
emoji: πŸ”„
colorFrom: blue
colorTo: purple
sdk: gradio
sdk_version: 5.0.0
app_file: app_gradio.py
pinned: false
license: mit
tags:
- llm
- database
- schema-translation
- natural-language-query
python_version: "3.12"
---
# Schema Translator
An intelligent contract schema translation system that enables querying across multiple enterprise customers with heterogeneous database schemas using LLM-powered semantic understanding.
## Prerequisites
- Python 3.10+
- UV package manager
- Anthropic API key
## Setup Instructions
### 1. Clone the Repository
```bash
git clone <repo-url>
cd schema_translator_v2
```
### 2. Create Virtual Environment with UV
```bash
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
```
### 3. Install Dependencies
```bash
uv pip install -r requirements.txt
```
### 4. Configure Environment Variables
```bash
cp .env.example .env
# Edit .env and add your ANTHROPIC_API_KEY
```
### 5. Generate Mock Data
```bash
python -m schema_translator.mock_data
```
### 6. Run Tests
```bash
pytest tests/
```
### 7. Start the Application
```bash
chainlit run app.py
```
## Project Structure
```
schema_translator_v2/
β”œβ”€β”€ README.md
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ .env.example
β”œβ”€β”€ .gitignore
β”œβ”€β”€ databases/ # SQLite databases
β”œβ”€β”€ schema_translator/ # Main package
β”‚ β”œβ”€β”€ config.py # Configuration management
β”‚ β”œβ”€β”€ models.py # Pydantic data models
β”‚ β”œβ”€β”€ mock_data.py # Mock data generation
β”‚ β”œβ”€β”€ knowledge_graph.py # Schema knowledge graph
β”‚ β”œβ”€β”€ query_compiler.py # SQL generation
β”‚ β”œβ”€β”€ database_executor.py # Query execution
β”‚ β”œβ”€β”€ result_harmonizer.py # Result normalization
β”‚ β”œβ”€β”€ orchestrator.py # Main pipeline orchestrator
β”‚ β”œβ”€β”€ agents/ # LLM agents
β”‚ └── learning/ # Learning and feedback
β”œβ”€β”€ tests/ # Test suite
└── app.py # Chainlit application
```
## Tech Stack
- **Language:** Python 3.10+
- **LLM:** Anthropic Claude (claude-sonnet-4-20250514)
- **Database:** SQLite
- **UI Framework:** Chainlit
- **Data Validation:** Pydantic
- **Graph:** NetworkX
- **Testing:** pytest
- **Environment:** python-dotenv
## Development
### Running Tests
```bash
pytest tests/ -v
```
### Code Formatting
```bash
black schema_translator/ tests/
```
### Linting
```bash
ruff check schema_translator/ tests/
```
### Type Checking
```bash
mypy schema_translator/
```
## License
See LICENSE file for details.