Faham
commited on
Commit
Β·
db77419
1
Parent(s):
b1acf7e
UPDATE: readme
Browse files- .dockerignore +1 -1
- README.md +27 -1
- src/utils/preprocessing.py +1 -4
.dockerignore
CHANGED
|
@@ -48,7 +48,7 @@ Thumbs.db
|
|
| 48 |
.ipynb_checkpoints/
|
| 49 |
|
| 50 |
# Models (if they're large)
|
| 51 |
-
|
| 52 |
|
| 53 |
# Logs
|
| 54 |
*.log
|
|
|
|
| 48 |
.ipynb_checkpoints/
|
| 49 |
|
| 50 |
# Models (if they're large)
|
| 51 |
+
model_weights/*.pth
|
| 52 |
|
| 53 |
# Logs
|
| 54 |
*.log
|
README.md
CHANGED
|
@@ -76,9 +76,23 @@ sentiment-fused/
|
|
| 76 |
βββ notebooks/ # Development notebooks
|
| 77 |
β βββ audio_sentiment_analysis.ipynb # Audio model development
|
| 78 |
β βββ vision_sentiment_analysis.ipynb # Vision model development
|
| 79 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 80 |
```
|
| 81 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 82 |
## Key Features
|
| 83 |
|
| 84 |
- **Real-time Analysis**: Instant sentiment predictions with confidence scores
|
|
@@ -264,5 +278,17 @@ Key libraries used:
|
|
| 264 |
6. **Production Ready**: Docker containerization and deployment
|
| 265 |
7. **Video Analysis**: Comprehensive video processing with multi-modal extraction
|
| 266 |
8. **Speech Recognition**: Audio-to-text transcription for enhanced analysis
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 267 |
|
| 268 |
This project serves as a comprehensive example of building production-ready multimodal AI applications with modern Python tools and frameworks.
|
|
|
|
| 76 |
βββ notebooks/ # Development notebooks
|
| 77 |
β βββ audio_sentiment_analysis.ipynb # Audio model development
|
| 78 |
β βββ vision_sentiment_analysis.ipynb # Vision model development
|
| 79 |
+
βββ model_weights/ # Model storage directory (downloaded .pth files)
|
| 80 |
+
βββ src/ # Source code package
|
| 81 |
+
βββ __init__.py # Package initialization
|
| 82 |
+
βββ config/ # Configuration settings
|
| 83 |
+
βββ models/ # Model logic and inference code
|
| 84 |
+
βββ utils/ # Utility functions and preprocessing
|
| 85 |
+
βββ ui/ # User interface components
|
| 86 |
```
|
| 87 |
|
| 88 |
+
### Directory Explanation
|
| 89 |
+
|
| 90 |
+
- **`model_weights/`**: Contains the actual trained model files (`.pth` files) downloaded from Google Drive at inference time.
|
| 91 |
+
- **`src/models/`**: Contains the Python code for model loading, inference, and prediction logic
|
| 92 |
+
- **`src/utils/`**: Contains preprocessing utilities for audio, vision, and text data
|
| 93 |
+
- **`src/config/`**: Contains centralized configuration settings for the entire application
|
| 94 |
+
- **`src/ui/`**: Contains Streamlit UI components and styling
|
| 95 |
+
|
| 96 |
## Key Features
|
| 97 |
|
| 98 |
- **Real-time Analysis**: Instant sentiment predictions with confidence scores
|
|
|
|
| 278 |
6. **Production Ready**: Docker containerization and deployment
|
| 279 |
7. **Video Analysis**: Comprehensive video processing with multi-modal extraction
|
| 280 |
8. **Speech Recognition**: Audio-to-text transcription for enhanced analysis
|
| 281 |
+
9. **Modular Architecture**: Clean, maintainable code structure with separated concerns
|
| 282 |
+
10. **Professional Code Organization**: Proper Python packaging with config, models, utils, and UI modules
|
| 283 |
+
|
| 284 |
+
## Recent Improvements
|
| 285 |
+
|
| 286 |
+
The project has been refactored from a monolithic structure to a clean, modular architecture:
|
| 287 |
+
|
| 288 |
+
- **Modular Design**: Separated into logical modules (`src/config/`, `src/models/`, `src/utils/`, `src/ui/`)
|
| 289 |
+
- **Centralized Configuration**: All settings consolidated in `src/config/settings.py`
|
| 290 |
+
- **Clean Separation**: Model logic, preprocessing, and UI components are now in dedicated modules
|
| 291 |
+
- **Better Maintainability**: Easier to modify, test, and extend individual components
|
| 292 |
+
- **Professional Structure**: Follows Python packaging best practices
|
| 293 |
|
| 294 |
This project serves as a comprehensive example of building production-ready multimodal AI applications with modern Python tools and frameworks.
|
src/utils/preprocessing.py
CHANGED
|
@@ -20,13 +20,10 @@ except ImportError:
|
|
| 20 |
from ..config.settings import (
|
| 21 |
IMAGE_TRANSFORMS,
|
| 22 |
AUDIO_MODEL_CONFIG,
|
| 23 |
-
VISION_MODEL_CONFIG,
|
| 24 |
-
SUPPORTED_IMAGE_FORMATS,
|
| 25 |
-
SUPPORTED_AUDIO_FORMATS,
|
| 26 |
)
|
| 27 |
|
| 28 |
# Add Any to typing imports
|
| 29 |
-
from typing import List, Optional,
|
| 30 |
|
| 31 |
# Add torch import for audio preprocessing
|
| 32 |
try:
|
|
|
|
| 20 |
from ..config.settings import (
|
| 21 |
IMAGE_TRANSFORMS,
|
| 22 |
AUDIO_MODEL_CONFIG,
|
|
|
|
|
|
|
|
|
|
| 23 |
)
|
| 24 |
|
| 25 |
# Add Any to typing imports
|
| 26 |
+
from typing import List, Optional, Union, Any
|
| 27 |
|
| 28 |
# Add torch import for audio preprocessing
|
| 29 |
try:
|