Spaces:

iamfaham
/

multimodal-sentiment-analysis

Running

App Files Files Community

Faham commited on Aug 27

Commit

db77419

1 Parent(s): b1acf7e

UPDATE: readme

Browse files

Files changed (3) hide show

.dockerignore +1 -1
README.md +27 -1
src/utils/preprocessing.py +1 -4

.dockerignore CHANGED Viewed

@@ -48,7 +48,7 @@ Thumbs.db
 .ipynb_checkpoints/
 # Models (if they're large)
-models/*.pth
 # Logs
 *.log

 .ipynb_checkpoints/
 # Models (if they're large)
+model_weights/*.pth
 # Logs
 *.log

README.md CHANGED Viewed

@@ -76,9 +76,23 @@ sentiment-fused/
 ├── notebooks/                     # Development notebooks
 │   ├── audio_sentiment_analysis.ipynb    # Audio model development
 │   └── vision_sentiment_analysis.ipynb   # Vision model development
-└── models/                        # Model storage directory
 ```
 ## Key Features
 - **Real-time Analysis**: Instant sentiment predictions with confidence scores
@@ -264,5 +278,17 @@ Key libraries used:
 6. **Production Ready**: Docker containerization and deployment
 7. **Video Analysis**: Comprehensive video processing with multi-modal extraction
 8. **Speech Recognition**: Audio-to-text transcription for enhanced analysis
 This project serves as a comprehensive example of building production-ready multimodal AI applications with modern Python tools and frameworks.

 ├── notebooks/                     # Development notebooks
 │   ├── audio_sentiment_analysis.ipynb    # Audio model development
 │   └── vision_sentiment_analysis.ipynb   # Vision model development
+├── model_weights/                 # Model storage directory (downloaded .pth files)
+└── src/                           # Source code package
+    ├── __init__.py               # Package initialization
+    ├── config/                   # Configuration settings
+    ├── models/                   # Model logic and inference code
+    ├── utils/                    # Utility functions and preprocessing
+    └── ui/                       # User interface components
 ```
+### Directory Explanation
+- **`model_weights/`**: Contains the actual trained model files (`.pth` files) downloaded from Google Drive at inference time.
+- **`src/models/`**: Contains the Python code for model loading, inference, and prediction logic
+- **`src/utils/`**: Contains preprocessing utilities for audio, vision, and text data
+- **`src/config/`**: Contains centralized configuration settings for the entire application
+- **`src/ui/`**: Contains Streamlit UI components and styling
 ## Key Features
 - **Real-time Analysis**: Instant sentiment predictions with confidence scores
 6. **Production Ready**: Docker containerization and deployment
 7. **Video Analysis**: Comprehensive video processing with multi-modal extraction
 8. **Speech Recognition**: Audio-to-text transcription for enhanced analysis
+9. **Modular Architecture**: Clean, maintainable code structure with separated concerns
+10. **Professional Code Organization**: Proper Python packaging with config, models, utils, and UI modules
+## Recent Improvements
+The project has been refactored from a monolithic structure to a clean, modular architecture:
+- **Modular Design**: Separated into logical modules (`src/config/`, `src/models/`, `src/utils/`, `src/ui/`)
+- **Centralized Configuration**: All settings consolidated in `src/config/settings.py`
+- **Clean Separation**: Model logic, preprocessing, and UI components are now in dedicated modules
+- **Better Maintainability**: Easier to modify, test, and extend individual components
+- **Professional Structure**: Follows Python packaging best practices
 This project serves as a comprehensive example of building production-ready multimodal AI applications with modern Python tools and frameworks.

src/utils/preprocessing.py CHANGED Viewed

@@ -20,13 +20,10 @@ except ImportError:
 from ..config.settings import (
     IMAGE_TRANSFORMS,
     AUDIO_MODEL_CONFIG,
-    VISION_MODEL_CONFIG,
-    SUPPORTED_IMAGE_FORMATS,
-    SUPPORTED_AUDIO_FORMATS,
 )
 # Add Any to typing imports
-from typing import List, Optional, Tuple, Union, Any
 # Add torch import for audio preprocessing
 try:

 from ..config.settings import (
     IMAGE_TRANSFORMS,
     AUDIO_MODEL_CONFIG,
 )
 # Add Any to typing imports
+from typing import List, Optional, Union, Any
 # Add torch import for audio preprocessing
 try: