A Flask-based web application that extracts inventory data from images using OCR (Optical Character Recognition) and Azure OpenAI, allowing users to ask natural language questions about their inventory data.
InventoryQA is an intelligent inventory management assistant that:
- Extracts data from images containing tables of inventory items using Tesseract OCR technology
- Converts extracted data into structured DataFrames for analysis
- Processes queries using Azure OpenAI (ChatGPT-4o-mini) to answer natural language questions
- Generates visualizations as charts when appropriate for better insights
- Maintains history of all questions and answers for future reference
- Multi-Image Upload: Upload multiple inventory table images at once
- Automatic OCR Extraction: Extract table contents from images using Tesseract OCR and img2table
- DataFrame Generation: Automatically convert extracted data into pandas DataFrames
- AI-Powered Q&A: Ask natural language questions about your inventory using Azure OpenAI
- Smart Visualization: Automatically generate charts when questions are best answered with graphs
- Query History: View and reference all previous questions and answers with timestamps
- Image Management: Upload, view, and delete inventory images
- Flask 3.1.2 - Web framework
- SQLAlchemy 2.0.43 - ORM for database operations
- Flask-SQLAlchemy 3.1.1 - SQLAlchemy integration with Flask
- Flask-WTF 1.2.2 - Form handling and CSRF protection
- Flask-Migrate 4.1.0 - Database migration management
- Tesseract OCR - Optical character recognition engine
- img2table - Table extraction from images
- OpenCV (cv2) - Image processing and manipulation
- Pillow - Image handling and thumbnail generation
- pandas - DataFrame operations and data analysis
- Azure OpenAI SDK - Integration with Azure OpenAI models
- Azure Identity - Authentication for Azure services
- WTForms 3.2.1 - Form validation
- SQLite - Local database storage
InventoryQA_OCR_GenAI/
├── inventory_app_deploy/ # Main Flask application package
│ ├── __init__.py # App factory and configuration
│ ├── config.py # Application configuration
│ ├── models.py # SQLAlchemy ORM models
│ ├── forms.py # WTForms form definitions
│ ├── routes.py # Flask route handlers
│ ├── services.py # Business logic and utilities
│ ├── cursor.py # Database cursor helper
│ ├── dataframe_processor_class.py # DataFrame processing logic
│ ├── static/ # Static assets
│ │ ├── css/ # Stylesheets
│ │ └── uploads/ # Uploaded images and thumbnails
│ └── templates/ # HTML templates
│ ├── home.html # Home page
│ ├── layout.html # Base template
│ ├── uploads.html # Image upload form
│ ├── qa_history.html # Query history view
│ ├── view_dataframe.html # DataFrame display
│ └── view_image.html # Image viewer
├── run.py # Application entry point
├── requirements.txt # Python dependencies
├── Dockerfile # Docker container definition
├── Steps_to_Deploy.txt # Deployment instructions
└── README.md # This file
Stores metadata for uploaded inventory images:
id(Integer, Primary Key) - Unique image identifierimage_name(String) - Filename of the original uploaded imagethumbnail_name(String, Optional) - Filename of the generated thumbnail
Logs all user interactions with the AI:
id(Integer, Primary Key) - Unique history entry identifierquestion(Text) - Raw question text submitted by the useranswer(Text) - Formatted answer with analysis and visualizationstimestamp(String) - When the interaction occurred (human-readable format)
Handles multi-file image uploads:
images(MultipleFileField) - Required field for selecting multiple image filessubmit(SubmitField) - Button to submit the upload request
Provides CSRF protection and client-side validation for file selection.
- Python 3.11 or higher
- Tesseract OCR installed on your system
- Azure OpenAI API credentials for an LLM
- Environment variables configured
-
Clone the repository and navigate to the project directory:
cd InventoryQA_OCR_GenAI -
Create and activate a virtual environment:
python -m venv inventory_app inventory_app\Scripts\activate # On Windows # OR source inventory_app/bin/activate # On macOS/Linux
-
Install dependencies:
pip install -r requirements.txt
-
Configure environment variables - Create a
.envfile with:SECRET_KEY=my_secret_key_for_flask_app AZURE_OPENAI_API_KEY=azure_open_ai_api_key OPEN_AI_ENDPOINT=azure_open_ai_endpoint MODEL_NAME=llm_model_name DEPLOYMENT_NAME=deployment_name AZURE_OPENAI_API_VERSION=azure_open_ai_api_version -
Run the application:
python run.py
The app will be available at
http://127.0.0.1:5000/
- Navigate to the Uploads page
- Select one or more images containing inventory tables
- Click Upload to process the images
- The app will extract data and generate thumbnails
- Go to the Home page
- Type a natural language question about your inventory (e.g., "What are the total units in stock?")
- The app will analyze the extracted data and respond with:
- Text-based answer
- Charts/visualizations (when appropriate)
- View all previous questions and answers in the QA History page
- View DataFrame: See the structured data extracted from your images
- View Images: Review uploaded images and their thumbnails
- QA History: Browse all previous interactions with timestamps
Build and run the application in a Docker container:
docker build -t inventory_app .
docker run -p 5000:5000 \
-e SECRET_KEY=my_secret_key_for_flask_app \
-e AZURE_OPENAI_API_KEY=azure_open_ai_api_key \
-e OPEN_AI_ENDPOINT=azure_open_ai_endpoint \
-e MODEL_NAME=llm_model_name \
-e DEPLOYMENT_NAME=deployment_name \
-e AZURE_OPENAI_API_VERSION=azure_open_ai_api_version \
inventory_appFor production deployment to Azure:
-
Create Azure Container Registry:
az acr create --resource-group your-rg --name your-registry --sku Basic
-
Tag and push image:
docker tag inventory_app:latest your-registry.azurecr.io/inventory_app:latest docker push your-registry.azurecr.io/inventory_app:latest
-
Deploy to Azure Container Instances:
az container create \ --resource-group your-rg \ --name inventory-app \ --image your-registry.azurecr.io/inventory_app2:latest \ --cpu 4 \ --memory 16 \ --registry-login-server your-registry.azurecr.io \ --registry-username your-username \ --registry-password your-password \ --ports 5000 \ --ip-address Public \ --dns-name-label your-dns-label \ --environment-variables \ -e SECRET_KEY=my_secret_key_for_flask_app \ -e AZURE_OPENAI_API_KEY=azure_open_ai_api_key \ -e OPEN_AI_ENDPOINT=azure_open_ai_endpoint \ -e MODEL_NAME=llm_model_name \ -e DEPLOYMENT_NAME=deployment_name \ -e AZURE_OPENAI_API_VERSION=azure_open_ai_api_version
The application is configured via:
config.py- Core Flask settings (database, secret key).envfile - Environment variables (API keys, credentials)
Key settings:
SECRET_KEY- Flask session encryption keyAZURE_OPENAI_API_KEY- Azure OpenAI API credentialOPEN_AI_ENDPOINT- Azure OpenAI service endpoint URLMODEL_NAME- LLM model name (e.g.,gpt-4o-mini)DEPLOYMENT_NAME- Azure deployment name (e.g.,gpt-4o-mini)AZURE_OPENAI_API_VERSION- Azure OpenAI API version (e.g.,2024-12-01-preview)
- Image Upload: User uploads images → Validation → Storage in
static/uploads/ - OCR Processing: Tesseract extracts text → img2table extracts tables → pandas DataFrame created
- Data Persistence: DataFrame and image metadata stored in SQLite database
- Query Processing: User question → Passed to Azure OpenAI with DataFrame context → LLM generates answer
- Visualization: If answer benefits from charts → Matplotlib generates visualization
- History: Question and answer stored in
QA_Historytable with timestamp
Central hub for business logic:
- Image ingestion and validation
- OCR and table extraction
- DataFrame preparation and processing
- Azure OpenAI integration for Q&A
- Visualization generation
- QA history pagination and retrieval
Handles DataFrame operations:
- Data cleaning and normalization
- Format conversion
- Aggregation and analysis
models.py- SQLAlchemy model definitionscursor.py- Database query helper- SQLite database in
inventory_app_deploy/databases/site.db
The application provides these main routes:
/or/home- Home page with Q&A interface/upload- Image upload form and handler/view_image/<image_id>- View specific uploaded image/view_dataframe- Display extracted DataFrame/qa_history- View all previous questions and answers- (Additional routes handled by Flask-Login and Flask-Migrate)
- Support for additional file formats (PDF, CSV)
- Advanced filtering and export capabilities
- Batch processing for large image sets
- Custom model fine-tuning for domain-specific inventory terms
- Real-time collaboration features
- Mobile app for on-the-go inventory checking