Skip to content

teibok/InventoryQA_OCR_GenAI

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

InventoryQA OCR GenAI

A Flask-based web application that extracts inventory data from images using OCR (Optical Character Recognition) and Azure OpenAI, allowing users to ask natural language questions about their inventory data.

Overview

InventoryQA is an intelligent inventory management assistant that:

  1. Extracts data from images containing tables of inventory items using Tesseract OCR technology
  2. Converts extracted data into structured DataFrames for analysis
  3. Processes queries using Azure OpenAI (ChatGPT-4o-mini) to answer natural language questions
  4. Generates visualizations as charts when appropriate for better insights
  5. Maintains history of all questions and answers for future reference

Features

  • Multi-Image Upload: Upload multiple inventory table images at once
  • Automatic OCR Extraction: Extract table contents from images using Tesseract OCR and img2table
  • DataFrame Generation: Automatically convert extracted data into pandas DataFrames
  • AI-Powered Q&A: Ask natural language questions about your inventory using Azure OpenAI
  • Smart Visualization: Automatically generate charts when questions are best answered with graphs
  • Query History: View and reference all previous questions and answers with timestamps
  • Image Management: Upload, view, and delete inventory images

Technology Stack

Backend

  • Flask 3.1.2 - Web framework
  • SQLAlchemy 2.0.43 - ORM for database operations
  • Flask-SQLAlchemy 3.1.1 - SQLAlchemy integration with Flask
  • Flask-WTF 1.2.2 - Form handling and CSRF protection
  • Flask-Migrate 4.1.0 - Database migration management

OCR & Data Processing

  • Tesseract OCR - Optical character recognition engine
  • img2table - Table extraction from images
  • OpenCV (cv2) - Image processing and manipulation
  • Pillow - Image handling and thumbnail generation
  • pandas - DataFrame operations and data analysis

AI & LLM

  • Azure OpenAI SDK - Integration with Azure OpenAI models
  • Azure Identity - Authentication for Azure services

Security & Validation

  • WTForms 3.2.1 - Form validation

Database

  • SQLite - Local database storage

Project Structure

InventoryQA_OCR_GenAI/
├── inventory_app_deploy/           # Main Flask application package
│   ├── __init__.py                 # App factory and configuration
│   ├── config.py                   # Application configuration
│   ├── models.py                   # SQLAlchemy ORM models
│   ├── forms.py                    # WTForms form definitions
│   ├── routes.py                   # Flask route handlers
│   ├── services.py                 # Business logic and utilities
│   ├── cursor.py                   # Database cursor helper
│   ├── dataframe_processor_class.py # DataFrame processing logic
│   ├── static/                     # Static assets
│   │   ├── css/                    # Stylesheets
│   │   └── uploads/                # Uploaded images and thumbnails
│   └── templates/                  # HTML templates
│       ├── home.html               # Home page
│       ├── layout.html             # Base template
│       ├── uploads.html            # Image upload form
│       ├── qa_history.html         # Query history view
│       ├── view_dataframe.html     # DataFrame display
│       └── view_image.html         # Image viewer
├── run.py                          # Application entry point
├── requirements.txt                # Python dependencies
├── Dockerfile                      # Docker container definition
├── Steps_to_Deploy.txt             # Deployment instructions
└── README.md                       # This file

Database Models

Images Table

Stores metadata for uploaded inventory images:

  • id (Integer, Primary Key) - Unique image identifier
  • image_name (String) - Filename of the original uploaded image
  • thumbnail_name (String, Optional) - Filename of the generated thumbnail

QA_History Table

Logs all user interactions with the AI:

  • id (Integer, Primary Key) - Unique history entry identifier
  • question (Text) - Raw question text submitted by the user
  • answer (Text) - Formatted answer with analysis and visualizations
  • timestamp (String) - When the interaction occurred (human-readable format)

Forms

UploadImagesForm

Handles multi-file image uploads:

  • images (MultipleFileField) - Required field for selecting multiple image files
  • submit (SubmitField) - Button to submit the upload request

Provides CSRF protection and client-side validation for file selection.

Getting Started

Prerequisites

  • Python 3.11 or higher
  • Tesseract OCR installed on your system
  • Azure OpenAI API credentials for an LLM
  • Environment variables configured

Installation

  1. Clone the repository and navigate to the project directory:

    cd InventoryQA_OCR_GenAI
  2. Create and activate a virtual environment:

    python -m venv inventory_app
    inventory_app\Scripts\activate  # On Windows
    # OR
    source inventory_app/bin/activate  # On macOS/Linux
  3. Install dependencies:

    pip install -r requirements.txt
  4. Configure environment variables - Create a .env file with:

     SECRET_KEY=my_secret_key_for_flask_app
     AZURE_OPENAI_API_KEY=azure_open_ai_api_key
     OPEN_AI_ENDPOINT=azure_open_ai_endpoint
     MODEL_NAME=llm_model_name
     DEPLOYMENT_NAME=deployment_name
     AZURE_OPENAI_API_VERSION=azure_open_ai_api_version
    
  5. Run the application:

    python run.py

    The app will be available at http://127.0.0.1:5000/

Usage

Uploading Inventory Images

  1. Navigate to the Uploads page
  2. Select one or more images containing inventory tables
  3. Click Upload to process the images
  4. The app will extract data and generate thumbnails

Asking Questions

  1. Go to the Home page
  2. Type a natural language question about your inventory (e.g., "What are the total units in stock?")
  3. The app will analyze the extracted data and respond with:
    • Text-based answer
    • Charts/visualizations (when appropriate)
  4. View all previous questions and answers in the QA History page

Viewing Data

  • View DataFrame: See the structured data extracted from your images
  • View Images: Review uploaded images and their thumbnails
  • QA History: Browse all previous interactions with timestamps

Deployment

Local Docker Deployment

Build and run the application in a Docker container:

docker build -t inventory_app .
docker run -p 5000:5000 \
  -e  SECRET_KEY=my_secret_key_for_flask_app \
  -e  AZURE_OPENAI_API_KEY=azure_open_ai_api_key \
  -e  OPEN_AI_ENDPOINT=azure_open_ai_endpoint \
  -e  MODEL_NAME=llm_model_name \
  -e  DEPLOYMENT_NAME=deployment_name \
  -e  AZURE_OPENAI_API_VERSION=azure_open_ai_api_version \
  inventory_app

Azure Container Registry & Azure Container Instance

For production deployment to Azure:

  1. Create Azure Container Registry:

    az acr create --resource-group your-rg --name your-registry --sku Basic
  2. Tag and push image:

    docker tag inventory_app:latest your-registry.azurecr.io/inventory_app:latest
    docker push your-registry.azurecr.io/inventory_app:latest
  3. Deploy to Azure Container Instances:

    az container create \
      --resource-group your-rg \
      --name inventory-app \
      --image your-registry.azurecr.io/inventory_app2:latest \
      --cpu 4 \
      --memory 16 \
      --registry-login-server your-registry.azurecr.io \
      --registry-username your-username \
      --registry-password your-password \
      --ports 5000 \
      --ip-address Public \
      --dns-name-label your-dns-label \
      --environment-variables   \
         -e  SECRET_KEY=my_secret_key_for_flask_app \
         -e  AZURE_OPENAI_API_KEY=azure_open_ai_api_key \
         -e  OPEN_AI_ENDPOINT=azure_open_ai_endpoint \
         -e  MODEL_NAME=llm_model_name \
         -e  DEPLOYMENT_NAME=deployment_name \
         -e  AZURE_OPENAI_API_VERSION=azure_open_ai_api_version 

Configuration

The application is configured via:

  • config.py - Core Flask settings (database, secret key)
  • .env file - Environment variables (API keys, credentials)

Key settings:

  • SECRET_KEY - Flask session encryption key
  • AZURE_OPENAI_API_KEY - Azure OpenAI API credential
  • OPEN_AI_ENDPOINT - Azure OpenAI service endpoint URL
  • MODEL_NAME - LLM model name (e.g., gpt-4o-mini)
  • DEPLOYMENT_NAME - Azure deployment name (e.g., gpt-4o-mini)
  • AZURE_OPENAI_API_VERSION - Azure OpenAI API version (e.g., 2024-12-01-preview)

Architecture & Data Flow

  1. Image Upload: User uploads images → Validation → Storage in static/uploads/
  2. OCR Processing: Tesseract extracts text → img2table extracts tables → pandas DataFrame created
  3. Data Persistence: DataFrame and image metadata stored in SQLite database
  4. Query Processing: User question → Passed to Azure OpenAI with DataFrame context → LLM generates answer
  5. Visualization: If answer benefits from charts → Matplotlib generates visualization
  6. History: Question and answer stored in QA_History table with timestamp

Key Components

Services Module (services.py)

Central hub for business logic:

  • Image ingestion and validation
  • OCR and table extraction
  • DataFrame preparation and processing
  • Azure OpenAI integration for Q&A
  • Visualization generation
  • QA history pagination and retrieval

Data Processing (dataframe_processor_class.py)

Handles DataFrame operations:

  • Data cleaning and normalization
  • Format conversion
  • Aggregation and analysis

Database Layer

  • models.py - SQLAlchemy model definitions
  • cursor.py - Database query helper
  • SQLite database in inventory_app_deploy/databases/site.db

API Routes

The application provides these main routes:

  • / or /home - Home page with Q&A interface
  • /upload - Image upload form and handler
  • /view_image/<image_id> - View specific uploaded image
  • /view_dataframe - Display extracted DataFrame
  • /qa_history - View all previous questions and answers
  • (Additional routes handled by Flask-Login and Flask-Migrate)

Future Enhancements

  • Support for additional file formats (PDF, CSV)
  • Advanced filtering and export capabilities
  • Batch processing for large image sets
  • Custom model fine-tuning for domain-specific inventory terms
  • Real-time collaboration features
  • Mobile app for on-the-go inventory checking

About

Build an end-to-end system that ingests inventory report PDFs/images, runs OCR to normalize and extract tabular data, stores the cleaned dataset, and exposes a secure, conversational agent that can answer business queries over the data (aggregation, filtering, joins, trends), returning tables, charts, and exportable results.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors