# Deployment Guide for File Processing System

## Prerequisites

1. System Requirements:
   - Node.js (v14 or higher)
   - Python (3.8 or higher)
   - MongoDB (v4.4 or higher)
   - Windows machine (required for win32com.client functionality)

2. API Keys:
   - OpenAI API key (for AI integration)

## Installation Steps

### 1. Node.js Setup
```bash
# Install Node.js dependencies
npm install
```

### 2. Python Environment Setup
```bash
# Create a virtual environment
python -m venv venv

# Activate virtual environment
# On Windows:
venv\Scripts\activate
# On Unix/MacOS:
source venv/bin/activate

# Install Python dependencies
pip install -r requirements.txt
```

### 3. MongoDB Setup
1. Install MongoDB
2. Create a new database
3. Update connection string in .env file

### 4. Environment Configuration

Create a .env file in the root directory with the following variables:

```env
# Server Configuration
PORT=3000
NODE_ENV=production

# MongoDB Configuration
MONGODB_URI=mongodb://your-mongodb-uri

# Upload Paths
UPLOAD_PATH=/path/to/uploads
DIR_PATH=/path/to/base/directory

# OpenAI Configuration
OPENAI_API_KEY=your-openai-api-key
```

### 5. Directory Structure Setup

Create required directories:
```bash
mkdir -p exhibitions
chmod 755 exhibitions
```

### 6. Windows-Specific Setup

1. Install Microsoft Office (required for .doc file processing)
2. Configure COM permissions for the service account

## Deployment Steps

1. Build the application:
```bash
npm run build
```

2. Start the server:
```bash
# Using PM2 (recommended for production)
pm2 start app.js --name "file-processor"

# Or using Node directly
node app.js
```

3. Verify the installation:
```bash
# Test file upload endpoint
curl -X POST http://localhost:3000/upload \
  -F "file=@test.pdf" \
  -F "ref=agents" \
  -F "ref_id=123"
```

## Monitoring and Maintenance

1. Monitor the logs:
```bash
# If using PM2
pm2 logs file-processor

# Check application logs
tail -f logs/app.log
```

2. Monitor queue status:
- Access MongoDB queue_statuses collection
- Check processing_summary.json files in processed_files directories

3. Regular maintenance:
- Clean up temporary files
- Monitor disk space usage
- Check MongoDB indexes
- Review error logs

## Troubleshooting

1. File Processing Issues:
- Check Python environment activation
- Verify Microsoft Office installation
- Ensure proper file permissions

2. Queue Processing Issues:
- Check MongoDB connection
- Verify queue status entries
- Review process logs

3. AI Integration Issues:
- Verify OpenAI API key
- Check network connectivity
- Monitor API rate limits

## Security Considerations

1. File Upload Security:
- Configure maximum file size limits
- Implement file type validation
- Set up proper file permissions

2. API Security:
- Implement rate limiting
- Use HTTPS
- Configure CORS properly

3. Database Security:
- Use strong MongoDB authentication
- Configure network access rules
- Regular security updates

## Backup and Recovery

1. Regular Backups:
- MongoDB data
- Uploaded files
- Configuration files

2. Recovery Procedures:
- Database restore process
- File system recovery
- Configuration restore

## Performance Optimization

1. System Configuration:
- Adjust Node.js memory limits
- Configure MongoDB indexes
- Optimize Python process pool

2. Storage Management:
- Implement file cleanup policies
- Monitor disk usage
- Configure log rotation

3. Queue Management:
- Adjust concurrent processing limits
- Configure timeout values
- Implement retry mechanisms