No description
| .gitignore | ||
| html_optimizer.py | ||
| main.py | ||
| README.md | ||
Blog Generator with Image Optimization
Overview
The enhanced main_optimized.py script extends the original blog generator with powerful image optimization and cloud storage capabilities.
Key Features
🖼️ Image Optimization
- Automatic resizing: Images larger than 1200x1200px are resized while maintaining aspect ratio
- Format conversion: PNG/JPG images are converted to WebP for better compression
- Quality optimization: Images are compressed with 85% quality for optimal size/quality balance
- EXIF handling: Automatic rotation based on EXIF orientation data
- Massive size reduction: Typically 85-95% file size reduction
☁️ Cloud Storage Integration
- Google Cloud Storage: Automatically uploads optimized images to GCP bucket
- Public URLs: Generates public URLs for all uploaded images
- Organized structure: Images are organized by folder (e.g.,
06/for June) - CDN delivery: Images served from Google's global CDN for fast loading
🌐 HTML Generation
- Public URLs: HTML uses cloud storage URLs instead of local file paths
- Fallback support: If upload fails, falls back to local file references
- Same format: Maintains the same HTML structure as the original script
Setup
1. Install Dependencies
pip install python-docx google-cloud-translate google-cloud-storage Pillow
2. Configure Environment
# Required: Path to service account JSON
export GOOGLE_APPLICATION_CREDENTIALS="./service_account.json"
# Optional: Override default bucket name
export GCP_BUCKET_NAME="your-bucket-name" # defaults to "filipkin-blog-images"
3. Ensure GCP Bucket Exists
The bucket filipkin-blog-images should already exist and be publicly accessible.
Usage
python main_optimized.py input.docx output.html
Example
python main_optimized.py files/06/index.docx files/06/index_optimized.html
What Happens
- Extract images from the DOCX file
- Optimize each image:
- Resize if larger than 1200x1200px
- Convert to WebP format
- Compress with 85% quality
- Apply EXIF orientation fixes
- Upload to GCP bucket:
- Upload to
gs://filipkin-blog-images/FOLDER/imagename.webp - Generate public URL:
https://storage.googleapis.com/filipkin-blog-images/FOLDER/imagename.webp
- Upload to
- Generate HTML with public URLs
- Save backup of optimized images locally
Performance Improvements
Before (Original)
- File sizes: 1-3MB per image
- Format: PNG/JPG
- Storage: Local files only
- Loading: Slow, especially on mobile
- Total size: ~35MB for a typical blog post
After (Optimized)
- File sizes: 50-400KB per image (85-95% reduction)
- Format: WebP (better compression)
- Storage: Google Cloud Storage with CDN
- Loading: Fast global delivery
- Total size: ~3.5MB for the same blog post
Error Handling
- Upload failures: Falls back to local file references
- Optimization failures: Uses original image if optimization fails
- Missing credentials: Clear error messages with setup instructions
- Network issues: Continues processing other images if one fails
File Organization
filipkin-blog-images/
├── 06/
│ ├── image1.webp
│ ├── image2.webp
│ └── ...
├── 07/
│ ├── image1.webp
│ └── ...
└── test/
└── test_images...
Benefits
- Faster loading: 90%+ smaller file sizes
- Better user experience: Especially on mobile/slow connections
- Global CDN: Fast delivery worldwide via Google's infrastructure
- Future-proof: Easy to update images without re-deploying
- Cost effective: Reduces bandwidth costs
- SEO benefits: Faster page load times improve search rankings
Backward Compatibility
The optimized script maintains full compatibility with the original:
- Same command-line interface
- Same HTML structure
- Same translation features
- Falls back gracefully if cloud features aren't available