Welcome to the File Processing System! This project automatically processes CSV files, extracts metadata, and stores it in AWS-like services using LocalStack. It mimics real-world cloud-based file processing without incurring AWS costs. 🏗️☁️
- ✅ Uploads CSV files to an S3 bucket (LocalStack S3)
- ✅ Extracts metadata (row count, column count, column names, file size, timestamp)
- ✅ Stores metadata in DynamoDB (LocalStack DynamoDB)
- ✅ Uses LocalStack to simulate AWS services locally
- ✅ Ensures a smooth pipeline from upload ➝ processing ➝ storage
Technology | Purpose |
---|---|
Python 🐍 | Core scripting language |
LocalStack 🏗️ | AWS Cloud Emulator |
Boto3 🔗 | AWS SDK for Python |
AWS S3 (LocalStack) 📂 | Cloud storage for CSV files |
AWS Lambda (LocalStack) ⚡ | Serverless file processing |
AWS DynamoDB (LocalStack) 🗄️ | NoSQL database to store metadata |
Make sure you have the following installed on your system:
- Docker 🐳 (Ensure it's running ✅)
- LocalStack 🏗️ (AWS Cloud Emulator)
- Python (Recommended: 3.8+)
- Pip for package management
1️⃣ Clone this repository:
git clone https://github.com/yourusername/FileProcessingSystem.git
cd FileProcessingSystem
2️⃣ Install required dependencies:
pip install -r requirements.txt
3️⃣ Start LocalStack in Docker mode:
localstack start -d # Runs in detached mode
4️⃣ Verify LocalStack is running:
localstack status
Expected Output:
Runtime status | ✔ running (name: "localstack-main", IP: 172.17.0.2)
awslocal s3 mb s3://file-processing-bucket
awslocal s3 cp example.csv s3://file-processing-bucket/
awslocal s3 ls s3://file-processing-bucket/
python extract_metadata.py
Expected Output:
{
"filename": "example.csv",
"upload_timestamp": "2025-03-09 12:00:16",
"file_size_bytes": 118,
"row_count": 3,
"column_count": 5,
"column_names": "['id', 'name', 'age', 'city', 'date']"
}
awslocal dynamodb scan --table-name FileMetadata
Expected Output:
{
"Items": [
{
"filename": { "S": "example.csv" },
"upload_timestamp": { "S": "2025-03-09 12:00:16" },
"file_size_bytes": { "N": "118" },
"row_count": { "N": "3" },
"column_count": { "N": "5" },
"column_names": { "S": "['id', 'name', 'age', 'city', 'date']" }
}
]
}
Run:
localstack start -d
Check if Docker is running and restart LocalStack:
docker ps # Ensure LocalStack container is running
localstack stop && localstack start -d
Make sure your file exists in C:\Users\Hello\
and update your script with the correct file path.
✅ Add an AWS Lambda trigger for automatic processing ✅ Implement error handling & logging ✅ Store metadata in AWS RDS (PostgreSQL/MySQL)
This project simulates real AWS cloud operations locally, making it cost-effective & developer-friendly! 🚀 If you have any suggestions or improvements, feel free to contribute! 😊
🔗 GitHub Repository: https://github.com/KritikaK21/AWSProject1 💬 Need help? Open an issue or contact me!