Recipe: Items-only pipeline
Store only items to PostgreSQL (no request tracking, minimal logging).
1) Enable (settings.py)
ITEM_PIPELINES = {
'scrapy_item_ingest.ItemsPipeline': 300,
}
# Database config (pick ONE style)
DB_URL = 'postgresql://user:password@localhost:5432/database'
# or discrete fields:
# DB_HOST = 'localhost'
# DB_PORT = 5432
# DB_USER = 'user'
# DB_PASSWORD = 'password'
# DB_NAME = 'database'
CREATE_TABLES = True
2) Run
scrapy crawl your_spider
Expected
Items saved as JSON rows in
job_items.No
job_requestsrows are created in this recipe.
Tip
Use discrete DB fields to avoid URL‑encoding special characters in passwords.