Recipe: Items-only pipeline

Store only items to PostgreSQL (no request tracking, minimal logging).

1) Enable (settings.py)

ITEM_PIPELINES = {
    'scrapy_item_ingest.ItemsPipeline': 300,
}

# Database config (pick ONE style)
DB_URL = 'postgresql://user:password@localhost:5432/database'
# or discrete fields:
# DB_HOST = 'localhost'
# DB_PORT = 5432
# DB_USER = 'user'
# DB_PASSWORD = 'password'
# DB_NAME = 'database'

CREATE_TABLES = True

2) Run

scrapy crawl your_spider

Expected

  • Items saved as JSON rows in job_items.

  • No job_requests rows are created in this recipe.

Tip

Use discrete DB fields to avoid URL‑encoding special characters in passwords.