Configuration

Essential settings only. Add these to your Scrapy project’s settings.py.

Required

Database (pick ONE style)

# Single URL
DB_URL = 'postgresql://user:password@localhost:5432/database'

# OR discrete fields (no URL encoding needed)
# DB_HOST = 'localhost'
# DB_PORT = 5432
# DB_USER = 'user'
# DB_PASSWORD = 'password'
# DB_NAME = 'database'

Recommended

ITEM_PIPELINES = {
    'scrapy_item_ingest.DbInsertPipeline': 300,
}

EXTENSIONS = {
    'scrapy_item_ingest.LoggingExtension': 500,
}

Optional

CREATE_TABLES = True   # auto-create job_items, job_requests, job_logs
# JOB_ID = 1           # omit to use spider name

Table names (optional)

# Defaults
# ITEMS_TABLE = 'job_items'
# REQUESTS_TABLE = 'job_requests'
# LOGS_TABLE = 'job_logs'

Logging to DB (optional)

# Minimum level stored in DB
# LOG_DB_LEVEL = 'INFO'  # or 'DEBUG', 'WARNING', ...

# Capture level for Scrapy loggers routed to DB (does not change console)
# LOG_DB_CAPTURE_LEVEL = 'DEBUG'

# Include/exclude loggers and messages
# LOG_DB_LOGGERS = ['scrapy']
# LOG_DB_EXCLUDE_LOGGERS = ['scrapy.core.scraper']
# LOG_DB_EXCLUDE_PATTERNS = ['Scraped from <']

Tips

Password has @ or $? If using DB_URL, encode them: @ -> %40, $ -> %24.
Prefer discrete fields to avoid URL encoding.
Set CREATE_TABLES = True for the first run, then keep or turn off as you prefer.