Skip to content

Fetch NEMWEB data

Download data with the CLI

# All active tables for a date range
uv run populate --location ./data --date_range 2024-01-01->2024-06-30

# A specific table only
uv run populate --location ./data --date_range 2024-01-01->2024-06-30 --table DISPATCHREGIONSUM

# Force re-download (overwrite existing)
uv run populate --location ./data --date_range 2024-01-01->2024-01-31 --force_new

CLI flags:

Flag Default Description
--location ~/.nemdb_cache Directory to store Parquet files
--filesystem file Storage backend (file, gcs, etc.)
--date_range (required) Date range as YYYY-MM-DD->YYYY-MM-DD
--table all Table name or all
--force_new False Re-download existing data

Download data from Python

from nemdb import NEMWEBManager, Config

Config.set_cache_dir("./data")
nemweb = NEMWEBManager(Config)

# Populate all active tables
nemweb.populate(slice("2024-01-01", "2024-06-30"))

# Populate a single table
nemweb.DISPATCHREGIONSUM.populate(slice("2024-01-01", "2024-06-30"))

# Add a single month
nemweb.DISPATCHREGIONSUM.add_data(year=2024, month=1)

Available tables

Tables populated by default

These tables are downloaded when running populate with table=all:

Table Filter type Description
DISPATCHREGIONSUM Settlement date Regional demand, supply, and availability by 5-min interval
DISPATCHLOAD Settlement date Unit-level dispatch (MW cleared, ramp rates, FCAS)
DISPATCHPRICE Settlement date Regional reference prices and FCAS prices
BIDPEROFFER_D Settlement date Unit volume bids by 5-min interval (large table)
BIDDAYOFFER_D Settlement date Unit price bids by market day
DUDETAILSUMMARY Start/end date Unit metadata (region, connection point, loss factors)
DUDETAIL Effective date Detailed unit parameters (capacity, ramp rates, storage)
DUALLOC None DUID to GENSETID mapping
GENUNITS None Generator set information (voltage, fuel, emissions)
STATION None Station name and address
STATIONOPERATINGSTATUS None Station operating status
MNSP_INTERCONNECTOR Effective date Market network service provider interconnectors

Additional tables (not populated by default)

Table Filter type Description
DISPATCHCONSTRAINT Settlement date Generic constraints used in dispatch
GENCONDATA Effective date Generic constraint definitions
SPDREGIONCONSTRAINT Effective date Regional constraint LHS terms
SPDCONNECTIONPOINTCONSTRAINT Effective date Connection point constraint LHS terms
SPDINTERCONNECTORCONSTRAINT Effective date Interconnector constraint LHS terms
INTERCONNECTOR Effective date Interconnector region links
INTERCONNECTORCONSTRAINT Effective date Interconnector loss parameters
LOSSMODEL Effective date Interconnector loss model breakpoints
LOSSFACTORMODEL Effective date Loss factor demand coefficients
DISPATCHINTERCONNECTORRES Settlement date Interconnector dispatch results
RESERVE Settlement date FCAS reserve requirements
ZONE_SUBSTATION Year DNSP zone substation loads

Query patterns

By settlement date (5-minute intervals)

# Returns data for a single 5-minute interval
df = nemweb.DISPATCHREGIONSUM.get_data("2024/01/15 12:00:00")

By start/end date range

# Returns data where start_date <= date <= end_date
df = nemweb.DUDETAILSUMMARY.get_data("2024/01/15")

By effective date and version

# Returns latest version of data effective on or before the date
df = nemweb.GENCONDATA.get_data("2024/01/15")
import polars as pl

# Create a lazy query
lf = nemweb.DISPATCHREGIONSUM.scan()

# Filter, aggregate, then collect
result = (
    lf.filter(
        (pl.col("REGIONID") == "NSW1")
        & (pl.col("SETTLEMENTDATE") >= pl.datetime(2024, 1, 1))
        & (pl.col("SETTLEMENTDATE") < pl.datetime(2024, 2, 1))
    )
    .group_by(pl.col("SETTLEMENTDATE").dt.date())
    .agg(pl.col("TOTALDEMAND").mean())
    .sort("SETTLEMENTDATE")
    .collect()
)

Read bid data

Bid data is handled separately due to its size:

# Price and volume bids for a specific day
price_bids = nemweb.get_unit_price_bids("2024/01/15 12:00:00")
volume_bids = nemweb.get_unit_volume_bids("2024/01/15 12:00:00")

Use cloud storage

nemdb supports any fsspec-compatible backend:

from nemdb import Config

# Google Cloud Storage
Config.set_cache_dir("gs://my-bucket/nemweb")
Config.set_filesystem("gcs")
# Via CLI
uv run populate --location gs://my-bucket/nemweb --filesystem gcs --date_range 2024-01-01->2024-01-31