Skip to main content

Dimension Registration

Dimension registration allows publishers to promote frequently-used targeting attributes from generic key-value pairs to first-class dimensions with type validation, indexing, and optimized analytics.

Overview

OpenAdServe supports two complementary approaches for custom targeting:

Registered Dimensions

  • Purpose: Well-defined, frequently-used targeting attributes
  • Benefits: Type validation, ClickHouse indexing, structured analytics
  • Use Cases: Content categories, user tiers, page positions, brand safety ratings
  • Configuration: Defined in YAML configuration file

Custom Key-Values

  • Purpose: Dynamic, ad-hoc targeting needs
  • Benefits: Maximum flexibility, no configuration overhead
  • Use Cases: Experimental attributes, one-off campaigns, A/B test groups
  • Configuration: None required - works out of the box

When to Use Registered Dimensions

Register dimensions when you have targeting attributes that are:

  1. Frequently Used: Targeted by multiple campaigns or line items
  2. Well-Defined: Clear set of possible values (enums) or value ranges (numbers)
  3. Performance Critical: Need fast analytics queries or reporting
  4. Validation Required: Want to prevent targeting errors from invalid values

Examples of good dimension candidates:

  • Content categories (sports, news, tech, etc.)
  • User subscription tiers (free, premium, enterprise)
  • Page positions (above_fold, below_fold, sidebar)
  • Brand safety ratings (safe, moderate, restricted)

Configuration

Setup

Docker Compose: No setup required - the .env file sets DIMENSIONS_CONFIG_PATH=config/dimensions.yaml which includes basic dimensions for content categorization, page positioning, and user segmentation.

Custom Configuration: Set the DIMENSIONS_CONFIG_PATH environment variable:

export DIMENSIONS_CONFIG_PATH=my-custom-dimensions.yaml
./openadserve

Configuration Schema

dimensions:
- name: dimension_name # Required: Unique identifier
type: enum|string|number|boolean # Required: Data type
values: [val1, val2, val3] # For enum type: allowed values
min: 0 # For number type: minimum value
max: 100 # For number type: maximum value
indexed: true|false # Whether to create ClickHouse index
required: true|false # Whether dimension is required in requests
default: "default_value" # Default value when missing
description: "Description" # Human-readable description

Dimension Types

Enum

Most common type for categorical data with predefined values.

- name: content_category
type: enum
values: [sports, news, entertainment, tech]
indexed: true
default: "news"
description: "Primary content category"

String

For open-ended text values like author names or IDs.

- name: article_author
type: string
indexed: true
description: "Article author name"

Number

For numeric values with optional min/max constraints.

- name: user_engagement_score
type: number
min: 0
max: 100
indexed: true
description: "User engagement score (0-100)"

Boolean

For true/false flags.

- name: is_premium_content
type: boolean
indexed: true
default: "false"
description: "Whether content is premium"

Indexing Strategy

Set indexed: true for dimensions you'll frequently:

  • Filter by in analytics queries
  • Group by in reports
  • Use for campaign optimization

Indexed dimensions get dedicated ClickHouse columns with names like registered_dim_content_category, registered_dim_subscription_tier.

Non-indexed dimensions are still validated but stored in a JSON column for occasional use.

Usage in Ad Requests

Send targeting data in the ext.kv field as usual. The ad server automatically splits registered dimensions from custom key-values:

{
"ext": {
"kv": {
"content_category": "sports", // Registered dimension
"subscription_tier": "premium", // Registered dimension
"page_position": "above_fold", // Registered dimension
"experiment_group": "variant_a" // Custom key-value
}
}
}

Processing Flow

  1. Request Received: Ad server receives targeting data in ext.kv
  2. Dimension Splitting: Registered dimensions separated from custom key-values
  3. Validation: Registered dimensions validated against configuration
  4. Type Conversion: Values converted to appropriate types (string, number, boolean)
  5. Targeting: Both registered dimensions and custom key-values used for ad selection
  6. Analytics: Registered dimensions stored in indexed ClickHouse columns

Error Handling

Invalid dimension values result in HTTP 400 responses with descriptive errors:

{
"error": "dimension validation failed: content_category value 'invalid' not in allowed values: [sports, news, entertainment, tech]"
}

Analytics Benefits

Indexed Queries

Registered dimensions enable fast analytics queries:

-- Fast: Uses indexed column
SELECT COUNT(*) FROM impressions
WHERE registered_dim_content_category = 'sports'
AND registered_dim_subscription_tier = 'premium'

-- Slower: JSON extraction from custom key-values
SELECT COUNT(*) FROM impressions
WHERE JSONExtractString(key_values, 'custom_attr') = 'value'

Automatic Aggregation

ClickHouse materialized views can automatically aggregate by registered dimensions:

-- Auto-generated aggregation table
CREATE MATERIALIZED VIEW impressions_by_category
ENGINE = SummingMergeTree()
ORDER BY (publisher_id, dim_content_category, date)
AS SELECT
publisher_id,
registered_dim_content_category,
toDate(timestamp) as date,
count() as impressions,
sum(revenue) as revenue
FROM impressions
GROUP BY publisher_id, registered_dim_content_category, date

Line Item Targeting

Line items can target registered dimensions exactly like custom key-values:

-- Target registered dimension
INSERT INTO line_items (campaign_id, key_values)
VALUES (1, '{"content_category": "sports", "subscription_tier": "premium"}');

-- Mixed targeting (registered + custom)
INSERT INTO line_items (campaign_id, key_values)
VALUES (2, '{"content_category": "tech", "custom_segment": "high_value"}');

The targeting engine checks both registered dimensions and custom key-values transparently.

Best Practices

Dimension Design

  • Use clear, descriptive names: content_category not cat
  • Keep enum values stable: Avoid frequent changes to registered values
  • Start conservative: Register fewer dimensions initially, expand based on usage
  • Document thoroughly: Include descriptions for all dimensions

Performance Optimization

  • Index frequently queried dimensions: Analytics, reporting, campaign optimization
  • Don't over-index: Each index adds storage and write overhead
  • Monitor usage: Remove unused dimensions periodically
  • Batch configuration changes: Update dimensions.yaml and restart during low-traffic periods

Validation Strategy

  • Use enums for controlled vocabularies: Content categories, user tiers, page positions
  • Set reasonable number ranges: Prevent extreme values that could indicate bugs
  • Provide sensible defaults: Reduce errors from missing values
  • Make optional when possible: Required dimensions can break existing integrations

Backward Compatibility

  • Never break existing campaigns: Ensure line items can still target dimensions after registration
  • Test thoroughly: Validate targeting still works after registering dimensions
  • Monitor error rates: Watch for validation failures after configuration changes
  • Provide migration guides: Help publishers transition smoothly

Schema Generation Tool

OpenAdServe includes a ClickHouse schema generation tool that creates the necessary DDL statements for indexed dimensions. This tool bridges the gap between dimension configuration and database schema.

Usage

# Generate schema (uses default config/dimensions.yaml)
go run tools/generate_clickhouse_schema/main.go -output schema.sql

# Include materialized views and backfill statements
go run tools/generate_clickhouse_schema/main.go -materialized-views -backfill -output schema.sql

# Preview without creating file
go run tools/generate_clickhouse_schema/main.go -dry-run

Deployment Workflow

# 1. Generate and review schema
go run tools/generate_clickhouse_schema/main.go -output schema.sql
cat schema.sql

# 2. Apply to ClickHouse
# Development
docker compose exec clickhouse clickhouse-client --multiquery < schema.sql
# Production
clickhouse-client --multiquery < schema.sql

# 3. Restart server to detect new columns
docker compose restart openadserve

# 4. Verify (optional)
clickhouse-client -q "SHOW CREATE TABLE events" | grep registered_dim_

Testing

Send an ad request and verify dimension data is stored in dedicated columns:

# Check events table after sending requests
clickhouse-client -q "
SELECT
request_id,
registered_dim_content_category,
registered_dim_page_position,
key_values['custom_attr'] as custom_attr
FROM events
WHERE request_id = 'your-request-id'
FORMAT Vertical"

Important Notes

  • Schema changes must be manually applied - not automatic
  • Server restart required after schema changes
  • New data uses dimension columns, existing data remains in key_values
  • Apply during low traffic to minimize performance impact

Troubleshooting

Validation errors: Add missing values to dimension config or use custom key-values

Slow queries: Use indexed: false for rarely-queried dimensions

Targeting not working: Verify line item key_values match dimension config exactly

Missing columns: Apply schema with go run tools/generate_clickhouse_schema/main.go

Debug: Check server logs for "dimension registry initialized" message

Example Configurations

Publisher: News Website

dimensions:
- name: content_category
type: enum
values: [breaking_news, politics, sports, business, technology]
indexed: true
required: true
- name: article_author
type: string
indexed: true
- name: paywall_status
type: enum
values: [free, subscriber_only, premium]
indexed: true
default: "free"

Publisher: E-commerce Site

dimensions:
- name: product_category
type: enum
values: [electronics, clothing, home_garden, sports, books]
indexed: true
- name: price_range
type: enum
values: [budget, mid_range, premium, luxury]
indexed: true
- name: user_loyalty_tier
type: enum
values: [new, bronze, silver, gold, platinum]
indexed: true
default: "new"

Publisher: Gaming Platform

dimensions:
- name: game_genre
type: enum
values: [action, strategy, rpg, sports, puzzle, simulation]
indexed: true
- name: player_level
type: number
min: 1
max: 100
indexed: true
- name: in_game_purchase_history
type: boolean
indexed: true
default: "false"

Integration

  • API Reference - Send registered dimensions in ad requests via the ext.kv field
  • Integration Guide - JavaScript SDK usage with mixed registered dimensions and custom key-values

Analytics

  • Analytics - Query indexed dimension columns for fast campaign performance analysis
  • ClickHouse Schema Tool - Generate DDL statements for dimension columns and materialized views

Configuration

  • Targeting - Line items target registered dimensions the same way as custom key-values
  • Custom Events - Registered dimensions are automatically tracked with all custom events