🏠 OMS to D365 Integration Overview

Complete guide to building an event-driven, cloud-native integration platform from Order Management System (OMS) to Dynamics 365 Finance & Operations via Azure Integration Services (AIS). This architecture prioritizes reliability, security, and operational excellence.

🎯 Key Highlights

✅ Event-Driven Architecture
✅ Dead Letter Queue (DLQ) Handling
✅ Logic App Exception Handling
✅ Function App Exception Handling
✅ Idempotent Processing
✅ Cosmos DB State Machine
✅ Secure Event Grid Endpoint
✅ ZIP-based DIXF Import
✅ 3-Environment IaC (Dev/UAT/Prod)
✅ Application Insights Telemetry
✅ Key Vault Secret Management
✅ Bicep + Terraform IaC

📊 Big Picture Data Flow

Here's the complete journey of an order from OMS to D365:

graph LR OMS["🏭 OMS Order Created"] -->|"CloudEvent every 4h"| EG["📋 Event Grid Custom Topic"] EG -->|"Route"| SB["📨 Service Bus Topic + Subscription"] SB -->|"ServiceBusTrigger"| F1["⚡ Function 1 Ingestion Validate + Store"] F1 -->|"Upsert status=Pending"| COSMOS["Cosmos DB oms-orders"] COSMOS -->|"Query every 4h"| F2["Function 2 Timer Transform Batch and ZIP"] F2 -->|"Upload ZIP"| BLOB["Blob Storage oms-d365-payloads"] F2 -->|"Update status=Processed"| COSMOS BLOB -->|"T+4:15h List and Get"| LA["Logic App Delivery Upload and DIXF"] LA -->|"DIXF Upload"| D365["D365 Finance and Operations SalesOrderHeadersV2 DIXF Import"] F1 -.->|"Telemetry"| AI["App Insights"] LA -.->|"Telemetry"| AI F1 -.->|"Get Secrets"| KV["Key Vault"] LA -.->|"Get Secrets"| KV

📅 Processing Timeline

sequenceDiagram participant OMS as OMS System participant EG as Event Grid participant SB as Service Bus participant F1 as Function 1 participant COSMOS as Cosmos DB participant F2 as Function 2 participant BLOB as Blob Storage participant LA as Logic App participant D365 as D365 F&O Note over OMS,D365: 4-Hour Processing Window rect rgb(240, 248, 255) Note over OMS: T+0:00 - OMS Publishing Phase OMS->>EG: POST CloudEvents (orders collected) EG->>SB: Route to oms-orders-topic SB->>F1: ServiceBusTrigger (per message) F1->>COSMOS: Upsert document (status: Pending) F1->>COSMOS: Check duplicates (10min window) end rect rgb(255, 248, 220) Note over F2: T+4:00 - Batch Transform Phase F2->>COSMOS: Query WHERE status='Pending' COSMOS-->>F2: Return pending orders F2->>COSMOS: Update status='Processing' F2->>F2: Transform to D365 schema F2->>F2: Create ZIP archive F2->>BLOB: Upload ZIP file F2->>COSMOS: Update status='Processed' end rect rgb(240, 255, 240) Note over LA: T+4:15 - D365 Delivery Phase LA->>BLOB: List blobs (oms-d365-payloads) BLOB-->>LA: Return latest ZIP LA->>D365: Upload via DIXF connector D365->>D365: Execute import job D365-->>LA: Import completion LA->>COSMOS: Update delivery status end Note over D365: T+4:30 - Orders visible in D365 Sales module

🏗️ Component Interaction Overview

graph TB subgraph PUSH["Every 4 Hours - OMS Push"] P1["OMS scheduled job runs"] --> P2["Collect orders since last run"] P2 --> P3["POST CloudEvents to Event Grid"] P3 --> P4["Event Grid validates and routes"] P4 --> P5["Service Bus queues messages"] P5 --> P6["Function 1 processes each message - validate - dedup - Cosmos Pending"] end subgraph BATCH["Every 4 Hours - Transform Batch"] B1["Function 2 timer fires"] --> B2["Query Cosmos WHERE status=Pending"] B2 --> B3["Claim records - status=Processing"] B3 --> B4["Map to D365 schema"] B4 --> B5["Create ZIP archive - header.json - package.yaml - sales_orders.json - lines.json"] B5 --> B6["Upload ZIP to Blob Storage"] B6 --> B7["Update Cosmos - status=Processed - blobReference=URI"] end subgraph DELIVER["Every 4h+15min - Logic App Delivery"] D1["Logic App recurrence fires"] --> D2["List blobs in oms-d365-payloads"] D2 --> D3{"New blob found?"} D3 -->|"Yes"| D4["Download ZIP content"] D4 --> D5["Upload to D365 via connector"] D5 --> D6["Trigger DIXF import job"] D6 --> D7["Poll until complete"] D7 --> D8["Log success to App Insights"] D3 -->|"No"| D9["Terminate - no work to do"] end subgraph EXCEPTION["Exception Paths"] E1["DLQ: malformed JSON or validation failure"] --> E2["Azure Monitor alert fires"] E2 --> E3["On-call notified - manual remediation"] E4["Logic App failure"] --> E5["Exception scope catches error"] E5 --> E6["Alert email sent - App Insights failure logged"] end P6 -.-> BATCH BATCH -.-> DELIVER style PUSH fill:#0078d4,color:#fff style BATCH fill:#107c10,color:#fff style DELIVER fill:#7b4f9e,color:#fff style EXCEPTION fill:#d83b01,color:#fff

⏱️ Why Every 4 Hours?

Rationale: The 4-hour batch window balances several competing concerns:
  • Cost: Fewer Function 2 executions = lower compute costs and Azure Service Bus reads
  • Throughput: Batching multiple orders into a single ZIP reduces D365 DIXF import overhead
  • Timeliness: Orders reach D365 within ~4.25 hours (4h batch + 15min offset) — acceptable for most business processes
  • Rate Limiting: Respects D365 API throttling limits; single ZIP upload per 4h window
  • Idempotency: Clear processing windows reduce replay/duplicate risk

🔄 Processing Status States

Each order moves through these states in Cosmos DB:

📈 Expected Performance

Metric Value Notes
Ingestion Latency < 1 second Service Bus trigger fires instantly; Function 1 validates & writes to Cosmos
End-to-End Latency ~4-4.5 hours Batch window (4h) + delivery offset (15min) + D365 processing (variable)
Throughput 1,000+ orders/batch ZIP size typically 2-5 MB for standard order volumes
DLQ Rate < 0.1% Only malformed events or network failures; good data quality assumed
Availability Target 99.9% (3x9) SLA: deliveries reach D365 within +30min of scheduled time

🎓 What You'll Learn

By exploring this tutorial, you will understand:

🏛️ System Architecture

High-Level Design

Layered architecture showing data flow from OMS source through D365 destination, with observability and security cross-cutting concerns.

graph TB subgraph SOURCE["OMS Source System"] OMS[("OMS Order Management System")] end subgraph LAYER1["Layer 1 - Event Capture"] EG["Event Grid Custom Topic"] SB["Service Bus Topic plus Subscription plus DLQ"] end subgraph LAYER2["Layer 2 - Ingestion"] FA1["Function 1: OmsOrderIngestion - ServiceBusTrigger"] end subgraph LAYER3["Layer 3 - State Store"] COSMOS[("Cosmos DB - oms-orders - partition: /orderId")] end subgraph LAYER4["Layer 4 - Transform"] FA2["Function 2: OmsTimerTransform - Timer every 4h"] end subgraph LAYER5["Layer 5 - Stage"] BLOB["Blob Storage - oms-d365-payloads - ZIP files"] end subgraph LAYER6["Layer 6 - Deliver"] LA["Logic App Standard - oms-to-d365-delivery - Every 4h+15min"] end subgraph DEST["D365 Finance and Operations"] D365[("Dynamics 365 FandO - SalesOrderHeadersV2 - DIXF Import")] end subgraph OBS["Observability and Security"] AI["Application Insights - Structured Logs and Alerts"] KV["Key Vault - All Secrets and Certs"] end OMS -->|"CloudEvent OMS.Order.Created"| EG EG -->|"Event subscription routing"| SB SB -->|"ServiceBusTrigger Peek-Lock"| FA1 FA1 -->|"Upsert status=Pending"| COSMOS COSMOS -->|"Query WHERE status=Pending"| FA2 FA2 -->|"Upload ZIP package"| BLOB FA2 -->|"Update status=Processed"| COSMOS BLOB -->|"Fetch blob content"| LA LA -->|"DIXF import trigger"| D365 LA -->|"Success and Failure telemetry"| AI FA1 -->|"Structured logs"| AI KV -.->|"Secrets"| FA1 KV -.->|"Secrets"| LA style SOURCE fill:#1e3a5f,color:#fff style LAYER1 fill:#0078d4,color:#fff style LAYER2 fill:#107c10,color:#fff style LAYER3 fill:#7b4f9e,color:#fff style LAYER4 fill:#c75000,color:#fff style LAYER5 fill:#106ebe,color:#fff style LAYER6 fill:#d83b01,color:#fff style DEST fill:#00bcf2,color:#000 style OBS fill:#2d4a2d,color:#fff

Component Overview

Layer 1 — Event Capture: OMS publishes CloudEvents to Event Grid custom topic every 4 hours. Event Grid subscriptions route to Service Bus topic for reliable, durable message queue.

Layer 2 — Ingestion: Azure Functions v4 (Function 1) binds to Service Bus subscription via ServiceBusTrigger. Validates JSON schema, checks for duplicates, and upserts order documents to Cosmos DB with status "Pending".

Layer 3 — State Store: Cosmos DB NoSQL container stores order state machine. Documents partitioned by orderId; session-level consistency; 30-day TTL for automatic cleanup.

Layer 4 — Transform: Function 2 runs on timer trigger (every 4 hours). Queries all "Pending" orders, claims them (status → "Processing"), transforms to D365 schema, and creates ZIP archive.

Layer 5 — Stage: ZIP files stored in Blob Storage container "oms-d365-payloads". Blob naming convention: oms_yyyyMMdd_HHmmss_guid.zip. Files retained for 90 days.

Layer 6 — Deliver: Logic App Standard runs every 4h + 15min offset. Lists blobs, downloads latest ZIP, calls D365 connector to upload, triggers DIXF import job. Updates Cosmos DB status to "Processed" on success.

Destination — D365 F&O: Dynamics 365 receives ZIP package via DIXF (Data Import Export Framework). Executes standard import job for SalesOrderHeadersV2 entity. Orders appear in D365 Sales Order module.

Observability: Application Insights collects structured telemetry from Functions and Logic App. Key Vault holds all connection strings, API keys, and D365 service principal credentials.

Low-Level Design

Detailed component architecture showing service SKUs, configuration parameters, and inter-service communication details.

graph TD OMS --> EG EG --> SB SB --> TOPIC TOPIC --> SUB SUB --> F1 SUB --> DLQ F1 --> COSMOS COSMOS --> F2 F2 --> BLOB BLOB --> LA LA --> D365 F1 --> AI LA --> AI KV --> F1 KV --> LA

Key Configuration Details

🔀 Complete Data Flow

Step-by-step sequence of events showing how an order moves through the system.

Sequence Diagram: End-to-End Order Processing

sequenceDiagram participant OMS as 🏭 OMS participant EG as 📋 Event Grid participant SB as 📨 Service Bus participant F1 as ⚡ Function 1 participant COSMOS as 🌐 Cosmos DB participant AI as 📈 App Insights rect rgb(0, 120, 212, 0.1) note over OMS,AI: T+0h: Order Ingestion Phase OMS->>EG: POST CloudEvent (OMS.Order.Created) EG->>SB: Route to oms-orders-topic SB->>F1: ServiceBusTrigger (Peek-Lock) F1->>F1: Validate JSON schema F1->>F1: Check orderId not duplicate F1->>COSMOS: Upsert(id=order123, status="Pending", ingestedAt=now) COSMOS-->>F1: Document acknowledged F1->>AI: Track event: "IngestionSuccess" F1->>SB: CompleteMessage (remove from queue) end rect rgb(107, 124, 16, 0.1) note over OMS,AI: T+4h: Timer Transform Phase activate F1 note over F1: Timer trigger fires F1->>COSMOS: Query WHERE processingStatus="Pending" COSMOS-->>F1: Return [order123, order456, ...] F1->>F1: Claim batch (status→"Processing") F1->>COSMOS: Update orders status="Processing", pickedUpAt=now F1->>F1: Transform orders to D365 schema F1->>F1: Create ZIP archive F1->>BLOB: Upload ZIP (oms_20260314_093000_uuid.zip) BLOB-->>F1: Blob URI + SAS URL F1->>COSMOS: Update status="Processed", blobReference=uri F1->>AI: Track event: "TransformSuccess", batchSize=42 deactivate F1 end rect rgb(212, 59, 1, 0.1) note over OMS,AI: T+4:15h: Delivery Phase participant LA as 🔗 Logic App participant D365 as D365 Finance and Operations LA->>BLOB: List blobs (filter by time) BLOB-->>LA: Return latest ZIP LA->>BLOB: Get blob content (binary) LA->>D365: Call ImportSalesOrders (connector) D365->>D365: Execute DIXF import job D365-->>LA: Job ID + status LA->>AI: Track event: "DeliverySuccess" end rect rgb(139, 0, 0, 0.1) note over OMS,AI: Exception Path: DLQ Handling participant DLQ as 🚨 Dead Letter Queue note over F1: Delivery count = 10 F1->>DLQ: AutoDeadLetter DLQ->>AI: Alert: "DLQ message count > 0" AI->>AI: Trigger incident end

Data Flow Diagram — Functional Processes (DFD Level 0)

graph LR OMS["OMS Source System"] AIS["AIS Integration Platform"] D365["D365 Finance and Operations"] OMS -->|"CloudEvent Order Created"| AIS AIS -->|"ZIP DIXF Package Import"| D365 style OMS fill:#1e3a5f,color:#fff style AIS fill:#0078d4,color:#fff style D365 fill:#00bcf2,color:#000

DFD Level 1 — Internal Processes

graph TB subgraph CAPTURE["1. Capture"] EG["Event Grid Topic"] SB["Service Bus Topic"] end subgraph INGEST["2. Ingest & Validate"] F1["Function 1 ServiceBusTrigger"] SCHEMA["JSON Schema Validation"] DEDUP["Duplicate Detection"] end subgraph STORE["3. Store State"] COSMOS["Cosmos DB State Store"] STATUS["Process Status Machine"] end subgraph BATCH["4. Batch & Transform"] F2["Function 2 Timer Trigger"] CLAIM["Claim Records Batch ID"] MAP["Transform to D365 Schema"] ZIP["Create ZIP Archive"] end subgraph STAGE["5. Stage"] BLOB["Blob Storage Processed"] end subgraph DELIVER["6. Deliver"] LA["Logic App Orchestration"] DIXF["DIXF Import D365 Connector"] end subgraph OUTPUT["7. Output"] D365["D365 Finance and Operations SalesOrders"] end %% Flow connections between subgraphs CAPTURE --> INGEST INGEST --> STORE STORE --> BATCH BATCH --> STAGE STAGE --> DELIVER DELIVER --> OUTPUT %% Internal subgraph connections EG --> SB SB --> F1 F1 --> SCHEMA SCHEMA --> DEDUP COSMOS --> STATUS F2 --> CLAIM CLAIM --> MAP MAP --> ZIP LA --> DIXF style CAPTURE fill:#0078d4,color:#fff style INGEST fill:#107c10,color:#fff style STORE fill:#7b4f9e,color:#fff style BATCH fill:#c75000,color:#fff style STAGE fill:#106ebe,color:#fff style DELIVER fill:#d83b01,color:#fff style OUTPUT fill:#00bcf2,color:#000

Order State Machine Transitions

stateDiagram-v2 direction LR [*] --> Pending : Function 1 ingests order Pending --> Processing : Function 2 claims batch every 4h Processing --> Processed : ZIP uploaded to Blob Storage Processing --> Failed : Transform exception caught Failed --> Pending : Admin re-queues for retry Processed --> [*] : TTL 30 days auto-delete Failed --> [*] : TTL 30 days auto-delete

Event Schema (CloudEvent → D365)

Input Event (from OMS):

{
  "specversion": "1.0",
  "type": "OMS.Order.Created",
  "source": "https://oms.contoso.com",
  "id": "12345-67890",
  "time": "2026-03-14T08:00:00Z",
  "datacontenttype": "application/json",
  "data": {
    "orderId": "ORD-20260314-001",
    "customerId": "CUST-12345",
    "orderDate": "2026-03-14",
    "totalAmount": 5000.00,
    "currency": "USD",
    "lineItems": [
      {
        "itemNumber": 1,
        "productCode": "PROD-ABC",
        "quantity": 10,
        "unitPrice": 500.00
      }
    ],
    "shippingAddress": {
      "street": "123 Main St",
      "city": "Seattle",
      "state": "WA",
      "zip": "98101",
      "country": "US"
    }
  }
}

Cosmos DB Document (after ingestion):

{
  "id": "ORD-20260314-001",
  "orderId": "ORD-20260314-001",
  "processingStatus": "Pending",
  "omsEvent": { /* full CloudEvent data */ },
  "ingestedAt": "2026-03-14T08:00:15Z",
  "pickedUpAt": null,
  "processedAt": null,
  "batchId": null,
  "blobReference": null,
  "retryCount": 0,
  "_ttl": 2592000
}

🧩 Why Azure? Component Decisions

Decision matrix showing component selection rationale and alternatives considered.

Component Selection Matrix

Component Why Chosen Alternatives Considered Decision Criteria
Event Grid
Custom Topic
Push-based, CloudEvents spec, 10M+ events/s, SAS + Managed Identity auth, built-in subscription routing, no server management Apache Kafka, Event Hubs, direct HTTP webhooks, SNS (AWS) Native Azure, serverless, low cost for < 1M events, automatic subscription routing, CloudEvents standard compliance
Service Bus
Standard
Reliable at-least-once delivery, peek-lock semantics, built-in DLQ, session support, 10-retry default, duplicate detection, FIFO ordering (topics with sessions) Storage Queue, Event Hubs, RabbitMQ, Amazon SQS Enterprise-grade messaging, DLQ critical for resilience, automatic retry + dead-lettering, cost-effective for intermittent load
Azure Functions
v4 .NET 8
Serverless (consumption plan), no idle cost, elastic scale 0→1000s, isolated worker process, built-in Service Bus trigger, Managed Identity native App Service, Container Apps, Durable Functions, AWS Lambda Minimal ops overhead, scales with message volume, excellent .NET integration, automatic trigger binding for Service Bus
Cosmos DB
NoSQL
Schema-flexible for OMS variant structures, partition by orderId ensures hot-data locality, idempotent upsert, 10ms p99 reads, TTL for auto-cleanup, global distribution option Azure SQL, Table Storage, MongoDB, PostgreSQL Order documents vary in structure; SQL would require schema migration per OMS change. Upsert operation = built-in idempotency. Session consistency sufficient.
Blob Storage Cheap binary storage ($0.01/GB/month cool tier), append-only guarantees, streaming upload, SAS URLs for D365 access, 90-day retention policies, ZRS replication Azure Files, Data Lake, direct database BLOB field, SFTP server ZIP packages inherently binary; Blob is designed for this. Cost << SQL storage. SAS URLs are secure temporary download links. No server to manage.
Logic App
Standard
Low-code D365 connector (built-in), visual workflow designer, retry policies per action, scope-based exception handling, Managed Identity support, ISE for private networking API Management + custom code, AWS Step Functions, Zapier, Power Automate Cloud Out-of-box D365 F&O connector saves 3+ weeks dev time. Visual design reduces bugs. Exception scopes map cleanly to business logic.
Application
Insights
Native Azure telemetry SDK, structured logging, KQL query language, auto-correlation across services, alert rules, free for Functions, 90-day retention, Log Analytics integration Datadog, Splunk, New Relic, CloudWatch (AWS) Zero instrumentation cost in Azure. KQL is powerful. Correlation IDs auto-tracked. Built-in SLA/SLO dashboards. No vendor lock-in (data exportable).
Key Vault Centralized secret store, RBAC-based access (no shared keys), Managed Identity auto-auth, audit logging, optional hardware security module (HSM), secret rotation automation, soft-delete recovery App Configuration, environment variables, AWS Secrets Manager, HashiCorp Vault Zero secrets in code/configs. Managed Identity = no passwords to rotate manually. RBAC = principle of least privilege. HSM option for compliance (PCI-DSS, HIPAA).

Design Decision Narratives

📨 Why NOT Premium Service Bus?

Cost-Benefit Analysis: Premium SKU costs $584/month (dedicated capacity) vs Standard at ~$20/month. Premium benefits:
  • 99.95% SLA vs 99.9%
  • Guaranteed throughput with partitions
  • VNet integration
Our decision: Standard is sufficient because:
  • 4-hour batch cycle = ~6,250 messages/day max = well below Standard limits
  • 99.9% SLA acceptable for order processing (not real-time payments)
  • Private endpoints can be added to Standard for network security
  • Upgrade path exists if load increases

🌐 Why Session Consistency (not Strong) in Cosmos DB?

Consistency Level Tradeoff: Strong consistency guarantees linearizability but impacts latency and throughput.
  • Strong: p99 latency > 100ms, 30% fewer RUs
  • Session: p99 latency < 50ms, optimal RU cost
Our decision: Session because:
  • Single-region deployment = fast propagation (< 1ms internal)
  • Function 1 writes & reads own session = session-consistent
  • Function 2 reads "Pending" are eventual consistent (acceptable 4h batch delay)
  • No cross-partition transactions needed

⚡ Why /orderId as Partition Key (not /processingStatus)?

Hot Partition Analysis: Partition key cardinality determines data distribution.
  • /processingStatus: Only 4 values → ALL "Pending" orders on same partition → hot partition → throttling (429 errors)
  • /orderId: Millions of unique values → even distribution → no hot partition
Our decision: /orderId because:
  • Function 1 writes per-order = natural distribution
  • Function 2 cross-partition query (status=Pending) only runs 6x/day (acceptable)
  • Each order's operations stay on same partition (better cache locality)

🔐 Why Managed Identity over Connection Strings?

Security First: Connection strings are credentials that can leak in logs, error messages, or git history.
  • Connection String Model: "DefaultEndpointProtocol=..." stored in Key Vault → retrieved at runtime → risk of exposure in memory/logs
  • Managed Identity Model: Function app has an identity → RBAC role grants "Service Bus Data Receiver" → no secret ever stored
Our decision: Managed Identity because:
  • AAD token lifetime = 1 hour (auto-refreshed)
  • No static credentials = no rotation overhead
  • Audit logs show which identity accessed what (fine-grained accountability)
  • Complies with zero-trust security model

⏰ Why Logic App Standard over Cloud?

Logic App Hosting Decision:
  • Cloud (Consumption): Cheaper per execution, fully serverless, but cold starts and D365 connector latency spikes
  • Standard (App Service Plan): Higher baseline cost ($20/month ISE), but warm instances, guaranteed latency, better for Regulated orgs
Our decision: Standard because:
  • D365 connector calls are sensitive to latency (200ms+ cold start can exceed D365 API timeout)
  • ISE option enables private networking (VNet integration)
  • Predictable monthly cost (no surprise execution fees)
  • Better monitoring + debugging experience

⚡ Event Grid & Secure Event Publishing

Event Grid Custom Topic endpoint security and 4-hour push cycle rationale.

Securing the Event Grid Endpoint

Five authentication methods protect Event Grid Custom Topic from unauthorized publishers:

Authentication Flow: OMS to Event Grid

sequenceDiagram participant OMS as OMS Application participant AAD as Microsoft Entra ID participant EG as Event Grid Topic participant SB as Service Bus Topic Note over OMS,SB: Every 4 hours - Secure Publishing Flow OMS->>AAD: Request OAuth2 token (client_credentials grant) Note right of OMS: client_id + client_secret from Key Vault AAD-->>OMS: Bearer JWT Token (valid 1 hour) OMS->>EG: POST /api/events (Bearer token in Authorization header) Note right of OMS: CloudEvent payload in request body EG->>AAD: Validate token - check EventGrid Data Sender role AAD-->>EG: Token valid - role confirmed EG-->>OMS: HTTP 200 OK - Event accepted EG->>SB: Route event to oms-orders-topic subscription SB-->>EG: Delivery confirmed Note over OMS,SB: Process repeats every 4 hours

Event Grid Security Layers (Defence-in-Depth)

graph LR subgraph OMS_SIDE["OMS Side"] OMS_APP["OMS Application App Registration in Entra ID"] end subgraph AUTH_LAYER["Authentication Layer"] ENTRA["Microsoft Entra ID Token Endpoint EventGrid Data Sender Role Assignment"] end subgraph NETWORK_LAYER["Network Layer"] IP_RULE["IP Firewall Rules Whitelist OMS IP ranges Deny all others"] PE["Private Endpoint Prod environment No public internet"] end subgraph EG_LAYER["Event Grid"] EG_TOPIC["Event Grid Custom Topic oms-events-prod CloudEvents 1.0 Schema Input validation"] end subgraph DEST_LAYER["Destination"] SB_DEST["Service Bus Topic oms-orders-topic Filtered subscription OMS.Order.Created only"] end OMS_APP -->|"1. Get AAD token"| ENTRA ENTRA -->|"2. JWT Bearer token"| OMS_APP OMS_APP -->|"3. POST CloudEvent with Bearer token"| IP_RULE IP_RULE -->|"4. IP allowed"| EG_TOPIC PE -.->|"Alt: Private network"| EG_TOPIC EG_TOPIC -->|"5. Route matching events"| SB_DEST style OMS_SIDE fill:#1e3a5f,color:#fff style AUTH_LAYER fill:#7b4f9e,color:#fff style NETWORK_LAYER fill:#d83b01,color:#fff style EG_LAYER fill:#0078d4,color:#fff style DEST_LAYER fill:#107c10,color:#fff

CloudEvents Payload Format

OMS publishes CloudEvents (CNCF standard) batched in HTTP POST every 4 hours:

[
  {
    "specversion": "1.0",
    "type": "OMS.Order.Created",
    "source": "https://oms.contoso.com",
    "id": "order-uuid-1234",
    "time": "2026-03-14T12:00:00Z",
    "datacontenttype": "application/json",
    "data": {
      "orderId": "ORD-001",
      "customerId": "CUST-001",
      "totalAmount": 5000.00,
      "lineItems": []
    }
  }
]

⏱️ Why Every 4 Hours?

Factor 4-Hour Window
Cost 6 Function 2 runs/day; low compute cost
E2E Latency ~4.25 hours (batch + 15min offset); acceptable for order processing
Batch Size 50-500 orders typical; 1-2 MB ZIP
D365 Throttling 1 ZIP upload every 4h; well below 200/min import limit
Idempotency Clear window boundaries make dedup logic simple

📨 Service Bus & Dead Letter Queue

At-least-once delivery, peek-lock semantics, and automatic dead-lettering for failed messages.

Service Bus Configuration

Component Configuration Rationale
Namespace SKU Standard Cost-effective; 40M messages/month included
Topic Partitioned; TTL 14 days; duplicate detection 10 min Partitioning scales throughput; duplicate detection prevents replay
Subscription maxDelivery=10; lockDuration=5min; DLQ enabled 10 retries over 5+ minutes; failed messages auto-DLQ

Service Bus Namespace Topology

graph TB subgraph NS["Service Bus Namespace: sb-oms-integration-prod - SKU: Standard"] subgraph TOPIC["Topic: oms-orders-topic - Partitioned - TTL 14 days - Dup Detection 10min"] SUB["Subscription: oms-d365-subscription - MaxDelivery: 10 - LockDuration: 5min"] DLQ_MAIN["Dead Letter Queue - oms-d365-subscription/$deadletterqueue - TTL: 14 days"] MON_SUB["Subscription: oms-dlq-monitor - For alerting and remediation"] end end EG_IN["Event Grid - Source"] -->|"Route OMS.Order.Created"| TOPIC SUB -->|"ServiceBusTrigger - Peek-Lock"| FA["Function 1: OmsOrderIngestion"] SUB -.->|"After 10 delivery failures"| DLQ_MAIN DLQ_MAIN -.->|"Monitor and alert"| MON_SUB style NS fill:#106ebe,color:#fff style TOPIC fill:#0078d4,color:#fff

Dead Letter Queue (DLQ) Flow

Messages are sent to DLQ when:

DLQ Message Processing Flow

flowchart TD MSG["Message arrives on Service Bus"] --> PEEK["Function 1 receives message - Peek-Lock mode - Lock: 5 minutes"] PEEK --> PARSE{"JSON Deserialisation"} PARSE -->|"JsonException"| DLQ_PARSE["DeadLetterMessage immediately - Reason: JsonDeserialiseFailure - No retry possible"] PARSE -->|"Success"| VALIDATE{"Schema Validation"} VALIDATE -->|"ValidationException"| DLQ_VALID["DeadLetterMessage immediately - Reason: ValidationFailure - No retry possible"] VALIDATE -->|"Valid"| DEDUP{"Duplicate Check - orderId in Cosmos DB?"} DEDUP -->|"Exists already"| COMPLETE_DUP["CompleteMessage - Idempotent discard - Order already stored"] DEDUP -->|"New order"| COSMOS_WRITE{"Write to Cosmos DB"} COSMOS_WRITE -->|"Success"| COMPLETE_OK["CompleteMessage - Order stored as Pending - SUCCESS"] COSMOS_WRITE -->|"CosmosException transient"| ABANDON["AbandonMessage - Delivery count incremented - Service Bus retries"] ABANDON --> RETRY_CHECK{"Delivery Count = 10?"} RETRY_CHECK -->|"No"| PEEK RETRY_CHECK -->|"Yes"| DLQ_AUTO["Service Bus AUTO Dead-Letters - oms-d365-subscription slash deadletterqueue"] DLQ_AUTO --> ALERT["Azure Monitor Alert fires - DLQ count greater than 0 - Page on-call engineer"] ALERT --> REMEDIATE["Ops Team Remediation - Review DLQ message - Fix source data - Re-publish to Event Grid"] style DLQ_PARSE fill:#d83b01,color:#fff style DLQ_VALID fill:#d83b01,color:#fff style DLQ_AUTO fill:#d83b01,color:#fff style COMPLETE_OK fill:#107c10,color:#fff style COMPLETE_DUP fill:#107c10,color:#fff style ALERT fill:#c75000,color:#fff style REMEDIATE fill:#7b4f9e,color:#fff

DLQ Monitoring Alert

Azure Monitor Alert: If DLQ message count > 0 for 5 min, trigger high-severity alert. Action: Page ops on-call, create incident ticket.

Managed Identity Authentication

Function App accesses Service Bus via Managed Identity (no connection strings):

[Function("OmsOrderIngestion")]
public async Task Run(
    [ServiceBusTrigger("oms-orders-topic", "oms-d365-subscription")]
    ServiceBusReceivedMessage message,
    FunctionContext context)
{
    // Function binding uses Managed Identity automatically
    // No need to manage connection strings
}

⚡ Azure Functions: Ingestion and Transformation

Two serverless functions handle real-time validation, state management, and batch transformation with comprehensive exception handling.

Function 1: OmsOrderIngestion

Trigger: Azure Service Bus (ServiceBusTrigger)  |  Runtime: .NET 8 Isolated  |  Timeout: 5 minutes

Processing Flow Diagram

flowchart TD START["Service Bus Message Received - Peek-Lock acquired - Lock duration: 5 minutes"] --> SCOPE_LOG["Begin structured log scope - CorrelationId, MessageId, SequenceNumber, EnqueuedTime"] SCOPE_LOG --> PARSE{"Deserialise JSON body - JsonSerializer.Deserialize"} PARSE -->|"JsonException"| DLQ1["DeadLetterMessageAsync - Reason: JsonDeserialiseFailure - No retry - message is poison"] PARSE -->|"Success - OmsOrderEvent object"| VALIDATE{"Validate business rules - OmsOrderValidator.Validate"} VALIDATE -->|"Fails validation"| DLQ2["DeadLetterMessageAsync - Reason: ValidationFailure - Details: validation errors"] VALIDATE -->|"Passes"| DEDUP{"Check Cosmos DB - orderId already exists?"} DEDUP -->|"CosmosException transport error"| ABANDON1["AbandonMessageAsync - Delivery count plus 1 - Will retry up to 10 times"] DEDUP -->|"Document found - duplicate"| COMPLETE1["CompleteMessageAsync - Idempotent success - Log: duplicate skipped"] DEDUP -->|"Null - new order"| UPSERT{"UpsertItemAsync to Cosmos DB - status: Pending - ingestedAt: utcNow"} UPSERT -->|"CosmosException"| ABANDON2["AbandonMessageAsync - Delivery count plus 1 - Retry with backoff"] UPSERT -->|"Success"| TELEMETRY["Log IngestionSuccess to App Insights - OrderId, CorrelationId, Duration"] TELEMETRY --> COMPLETE2["CompleteMessageAsync - Message removed from queue - SUCCESS"] style DLQ1 fill:#d83b01,color:#fff style DLQ2 fill:#d83b01,color:#fff style COMPLETE1 fill:#107c10,color:#fff style COMPLETE2 fill:#107c10,color:#fff style ABANDON1 fill:#c75000,color:#fff style ABANDON2 fill:#c75000,color:#fff

Full C# Implementation

using Azure.Messaging.ServiceBus;
using FunctionApp.OmsIntegration.Models;
using FunctionApp.OmsIntegration.Services;
using FunctionApp.OmsIntegration.Validators;
using Microsoft.Azure.Functions.Worker;
using Microsoft.Extensions.Logging;
using System.Text.Json;

namespace FunctionApp.OmsIntegration.Functions;

public class OmsOrderIngestionFunction
{
    private readonly ICosmosDbService _cosmosDbService;
    private readonly ILogger<OmsOrderIngestionFunction> _logger;
    private static readonly JsonSerializerOptions _jsonOptions = new()
    {
        PropertyNameCaseInsensitive = true
    };

    public OmsOrderIngestionFunction(
        ICosmosDbService cosmosDbService,
        ILogger<OmsOrderIngestionFunction> _logger)
    {
        _cosmosDbService = cosmosDbService;
        this._logger = _logger;
    }

    [Function(nameof(OmsOrderIngestionFunction))]
    public async Task Run(
        [ServiceBusTrigger(
            topicName:        "%ServiceBusTopicName%",
            subscriptionName: "%ServiceBusSubscriptionName%",
            Connection:       "ServiceBusConnection")]
        ServiceBusReceivedMessage message,
        ServiceBusMessageActions  messageActions)
    {
        var correlationId = message.CorrelationId ?? Guid.NewGuid().ToString();

        using var logScope = _logger.BeginScope(new Dictionary<string, object>
        {
            ["CorrelationId"]  = correlationId,
            ["MessageId"]      = message.MessageId,
            ["DeliveryCount"]  = message.DeliveryCount,
            ["EnqueuedTime"]   = message.EnqueuedTime
        });

        _logger.LogInformation("OmsOrderIngestionFunction started. MessageId={MessageId}", message.MessageId);

        // STEP 1: Deserialise JSON
        OmsOrderEvent? omsEvent;
        try
        {
            omsEvent = JsonSerializer.Deserialize<OmsOrderEvent>(
                message.Body.ToString(), _jsonOptions);
        }
        catch (JsonException ex)
        {
            _logger.LogError(ex, "JSON deserialisation failed for MessageId={MessageId}", message.MessageId);
            await messageActions.DeadLetterMessageAsync(
                message,
                deadLetterReason: "JsonDeserialiseFailure",
                deadLetterErrorDescription: ex.Message);
            return;
        }

        if (omsEvent is null)
        {
            await messageActions.DeadLetterMessageAsync(message,
                deadLetterReason: "NullPayload",
                deadLetterErrorDescription: "Deserialised event was null");
            return;
        }

        // STEP 2: Validate business rules
        var validationResult = OmsOrderValidator.Validate(omsEvent);
        if (!validationResult.IsValid)
        {
            _logger.LogWarning("Validation failed for OrderId={OrderId}: {Errors}",
                omsEvent.OrderId, string.Join("; ", validationResult.Errors));
            await messageActions.DeadLetterMessageAsync(message,
                deadLetterReason: "ValidationFailure",
                deadLetterErrorDescription: string.Join("; ", validationResult.Errors));
            return;
        }

        // STEP 3: Idempotency check
        OmsOrderDocument? existing = null;
        try
        {
            existing = await _cosmosDbService.GetOrderByIdAsync(omsEvent.OrderId);
        }
        catch (Exception ex)
        {
            _logger.LogError(ex, "Cosmos DB read failed for OrderId={OrderId}", omsEvent.OrderId);
            await messageActions.AbandonMessageAsync(message);
            return;
        }

        if (existing is not null)
        {
            _logger.LogInformation("Duplicate: OrderId={OrderId} already in Cosmos DB. Completing.", omsEvent.OrderId);
            await messageActions.CompleteMessageAsync(message);
            return;
        }

        // STEP 4: Persist to Cosmos DB
        var document = new OmsOrderDocument
        {
            Id               = omsEvent.OrderId,
            OrderId          = omsEvent.OrderId,
            ProcessingStatus = "Pending",
            OmsEvent         = omsEvent,
            IngestedAt       = DateTimeOffset.UtcNow,
            RetryCount       = 0
        };

        try
        {
            await _cosmosDbService.UpsertOrderAsync(document);
            _logger.LogInformation("Ingested OrderId={OrderId} with status=Pending", omsEvent.OrderId);
            await messageActions.CompleteMessageAsync(message);
        }
        catch (Exception ex)
        {
            _logger.LogError(ex, "Failed to upsert OrderId={OrderId}. Abandoning.", omsEvent.OrderId);
            await messageActions.AbandonMessageAsync(message);
        }
    }
}

Function 2: OmsTimerTransform

Trigger: Timer CRON (0 0 */4 * * *)  |  Runtime: .NET 8 Isolated  |  Timeout: 30 minutes

Processing Flow Diagram

flowchart TD TIMER["Timer fires every 4 hours - CRON: 0 0 */4 * * * - Runs at 00:00 04:00 08:00 12:00 16:00 20:00 UTC"] --> QUERY["Query Cosmos DB - WHERE processingStatus = Pending - ORDER BY ingestedAt ASC - Cross-partition query"] QUERY -->|"No Pending orders"| EXIT["Log: No pending orders - Function exits - Next run in 4 hours"] QUERY -->|"N orders found"| CLAIM["IDEMPOTENCY STEP 1 - Bulk update status = Processing - Set batchId = new GUID - Set pickedUpAt = utcNow"] CLAIM --> MAP["Transform each order - OmsToD365Mapper.Map - OmsOrderEvent to D365SalesOrder - Map line items to D365SalesOrderLine"] MAP --> BUILD_ZIP["Build ZIP archive in MemoryStream - header.json - package.yaml - sales_orders.json - sales_order_lines.json"] BUILD_ZIP --> UPLOAD{"Upload ZIP to Blob Storage - Container: oms-d365-payloads - Name: oms yyyyMMdd HHmmss guid .zip"} UPLOAD -->|"BlobServiceException"| FAIL["Update status = Failed - retryCount++ - Log error to App Insights - Alert on-call"] UPLOAD -->|"Success"| MARK["IDEMPOTENCY STEP 2 - Bulk update status = Processed - Set processedAt = utcNow - Set blobReference = blob URI"] MARK --> TELEMETRY["Log TransformSuccess to App Insights - batchId, batchSize, duration, blobName"] style EXIT fill:#107c10,color:#fff style TELEMETRY fill:#107c10,color:#fff style FAIL fill:#d83b01,color:#fff

ZIP Package Contents

Every ZIP file contains 4 files required by D365 DIXF:
header.json — Package envelope with batch metadata and timestamp
package.yaml — DIXF manifest listing entities and import sequence
sales_orders.json — Transformed D365 SalesOrderHeadersV2 records
sales_order_lines.json — Transformed D365 SalesOrderLinesV2 records

Full C# Implementation

using FunctionApp.OmsIntegration.Mappers;
using FunctionApp.OmsIntegration.Models;
using FunctionApp.OmsIntegration.Services;
using Microsoft.Azure.Functions.Worker;
using Microsoft.Extensions.Logging;
using System.IO.Compression;
using System.Text;
using System.Text.Json;

namespace FunctionApp.OmsIntegration.Functions;

public class OmsTimerTransformFunction
{
    private readonly ICosmosDbService    _cosmosDbService;
    private readonly IBlobStorageService _blobStorageService;
    private readonly ILogger<OmsTimerTransformFunction> _logger;

    private static readonly JsonSerializerOptions _jsonOptions = new()
    {
        WriteIndented        = true,
        PropertyNamingPolicy = JsonNamingPolicy.CamelCase
    };

    public OmsTimerTransformFunction(
        ICosmosDbService    cosmosDbService,
        IBlobStorageService blobStorageService,
        ILogger<OmsTimerTransformFunction> _logger)
    {
        _cosmosDbService    = cosmosDbService;
        _blobStorageService = blobStorageService;
        this._logger        = _logger;
    }

    // CRON: fires at 00:00, 04:00, 08:00, 12:00, 16:00, 20:00 UTC
    [Function(nameof(OmsTimerTransformFunction))]
    public async Task Run([TimerTrigger("0 0 */4 * * *")] TimerInfo timerInfo)
    {
        var batchId   = Guid.NewGuid().ToString("N")[..8];
        var startTime = DateTimeOffset.UtcNow;

        _logger.LogInformation("OmsTimerTransformFunction started. BatchId={BatchId}", batchId);

        // STEP 1: Query pending orders
        var pendingOrders = await _cosmosDbService.GetPendingOrdersAsync();

        if (!pendingOrders.Any())
        {
            _logger.LogInformation("No pending orders found. Exiting. BatchId={BatchId}", batchId);
            return;
        }

        _logger.LogInformation("Found {Count} pending orders. BatchId={BatchId}", pendingOrders.Count, batchId);

        // STEP 2: IDEMPOTENCY — claim records immediately
        await _cosmosDbService.ClaimOrdersForProcessingAsync(pendingOrders, batchId);

        // STEP 3: Map OMS orders to D365 schema
        var salesOrders     = pendingOrders.Select(o => OmsToD365Mapper.MapHeader(o.OmsEvent)).ToList();
        var salesOrderLines = pendingOrders.SelectMany(o => OmsToD365Mapper.MapLines(o.OmsEvent)).ToList();

        // STEP 4: Build ZIP archive
        var blobName = $"oms_{startTime:yyyyMMdd_HHmmss}_{batchId}.zip";
        using var zipStream = new MemoryStream();

        using (var archive = new ZipArchive(zipStream, ZipArchiveMode.Create, leaveOpen: true))
        {
            // header.json
            var header = new PackageHeader
            {
                BatchId    = batchId,
                BatchSize  = pendingOrders.Count,
                CreatedAt  = startTime,
                ExportedBy = "OmsTimerTransformFunction"
            };
            AddJsonEntry(archive, "header.json", header);

            // package.yaml
            AddTextEntry(archive, "package.yaml",
                "Name: OMS-D365-Integration\n" +
                "Entities:\n" +
                "  - name: SalesOrderHeadersV2\n" +
                "    file: sales_orders.json\n" +
                "  - name: SalesOrderLinesV2\n" +
                "    file: sales_order_lines.json\n");

            // sales_orders.json
            AddJsonEntry(archive, "sales_orders.json", salesOrders);

            // sales_order_lines.json
            AddJsonEntry(archive, "sales_order_lines.json", salesOrderLines);
        }

        // STEP 5: Upload ZIP to Blob Storage
        zipStream.Position = 0;
        try
        {
            await _blobStorageService.UploadAsync(blobName, zipStream);
            _logger.LogInformation("ZIP uploaded: {BlobName}. BatchId={BatchId}", blobName, batchId);
        }
        catch (Exception ex)
        {
            _logger.LogError(ex, "Upload failed for BatchId={BatchId}. Marking orders Failed.", batchId);
            await _cosmosDbService.MarkOrdersFailedAsync(pendingOrders.Select(o => o.OrderId), batchId);
            throw;
        }

        // STEP 6: IDEMPOTENCY — mark orders as processed
        await _cosmosDbService.MarkOrdersProcessedAsync(
            pendingOrders.Select(o => o.OrderId),
            batchId,
            blobReference: blobName);

        var duration = DateTimeOffset.UtcNow - startTime;
        _logger.LogInformation(
            "TransformSuccess. BatchId={BatchId} BatchSize={BatchSize} Duration={Duration}ms BlobName={BlobName}",
            batchId, pendingOrders.Count, (int)duration.TotalMilliseconds, blobName);
    }

    private static void AddJsonEntry<T>(ZipArchive archive, string fileName, T obj)
    {
        var entry = archive.CreateEntry(fileName, CompressionLevel.Optimal);
        using var writer = new StreamWriter(entry.Open(), Encoding.UTF8);
        writer.Write(JsonSerializer.Serialize(obj, _jsonOptions));
    }

    private static void AddTextEntry(ZipArchive archive, string fileName, string content)
    {
        var entry = archive.CreateEntry(fileName, CompressionLevel.Optimal);
        using var writer = new StreamWriter(entry.Open(), Encoding.UTF8);
        writer.Write(content);
    }
}

🔗 Logic App Workflow: D365 Delivery Orchestration

Low-code workflow that schedules every 4 hours and delivers processed orders to D365.

Workflow Steps (Visual Representation)

Recurrence Trigger
Every 4 hours; offset +15 min (so fires at 00:15, 04:15, 08:15, etc.)
🔧
Initialize Variables
Set WorkflowRunId, StartTime, BlobFound=false, D365Status="Pending"
📁
List Blobs in oms-d365-payloads
Filter: blobs created in last 4 hours; sort by timestamp DESC; take latest
Condition: New Blob Found?
If blobs count > 0, proceed to download; else terminate

✅ TRUE Branch: Blob Found

🔧
Set BlobName Variable
Extract latest blob name from list (outputs[0].Name)
📥
Get Blob Content
Azure Blob Storage connector; retrieve binary ZIP file
➡️
Scope: Main Processing Flow
Container for upload & import actions (for exception handling)
📤
Call D365: SalesOrderHeadersV2
Dynamics 365 Finance & Operations connector; POST ZIP content
⚙️
Trigger DIXF Import Job
D365 connector: start import job with job definition ID
🔄
Wait for Job Completion
Poll D365 import job status every 30 sec; timeout after 10 min
Log Success Telemetry
Track event: DeliverySuccess; include BlobName, JobId, Duration
🚨
Scope: Exception Handling
Runs if Main Processing throws error
📊
Log Failure Telemetry
Track event: DeliveryFailure; include error details, blob name
📧
Send Alert Email
To: ops@contoso.com; Subject: "D365 Delivery Failed"; include error trace
⏹️
Terminate (Failure)
Set run status = Failed; this triggers Logic App alert rule

❌ FALSE Branch: No Blob

⏭️
Terminate (Success)
No blobs to process; normal during off-hours. Exit cleanly.

Logic App Workflow JSON Definition

{
  "definition": {
    "$schema": "https://schema.management.azure.com/providers/Microsoft.Logic/schemas/2016-06-01/workflowdefinition.json#",
    "actions": {
      "Recurrence": {
        "type": "Recurrence",
        "recurrence": {
          "frequency": "Hour",
          "interval": 4,
          "startTime": "2026-03-14T00:15:00Z",
          "timeZone": "UTC"
        }
      },
      "Initialize_StartTime": {
        "type": "InitializeVariable",
        "inputs": {
          "variables": [
            {
              "name": "StartTime",
              "type": "string",
              "value": "@utcNow()"
            }
          ]
        },
        "runAfter": { "Recurrence": ["Succeeded"] }
      },
      "List_Blobs": {
        "type": "ApiConnection",
        "inputs": {
          "host": {
            "connection": { "name": "@parameters('$connections')['azureblob']['connectionId']" }
          },
          "method": "get",
          "path": "/datasets/default/foldersV2/@{encodeURIComponent(encodeURIComponent('oms-d365-payloads'))}",
          "queries": {
            "useFlatListing": false,
            "pageSize": 10,
            "searchPattern": "oms_*.zip"
          }
        },
        "runAfter": { "Initialize_StartTime": ["Succeeded"] }
      },
      "Condition_BlobFound": {
        "type": "If",
        "expression": {
          "and": [
            {
              "greater": [
                "@length(outputs('List_Blobs')?['body/value'])",
                0
              ]
            }
          ]
        },
        "actions": {
          "Set_BlobName": {
            "type": "SetVariable",
            "inputs": {
              "name": "BlobName",
              "value": "@outputs('List_Blobs')?['body/value'][0]['Name']"
            }
          },
          "Get_Blob_Content": {
            "type": "ApiConnection",
            "inputs": {
              "host": {
                "connection": { "name": "@parameters('$connections')['azureblob']['connectionId']" }
              },
              "method": "get",
              "path": "/datasets/default/files/@{encodeURIComponent(encodeURIComponent('oms-d365-payloads/@{variables(\\'BlobName\\')}')}",
              "queries": { "inferContentType": true }
            },
            "runAfter": { "Set_BlobName": ["Succeeded"] }
          },
          "Scope_MainFlow": {
            "type": "Scope",
            "actions": {
              "Upload_to_D365": {
                "type": "ApiConnection",
                "inputs": {
                  "host": {
                    "connection": { "name": "@parameters('$connections')['dynamicscrmonline']['connectionId']" }
                  },
                  "method": "post",
                  "path": "/api/data/v9.2/SalesOrderHeadersV2",
                  "body": "@outputs('Get_Blob_Content')?['body']",
                  "headers": { "Content-Type": "application/zip" }
                }
              },
              "Trigger_DIXF_Import": {
                "type": "ApiConnection",
                "inputs": {
                  "host": {
                    "connection": { "name": "@parameters('$connections')['dynamicscrmonline']['connectionId']" }
                  },
                  "method": "post",
                  "path": "/api/data/v9.2/dmf_importexecution",
                  "body": {
                    "dmf_jobdefinitionid": "@parameters('DixfJobDefinitionId')",
                    "dmf_sourcename": "@variables('BlobName')"
                  }
                },
                "runAfter": { "Upload_to_D365": ["Succeeded"] }
              },
              "Wait_for_JobCompletion": {
                "type": "Until",
                "expression": "@or(equals(variables('D365Status'), 'Completed'), equals(variables('D365Status'), 'Failed'))",
                "limit": {
                  "count": 20,
                  "timeout": "PT10M"
                },
                "actions": {
                  "Get_Job_Status": {
                    "type": "ApiConnection",
                    "inputs": {
                      "host": { "connection": { "name": "@parameters('$connections')['dynamicscrmonline']['connectionId']" } },
                      "method": "get",
                      "path": "/api/data/v9.2/dmf_importexecution(@{outputs('Trigger_DIXF_Import')?['body/dmf_importexecutionid']})"
                    }
                  },
                  "Delay_30s": {
                    "type": "Wait",
                    "inputs": { "interval": { "count": 30, "unit": "Second" } },
                    "runAfter": { "Get_Job_Status": ["Succeeded"] }
                  }
                }
              },
              "Log_Success": {
                "type": "ApiConnection",
                "inputs": {
                  "host": { "connection": { "name": "@parameters('$connections')['applicationinsights']['connectionId']" } },
                  "method": "post",
                  "path": "/api/logEvent",
                  "body": {
                    "name": "DeliverySuccess",
                    "properties": {
                      "BlobName": "@variables('BlobName')",
                      "Duration": "@{sub(ticks(utcNow()), ticks(variables('StartTime')))}"
                    }
                  }
                },
                "runAfter": { "Wait_for_JobCompletion": ["Succeeded"] }
              }
            },
            "runAfter": { "Get_Blob_Content": ["Succeeded"] }
          },
          "Scope_ExceptionHandling": {
            "type": "Scope",
            "actions": {
              "Log_Failure": {
                "type": "ApiConnection",
                "inputs": {
                  "host": { "connection": { "name": "@parameters('$connections')['applicationinsights']['connectionId']" } },
                  "method": "post",
                  "path": "/api/logEvent",
                  "body": {
                    "name": "DeliveryFailure",
                    "properties": {
                      "BlobName": "@variables('BlobName')",
                      "Error": "@{body('Scope_MainFlow')}"
                    }
                  }
                }
              },
              "Send_Alert_Email": {
                "type": "ApiConnection",
                "inputs": {
                  "host": { "connection": { "name": "@parameters('$connections')['office365']['connectionId']" } },
                  "method": "post",
                  "path": "/Mail",
                  "body": {
                    "To": "ops@contoso.com",
                    "Subject": "ALERT: D365 Order Delivery Failed",
                    "Body": "Blob: @{variables('BlobName')}\nError: @{body('Scope_MainFlow')}"
                  }
                }
              },
              "Terminate_Failure": {
                "type": "Terminate",
                "inputs": {
                  "runStatus": "Failed",
                  "runError": { "message": "@{body('Scope_MainFlow')}" }
                }
              }
            },
            "runAfter": { "Scope_MainFlow": ["Failed"] }
          }
        },
        "else": {
          "actions": {
            "Terminate_NoBlobs": {
              "type": "Terminate",
              "inputs": { "runStatus": "Succeeded" }
            }
          }
        },
        "runAfter": { "List_Blobs": ["Succeeded"] }
      }
    },
    "contentVersion": "1.0.0.0",
    "outputs": {},
    "parameters": {
      "$connections": { "defaultValue": {}, "type": "Object" },
      "DixfJobDefinitionId": { "defaultValue": "", "type": "String" }
    },
    "triggers": { "Recurrence": { "type": "Recurrence" } }
  }
}

Connection Details

Connection Service Authentication Actions Used
azureblob Azure Blob Storage Managed Identity (preferred) or Storage Account Key List Blobs, Get Blob Content
dynamicscrmonline Dynamics 365 Finance & Operations Service Principal (app registration) with DIXF admin role Post to SalesOrderHeadersV2, Trigger DIXF Job, Get Job Status
applicationinsights Application Insights Instrumentation Key or API Key Track Events (custom telemetry)
office365 Office 365 / Outlook OAuth (user account) Send Email Alert

🌐 Cosmos DB: State Management

Order state machine powered by Azure Cosmos DB NoSQL. Stores processing status and original OMS event data.

Document Schema

Each order in Cosmos DB is stored as a JSON document with the following schema:

Field Type Partition Key? Description
id string ✅ Yes (Cosmos ID) Unique document identifier; same as orderId
orderId string ✅ Yes (Logical Partition) Order identifier from OMS; partition key for hot-data locality
processingStatus string ❌ Indexed "Pending" | "Processing" | "Processed" | "Failed"
omsEvent object (JSON) Full CloudEvent data object (for audit trail)
ingestedAt ISO 8601 string ❌ Indexed (composite) Timestamp when Function 1 processed the order
pickedUpAt ISO 8601 string | null When Function 2 claimed the order for batch processing
processedAt ISO 8601 string | null When ZIP was created and uploaded to Blob Storage
batchId GUID string | null Batch identifier assigned by Function 2; links orders processed together
blobReference URI string | null Blob Storage URI of the ZIP file containing this order
retryCount integer Number of processing attempts; incremented on retry
_ttl integer Time-to-live in seconds (2,592,000 = 30 days); document auto-deletes

Sample Document (JSON)

{
  "id": "ORD-20260314-001",
  "orderId": "ORD-20260314-001",
  "processingStatus": "Processed",
  "omsEvent": {
    "specversion": "1.0",
    "type": "OMS.Order.Created",
    "source": "https://oms.contoso.com/api/orders",
    "id": "order-20260314-001-uuid-1234",
    "time": "2026-03-14T12:30:45Z",
    "datacontenttype": "application/json",
    "data": {
      "orderId": "ORD-20260314-001",
      "customerId": "CUST-98765",
      "orderDate": "2026-03-14",
      "totalAmount": 12500.00,
      "currency": "USD",
      "lineItems": [
        {
          "itemNumber": 1,
          "productCode": "PROD-XYZ-001",
          "quantity": 5,
          "unitPrice": 2500.00
        }
      ],
      "shippingAddress": {
        "street": "456 Oak Ave",
        "city": "Portland",
        "state": "OR",
        "zip": "97201",
        "country": "US"
      }
    }
  },
  "ingestedAt": "2026-03-14T12:30:52Z",
  "pickedUpAt": "2026-03-14T16:00:15Z",
  "processedAt": "2026-03-14T16:02:33Z",
  "batchId": "a1b2c3d4-e5f6-7890-abcd-ef1234567890",
  "blobReference": "https://stomsomsintegrationprod.blob.core.windows.net/oms-d365-payloads/oms_20260314_160230_a1b2c3d4.zip",
  "retryCount": 0,
  "_ttl": 2592000
}

State Machine Diagram

stateDiagram-v2 direction LR [*] --> Pending : Function 1 ingests order Pending --> Processing : Function 2 claims batch Processing --> Processed : ZIP uploaded to Blob Processing --> Failed : Transform exception Failed --> Pending : Admin re-queues for retry Processed --> [*] : TTL 30 days auto-delete Failed --> [*] : TTL 30 days auto-delete note right of Pending ingestedAt is set batchId is null blobReference is null end note note right of Processing pickedUpAt is set batchId is assigned retryCount may increment end note note right of Processed processedAt is set blobReference is set Order in D365 end note

Why /orderId as Partition Key?

Partition Key Selection Rationale

Candidate Cardinality Data Distribution Issues Decision
/orderId Millions of unique values Even distribution across 16 logical partitions None; ideal ✅ CHOSEN
/processingStatus Only 4 values (Pending, Processing, Processed, Failed) All "Pending" orders on same partition → hot partition Throttling (429) when batch size > 10K orders/4h; RU limits ❌ Rejected
/customerId ~10K unique values (uneven) Large customers create hot partitions Skewed distribution; some partitions overloaded ❌ Rejected
/ingestedAt (day) 365 unique values Today's date → single partition until midnight Hot partition during peak hours; cold after ❌ Rejected

Impact Analysis

Function 1 (Writes per-order): Each order writes to a different partition (by orderId) → no contention, scales horizontally to millions of orders/sec
Function 2 (Cross-partition query): Query WHERE processingStatus="Pending" scans all 16 partitions, but runs only 6x/day (every 4h), so latency spike is acceptable. If this were real-time, we'd need composite indexes or a separate view.

Indexing Policy

{
  "indexingPolicy": {
    "indexingMode": "consistent",
    "automatic": true,
    "includedPaths": [
      {
        "path": "/*"
      }
    ],
    "excludedPaths": [
      {
        "path": "/omsEvent/*"
      }
    ],
    "compositeIndexes": [
      [
        { "path": "/processingStatus", "order": "ascending" },
        { "path": "/ingestedAt", "order": "descending" }
      ],
      [
        { "path": "/batchId", "order": "ascending" },
        { "path": "/processedAt", "order": "descending" }
      ]
    ]
  }
}

Index Rationale

Throughput & RU Configuration

Environment RU/s Mode RU/s Value Rationale
Dev Manual 400 RU/s Low cost; batch of 50 orders = ~50 RU writes + 50 RU updates = 100 RU/batch. 6 batches/day = 600 RU/day total.
UAT Autoscale 400–1000 RU/s Peak load testing; autoscale handles spikes (e.g., retry batches)
Prod Autoscale 1000–2000 RU/s Business growth; 2000 RU/s supports 4x current load without throttling

Time-to-Live (TTL) Configuration

TTL = 2,592,000 seconds (30 days): Each document automatically deletes 30 days after creation. This keeps storage costs low and ensures old orders don't accumulate. D365 retains the imported orders indefinitely; Cosmos DB just stores intermediate state.

Cosmos DB Account Configuration

Setting Value Notes
Kind GlobalDocumentDB SQL API (not MongoDB, Cassandra, etc.)
Consistency Level Session Balanced: faster than Strong, sufficient for our use case
Region(s) Primary: East US; Failover: West US (prod only) Single region for dev/uat; multi-region for HA in prod
Backup Policy Continuous 30 days Point-in-time restore available for 30 days (prod)
Network Private Endpoint (prod); Public (dev/uat) Private endpoint restricts access to VNet-connected services

🏗️ Infrastructure as Code (Bicep & Terraform)

Complete IaC for 3-environment deployments (Dev/UAT/Prod) using Bicep and Terraform.

Bicep uses modular structure: main.bicep orchestrates modules (servicebus.bicep, cosmosdb.bicep, etc.). Parameter files specify per-environment values.

main.bicep — Orchestrator

Entry point. Declares all module deployments and wires outputs of one module into inputs of dependent modules. Also creates RBAC role assignments.

targetScope = 'resourceGroup'

@description('Deployment environment')
@allowed(['dev', 'uat', 'prod'])
param environment string

@description('Azure region for all resources')
param location string = resourceGroup().location

@description('Tags applied to all resources')
param tags object = {
  project: 'oms-d365-integration'
  environment: environment
  managedBy: 'bicep'
  team: 'azure-integration'
}

// ── Module: Application Insights + Log Analytics ───────────────────────────────────
module appInsights './modules/appInsights.bicep' = {
  name: 'deploy-appInsights-${environment}'
  params: { environment: environment, location: location, tags: tags }
}

// ── Module: Key Vault ──────────────────────────────────────────────────────────────
module keyVault './modules/keyVault.bicep' = {
  name: 'deploy-keyVault-${environment}'
  params: { environment: environment, location: location, tags: tags, tenantId: tenant().tenantId }
}

// ── Module: Service Bus ────────────────────────────────────────────────────────────
module serviceBus './modules/serviceBus.bicep' = {
  name: 'deploy-serviceBus-${environment}'
  params: { environment: environment, location: location, tags: tags }
}

// ── Module: Cosmos DB ──────────────────────────────────────────────────────────────
module cosmosDb './modules/cosmosDb.bicep' = {
  name: 'deploy-cosmosDb-${environment}'
  params: { environment: environment, location: location, tags: tags }
}

// ── Module: Storage Account ────────────────────────────────────────────────────────
module storage './modules/storage.bicep' = {
  name: 'deploy-storage-${environment}'
  params: { environment: environment, location: location, tags: tags }
}

// ── Module: Event Grid (depends on Service Bus) ────────────────────────────────────
module eventGrid './modules/eventGrid.bicep' = {
  name: 'deploy-eventGrid-${environment}'
  params: {
    environment: environment
    location: location
    tags: tags
    serviceBusTopicId: serviceBus.outputs.topicId
    serviceBusNamespaceId: serviceBus.outputs.namespaceId
  }
  dependsOn: [serviceBus]
}

// ── Module: Function App (depends on most modules) ─────────────────────────────────
module functionApp './modules/functionApp.bicep' = {
  name: 'deploy-functionApp-${environment}'
  params: {
    environment: environment
    location: location
    tags: tags
    storageAccountName: storage.outputs.storageAccountName
    appInsightsConnectionString: appInsights.outputs.connectionString
    cosmosDbEndpoint: cosmosDb.outputs.endpoint
    cosmosDbAccountName: cosmosDb.outputs.accountName
    serviceBusNamespaceName: serviceBus.outputs.namespaceName
    serviceBusNamespaceId: serviceBus.outputs.namespaceId
    keyVaultUri: keyVault.outputs.keyVaultUri
    keyVaultName: keyVault.outputs.keyVaultName
  }
  dependsOn: [storage, appInsights, cosmosDb, serviceBus, keyVault]
}

// ── Module: Logic App ──────────────────────────────────────────────────────────────
module logicApp './modules/logicApp.bicep' = {
  name: 'deploy-logicApp-${environment}'
  params: {
    environment: environment
    location: location
    tags: tags
    storageAccountName: storage.outputs.storageAccountName
    storageAccountId: storage.outputs.storageAccountId
    appInsightsConnectionString: appInsights.outputs.connectionString
    keyVaultUri: keyVault.outputs.keyVaultUri
    keyVaultName: keyVault.outputs.keyVaultName
  }
  dependsOn: [storage, appInsights, keyVault]
}

// ── RBAC: Function App MI → Service Bus Data Receiver ─────────────────────────────
resource sbRbac 'Microsoft.Authorization/roleAssignments@2022-04-01' = {
  name: guid(serviceBus.outputs.namespaceId, functionApp.outputs.principalId, 'sb-receiver')
  scope: resourceGroup()
  properties: {
    roleDefinitionId: subscriptionResourceId('Microsoft.Authorization/roleDefinitions', '4f6d3b9b-027b-4f4c-9142-0e5a2a2247e0')
    principalId: functionApp.outputs.principalId
    principalType: 'ServicePrincipal'
  }
}

// ── RBAC: Function App MI → Storage Blob Data Contributor ──────────────────────────
resource storageFaRbac 'Microsoft.Authorization/roleAssignments@2022-04-01' = {
  name: guid(storage.outputs.storageAccountId, functionApp.outputs.principalId, 'storage-blob')
  scope: resourceGroup()
  properties: {
    roleDefinitionId: subscriptionResourceId('Microsoft.Authorization/roleDefinitions', 'ba92f5b4-2d11-453d-a403-e96b0029c9fe')
    principalId: functionApp.outputs.principalId
    principalType: 'ServicePrincipal'
  }
}

// ── RBAC: Logic App MI → Storage Blob Data Contributor ─────────────────────────────
resource storageLaRbac 'Microsoft.Authorization/roleAssignments@2022-04-01' = {
  name: guid(storage.outputs.storageAccountId, logicApp.outputs.principalId, 'storage-blob-la')
  scope: resourceGroup()
  properties: {
    roleDefinitionId: subscriptionResourceId('Microsoft.Authorization/roleDefinitions', 'ba92f5b4-2d11-453d-a403-e96b0029c9fe')
    principalId: logicApp.outputs.principalId
    principalType: 'ServicePrincipal'
  }
}

// ── Outputs ────────────────────────────────────────────────────────────────────────
output functionAppName string = functionApp.outputs.functionAppName
output cosmosDbEndpoint string = cosmosDb.outputs.endpoint
output serviceBusNamespaceName string = serviceBus.outputs.namespaceName
output logicAppName string = logicApp.outputs.logicAppName
output keyVaultUri string = keyVault.outputs.keyVaultUri
output appInsightsName string = appInsights.outputs.appInsightsName
output storageAccountName string = storage.outputs.storageAccountName

modules/serviceBus.bicep

Creates: Service Bus Namespace (Standard SKU), Topic (oms-orders-topic) with partitioning and duplicate detection, Subscription with DLQ enabled.

@description('Deployment environment')
param environment string
@description('Azure region')
param location string
@description('Resource tags')
param tags object

resource sbNamespace 'Microsoft.ServiceBus/namespaces@2022-10-01-preview' = {
  name: 'sb-oms-integration-${environment}'
  location: location
  sku: { name: 'Standard', tier: 'Standard' }
  properties: {
    minimumTlsVersion: '1.2'
    zoneRedundant: environment == 'prod' ? true : false
  }
  tags: tags
}

// Topic: partitioned for throughput; dup detection prevents replay within 10 min
resource sbTopic 'Microsoft.ServiceBus/namespaces/topics@2022-10-01-preview' = {
  parent: sbNamespace
  name: 'oms-orders-topic'
  properties: {
    enablePartitioning: true
    requiresDuplicateDetection: true
    duplicateDetectionHistoryTimeWindow: 'PT10M'
    defaultMessageTimeToLive: 'P14D'
    maxSizeInMegabytes: 1024
  }
}

// Subscription: maxDelivery=10 then auto-DLQ; lockDuration=5min matches function timeout
resource sbSubscription 'Microsoft.ServiceBus/namespaces/topics/subscriptions@2022-10-01-preview' = {
  parent: sbTopic
  name: 'oms-d365-subscription'
  properties: {
    maxDeliveryCount: 10
    lockDuration: 'PT5M'
    deadLetteringOnMessageExpiration: true
    defaultMessageTimeToLive: 'P14D'
    enableBatchedOperations: true
  }
}

output namespaceName string = sbNamespace.name
output namespaceId string = sbNamespace.id
output topicId string = sbTopic.id
output subscriptionName string = sbSubscription.name
output namespaceHostname string = '${sbNamespace.name}.servicebus.windows.net'

modules/cosmosDb.bicep

Creates: Cosmos DB Account (NoSQL/GlobalDocumentDB), Database, Container with /orderId partition key, composite index on [processingStatus, ingestedAt], 30-day TTL.

@description('Deployment environment')
param environment string
@description('Azure region')
param location string
@description('Resource tags')
param tags object

// Autoscale in prod; manual 400 RU/s in dev/uat
var throughputSettings = environment == 'prod'
  ? { autoscaleSettings: { maxThroughput: 4000 } }
  : { throughput: 400 }

// Session consistency: best trade-off - cheaper than Strong, sufficient for our batch pattern
resource cosmosAccount 'Microsoft.DocumentDB/databaseAccounts@2023-04-15' = {
  name: 'cosmos-oms-integration-${environment}'
  location: location
  kind: 'GlobalDocumentDB'
  properties: {
    databaseAccountOfferType: 'Standard'
    consistencyPolicy: {
      defaultConsistencyLevel: 'Session'
      maxStalenessPrefix: 100
      maxIntervalInSeconds: 5
    }
    locations: [{
      locationName: location
      failoverPriority: 0
      isZoneRedundant: environment == 'prod' ? true : false
    }]
    enableAutomaticFailover: false
    publicNetworkAccess: environment == 'prod' ? 'Disabled' : 'Enabled'
    minimalTlsVersion: 'Tls12'
  }
  tags: tags
}

resource database 'Microsoft.DocumentDB/databaseAccounts/sqlDatabases@2023-04-15' = {
  parent: cosmosAccount
  name: 'oms-integration-db'
  properties: { resource: { id: 'oms-integration-db' } }
}

// Partition key: /orderId — millions of unique values = perfect cardinality, no hot partition
// TTL: 30 days auto-deletes processed documents to control storage cost
resource container 'Microsoft.DocumentDB/databaseAccounts/sqlDatabases/containers@2023-04-15' = {
  parent: database
  name: 'oms-orders'
  properties: {
    resource: {
      id: 'oms-orders'
      partitionKey: { paths: ['/orderId'], kind: 'Hash', version: 2 }
      defaultTtl: 2592000
      indexingPolicy: {
        indexingMode: 'consistent'
        automatic: true
        includedPaths: [{ path: '/*' }]
        excludedPaths: [{ path: '/omsEvent/*' }, { path: '/"_etag"/?' }]
        compositeIndexes: [
          [
            { path: '/processingStatus', order: 'ascending' }
            { path: '/ingestedAt', order: 'ascending' }
          ]
        ]
      }
    }
    options: throughputSettings
  }
}

output endpoint string = cosmosAccount.properties.documentEndpoint
output accountName string = cosmosAccount.name
output accountId string = cosmosAccount.id
output databaseName string = database.name

modules/storage.bicep

Creates: Storage Account (ZRS in prod, LRS in dev), Blob Service, Container oms-d365-payloads, Lifecycle policy (auto-delete ZIPs after 7 days).

@description('Deployment environment')
param environment string
@description('Azure region')
param location string
@description('Resource tags')
param tags object

// ZRS in prod = zone-redundant (survives datacenter failure)
// LRS in dev/uat = cheaper, sufficient for non-prod
resource storageAccount 'Microsoft.Storage/storageAccounts@2023-01-01' = {
  name: 'saomsintegration${environment}'   // max 24 chars, no hyphens
  location: location
  sku: { name: environment == 'prod' ? 'Standard_ZRS' : 'Standard_LRS' }
  kind: 'StorageV2'
  properties: {
    accessTier: 'Hot'
    minimumTlsVersion: 'TLS1_2'
    allowBlobPublicAccess: false
    supportsHttpsTrafficOnly: true
    networkAcls: {
      defaultAction: environment == 'prod' ? 'Deny' : 'Allow'
      bypass: 'AzureServices'
    }
  }
  tags: tags
}

resource blobService 'Microsoft.Storage/storageAccounts/blobServices@2023-01-01' = {
  parent: storageAccount
  name: 'default'
  properties: {
    deleteRetentionPolicy: { enabled: true, days: 7 }
    isVersioningEnabled: environment == 'prod' ? true : false
  }
}

// Staging container: holds ZIP packages for Logic App to read and deliver to D365
resource stagingContainer 'Microsoft.Storage/storageAccounts/blobServices/containers@2023-01-01' = {
  parent: blobService
  name: 'oms-d365-payloads'
  properties: { publicAccess: 'None' }
}

// Safety net: any ZIP older than 7 days was not processed — auto-delete
resource lifecyclePolicy 'Microsoft.Storage/storageAccounts/managementPolicies@2023-01-01' = {
  parent: storageAccount
  name: 'default'
  properties: {
    policy: {
      rules: [{
        name: 'delete-old-staging-zips'
        enabled: true
        type: 'Lifecycle'
        definition: {
          filters: { blobTypes: ['blockBlob'], prefixMatch: ['oms-d365-payloads/'] }
          actions: { baseBlob: { delete: { daysAfterModificationGreaterThan: 7 } } }
        }
      }]
    }
  }
}

output storageAccountName string = storageAccount.name
output storageAccountId string = storageAccount.id
output stagingContainerName string = stagingContainer.name
output blobEndpoint string = storageAccount.properties.primaryEndpoints.blob

modules/functionApp.bicep

Creates: App Service Plan (Y1 Consumption), Function App (.NET 8 isolated), all app settings with Key Vault references for secrets, System-Assigned Managed Identity.

@description('Deployment environment')
param environment string
@description('Azure region')
param location string
@description('Resource tags')
param tags object
@description('Storage account name for AzureWebJobsStorage')
param storageAccountName string
@description('Application Insights connection string')
param appInsightsConnectionString string
@description('Cosmos DB endpoint URL')
param cosmosDbEndpoint string
@description('Cosmos DB account name')
param cosmosDbAccountName string
@description('Service Bus namespace name')
param serviceBusNamespaceName string
@description('Service Bus namespace resource ID')
param serviceBusNamespaceId string
@description('Key Vault URI')
param keyVaultUri string
@description('Key Vault name for KV reference syntax')
param keyVaultName string

resource storageAccount 'Microsoft.Storage/storageAccounts@2023-01-01' existing = {
  name: storageAccountName
}

// Consumption plan Y1: pay-per-execution, elastic scale, zero idle cost
resource appServicePlan 'Microsoft.Web/serverfarms@2023-01-01' = {
  name: 'asp-oms-integration-${environment}'
  location: location
  sku: { name: 'Y1', tier: 'Dynamic' }
  kind: 'functionapp'
  properties: { reserved: false }
  tags: tags
}

resource functionApp 'Microsoft.Web/sites@2023-01-01' = {
  name: 'func-oms-integration-${environment}'
  location: location
  kind: 'functionapp'
  identity: { type: 'SystemAssigned' }
  properties: {
    serverFarmId: appServicePlan.id
    httpsOnly: true
    siteConfig: {
      netFrameworkVersion: 'v8.0'
      use32BitWorkerProcess: false
      ftpsState: 'Disabled'
      minTlsVersion: '1.2'
      appSettings: [
        { name: 'FUNCTIONS_EXTENSION_VERSION', value: '~4' }
        { name: 'FUNCTIONS_WORKER_RUNTIME', value: 'dotnet-isolated' }
        { name: 'AzureWebJobsStorage', value: 'DefaultEndpointsProtocol=https;AccountName=${storageAccount.name};AccountKey=${storageAccount.listKeys().keys[0].value}' }
        { name: 'APPLICATIONINSIGHTS_CONNECTION_STRING', value: appInsightsConnectionString }
        // Service Bus — Managed Identity (no connection string)
        { name: 'ServiceBusConnection__fullyQualifiedNamespace', value: '${serviceBusNamespaceName}.servicebus.windows.net' }
        { name: 'ServiceBusTopicName', value: 'oms-orders-topic' }
        { name: 'ServiceBusSubscriptionName', value: 'oms-d365-subscription' }
        // Cosmos DB — Managed Identity
        { name: 'CosmosDbEndpoint', value: cosmosDbEndpoint }
        { name: 'CosmosDbDatabaseName', value: 'oms-integration-db' }
        { name: 'CosmosDbContainerName', value: 'oms-orders' }
        // Blob Storage
        { name: 'BlobStorageEndpoint', value: 'https://${storageAccountName}.blob.core.windows.net' }
        { name: 'BlobContainerName', value: 'oms-d365-payloads' }
        // Key Vault References — secrets never appear in plain text
        { name: 'D365BaseUrl', value: '@Microsoft.KeyVault(VaultName=${keyVaultName};SecretName=D365-Base-Url)' }
        { name: 'D365ClientId', value: '@Microsoft.KeyVault(VaultName=${keyVaultName};SecretName=D365-Client-Id)' }
        { name: 'D365ClientSecret', value: '@Microsoft.KeyVault(VaultName=${keyVaultName};SecretName=D365-Client-Secret)' }
        { name: 'Environment', value: environment }
        { name: 'WEBSITE_RUN_FROM_PACKAGE', value: '1' }
      ]
    }
  }
  tags: tags
}

output functionAppName string = functionApp.name
output functionAppId string = functionApp.id
output principalId string = functionApp.identity.principalId
output defaultHostname string = functionApp.properties.defaultHostName

modules/logicApp.bicep

Creates: App Service Plan (WS1 Workflow Standard — required for Logic App Standard), Logic App Standard site, app settings with KV references, System-Assigned MI.

@description('Deployment environment')
param environment string
@description('Azure region')
param location string
@description('Resource tags')
param tags object
@description('Storage account name (required by Logic App Standard)')
param storageAccountName string
@description('Storage account resource ID')
param storageAccountId string
@description('Application Insights connection string')
param appInsightsConnectionString string
@description('Key Vault URI')
param keyVaultUri string
@description('Key Vault name')
param keyVaultName string

resource storageAccount 'Microsoft.Storage/storageAccounts@2023-01-01' existing = {
  name: storageAccountName
}

// Logic App Standard REQUIRES a dedicated ASP (not Consumption)
// WS1 = 1 vCore, 3.5 GB RAM — suitable for our delivery workflow
resource logicAppPlan 'Microsoft.Web/serverfarms@2023-01-01' = {
  name: 'asp-la-oms-integration-${environment}'
  location: location
  sku: { name: 'WS1', tier: 'WorkflowStandard' }
  kind: 'windows'
  properties: {
    targetWorkerCount: environment == 'prod' ? 2 : 1
    maximumElasticWorkerCount: environment == 'prod' ? 4 : 2
  }
  tags: tags
}

resource logicApp 'Microsoft.Web/sites@2023-01-01' = {
  name: 'la-oms-d365-delivery-${environment}'
  location: location
  kind: 'functionapp,workflowapp'
  identity: { type: 'SystemAssigned' }
  properties: {
    serverFarmId: logicAppPlan.id
    httpsOnly: true
    siteConfig: {
      use32BitWorkerProcess: false
      ftpsState: 'Disabled'
      minTlsVersion: '1.2'
      appSettings: [
        { name: 'APP_KIND', value: 'workflowApp' }
        { name: 'FUNCTIONS_EXTENSION_VERSION', value: '~4' }
        { name: 'FUNCTIONS_WORKER_RUNTIME', value: 'node' }
        { name: 'WEBSITE_NODE_DEFAULT_VERSION', value: '~18' }
        { name: 'AzureWebJobsStorage', value: 'DefaultEndpointsProtocol=https;AccountName=${storageAccount.name};AccountKey=${storageAccount.listKeys().keys[0].value}' }
        { name: 'APPLICATIONINSIGHTS_CONNECTION_STRING', value: appInsightsConnectionString }
        { name: 'BlobStorageEndpoint', value: 'https://${storageAccountName}.blob.core.windows.net' }
        { name: 'BlobStagingContainer', value: 'oms-d365-payloads' }
        { name: 'D365BaseUrl', value: '@Microsoft.KeyVault(VaultName=${keyVaultName};SecretName=D365-Base-Url)' }
        { name: 'D365ClientId', value: '@Microsoft.KeyVault(VaultName=${keyVaultName};SecretName=D365-Client-Id)' }
        { name: 'D365ClientSecret', value: '@Microsoft.KeyVault(VaultName=${keyVaultName};SecretName=D365-Client-Secret)' }
        { name: 'Environment', value: environment }
        { name: 'WEBSITE_RUN_FROM_PACKAGE', value: '1' }
      ]
    }
  }
  tags: tags
}

output logicAppName string = logicApp.name
output logicAppId string = logicApp.id
output principalId string = logicApp.identity.principalId
output defaultHostname string = logicApp.properties.defaultHostName

modules/eventGrid.bicep

Creates: Event Grid Custom Topic (CloudEvents 1.0 schema), Event Subscription routing OMS.Order.Created events to Service Bus Topic, RBAC for Event Grid MI to publish to Service Bus.

@description('Deployment environment')
param environment string
@description('Azure region')
param location string
@description('Resource tags')
param tags object
@description('Service Bus topic resource ID (event destination)')
param serviceBusTopicId string
@description('Service Bus namespace resource ID (for RBAC)')
param serviceBusNamespaceId string

// CloudEvents 1.0 schema enforces structured, validated payload format
// Prevents arbitrary unstructured JSON from being published
resource eventGridTopic 'Microsoft.EventGrid/topics@2023-12-15-preview' = {
  name: 'egt-oms-events-${environment}'
  location: location
  identity: { type: 'SystemAssigned' }   // Needed for delivery to Service Bus
  properties: {
    inputSchema: 'CloudEventSchemaV1_0'
    publicNetworkAccess: environment == 'prod' ? 'Disabled' : 'Enabled'
    disableLocalAuth: false
  }
  tags: tags
}

// Event Grid MI needs Azure Service Bus Data Sender role to forward events
resource egToSbRole 'Microsoft.Authorization/roleAssignments@2022-04-01' = {
  name: guid(serviceBusNamespaceId, eventGridTopic.id, 'sb-data-sender')
  scope: resourceGroup()
  properties: {
    roleDefinitionId: subscriptionResourceId('Microsoft.Authorization/roleDefinitions', '69a216fc-b8fb-44d8-bc22-1f3c2cd27a39')
    principalId: eventGridTopic.identity.principalId
    principalType: 'ServicePrincipal'
  }
}

// Filter: only route OMS.Order.Created and OMS.Order.Updated events
resource eventSubscription 'Microsoft.EventGrid/topics/eventSubscriptions@2023-12-15-preview' = {
  parent: eventGridTopic
  name: 'oms-to-servicebus-subscription'
  properties: {
    destination: {
      endpointType: 'ServiceBusTopic'
      properties: { resourceId: serviceBusTopicId }
    }
    filter: {
      includedEventTypes: ['OMS.Order.Created', 'OMS.Order.Updated']
      enableAdvancedFilteringOnArrays: true
    }
    eventDeliverySchema: 'CloudEventSchemaV1_0'
    retryPolicy: { maxDeliveryAttempts: 30, eventTimeToLiveInMinutes: 1440 }
  }
  dependsOn: [egToSbRole]
}

output topicEndpoint string = eventGridTopic.properties.endpoint
output topicId string = eventGridTopic.id
output topicName string = eventGridTopic.name
output principalId string = eventGridTopic.identity.principalId

modules/keyVault.bicep

Creates: Key Vault with RBAC authorization model (not access policies), soft delete 90 days, purge protection in prod. Secrets are set post-deployment via CI/CD — never hardcoded in Bicep.

@description('Deployment environment')
param environment string
@description('Azure region')
param location string
@description('Resource tags')
param tags object
@description('AAD Tenant ID for Key Vault')
param tenantId string

// RBAC model preferred over access policies:
//   - Standard Azure RBAC = consistent with other resources
//   - No 16-policy limit (RBAC is unlimited)
//   - Can scope permissions to individual secrets
//   - Works seamlessly with Managed Identity
resource keyVault 'Microsoft.KeyVault/vaults@2023-07-01' = {
  name: 'kv-oms-integration-${environment}'
  location: location
  properties: {
    tenantId: tenantId
    sku: { name: 'standard', family: 'A' }
    enableRbacAuthorization: true          // RBAC model — not access policies
    enableSoftDelete: true
    softDeleteRetentionInDays: 90
    enablePurgeProtection: environment == 'prod' ? true : false
    publicNetworkAccess: environment == 'prod' ? 'Disabled' : 'Enabled'
    networkAcls: {
      defaultAction: environment == 'prod' ? 'Deny' : 'Allow'
      bypass: 'AzureServices'
    }
  }
  tags: tags
}

output keyVaultUri string = keyVault.properties.vaultUri
output keyVaultName string = keyVault.name
output keyVaultId string = keyVault.id

modules/appInsights.bicep

Creates: Log Analytics Workspace (workspace-based mode — classic AI is deprecated), Application Insights linked to the workspace. Configures sampling percentage per environment.

@description('Deployment environment')
param environment string
@description('Azure region')
param location string
@description('Resource tags')
param tags object

// Log Analytics Workspace: required for workspace-based Application Insights
// Classic (non-workspace) Application Insights is deprecated since 2024
resource logAnalyticsWorkspace 'Microsoft.OperationalInsights/workspaces@2023-09-01' = {
  name: 'law-oms-integration-${environment}'
  location: location
  properties: {
    sku: { name: 'PerGB2018' }
    retentionInDays: environment == 'prod' ? 90 : 30
    features: { enableLogAccessUsingOnlyResourcePermissions: true }
  }
  tags: tags
}

// Workspace-based Application Insights: logs stored in Log Analytics
// Enables unified KQL querying across all Azure resources
resource appInsights 'Microsoft.Insights/components@2020-02-02' = {
  name: 'appi-oms-integration-${environment}'
  location: location
  kind: 'web'
  properties: {
    Application_Type: 'web'
    WorkspaceResourceId: logAnalyticsWorkspace.id
    RetentionInDays: environment == 'prod' ? 90 : 30
    // Sampling: 50% in prod reduces ingestion cost while maintaining statistical accuracy
    // 100% in dev/uat ensures full visibility during testing
    SamplingPercentage: environment == 'prod' ? 50 : 100
    DisableIpMasking: false      // GDPR compliance: mask client IPs
  }
  tags: tags
}

// Use connectionString (not instrumentationKey) — instrumentationKey is deprecated
output connectionString string = appInsights.properties.ConnectionString
output instrumentationKey string = appInsights.properties.InstrumentationKey
output appInsightsName string = appInsights.name
output appInsightsId string = appInsights.id
output logAnalyticsWorkspaceId string = logAnalyticsWorkspace.id

parameters/dev.bicepparam — Development Environment

using '../main.bicep'

// Development environment — cost-optimised, public access enabled for developer tooling
param environment = 'dev'
param location = 'westeurope'
param tags = {
  project:     'oms-d365-integration'
  environment: 'dev'
  costCenter:  'IT-DEV-001'
  managedBy:   'bicep'
  team:        'azure-integration'
  deployedBy:  'ci-cd-pipeline'
  owner:       'integration-team@company.com'
}

parameters/uat.bicepparam — UAT Environment

using '../main.bicep'

// UAT environment — mirrors prod config where possible for test parity
// Public access still enabled (no private endpoints) to reduce test friction
param environment = 'uat'
param location = 'westeurope'
param tags = {
  project:          'oms-d365-integration'
  environment:      'uat'
  costCenter:       'IT-UAT-001'
  managedBy:        'bicep'
  team:             'azure-integration'
  deployedBy:       'ci-cd-pipeline'
  owner:            'integration-team@company.com'
  testEnvironment:  'true'
  dataClassification: 'non-production'
}

parameters/prod.bicepparam — Production Environment

using '../main.bicep'

// PRODUCTION — full security hardening, zone-redundant, private endpoints
// Requires: manual approval gate in CI/CD pipeline before apply
// Requires: deployment during off-peak window (00:00–04:00 UTC)
param environment = 'prod'
param location = 'westeurope'
param tags = {
  project:           'oms-d365-integration'
  environment:       'prod'
  costCenter:        'IT-PROD-001'
  managedBy:         'bicep'
  team:              'azure-integration'
  deployedBy:        'ci-cd-pipeline'
  owner:             'integration-team@company.com'
  slaTarget:         '99.9'
  dataClassification: 'confidential'
  complianceScope:   'SOC2'
  businessUnit:      'supply-chain'
  criticalityLevel:  'high'
}

Deployment Command

# Dev deployment
az deployment group create \
  --name oms-d365-integration-dev \
  --resource-group rg-oms-d365-dev \
  --template-file main.bicep \
  --parameters dev.bicepparam

# Prod deployment
az deployment group create \
  --name oms-d365-integration-prod \
  --resource-group rg-oms-d365-prod \
  --template-file main.bicep \
  --parameters prod.bicepparam

Terraform Implementation

Alternative IaC using HashiCorp Terraform. Three files: main.tf (resources), variables.tf (inputs), outputs.tf (results).

main.tf

terraform {
  required_version = ">= 1.3"
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "~> 3.0"
    }
  }
  backend "azurerm" {
    # Configure backend state storage
  }
}

provider "azurerm" {
  features {}
  subscription_id = var.subscription_id
}

# ── Resource Group ────────────────────────────────────────────────
resource "azurerm_resource_group" "oms_integration" {
  name     = "rg-oms-d365-${var.environment}"
  location = var.location
  tags     = var.tags
}

# ── Service Bus Namespace ─────────────────────────────────────────
resource "azurerm_servicebus_namespace" "main" {
  name                = "sb-oms-integration-${var.environment}"
  location            = azurerm_resource_group.oms_integration.location
  resource_group_name = azurerm_resource_group.oms_integration.name
  sku                 = "Standard"
  capacity            = 1
  tags                = var.tags
}

# ── Service Bus Topic ─────────────────────────────────────────────
resource "azurerm_servicebus_topic" "oms_orders" {
  name                = "oms-orders-topic"
  namespace_id        = azurerm_servicebus_namespace.main.id
  partitioned         = true
  max_size_in_megabytes = 1024
  default_message_ttl = "P14D"
  requires_duplicate_detection = true
  duplicate_detection_history_time_window = "PT10M"
}

# ── Service Bus Subscription ──────────────────────────────────────
resource "azurerm_servicebus_subscription" "oms_d365" {
  name                 = "oms-d365-subscription"
  topic_id             = azurerm_servicebus_topic.oms_orders.id
  max_delivery_count   = 10
  lock_duration        = "PT5M"
  dead_lettering_on_message_expiration = true
  default_message_ttl  = "P14D"
  enable_batched_operations = true
}

# ── Cosmos DB Account ─────────────────────────────────────────────
resource "azurerm_cosmosdb_account" "main" {
  name                = "cosmos-oms-integration-${var.environment}"
  location            = azurerm_resource_group.oms_integration.location
  resource_group_name = azurerm_resource_group.oms_integration.name
  offer_type          = "Standard"
  kind                = "GlobalDocumentDB"
  
  consistency_policy {
    consistency_level = "Session"
  }

  geo_location {
    location          = var.location
    failover_priority = 0
  }

  tags = var.tags
}

# ── Cosmos DB SQL Database ────────────────────────────────────────
resource "azurerm_cosmosdb_sql_database" "main" {
  name                = "oms-integration"
  account_name        = azurerm_cosmosdb_account.main.name
  resource_group_name = azurerm_resource_group.oms_integration.name
}

# ── Cosmos DB SQL Container ───────────────────────────────────────
resource "azurerm_cosmosdb_sql_container" "oms_orders" {
  name                = "oms-orders"
  account_name        = azurerm_cosmosdb_account.main.name
  database_name       = azurerm_cosmosdb_sql_database.main.name
  resource_group_name = azurerm_resource_group.oms_integration.name
  partition_key_path  = "/orderId"
  throughput          = var.environment == "prod" ? 1000 : 400
  default_ttl         = 2592000

  indexing_policy {
    indexing_mode = "consistent"

    included_path {
      path = "/*"
    }

    excluded_path {
      path = "/omsEvent/*"
    }

    composite_index {
      index {
        path  = "/processingStatus"
        order = "Ascending"
      }
      index {
        path  = "/ingestedAt"
        order = "Descending"
      }
    }
  }
}

# ── Storage Account ───────────────────────────────────────────────
resource "azurerm_storage_account" "main" {
  name                     = "st${replace(var.environment, "-", "")}oms${random_string.storage_suffix.result}"
  resource_group_name      = azurerm_resource_group.oms_integration.name
  location                 = azurerm_resource_group.oms_integration.location
  account_tier             = "Standard"
  account_replication_type = var.environment == "prod" ? "ZRS" : "LRS"
  min_tls_version          = "TLS1_2"

  tags = var.tags
}

# ── Blob Container ────────────────────────────────────────────────
resource "azurerm_storage_container" "oms_payloads" {
  name                  = "oms-d365-payloads"
  storage_account_id    = azurerm_storage_account.main.id
  container_access_type = "private"
}

# ── Application Insights ──────────────────────────────────────────
resource "azurerm_application_insights" "main" {
  name                = "appinsights-oms-integration-${var.environment}"
  location            = azurerm_resource_group.oms_integration.location
  resource_group_name = azurerm_resource_group.oms_integration.name
  application_type    = "web"
  retention_in_days   = 90

  tags = var.tags
}

# ── Key Vault ─────────────────────────────────────────────────────
resource "azurerm_key_vault" "main" {
  name                = "kv-oms-integration-${var.environment}-${random_string.keyvault_suffix.result}"
  location            = azurerm_resource_group.oms_integration.location
  resource_group_name = azurerm_resource_group.oms_integration.name
  tenant_id           = data.azurerm_client_config.current.tenant_id
  sku_name            = "standard"
  purge_protection_enabled = var.environment == "prod" ? true : false

  tags = var.tags
}

# ── Random Strings for Naming ─────────────────────────────────────
resource "random_string" "storage_suffix" {
  length  = 4
  special = false
  upper   = false
}

resource "random_string" "keyvault_suffix" {
  length  = 4
  special = false
  upper   = false
}

# ── Data Source: Current User ─────────────────────────────────────
data "azurerm_client_config" "current" {}

variables.tf

variable "subscription_id" {
  description = "Azure subscription ID"
  type        = string
}

variable "environment" {
  description = "Deployment environment"
  type        = string
  validation {
    condition     = contains(["dev", "uat", "prod"], var.environment)
    error_message = "Environment must be dev, uat, or prod."
  }
}

variable "location" {
  description = "Azure region for resources"
  type        = string
  default     = "eastus"
}

variable "tags" {
  description = "Resource tags"
  type        = map(string)
  default = {
    project   = "oms-d365-integration"
    managedBy = "terraform"
  }
}

outputs.tf

output "resource_group_name" {
  value = azurerm_resource_group.oms_integration.name
}

output "service_bus_namespace_name" {
  value = azurerm_servicebus_namespace.main.name
}

output "service_bus_namespace_id" {
  value = azurerm_servicebus_namespace.main.id
}

output "cosmosdb_account_endpoint" {
  value = azurerm_cosmosdb_account.main.endpoint
}

output "cosmosdb_account_primary_key" {
  value     = azurerm_cosmosdb_account.main.primary_key
  sensitive = true
}

output "storage_account_name" {
  value = azurerm_storage_account.main.name
}

output "blob_container_name" {
  value = azurerm_storage_container.oms_payloads.name
}

output "application_insights_instrumentation_key" {
  value     = azurerm_application_insights.main.instrumentation_key
  sensitive = true
}

output "key_vault_id" {
  value = azurerm_key_vault.main.id
}

output "key_vault_uri" {
  value = azurerm_key_vault.main.vault_uri
}

Terraform Deployment

# Initialize Terraform
terraform init

# Plan deployment (dev)
terraform plan \
  -var-file="environments/dev.tfvars" \
  -out=tfplan

# Apply deployment
terraform apply tfplan

# Outputs
terraform output -json > outputs.json

🔒 Security Architecture

Defense-in-depth security model: network, identity, data, and secret management layers.

Security Architecture Overview

graph TB subgraph INTERNET["External Zone - Internet"] OMS_EXT["OMS Application - External Publisher - Entra ID App Registration"] end subgraph ENTRA["Microsoft Entra ID - Identity Plane"] AAD_EG["EventGrid Data Sender Role - Assigned to OMS App Registration"] AAD_FA["System-Assigned MI - Function App - Service Bus Data Receiver - Cosmos DB Data Contributor - Storage Blob Contributor - Key Vault Secrets User"] AAD_LA["System-Assigned MI - Logic App - Storage Blob Contributor - Key Vault Secrets User"] end subgraph NETWORK["Azure Network - Perimeter Layer"] IP_FW["Event Grid IP Firewall - Whitelist OMS IP ranges - Deny all others"] PE_SB["Private Endpoint - Service Bus - prod only"] PE_COSMOS["Private Endpoint - Cosmos DB - prod only"] PE_BLOB["Private Endpoint - Blob Storage - prod only"] PE_KV["Private Endpoint - Key Vault - prod only"] VNET["VNet Integration - Function App outbound - Logic App outbound"] end subgraph DATA["Data Services - Azure PaaS"] EG["Event Grid - CloudEvents schema validation - IP firewall enforced"] SB["Service Bus - TLS 1.2 enforced - Managed Identity auth"] COSMOS["Cosmos DB - AES-256 at rest - TLS 1.2 in transit - Managed Identity auth"] BLOB["Blob Storage - AES-256 at rest - TLS 1.2 - No public blob access"] KV["Key Vault - RBAC model - Soft delete 90 days - Purge protection prod"] end OMS_EXT -->|"AAD Token - EventGrid Data Sender role"| IP_FW IP_FW -->|"Validated request"| EG EG -->|"Route event - MI delivery"| SB AAD_FA -->|"RBAC grants"| SB AAD_FA -->|"RBAC grants"| COSMOS AAD_FA -->|"RBAC grants"| BLOB AAD_FA -->|"RBAC grants"| KV AAD_LA -->|"RBAC grants"| BLOB AAD_LA -->|"RBAC grants"| KV PE_SB -.->|"Private traffic"| SB PE_COSMOS -.->|"Private traffic"| COSMOS PE_BLOB -.->|"Private traffic"| BLOB PE_KV -.->|"Private traffic"| KV VNET -.->|"Outbound routing"| PE_SB VNET -.->|"Outbound routing"| PE_COSMOS style INTERNET fill:#d83b01,color:#fff style ENTRA fill:#7b4f9e,color:#fff style NETWORK fill:#c75000,color:#fff style DATA fill:#107c10,color:#fff

Zero-Trust Security Principles Applied

graph LR subgraph VERIFY["Verify Explicitly"] V1["AAD Token for every request - Role-based access enforced - Token expires every 1 hour"] end subgraph LEAST["Least Privilege Access"] L1["Function App: only Service Bus Data RECEIVER - not Sender - not Manage"] L2["Logic App: only Storage Blob CONTRIBUTOR - not Account Owner"] L3["OMS: only EventGrid Data SENDER - not Topic Owner"] end subgraph BREACH["Assume Breach"] B1["Private endpoints isolate all traffic - TLS 1.2 encrypts all transit - AES-256 encrypts all at-rest data"] B2["App Insights logs every access - Azure Monitor alerts on anomalies - Key Vault audit logs all secret access"] end style VERIFY fill:#0078d4,color:#fff style LEAST fill:#107c10,color:#fff style BREACH fill:#7b4f9e,color:#fff

Security Layers

1. Network Security

2. Identity & Access Control (IAM)

All services use Managed Identity + RBAC (no shared keys or passwords).

graph TB subgraph IDENTITY["Identity & RBAC"] FA_MI["Function App Managed Identity"] LA_MI["Logic App Managed Identity"] OMS_APP["OMS App Registration (Service Principal)"] end subgraph RBAC["Role Assignments"] ROLE1["🔐 Service Bus Data Receiver → Function App MI"] ROLE2["🔐 Cosmos DB Data Contributor → Function App MI"] ROLE3["🔐 Storage Blob Data Contributor → Function App MI + Logic App MI"] ROLE4["🔐 Key Vault Secrets User → Function App MI + Logic App MI"] ROLE5["🔐 EventGrid Data Sender → OMS App Registration"] end FA_MI --> ROLE1 FA_MI --> ROLE2 FA_MI --> ROLE3 FA_MI --> ROLE4 LA_MI --> ROLE3 LA_MI --> ROLE4 OMS_APP --> ROLE5 style IDENTITY fill:#2d4a2d,color:#fff style RBAC fill:#1e3a5f,color:#fff

RBAC Configuration (Bicep)

// Grant Service Bus Data Receiver role to Function App
resource sbRoleAssignment 'Microsoft.Authorization/roleAssignments@2022-04-01' = {
  name: guid(resourceGroup().id, functionApp.id, 'Service Bus Data Receiver')
  scope: sbSubscription
  properties: {
    roleDefinitionId: subscriptionResourceId(
      'Microsoft.Authorization/roleDefinitions',
      '4f6d3b9b-027b-4f4c-9142-0e5a2a2247ff'  // Service Bus Data Receiver
    )
    principalId: functionApp.identity.principalId
    principalType: 'ServicePrincipal'
  }
}

// Grant Cosmos DB Contributor role
resource cosmosRoleAssignment 'Microsoft.Authorization/roleAssignments@2022-04-01' = {
  name: guid(resourceGroup().id, functionApp.id, 'Cosmos DB Contributor')
  scope: cosmosDb
  properties: {
    roleDefinitionId: subscriptionResourceId(
      'Microsoft.Authorization/roleDefinitions',
      '230815da-be43-4aae-9cb8-5a8995d27db8'  // Cosmos DB Built-in Data Contributor
    )
    principalId: functionApp.identity.principalId
    principalType: 'ServicePrincipal'
  }
}

// Grant Storage Blob Data Contributor
resource storageRoleAssignment 'Microsoft.Authorization/roleAssignments@2022-04-01' = {
  name: guid(resourceGroup().id, functionApp.id, 'Storage Blob Data Contributor')
  scope: storage
  properties: {
    roleDefinitionId: subscriptionResourceId(
      'Microsoft.Authorization/roleDefinitions',
      'ba92f5b4-2d11-453d-a403-e96b0029c9fe'  // Storage Blob Data Contributor
    )
    principalId: functionApp.identity.principalId
    principalType: 'ServicePrincipal'
  }
}

// Grant Key Vault Secrets User
resource kvRoleAssignment 'Microsoft.Authorization/roleAssignments@2022-04-01' = {
  name: guid(resourceGroup().id, functionApp.id, 'Key Vault Secrets User')
  scope: keyVault
  properties: {
    roleDefinitionId: subscriptionResourceId(
      'Microsoft.Authorization/roleDefinitions',
      '4633458b-17de-408a-b874-0445c86b69e6'  // Key Vault Secrets User
    )
    principalId: functionApp.identity.principalId
    principalType: 'ServicePrincipal'
  }
}

3. Data Encryption

4. Secret Management (Key Vault)

All sensitive configuration stored in Azure Key Vault (not in app settings or git).

Secret Stored In Accessed By Rotation
D365 Service Principal Password Key Vault Function 1, Logic App 90 days (Azure Automation scheduled)
Cosmos DB Connection String Key Vault Function 1, Function 2 (via Managed Identity) Never (Managed Identity doesn't use string)
Service Bus SAS Key (fallback) Key Vault Function binding reference 180 days
Event Grid SAS Key (fallback) Key Vault OMS app config 90 days
Application Insights API Key Key Vault Logic App telemetry connector 1 year

Key Vault Reference in App Settings

{
  "name": "D365ServicePrincipalPassword",
  "value": "@Microsoft.KeyVault(VaultName=kv-oms-integration-prod;SecretName=d365-sp-password)"
}

5. Audit & Compliance Logging

6. Access Control Principle: Least Privilege

Each service gets minimum permissions:
  • Function App: "Service Bus Data Receiver" (not Sender; can't publish events)
  • Function App: "Cosmos DB Data Contributor" (can read + write, not delete collections)
  • Logic App: "Storage Blob Data Contributor" (can read blobs; not modify container metadata)
  • OMS App Registration: "EventGrid Data Sender" (can only publish events; can't read/delete topics)
Result: If one credential is compromised, attacker's access is limited to that service's scope.

📊 Monitoring & Observability

Complete telemetry stack: Application Insights for logging, KQL for querying, alerts for proactive incident detection.

Application Insights Integration

Key KQL (Kusto Query Language) Queries

Query 1: Ingestion Success Rate (last 24h)

customEvents
| where name == "IngestionSuccess"
| where timestamp > ago(24h)
| summarize
    success_count = count(),
    avg_duration_ms = avg(todouble(customDimensions.DurationMs))
    by bin(timestamp, 1h)
| render timechart with (title="Hourly Ingestion Success Rate")

Query 2: Dead Letter Queue Events

traces
| where message contains "DeadLetter"
| where timestamp > ago(24h)
| project
    timestamp,
    orderId = tostring(customDimensions.OrderId),
    reason = tostring(customDimensions.DeadLetterReason),
    error_message = tostring(customDimensions.ErrorMessage)
| order by timestamp desc

Query 3: End-to-End Latency (Ingest → Deliver)

let ingested = customEvents
| where name == "IngestionSuccess"
| extend orderId = tostring(customDimensions.OrderId), ingestedAt = timestamp;
let delivered = customEvents
| where name == "DeliverySuccess"
| extend orderId = tostring(customDimensions.OrderId), deliveredAt = timestamp;
ingested
| join kind=inner delivered on orderId
| extend latency_hours = (deliveredAt - ingestedAt) / 1h
| summarize
    avg_latency = avg(latency_hours),
    p95_latency = percentile(latency_hours, 95),
    p99_latency = percentile(latency_hours, 99)
    by bin(ingestedAt, 4h)
| render linechart

Query 4: Function Exception Rate

traces
| where severityLevel >= 2  // Warning or Error
| where timestamp > ago(1h)
| summarize
    error_count = count(),
    affected_functions = dcount(tostring(customDimensions.FunctionName))
    by tostring(customDimensions.FunctionName), severityLevel
| order by error_count desc

Query 5: Cosmos DB RU Consumption

dependencies
| where target == "cosmos-oms-integration-prod"
| where timestamp > ago(24h)
| extend ru = todouble(customDimensions.RUConsumed)
| summarize
    total_ru = sum(ru),
    avg_ru_per_op = avg(ru),
    max_ru_per_op = max(ru)
    by bin(timestamp, 1h)
| render barchart

Azure Monitor Alerts

Alert Name Condition Severity Action Group
DLQ Message Alert Service Bus subscription activeMessageCount (DLQ) > 0 🔴 Critical Page ops on-call; create incident ticket
Function Failure Rate High Failed invocations > 5% in last 5 min 🟠 High Email ops@contoso.com; log to dashboard
Logic App Run Failed Logic App oms-to-d365-delivery run status = Failed 🟠 High Page ops-critical group; send Teams notification
Cosmos DB Throttling 429 error count > 10 in 5 min window 🟡 Medium Email ops@contoso.com; evaluate RU scale-up
Event Grid Failed Deliveries Event Grid metric deadLettered > 0 🟡 Medium Email ops@contoso.com; review Event Grid dead letter log

Application Insights Dashboard

Pin the following to your dashboard for real-time visibility:

SLA & SLO Targets

Metric Target (SLO) Measurement
Order Delivery Success Rate 99.9% (3x9) Orders successfully imported into D365 / Total orders initiated
End-to-End Latency (p99) < 5 hours Order appears in D365 within 5 hours of OMS submission
Ingestion Latency (p95) < 1 second Order visible in Cosmos DB within 1 second of Service Bus arrival
DLQ Rate < 0.1% Messages reaching DLQ / Total messages attempted

🚀 Future Roadmap

Planned enhancements across resilience, security, observability, and operational excellence.

Phase 2 — Enhanced Resilience (3–6 months)

🔄 Durable Functions Orchestration

Replace simple timer functions with Azure Durable Functions for checkpointing and fault tolerance.

  • Activity functions for each step (query Cosmos, transform, zip, upload)
  • Automatic retry with exponential backoff without manual logic
  • Suspend/resume on transient failures (network timeouts)
  • Human approval workflow for DLQ remediation
🌍 Multi-Region Cosmos DB

Add secondary region (West US) with automatic failover.

  • Read replicas reduce latency for queries (p99 < 30ms from any region)
  • Automatic failover if primary region unreachable (RTO < 5min)
  • Multi-region write for high-frequency updates (eventual consistency)
🔌 Circuit Breaker for D365

Add circuit breaker pattern for D365 API calls to prevent cascading failures.

  • Detect D365 downtime or throttling (429 > 10/min)
  • Open circuit; queue blobs in Blob Storage for retry later
  • Automatic recovery when D365 health restored
  • Fallback: email notification for manual intervention

Phase 3 — Security Hardening (1–3 months)

🔐 Private Endpoints Everywhere

Migrate all PaaS services to private endpoint architecture.

  • Cosmos DB private endpoint in VNet
  • Service Bus private endpoint
  • Blob Storage private endpoint
  • Event Grid private endpoint (when GA)
  • Outcome: zero traffic crosses Azure internet backbone
🛡️ VNet Isolation

Deploy Function App with regional VNet integration and Logic App in App Service Environment (ISE).

  • All outbound traffic through VNet (NSG rules, firewall filtering)
  • Inbound: Function App via Azure Front Door with WAF
  • Result: no direct internet exposure
📋 Microsoft Defender for Cloud

Enable advanced threat protection and compliance scanning.

  • Continuous vulnerability assessment for container images
  • CSPM (Cloud Security Posture Management) reports
  • Regulatory compliance dashboard (SOC2, PCI-DSS, HIPAA if applicable)
👤 Managed Identity for OMS

Replace SAS keys with Entra ID app registration and token-based auth.

  • OMS gets app registration with "EventGrid Data Sender" role
  • No SAS keys to rotate; token-based (1-hour lifetime)
  • Audit trail in Azure AD sign-in logs

Phase 4 — Observability (2–4 months)

📊 Azure Monitor Workbooks

Create executive dashboards for order throughput, latency SLOs, and cost tracking.

  • Orders processed per 4-hour batch (trend chart)
  • End-to-end latency (p50/p95/p99 percentiles)
  • Cost per order (RU cost + Function execution + data transfer)
  • DLQ event drill-down (count, root causes)
📡 Distributed Tracing with W3C TraceContext

Implement W3C TraceContext standard for end-to-end tracing.

  • OMS generates traceparent header; included in CloudEvent
  • Service Bus, Functions, Logic App forward traceparent header
  • All related logs linked by trace ID (searchable in App Insights)
  • Integration with Jaeger or similar APM tools
📈 Grafana Integration

Use Grafana for multi-cloud visualization and alerting.

  • Azure Monitor data source plugin
  • Custom panels for order throughput, latency, cost trends
  • Alert rules in Grafana (send to PagerDuty, Slack)

Phase 5 — Operational Excellence

🤖 GitHub Actions CI/CD Pipeline

Automated testing and deployment for Functions and Logic App.

  • Trigger: git push to main branch
  • Build: dotnet build + unit tests (xUnit)
  • Integration tests: Deploy to dev, run Newman tests against D365 sandbox
  • Deploy to UAT on successful tests; manual approval for Prod
🧪 Automated Integration Tests

Newman (Postman collections) for order pipeline testing.

  • Test suite: Create order → ingest → batch → delivery
  • Assert: Order appears in D365 F&O within SLO (5 hours)
  • Run nightly against dev environment; alert on failure
🔍 Infrastructure Drift Detection

Continuous compliance checking against IaC source of truth.

  • Policy-as-Code: Azure Policy enforces naming conventions, tagging, encryption settings
  • Automated remediation: non-compliant resources auto-corrected
  • Report: weekly email of drift incidents
⚔️ Chaos Engineering with Azure Chaos Studio

Proactive failure testing to improve resilience.

  • Fault injection: simulate Cosmos DB latency spikes (100ms → 500ms)
  • Service outages: temporarily disable Service Bus to test fallback logic
  • Measure: How long before automated recovery? Manual SLA impact?
  • Frequency: Monthly chaos experiments in UAT
🚨 Automated Runbooks

Azure Automation runbooks for common remediation tasks.

  • DLQ Remediation: Auto-requeue messages after validation
  • Cosmos DB Scale-up: Auto-scale RU/s when throttling detected
  • Function App Cold Start: Pre-warm instances during peak hours
  • SAS Key Rotation: Auto-rotate Event Grid / Service Bus keys (90 days)