Role Summary

Designs the analytical and operational data platforms that power downstream business and AI use cases. Specialty areas include lakehouse architectures, data contract programs, master-data management, and the governance models that keep data platforms reliable as they scale beyond the original architects.

Treats data platforms as products with consumers, not infrastructure. Insists that producers own the contracts they publish. Pushes back on enterprise data warehouse projects that promise to “consolidate everything” without an enforcement mechanism, and on lakehouse projects that ignore governance.

Skills

  • Lakehouse architectures on Delta, Iceberg, and Hudi
  • Cloud data platforms (Snowflake, Databricks, BigQuery, Redshift, Synapse)
  • Data warehouse design and dimensional modeling
  • Data mesh evaluation and selective adoption (where appropriate)
  • Data contract programs and producer-owned schema discipline
  • Schema-evolution discipline and backwards/forwards compatibility
  • Master-data management strategy and tooling selection
  • Reference-data governance for regulated industries
  • Data-quality frameworks (Great Expectations, Soda, Monte Carlo)
  • Data lineage tooling (DataHub, Atlan, OpenLineage, Collibra)
  • Data catalog design and metadata management
  • Data classification and sensitivity labeling
  • Privacy controls (PII tagging, masking, tokenization, redaction)
  • Data security architecture (RBAC, ABAC, row-level and column-level security)
  • Streaming data architecture for analytical and operational pipelines
  • CDC patterns (Debezium, native cloud CDC) for source-system replication
  • Real-time vs batch tradeoff analysis and architectural sequencing
  • Cross-cloud and cross-region data-replication patterns
  • Data-platform cost modeling and FinOps for analytical workloads
  • Architecture decision records and reference-architecture libraries for data

Capabilities & Focus Areas

  • Lakehouse and warehouse architecture aligned to client analytical use cases
  • Data-contract programs spanning producers and downstream consumers
  • Master-data and reference-data management approaches
  • Data-governance models including stewardship, lineage, and quality
  • Platform standards for ingestion, transformation, and serving
  • Architectural oversight on data engineering implementation work
  • Migration sequencing for data-platform consolidation programs

Typical Engagement Patterns

  • Four to eight week data-platform assessment and architecture engagements
  • Twelve to twenty-four week greenfield lakehouse implementation programs
  • Data-contract program design and rollout (eight to sixteen weeks)
  • Embedded architectural oversight on long-running data modernization programs
  • Master-data and reference-data architecture engagements for regulated industries

Outcomes Delivered

  • Data platforms that producers and consumers both trust
  • Data contracts versioned and enforced at ingestion, not aspirational
  • Lineage and quality observability surfacing issues before consumers find them
  • Migration programs from legacy warehouse to lakehouse without analytical regressions
  • Data governance models that keep working after the consulting engagement ends

Need this role for an engagement?

Brief us on the scope and timeline and we'll match a senior practitioner.

Get in touch →