time, building a medallion architecture in Microsoft Fabric meant stitching together a small orchestra of moving parts: notebooks for the transformations, pipelines for orchestration, schedules for refresh, custom code for data quality checks, and the Monitor Hub for keeping an eye on whether anything actually worked. Every layer worked – until something didn’t, and then you had to figure out which layer broke, why, and which downstream layers got affected along the way.
If you’ve ever tried to debug a silver layer that didn’t update because the bronze notebook failed three hours ago, you know exactly what I’m talking about.
Then, at FabCon Atlanta in March 2026, materialized lake views (MLVs) went generally available. And the story they’re telling is simple: what if your entire medallion pipeline could be a few SELECT statements?
Let me walk you through the whole thing – what they are, how they work, what changed between preview and GA, and where they fit (and where they don’t) in your architecture.
A materialized lake view is a persisted, automatically refreshed view defined in Spark SQL or PySpark. You write a SELECT query that describes the transformation you want, and Fabric takes care of execution, storage, refresh, dependency tracking, and data quality enforcement.
The result is stored as a Delta table in your lakehouse. So downstream consumers, such as Power BI Direct Lake, Spark notebooks, SQL endpoints, can query it just like any other Delta table. No special handling, no different syntax.
To put it in plain English: an MLV is nothing else but a SELECT statement that learned to materialize itself, manage its own dependencies, schedule its own refresh, and check its own data quality.

OK, that’s nice. But what does that actually replace?
That’s a fair question. Before MLVs, building a single bronze-to-silver-to-gold flow looked roughly like this: you’d write a notebook for each transformation, set up a Data Factory pipeline to call them in the right order, configure schedules, build custom validation logic, and then wire up the Monitor Hub to watch for failures. Five different surfaces, five different things to debug when something breaks.
With MLVs, all of that collapses into declarative SQL. You describe what you want. Fabric figures out the rest.
Every MLV moves through four stages. According to the Microsoft documentation, understanding them is the foundation for everything else:
Now let’s dive into each piece.
Here’s the full Spark SQL pseudo-code syntax for creating an MLV, straight from the Microsoft Learn reference:
CREATE [OR REPLACE] MATERIALIZED LAKE VIEW [IF NOT EXISTS]
[workspace.lakehouse.schema].MLV_Identifier
[(CONSTRAINT constraint_name CHECK (condition) [ON MISMATCH DROP | FAIL], ...)]
[PARTITIONED BY (col1, col2, ...)]
[COMMENT “description”]
[TBLPROPERTIES (”key1”=”val1”, ...)]
AS select_statement
A real example – cleaning order data joined from products and orders, with a data quality constraint and partitioning:
CREATE OR REPLACE MATERIALIZED LAKE VIEW silver.cleaned_order_data
(
CONSTRAINT valid_quantity CHECK (quantity > 0) ON MISMATCH DROP
)
PARTITIONED BY (category)
COMMENT “Cleaned order data joined from products and orders”
AS
SELECT
p.productID, p.productName, p.category,
o.orderDate, o.quantity, o.totalAmount
FROM bronze.products p
INNER JOIN bronze.orders o ON p.productID = o.productID
Two things worth flagging right away. First, MLV names are case-insensitive (MyView becomes myview). Second, all-uppercase schema names (like MYSCHEMA) aren’t supported, so use either mixed or lowercase.
You also need a schema-enabled lakehouse and Fabric Runtime 1.3 or higher. If your lakehouse doesn’t have schemas turned on, MLVs aren’t available, that’s the very first prerequisite.
Here’s where MLVs stop being clever and start being smart.
When source data changes, Fabric’s optimal refresh engine looks at every MLV in the lineage and asks a series of questions: Did anything actually change? Can I process just the changes? Or do I need to rebuild from scratch?
Three possible outcomes:

But, and this is important, incremental refresh isn’t free. It has prerequisites:
delta.enableChangeDataFeed=true).Without CDF enabled, optimal refresh can only choose between skip and full. With CDF on, the full incremental path opens up. Enabling CDF on your source tables has no measurable storage or performance impact for append-only workloads, so there’s very little reason not to turn it on:
ALTER TABLE bronze.orders SET TBLPROPERTIES (delta.enableChangeDataFeed = true);
ALTER TABLE bronze.products SET TBLPROPERTIES (delta.enableChangeDataFeed = true);
Can it get better than this? Actually, yes! And, this is where the GA story really begins.
MLVs were introduced in preview at Build 2025. Between then and GA in March 2026, Microsoft closed the most important gaps. Five major changes turned MLVs from “interesting” into “production-ready”:
Let me take them one at a time.
In preview, you could only refresh all MLVs in a lakehouse on a single schedule. Need finance to update hourly, but analytics to update every six hours? You had to work around it with notebooks, which broke dependency awareness, error reporting, and retry logic. Notebook-triggered refreshes don’t surface MLV error details. Failures appear only in cell output, and dependent views have no awareness of them. Errors can persist week after week without anyone knowing the pipeline is broken.
Now you can define named schedules within a single lakehouse, each targeting a specific subset of views. Finance pipeline hourly. Analytics every six hours. Marketing every 15 minutes. All in the same lakehouse, no custom code.

When a named schedule runs, Fabric still refreshes all upstream dependencies in the correct order, runs independent views in parallel, and surfaces errors centrally. If a run is already in progress when a schedule fires, the new run is skipped, and the next window proceeds as expected – so you don’t have to worry about overlapping runs stomping on each other.
Incremental refresh used to fall back to full quite often, because the list of “supported” SQL constructs was narrow. At GA, that list expanded significantly. MLVs now refresh incrementally when the definition includes:
COUNT and SUM with GROUP BYThat’s a meaningful change. Most real-world medallion pipelines I’ve worked on use exactly these patterns, and now they qualify for incremental processing without being rewritten. With optimal refresh, a built-in decision engine examines each refresh, evaluates the volume of changed data against the cost of a full recomputation, and automatically chooses the faster path.
I hear you, I hear you: Nikola, what happens if my query uses something the engine can’t handle incrementally? Don’t worry, it’s much easier than it sounds:) Using unsupported constructs doesn’t prevent you from creating the MLV. It only means that Fabric uses a full refresh instead of an incremental one. Optimal refresh automatically falls back to full when needed, so you don’t normally need to force it. If you do want to force one (for example, to reprocess data after a correction), there’s a one-liner for that:
REFRESH MATERIALIZED LAKE VIEW silver.cleaned_order_data FULL;
This one is huge! SQL is great until your transformation logic involves a custom Python library, an ML inference call, or a UDF that wraps complex business rules. Then you’d hit a wall as MLVs were SQL-only.
With PySpark authoring, you can now create, refresh, and replace MLVs from Fabric notebooks using PySpark and the familiar DataFrameWriter API. The fmlv module exposes a decorator-based pattern, documented in the official PySpark MLV reference:
import fmlv
from pyspark.sql import functions as F
@fmlv.materialized_lake_view(
name=”LH1.silver.customer_silver”,
comment=”Cleaned & enriched customer silver MLV”,
partition_cols=[”year”, “city”],
table_properties={”delta.enableChangeDataFeed”: “true”},
replace=True
)
@fmlv.check(name=”non_null_sales”, condition=”sales IS NOT NULL”, action=”DROP”)
def customer_silver():
df = spark.read.table(”bronze.customer_bronze”)
cleaned_df = df.filter(F.col(”sales”).isNotNull())
enriched_df = cleaned_df.withColumn(”sales_in_usd”, F.col(”sales”) * 1.0)
return enriched_df
A few PySpark gotchas worth knowing about:
@fmlv decorator doesn’t support dynamic parameters or variables. All parameters must be hardcoded.createOrReplaceTempView) – the engine can’t see session-scoped views. Use physical Delta tables or other MLVs as sources.So if your transformation can be expressed cleanly in SQL, SQL is still the better choice for performance. PySpark MLVs unlock the cases where SQL alone won’t do.
Business logic changes. A filter shifts. A join gains a column. An aggregation adds a metric. In preview, updating an MLV definition required dropping and recreating it, which lost refresh history and forced downstream consumers to reconnect.
Now, with the Replace capability, you update an MLV’s definition in place. Fabric validates the new logic, swaps it in, and preserves the view’s identity, metadata, and lineage. Downstream dependencies remain intact. Works for both SQL (CREATE OR REPLACE) and PySpark (replace=True).
This is one of those “under the radar” GA features that doesn’t get headlines but matters enormously day-to-day. If you’ve ever had to coordinate dropping and recreating a heavily consumed table while production is running, you know the pain. That goes away with this.
Data quality constraints are nothing new in MLVs, but at GA, they got a serious upgrade. You can now:
Combine that with the auto-generated data quality reports, and you get something close to a built-in data observability layer. You can quickly spot which rules fail most often, which views they affect, and how trends shift over time, without building a separate monitoring pipeline.
When one MLV references another (or a table), Fabric infers the relationship automatically. No manual configuration, no external orchestration tool. The dependencies are discovered from your SQL.
That dependency graph becomes a visual lineage in your lakehouse. Each node represents a transformation. Arrows show execution order. Fabric makes sure that when bronze data lands, the bronze-to-silver MLV runs first, then the silver-to-gold MLV runs against the freshly updated silver.

This is where the declarative approach really pays off. You’re not writing pipelines. You’re not defining orchestration. You’re writing what each layer should look like, and Fabric figures out the order. This is the beauty of a declarative approach:)
A few useful behaviors to know about:
I touched on this above, but it deserves its own section because it’s one of the things that makes MLVs feel different from a hand-built pipeline.
Every MLV can have one or more data quality constraints attached:
CREATE OR REPLACE MATERIALIZED LAKE VIEW silver.valid_orders
(
CONSTRAINT positive_quantity CHECK (quantity > 0) ON MISMATCH DROP,
CONSTRAINT valid_date CHECK (orderDate >= ‘2020-01-01’) ON MISMATCH FAIL
)
AS
SELECT * FROM bronze.orders
Two action types:
If multiple constraints are present and both behaviors are configured, FAIL takes precedence.
Violations surface in the lineage view and run details. Fine, but what does that actually look like in practice? Well, in the data quality report, you’ll see counts by constraint, by view, over time. So if a constraint that normally drops 0.1% of rows suddenly drops 15%, you’ll see the spike and know exactly which rule failed and which view it belongs to. That’s a quality signal you’d otherwise have to build by hand.
The Microsoft docs also note that the new expression-based constraints support built-in Spark/SQL functions like UPPER(), LOWER(), TRIM(), COALESCE(), INITCAP(), and DATE_FORMAT(), so your CHECK conditions can be richer than just simple comparisons.
MLVs are not a hammer for every nail. The Microsoft documentation is unusually direct about where they fit and where they don’t.

Use MLVs when you have:
Don’t use MLVs when:
I’ll add a personal note here. I’m currently deep in a Microsoft Fabric engagement where the silver layer design choice – Warehouse vs. Lakehouse with MLVs – is actively on the table. And what I keep coming back to is this: MLVs aren’t competing with the Warehouse the way some people frame it. They’re competing with the spaghetti of notebooks-and-pipelines that you’d otherwise build inside the Lakehouse. If your team is already SQL-fluent and your transformations live naturally in SELECT statements, the case for MLVs as the silver layer in a Lakehouse-based architecture is genuinely strong.
I’d be doing you a disservice if I painted MLVs as a silver bullet. They have meaningful limitations, some of which will matter to your architecture:
INSERT, UPDATE, or DELETE into an MLV. The data is whatever the SELECT produces.VERSION AS OF and TIMESTAMP AS OF aren’t allowed.createOrReplaceTempView() outputs aren’t visible to the MLV engine.None of these are showstoppers for most pipelines. But they’re worth knowing before you commit an architecture to MLVs and discover the limitation halfway through.
If you’ve been building medallion design patterns in Fabric using notebooks and pipelines, MLVs are worth a serious look. They collapse five surfaces into one declarative layer. The dependency management is automatic. The data quality is built in. The lineage is visible. And as of FabCon Atlanta, they’re production-ready.
The roadmap from Microsoft is clear: optimal refresh for PySpark-authored MLVs is coming, more SQL operators will become incremental-refresh-eligible, and deeper integration with other Fabric workloads is on the way. This is a milestone, not the finish line – and I’m curious to see how MLVs evolve over the next few quarters, especially around PySpark incremental refresh and any cross-lakehouse story Microsoft might tell.
Two takeaways I’d hold onto:
Thanks for reading!