Close Menu
TechurzTechurz

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    What's Hot

    Everstone combines Wingify and ABTasty for $100M+ digital experience optimization platform

    January 20, 2026

    Eat App wants a bite of India’s restaurant reservation business with an acquisition and Swiggy partnership

    January 20, 2026

    Retail startup Another raises a $2.5M seed to help sell excess inventory

    January 20, 2026
    Facebook X (Twitter) Instagram
    Trending
    • Everstone combines Wingify and ABTasty for $100M+ digital experience optimization platform
    • Eat App wants a bite of India’s restaurant reservation business with an acquisition and Swiggy partnership
    • Retail startup Another raises a $2.5M seed to help sell excess inventory
    • Grubhub parent acquires restaurant rewards startup Claim
    • Humans&, a ‘human-centric’ AI startup founded by Anthropic, xAI, Google alums, raised $480M seed round
    • Indian vibe-coding startup Emergent raises $70M at $300M valuation from SoftBank, Khosla Ventures
    • Everstone combines Wingify, AB Tasty for $100M+ digital experience optimization platform
    • Looking ahead to 2026: What’s next for Startup Battlefield 200
    Facebook X (Twitter) Instagram Pinterest Vimeo
    TechurzTechurz
    • Home
    • AI
    • Apps
    • News
    • Guides
    • Opinion
    • Reviews
    • Security
    • Startups
    TechurzTechurz
    Home»AI»Databricks open-sources declarative ETL framework powering 90% faster pipeline builds
    AI

    Databricks open-sources declarative ETL framework powering 90% faster pipeline builds

    TechurzBy TechurzJune 12, 2025No Comments5 Mins Read
    Share Facebook Twitter Pinterest LinkedIn Tumblr Reddit Telegram Email
    Databricks open-sources declarative ETL framework powering 90% faster pipeline builds
    Share
    Facebook Twitter LinkedIn Pinterest Email

    Join the event trusted by enterprise leaders for nearly two decades. VB Transform brings together the people building real enterprise AI strategy. Learn more

    Today, at its annual Data + AI Summit, Databricks announced that it is open-sourcing its core declarative ETL framework as Apache Spark Declarative Pipelines, making it available to the entire Apache Spark community in an upcoming release. 

    Databricks launched the framework as Delta Live Tables (DLT) in 2022 and has since expanded it to help teams build and operate reliable, scalable data pipelines end-to-end. The move to open-source it reinforces the company’s commitment to open ecosystems while marking an effort to one-up rival Snowflake, which recently launched its own Openflow service for data integration—a crucial component of data engineering. 

    Snowflake’s offering taps Apache NiFi to centralize any data from any source into its platform, while Databricks is making its in-house pipeline engineering technology open, allowing users to run it anywhere Apache Spark is supported — and not just on its own platform.

    Declare pipelines, let Spark handle the rest

    Traditionally, data engineering has been associated with three main pain points: complex pipeline authoring, manual operations overhead and the need to maintain separate systems for batch and streaming workloads. 

    With Spark Declarative Pipelines, engineers describe what their pipeline should do using SQL or Python, and Apache Spark handles the execution. The framework automatically tracks dependencies between tables, manages table creation and evolution and handles operational tasks like parallel execution, checkpoints, and retries in production.

    “You declare a series of datasets and data flows, and Apache Spark figures out the right execution plan,” Michael Armbrust, distinguished software engineer at Databricks, said in an interview with VentureBeat. 

    The framework supports batch, streaming and semi-structured data, including files from object storage systems like Amazon S3, ADLS, or GCS, out of the box. Engineers simply have to define both real-time and periodic processing through a single API, with pipeline definitions validated before execution to catch issues early — no need to maintain separate systems.

    “It’s designed for the realities of modern data like change data feeds, message buses, and real-time analytics that power AI systems. If Apache Spark can process it (the data), these pipelines can handle it,” Armbrust explained. He added that the declarative approach marks the latest effort from Databricks to simplify Apache Spark.

    “First, we made distributed computing functional with RDDs (Resilient Distributed Datasets). Then we made query execution declarative with Spark SQL. We brought that same model to streaming with Structured Streaming and made cloud storage transactional with Delta Lake. Now, we’re taking the next leap of making end-to-end pipelines declarative,” he said.

    Proven at scale 

    While the declarative pipeline framework is set to be committed to the Spark codebase, its prowess is already known to thousands of enterprises that have used it as part of Databricks’ Lakeflow solution to handle workloads ranging from daily batch reporting to sub-second streaming applications.

    The benefits are pretty similar across the board: you waste way less time developing pipelines or on maintenance tasks and achieve much better performance, latency, or cost, depending on what you want to optimize for.

    Financial services company Block used the framework to cut development time by over 90%, while Navy Federal Credit Union reduced pipeline maintenance time by 99%. The Spark Structured Streaming engine, on which declarative pipelines are built, enables teams to tailor the pipelines for their specific latencies, down to real-time streaming.

    “As an engineering manager, I love the fact that my engineers can focus on what matters most to the business,” said Jian Zhou, senior engineering manager at Navy Federal Credit Union. “It’s exciting to see this level of innovation now being open-sourced, making it accessible to even more teams.”

    Brad Turnbaugh, senior data engineer at 84.51°, noted the framework has “made it easier to support both batch and streaming without stitching together separate systems” while reducing the amount of code his team needs to manage.

    Different approach from Snowflake

    Snowflake, one of Databricks’ biggest rivals, has also taken steps at its recent conference to address data challenges, debuting an ingestion service called Openflow. However, their approach is a tad different from that of Databricks in terms of scope.

    Openflow, built on Apache NiFi, focuses primarily on data integration and movement into Snowflake’s platform. Users still need to clean, transform and aggregate data once it arrives in Snowflake. Spark Declarative Pipelines, on the other hand, goes beyond by going from source to usable data. 

    “Spark Declarative Pipelines is built to empower users to spin up end-to-end data pipelines — focusing on the simplification of data transformation and the complex pipeline operations that underpin those transformations,” Armbrust said.

    The open-source nature of Spark Declarative Pipelines also differentiates it from proprietary solutions. Users don’t need to be Databricks customers to leverage the technology, aligning with the company’s history of contributing major projects like Delta Lake, MLflow and Unity Catalog to the open-source community.

    Availability timeline

    Apache Spark Declarative Pipelines will be committed to the Apache Spark codebase in an upcoming release. The exact timeline, however, remains unclear.

    “We’ve been excited about the prospect of open-sourcing our declarative pipeline framework since we launched it,” Armbrust said. “Over the last 3+ years, we’ve learned a lot about the patterns that work best and fixed the ones that needed some fine-tuning. Now it’s proven and ready to thrive in the open.”

    The open source rollout also coincides with the general availability of Databricks Lakeflow Declarative Pipelines, the commercial version of the technology that includes additional enterprise features and support.

    Databricks Data + AI Summit runs from June 9 to 12, 2025

    Daily insights on business use cases with VB Daily

    If you want to impress your boss, VB Daily has you covered. We give you the inside scoop on what companies are doing with generative AI, from regulatory shifts to practical deployments, so you can share insights for maximum ROI.

    Read our Privacy Policy

    Thanks for subscribing. Check out more VB newsletters here.

    An error occured.

    builds Databricks declarative ETL faster framework opensources pipeline Powering
    Share. Facebook Twitter Pinterest LinkedIn Tumblr Email
    Previous ArticleU.S. Open 2025: TV Schedule, How to Watch, Stream All the PGA Tour Golf From Anywhere
    Next Article This Hilarious Prime Video Caper Flips the Script on British Crime Drama
    Techurz
    • Website

    Related Posts

    Opinion

    Snowflake, Databricks challenger ClickHouse hits $15B valuation

    January 16, 2026
    Opinion

    Luminal raises $5.3 million to build a better GPU code framework

    November 17, 2025
    Opinion

    Equity Live: From $300M seed rounds to data center builds, AI is feeling bubbly

    October 31, 2025
    Add A Comment
    Leave A Reply Cancel Reply

    Top Posts

    College social app Fizz expands into grocery delivery

    September 3, 2025466 Views

    A Former Apple Luminary Sets Out to Create the Ultimate GPU Software

    September 25, 202514 Views

    The Reason Murderbot’s Tone Feels Off

    May 14, 202511 Views
    Stay In Touch
    • Facebook
    • YouTube
    • TikTok
    • WhatsApp
    • Twitter
    • Instagram
    Latest Reviews

    Subscribe to Updates

    Get the latest tech news from FooBar about tech, design and biz.

    Most Popular

    College social app Fizz expands into grocery delivery

    September 3, 2025466 Views

    A Former Apple Luminary Sets Out to Create the Ultimate GPU Software

    September 25, 202514 Views

    The Reason Murderbot’s Tone Feels Off

    May 14, 202511 Views
    Our Picks

    Everstone combines Wingify and ABTasty for $100M+ digital experience optimization platform

    January 20, 2026

    Eat App wants a bite of India’s restaurant reservation business with an acquisition and Swiggy partnership

    January 20, 2026

    Retail startup Another raises a $2.5M seed to help sell excess inventory

    January 20, 2026

    Subscribe to Updates

    Get the latest creative news from FooBar about art, design and business.

    Facebook X (Twitter) Instagram Pinterest
    • About Us
    • Contact Us
    • Privacy Policy
    • Terms and Conditions
    • Disclaimer
    © 2026 techurz. Designed by Pro.

    Type above and press Enter to search. Press Esc to cancel.