Delta Sharing is the industry’s first open protocol for secure data sharing, making it simple to share data with other organizations regardless of which computing platforms they use. It allows you to securely share you delta lake data to external recipients without having to copy it. Recipients can directly connect from Pandas, Spark, Rust and other systems without needing their own compute. Delta sharing also includes mechanisms for data governance, tracking and audit.
Needs to be a unity catalog enabled workspace (https://docs.databricks.com/en/data-governance/unity-catalog/enable-workspaces.html)
Needs to have a Unity Catalog metastore configured
Can share to a databricks user with a databricks sharing identifier or create a token and allow open access (if enabled on the workspace)
Open sharing uses a token and data can be accessed from Databricks, Apache Spark, Pandas, Power BI, etc
Dbr-Dbr : to share with users that don't have access to your unity catalog metastore - they still must have access to a unity enabled workspace. These users can be on AWS, GCP, Azure. No token is required and you can also share notebooks
A recipient needs their sharing identifier - it looks like you can only share with a user
A share is a read only collection of tables and table partitions that are made available to one or more recipients
The data is not copied - rather the recipient is given read-only access to the source data in the provider's account. When the data is actually read (or transferred), the server generates short-lived pre-signed URLs that allow the client to read these Parquet files directly from the cloud provider, so that the transfer can happen in parallel at massive bandwidth, without streaming through the sharing server
Data providers can update the data in real time using transactions on delta lake and recipients will always see a consistent view
Only tables and views in unity catalog metastore can be shared
Only tables in delta format can be shared
View sharing is only available for dbr-dbr
Limits on number of files in metadata allowed for a shared table (700K add files and 100K remove files in delta log - see here for workarounds