Administration & Architecture
Explore discussions on Databricks administration, deployment strategies, and architectural best prac...
Explore discussions on Databricks administration, deployment strategies, and architectural best prac...
Join discussions on data engineering best practices, architectures, and optimization strategies with...
Join discussions on data governance practices, compliance, and security within the Databricks Commun...
Explore discussions on generative artificial intelligence techniques and applications within the Dat...
Dive into the world of machine learning on the Databricks platform. Explore discussions on algorithm...
Engage in discussions on data warehousing, analytics, and BI solutions within the Databricks Communi...
Has anyone else run into a situation where a breaking schema change on a SQL Server source table leaves their Lakeflow Connect pipeline in a state it can't recover from — even after destroying and recreating the pipeline?Here's what happened to us:- ...
Hi @lrm_data yes, this one catches a lot of people. A few things to check on the SQL Server side that commonly block recovery even after destroy + recreate:Stale lakeflow_* capture instance. SQL Server allows only 2 capture instances per table. If bo...
We need to import large amount of Jira data into Databricks, and should import only the delta changes. What's the best approach to do so? Using the Fivetran Jira connector or develop our own Python scripts/pipeline code? Thanks.
Hi @greengil good question, I went through this something similar recently, so sharing what I found.My instinct was also to build it in Python, but once I dug in, the "just write a script" path hides a lot of pain:Deletions are invisible. Jira's RES...
I have a Databricks workspace on AWS (serverless compute). I created a network policy with "Allow access to all destinations" enabled and attached it to my workspace. When I run a Python notebook and try to make an HTTP request or curl to any externa...
Most likely the egress policy change hasn’t actually taken effect on the serverless compute that’s running your notebook. Check these things in order: Verify the network policy itself (Account Console → Security → Networking → Context-based ingress ...
Hey All!We are trying out the Beta connector for SharePoint and found that the connector will not work at the root-level site. Is there a reason for this limitation. It is unfortunately a hard blocker for us to use the native connector. MUST_START...
Databricks integrating with ServiceNow via Lakeflow Connect for data ingestion and looking for guidance on enforcing integration-user based data access.Observed behaviourU2M OAuth authentication succeeds when ServiceNow access is granted to the works...
Hi, looking through some internal resources, it seems most likely to be down to ServiceNow-side ACLs, High Security Settings, or domain/scope restrictions overriding the admin role on system tables the connector queries.Quick things to check: - Run t...
Hello, I have been generating a Databricks personal access token in my YAML-based CI pipeline using a bash script. The pipeline installs the Databricks CLI and then creates a token using a Service Principal (Azure AD application) credentials.Current ...
Hi, I'm pretty sure what you're hitting is stricter auth detection in the newer CLI/SDK. Your error shows azure_tenant_id, client_id, and client_secret all populated, so it's seeing more than one credential type and refusing to guess between them. Th...
Hi, I was trying to make a dashboard using AI Genie - it worked well for the basics, but was unable to perform some of the modifications that were obviously (via UI) doable. Per Genie's response below - it only knows Databricks documentation up to Ap...
Hi, Just wanted to add to what others have said and make sure its clear, I think you're trying to use a Genie Space to create your dashboard. But Genie code is probably a better tool here. If you have visuals created in a Genie space that you would l...
Is there any documentation available around the changeset size thresholds for materialized view incremental refreshes? Are these configurable at all? Are they constant or do the thresholds change depending on the number of rows/size of the material...
Hi, On top of Pradeep's reply, which I'd recommend trying, I'd also suggest you raise a support ticket for this. They will potentially be able to tweak the settings in the backend (not guaranteed), but it may help. Thanks,Emma
I don't want users using serverless interactive compute for their jobs. how do i disable it for everyone or for specific users
Btw. I just realized that at least with VNet injected workspaces you probably can prevent any sensible serverless usage by not giving permissions and network route to the needed resources. At least in Azure Databricks, notebooks need access to Databr...
Has anyone else seen full refresh snapshots trigger outside of their configured refresh window in Lakeflow Connect?Here's our situation:- We have a full refresh window configured to restrict snapshot operations to off-hours- On at least one occasion,...
@lrm_data This is very unlike case for the refresh to be triggered outside the configured window. Though I would still suggest to check the Configured Window and Auto Full Refresh policy once to be sure.If still persists, then you may raise a support...
How do I connect Claude desktop to Databricks connector (available in connectors)What are the steps involved in it, Can any one provide detailed step by step implementation for this so that I can query the data using Claude desktop please?
@ShivaPolusani This is achieved via MCP remote, using few args such as Workspace link and Token.Check the documentation here - https://docs.databricks.com/aws/en/generative-ai/mcp/connect-external-services?language=Claude+Desktop#pat-examplesVideo Li...
Issue Description:I am attempting to disable public network access on the Azure Databricks managed storage account. However, I am encountering the following error:Failed to save resource settings — access is denied due to a deny assignment created by...
@MyProfile This would be helpful, check once - https://learn.microsoft.com/en-us/answers/questions/1707749/managed-storage-accounts-compliance
Has anyone attempted to truncate a delta live gold level table that gets populated via a pipeline and then tried to repopulate it by starting the pipeline. I have this situation wherein i need to reprocess all data in my gold table, so i stopped the ...
My Blog on thishttps://medium.com/@singh.sanjiv/truncate-and-load-streaming-live-table-8f840eb424d1
Hi,we're exploring replacing one of the use cases we are running in our clour provider with a Databricks pipelines. We currently have explored possibility to subscribe to an eventhub using SDP pipelines, feedding our iot data into a Delta table where...
Hi @leopold_cudzik, The pattern you are suggesting is feasible, but it’s much easier to manage if you separate history ingestion from the 7-day serving view instead of cleaning the streaming sink table in place. A common architecture on Databricks wo...
Queries that previously worked started failing in SQL Warehouse (Dashboards) without any changes on our side.The query succeeds, but fails to render results with error:"Cannot read properties of undefined (reading 'data')"This happens with:- system.b...
Same problem here. I have previously reported this issue, and it had been resolved at the time. However, the problem has now reoccurred.When ingesting large tables (over 100k rows), the system is unable to properly render the data, preventing the tab...
| User | Count |
|---|---|
| 1837 | |
| 884 | |
| 764 | |
| 471 | |
| 312 |