Showing posts with label Data Mesh. Show all posts
Showing posts with label Data Mesh. Show all posts

Wednesday, October 16, 2024

Data Lakehouses & Apache Iceberg

Alex Merced (@AMdatalakehouse, Senior Tech Evangelist, @dremio) talks about everything data and we dig deep into Apache Iceberg and DataLakehouses.

SHOW: 865

Want to go to All Things Open in Raleigh for FREE? (Oct 27th-29th)

We are offering 5 Free passes, first come, first serve for the Cloudcast Community -> Registration Link

Instructions:

  1. Click reg link
  2. Click “Get Tickets”
  3. Choose ticket option
  4. Proceed with registration (discount will automatically be applied, cost will be $0)

SHOW TRANSCRIPT: The Cloudcast #865 Transcript

SHOW VIDEO: https://youtube.com/@TheCloudcastNET 

CLOUD NEWS OF THE WEEK: - http://bit.ly/cloudcast-cnotw

NEW TO CLOUD? CHECK OUT OUR OTHER PODCAST: - "CLOUDCAST BASICS" 

SHOW NOTES:

Topic 1 - Welcome to the show. Tell us a little bit about your background.

Topic 2 - It’s been a little while since we talked about Data Lakehouses, can you give us a little bit of background on this space, and what the most recent dynamics are around these technologies.

Topic 3 - What are the typical integrations with a Data Lakehouse? How are users/developers typically interacting with Data Lakehouse technologies? [The marketplace for Iceberg catalogs like Nessie and Polaris]

Topic 4 - How does an open data format like Apache Iceberg fit into the bigger picture of data lakehouses, or large scale stores of data?

Topic 5 - How does Dremio enable Iceberg? How does Dremio sit in the intersection of Data Lakehouse, Data Mesh and Data Virtualization trends all of which come from the same fundamental problem, the growing scale of data use cases.

Topic 6 -  We’ve seen companies start to rethink their data in the cloud strategies. Are you seeing on-premises making a comeback for large data applications

FEEDBACK?

Wednesday, July 22, 2020

Introduction to Data Mesh

Zhamak Dehghani (@zhamakd, Portfolio Tech Director @ThoughtWorks) talks about the concepts behind Data Mesh, the challenges and problems of Data Lakes / Data Warehouses, and how Cloud-native principles can be applied to Data. 

SHOW: 459

SHOW SPONSOR LINKS:


CLOUD NEWS OF THE WEEK - http://bit.ly/cloudcast-cnotw

PodCTL Podcast is Back (Enterprise Kubernetes) - http://podctl.com

SHOW NOTES:

Topic 1 - Welcome to the show. We were introduced to you through the O’Reilly events, but you’ve been involved in software development and architecture for quite a while. Tell us a little bit about your background and your focus areas at ThoughtWorks.

Topic 2 - About a year ago, you introduced this new concept called “Data Mesh”. Before we get into that, give us a little bit of background on the problems that previous generations of Data Warehouses or Data Lakes created. 

Topic 3 - Lets begin to walk through how Data Mesh is different from Data Lake. We’re not talking about just dumping all the various data sources into one “pool”, there’s a concept of “domains” within this big pool of data. What are the new concepts of source and consumption?

Topic 4 - Explain the concept of how pipelines are tied into Data Mesh and how this allows the creation of new products/features from the Data Mesh.

Topic 5 - You talk about the data being truthful, and then you bring an SRE concept of SLO into the truthfulness of the data. Explain how that might work? 

Topic 6 - Once a Data Mesh is in place, what are the “roles” (or teams) that have specific tasks, and who are the typical consumers of the Data Mesh platform?


FEEDBACK?