Shaping Big Data - Schema on Read or Schema on Write
Data Lakes often have the some of the same performance and security
decisions as past year's data warehouses. Teams need to decide if the data in a lake is stored in producer formats or consumer formats or a combination of the two. Storage is essentially unlimited which means we may choose to store the data in multiple consumer oriented fashion. Compute is essentially unlimited. We may decide to apply view style restrictions and access controls at read time. |
Video
Speaker's Notes
One option is to use the essentially unlimited compute to pull in data some format and filter the data as it streams to the client. The client schema built as the data is read. |
Created 5/2020
Comments
Post a Comment