Skip to main content

Posts

Featured

Data Lakes are not just for squares

Columnar-only lakes are just another warehouse Data Lakes are intended to be a source of truth for their varied data. Some organizations restrict their lake to columnar data, violating one of the main precepts behind Data Lakes. They limit data lake to be used for large data set transformations or automated analytics. This limiting definition leaves those companies without anywhere to store a significant subset of their total data pool data. Data Lakes are not restricted
Data lakes hold data in its' original data format to retain data fidelity. All data sets retain their original structure, data types and raw data format.

Some enterprise data lakes make the data more usable by storing the same data in multiple formats, the original format and a more queryable, accessible format.  This approach exactly preserves the original data while making more accessible.

Examples of multiple-copy same-data storage include.
CSV and other data that is also stored in directly queryable formats li…

Latest Posts

Machine Intelligence Feature Flow

Can federal programs really be Agile when multiple firms are involved?

No hack required for Linux on Chromebooks with the Termina VM and containers or Virtualbox

Tether Kali Linux to iPhone over USB

Silent Disco style conference breakout sessions with AWS

A portable Program Increment (PI) planning wall

My first DEFCON Experience.

The DEFCON 26 Experience Day 0 Registration

Who shares in a company's success?

Caution: Feed an Open Source Project and it might become yours.