Snowflake Posts

The Elephant in the Data Lake and Snowflake

Let’s talk about the elephant in the data lake, Hadoop, and the constant evolution of technology. Hadoop, (symbolized by an elephant), was created to handle massive amounts of raw data that were beyond the capabilities of existing database technologies. At its core, Hadoop is simply a distributed file system. There are no restrictions on the … Continue reading “The Elephant in the Data Lake and Snowflake”

Reduce Your DBAs’ Burden with Snowflake

  Today’s DBA typically manages 10s, 100s or even 1000s of databases, often from multiple vendors, both on premise and in the cloud.   These may include RDBMS, NoSQL DBMS, and or Hadoop clusters. While management automation has made substantial strides enabling DBAs to handle larger workloads, the care and feeding of these databases is still … Continue reading “Reduce Your DBAs’ Burden with Snowflake”

Agile Cost Management in Snowflake – Part 1, Compute

Snowflake’s revolutionary architecture brings new opportunities and approaches to managing the cost of running its cloud data warehouse. To paraphrase Voltaire and Peter Parker’s uncle Ben Parker, “with great innovation comes great power; with great power comes new responsibilities”. The overall workload architecture is key to managing spend, which in turn requires new modes of … Continue reading “Agile Cost Management in Snowflake – Part 1, Compute”

Agile Cost Management in Snowflake – Part 2, Storage

In Agile Cost Management in Snowflake – Part 1, Compute, we discussed managing compute costs. Managing storage costs is much simpler, but it is still very important, as poorly managed storage will result in unexpected expenses. As with Compute, Snowflake’s revolutionary architecture requires a different approach and mindset to managing storage. In legacy DW RDBMS, … Continue reading “Agile Cost Management in Snowflake – Part 2, Storage”

Notes from Snow Summit Product Innovations Keynote

This mornings keynote was full of upcoming and very exciting innovations. The biggest are: Going cross region and cross cloud. GCP coming, currently in preview. Cross region and cross cloud bidirectional replication. Transparent failover. Consolidated billing and management of all related accounts. Load data from Google Cloud Storage as stage, prior to full implementation of Snowflake … Continue reading “Notes from Snow Summit Product Innovations Keynote”

Close up from a man’s hand trimming and landscaping trees with shears.

Snowflake Micro-partition vs Legacy Macro-partition Pruning

I have been in the data business through several RDBM generations and have seen many attempts at comparing performance between competing vendors. To say those comparisons should be taken with a grain of salt is an understatement. The resulting salt consumption would not be good for anybody’s health. The Transaction Processing Council (TPC) performance benchmarks … Continue reading “Snowflake Micro-partition vs Legacy Macro-partition Pruning”

Snowflake and ELVT vs [ELT, ETL]

Over several generations of RDBM technologies, I have learned that common practices, knowledge, and attitudes become obsolete. Query planning and optimization is continually evolving. This evolution, combined, with Snowflake’s revolutionary performance capabilities and data model agnosticism, is leading many database practitioners to a new architectural pattern. A key element of this new pattern is what … Continue reading “Snowflake and ELVT vs [ELT, ETL]”