Why You Should Keep SQL at the Center of Your Analytics Workflow

Jun 22, 2025

Over the years, many tools have emerged that abstract away SQL-like analytics tasks and move them outside of the database. Some of the most common include:

Tidyverse – R
Pandas – Python
Polars – Python
Excel
Tableau
Power BI
KNIME

These tools are incredibly useful—I especially love Python and R!

But here’s a word of caution:

The farther you move away from SQL and push your data transformations outside the database, the more likely you are to encounter challenges such as:

Vendor lock-in
Slower performance
Less accurate or inconsistent reporting
Longer analytics development cycles
Difficulties in auditing and troubleshooting

A Better Mindset: SQL-First

What I recommend is adopting a SQL-first mindset—do as much work as possible in the database, then use external tools for visualization, modeling, or niche tasks.

For example:

In Tableau dashboards, I recommend building each component from a dedicated SQL view. All views should pull from a central, regularly refreshed detail table in your database.
In Python, if I need to run a machine learning model using scikit-learn, I pull only the necessary data from a SQL view—just the features and records I need. I don’t query SELECT * FROM table into multiple DataFrames, join them in RAM, and aggregate them locally. Even though Polars makes this kind of in-memory processing much faster than Pandas today, it’s still better to offload joins and aggregations to the database.

Final Thoughts

SQL isn't just a legacy skill—it's the foundation of reliable, scalable, and maintainable data systems. Treat your database as the engine, and let everything else be the interface.

Thanks for reading.

Data Methods

Discussion about this post