Databricks
Apparence
Description
Databricks combines a Data Lakehouse with Generative IA into a Data Intelligence Plateform. Erreur lors de la création de la vignette : /bin/bash: /usr/bin/convert: No such file or directory Error code: 127
History
1980 - Data warehouse | Collect and store structured data to provide support for for refined analysis and reporting. |
2000 - Data lake | Collect and store raw data and conducting exploratory analysis |
2021 - Data lakehouse | Unified plateform that benefits of both data lakes and data warehouses solution |
Aspect | Data Warehouse | Data Lake | Data Lakehouse |
---|---|---|---|
Data Type | Structured, processed, and refined data | Raw data: structured, semi-structured, and unstructured | Combines raw and processed data |
Schema | Schema-on-write: Data is structured before storage | Schema-on-read: Structure applied when accessed | Flexible: Schema-on-read for raw data; schema-on-write for structured data |
Purpose | Optimized for business intelligence (BI), reporting, and predefined analytics | Designed for big data analytics, machine learning, and exploratory analysis | Unified analytics platform for BI, AI/ML, streaming, and real-time analytics |
Processing Approach | ETL: Data is cleaned and transformed before storage | ELT: Data is loaded first and transformed as needed | Both ETL and ELT; enables real-time processing |
Scalability | Less scalable and more expensive to scale | Highly scalable and cost-effective for large volumes of diverse data | Combines scalability of lakes with performance optimization of warehouses |
Users | Business analysts and decision-makers | Data scientists, engineers, and analysts | BI teams, data scientists, engineers |
Accessibility | More rigid; changes to structure are complex | Flexible; easy to update and adapt | Highly adaptable; supports schema evolution |
Security & Maturity | Mature security measures; better suited for sensitive data | Security measures evolving; risk of "data swamp" if not managed properly | Strong governance with ACID transactions; improved reliability |
Use Cases | Operational reporting, dashboards, KPIs | Predictive analytics, AI/ML models, real-time analytics | Unified platform for BI dashboards, AI/ML workflows, streaming analytics |