🔗 Liens Utiles

Ressources externes, outils et plateformes recommandés

Astuce💡 Conseil

Bookmark cette page ! Elle regroupe tous les liens utiles pour votre parcours Data Engineering.


🛠️ Outils de Développement

IDEs & Éditeurs

Outil Usage Gratuit Lien
VS Code IDE polyvalent (recommandé) code.visualstudio.com
PyCharm IDE Python ✅ Community jetbrains.com/pycharm
IntelliJ IDEA IDE Scala/Java ✅ Community jetbrains.com/idea
DataGrip IDE SQL puissant 30j trial jetbrains.com/datagrip
DBeaver Client SQL universel dbeaver.io
Postman Test d’API REST postman.com

Bases de Données Analytiques

Outil Description Gratuit Lien
DuckDB Base SQL analytique in-process duckdb.org
SQLite Base SQL embarquée sqlite.org
ClickHouse OLAP temps réel clickhouse.com

Dashboards & Visualisation

Outil Description Gratuit Lien
Streamlit Dashboards Python (recommandé) streamlit.io
Gradio UI ML rapide gradio.app
Apache Superset BI open source superset.apache.org
Metabase BI simple metabase.com
Plotly Dash Dashboards interactifs dash.plotly.com

Terminal & Shell

Outil Description OS Lien
iTerm2 Terminal avancé macOS iterm2.com
Windows Terminal Terminal moderne Windows Microsoft Store
Oh My Zsh Framework Zsh All ohmyz.sh
Starship Prompt rapide et beau All starship.rs
tmux Multiplexeur sessions All github.com/tmux
fzf Fuzzy finder All github.com/junegunn/fzf

Git & Collaboration

Outil Description Lien
GitHub Hébergement Git + CI/CD github.com
GitLab Alternative self-hosted gitlab.com
GitKraken Client Git visuel gitkraken.com
Sourcetree Client Git Atlassian sourcetreeapp.com
lazygit Git TUI rapide github.com/jesseduffield/lazygit

📊 Datasets pour Pratiquer

Datasets du Bootcamp

Dataset Utilisé dans Lien
Video Game Sales 🎮 Projet Débutant kaggle.com/rush4ratio
Olist E-commerce 📦 Projet Intermédiaire kaggle.com/olistbr

Datasets Populaires

Dataset Description Taille Lien
NYC Taxi Trajets taxis New York ~100 Go nyc.gov/tlc
Spotify Dataset Tracks & features audio ~1 Go kaggle.com/spotify
Stack Overflow Questions/Réponses dev ~50 Go archive.org/stackoverflow
Wikipedia Dumps complets ~100 Go dumps.wikimedia.org
Common Crawl Web crawl Petabytes commoncrawl.org

Open Data

Source Description Lien
data.gouv.fr Open Data France data.gouv.fr
data.europa.eu Open Data EU data.europa.eu
NYC Open Data Données New York opendata.cityofnewyork.us
World Bank Données économiques mondiales data.worldbank.org
Awesome Public Datasets Liste curatée GitHub github.com/awesomedata

☁️ Cloud & Platforms

Cloud Providers

Provider Console Free Tier Services Data
AWS console.aws.amazon.com 12 mois S3, EMR, Glue, Redshift, Kinesis
Google Cloud console.cloud.google.com $300 crédits GCS, Dataproc, BigQuery, Pub/Sub
Azure portal.azure.com $200 crédits Blob, Synapse, Data Factory

Data Platforms

Plateforme Type Free Tier Lien
Databricks Lakehouse Community Edition ✅ databricks.com
Snowflake Data Warehouse 30j trial snowflake.com
Confluent Cloud Kafka Managed $400 crédits confluent.cloud
dbt Cloud Transformation Free Developer ✅ getdbt.com
Fivetran Ingestion 14j trial fivetran.com
Airbyte Ingestion Open Source ✅ Self-hosted airbyte.com

Registries & Hubs

Outil Description Lien
Docker Hub Images Docker hub.docker.com
Artifact Hub Charts Helm artifacthub.io
OperatorHub Operators Kubernetes operatorhub.io
PyPI Packages Python pypi.org

🧪 Playgrounds & Labs

SQL

Outil Engines Lien
DB Fiddle PostgreSQL, MySQL, SQLite db-fiddle.com
SQL Fiddle Multi-DB sqlfiddle.com
DuckDB Shell DuckDB en ligne shell.duckdb.org
Mode SQL Tutorial Cours interactif mode.com/sql-tutorial
SQLZoo Tutoriels progressifs sqlzoo.net

Python & Notebooks

Outil Description Lien
Google Colab Notebooks gratuits + GPU colab.research.google.com
Databricks Community Spark gratuit community.cloud.databricks.com
Kaggle Notebooks Datasets + compute kaggle.com/code
Deepnote Notebooks collaboratifs deepnote.com
Replit IDE en ligne replit.com

Kubernetes

Outil Description Lien
Killercoda Labs K8s interactifs killercoda.com
Play with Kubernetes Cluster éphémère labs.play-with-k8s.com
Play with Docker Docker en ligne labs.play-with-docker.com

📚 Apprentissage

Formations Gratuites

Plateforme Contenu Certification Lien
dbt Learn dbt fundamentals ✅ Gratuit courses.getdbt.com
Databricks Academy Spark, Delta Lake ✅ Certaines databricks.com/learn
Confluent Developer Kafka, streaming ✅ Certaines developer.confluent.io
Data Talks Club DE Zoomcamp ✅ Gratuit datatalks.club
freeCodeCamp Data Analysis Python ✅ Gratuit freecodecamp.org

Formations Payantes

Plateforme Focus Lien
Coursera Certifications Cloud coursera.org
DataCamp Data skills interactif datacamp.com
Udemy Cours variés udemy.com
Pluralsight Tech enterprise pluralsight.com
O’Reilly Livres + vidéos oreilly.com

YouTube Channels

Channel Focus Lien
Databricks Spark, Delta Lake, Lakehouse youtube.com/@Databricks
Confluent Kafka, Event Streaming youtube.com/@Confluent
Seattle Data Guy Carrière DE, tutoriels youtube.com/@SeattleDataGuy
Data with Zach DE concepts youtube.com/@datawithzach
Andreas Kretz Architectures Data youtube.com/@andreaskayy
TechWorld with Nana DevOps, K8s youtube.com/@TechWorldwithNana

📰 Blogs & Newsletters

Newsletters (à s’abonner)

Newsletter Fréquence Lien
Data Engineering Weekly Hebdo dataengineeringweekly.com
Seattle Data Guy Hebdo seattledataguy.substack.com
ByteByteGo Hebdo bytebytego.com
Data Council Mensuel datacouncil.ai
Last Week in AWS Hebdo lastweekinaws.com
TLDR Quotidien tldr.tech

Blogs Techniques

Blog Focus Lien
Databricks Blog Spark, Delta, Lakehouse databricks.com/blog
Confluent Blog Kafka, Streaming confluent.io/blog
dbt Blog Analytics Engineering getdbt.com/blog
DuckDB Blog Analytique in-process duckdb.org/news
Start Data Engineering Tutoriels pratiques startdataengineering.com
Martin Fowler Architecture martinfowler.com

Tech Blogs d’Entreprises

Blog Scale Lien
Netflix Tech Blog Streaming, ML netflixtechblog.com
Uber Engineering Data at scale uber.com/blog/engineering
Airbnb Tech Data platform medium.com/airbnb-engineering
Spotify Engineering Data pipelines engineering.atspotify.com
LinkedIn Engineering Big Data engineering.linkedin.com

👥 Communautés

Forums & Slack

Communauté Plateforme Lien
r/dataengineering Reddit reddit.com/r/dataengineering
dbt Community Slack (~50k) community.getdbt.com
Data Talks Club Slack (~40k) datatalks.club/slack
Apache Slack Slack the-asf.slack.com
MLOps Community Slack mlops.community
Locally Optimistic Slack locallyoptimistic.com

Conférences

Event Focus Lien
Data + AI Summit Databricks, Spark databricks.com/dataaisummit
Kafka Summit Streaming kafka-summit.org
dbt Coalesce Analytics Engineering coalesce.getdbt.com
Data Council Data Engineering datacouncil.ai
QCon Software Architecture qconferences.com

🎯 Préparation Entretiens

Coding Practice

Site Focus Niveau Lien
LeetCode Algo + SQL All leetcode.com
DataLemur SQL pour DE 🟦🟩 datalemur.com
StrataScratch SQL réaliste 🟩🟥 stratascratch.com
HackerRank Challenges variés All hackerrank.com
SQLPad SQL interactif 🟦 sqlpad.io

System Design

Ressource Type Lien
System Design Primer GitHub gratuit github.com/donnemartin
ByteByteGo Newsletter + livre bytebytego.com
Grokking System Design Cours payant educative.io
Designing Data-Intensive Apps Livre O’Reilly

Ressources Carrière

Ressource Type Lien
levels.fyi Salaires tech levels.fyi
Glassdoor Reviews entreprises glassdoor.com
Blind Discussions anonymes teamblind.com

🔖 Quick Bookmarks

Astuce📌 Les liens essentiels à bookmarker
📁 Data Engineering Bookmarks/
│
├── 📁 Documentation/
│   ├── spark.apache.org/docs
│   ├── kafka.apache.org/documentation  
│   ├── docs.delta.io
│   ├── docs.getdbt.com
│   ├── duckdb.org/docs
│   └── kubernetes.io/docs
│
├── 📁 Playgrounds/
│   ├── community.cloud.databricks.com (Spark gratuit)
│   ├── shell.duckdb.org (DuckDB en ligne)
│   ├── db-fiddle.com (SQL)
│   └── killercoda.com (K8s)
│
├── 📁 Apprentissage/
│   ├── courses.getdbt.com (dbt gratuit)
│   ├── developer.confluent.io (Kafka)
│   └── datatalks.club (DE Zoomcamp)
│
├── 📁 Communautés/
│   ├── reddit.com/r/dataengineering
│   ├── community.getdbt.com (Slack)
│   └── datatalks.club/slack
│
└── 📁 Entretiens/
    ├── datalemur.com (SQL)
    ├── leetcode.com
    └── github.com/donnemartin/system-design-primer

🏠 Retour à l’accueil

Retour au sommet