Skip to content
Navigation menu
Search
Powered by Algolia
Search
Log in
Create account
DEV Community
Close
#
dataengineering
Follow
Hide
Posts
Left menu
đź‘‹
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
Right menu
Entity Resolution at Scale: Matching Products Across Amazon, Reddit, and RTINGS
Daniel Rozin
Daniel Rozin
Daniel Rozin
Follow
Apr 10
Entity Resolution at Scale: Matching Products Across Amazon, Reddit, and RTINGS
#
ai
#
webdev
#
dataengineering
#
tutorial
Comments
Add Comment
4 min read
Apache Data Lakehouse Weekly: April 3–9, 2026
Alex Merced
Alex Merced
Alex Merced
Follow
Apr 9
Apache Data Lakehouse Weekly: April 3–9, 2026
#
news
#
data
#
dataengineering
#
opensource
Comments
Add Comment
7 min read
ETL vs ELT: Which One Should You Use and Why?
Robert Njuguna
Robert Njuguna
Robert Njuguna
Follow
Apr 14
ETL vs ELT: Which One Should You Use and Why?
#
dataengineering
#
database
#
data
#
luxdev
1
 reaction
Comments
Add Comment
7 min read
AWS Lake Formation: Why Your Data Lake Permissions Are Probably a Mess (And How to Fix That)
Soumyadeep Basu
Soumyadeep Basu
Soumyadeep Basu
Follow
Apr 9
AWS Lake Formation: Why Your Data Lake Permissions Are Probably a Mess (And How to Fix That)
#
dataengineering
#
awsdatalake
#
aws
Comments
Add Comment
3 min read
Airflow vs Prefect vs Dagster: Picking the Right Orchestrator in 2026
DataStackX
DataStackX
DataStackX
Follow
Apr 9
Airflow vs Prefect vs Dagster: Picking the Right Orchestrator in 2026
#
dataengineering
#
python
#
airflow
#
dagster
Comments
Add Comment
6 min read
Your Customer Table Has Duplicates You Can't See With SQL How I Built a Cross-Platform Identity Resolution Layer for a Dark Kitchen Data Platform
SARAN TEJA MALLELA
SARAN TEJA MALLELA
SARAN TEJA MALLELA
Follow
Apr 9
Your Customer Table Has Duplicates You Can't See With SQL How I Built a Cross-Platform Identity Resolution Layer for a Dark Kitchen Data Platform
#
dataengineering
#
apachespark
#
kafka
#
deltalake
3
 reactions
Comments
Add Comment
8 min read
How to Bypass the Pandas "Object Tax": Building an 8x Faster CSV Engine in C
NARESH-CN2
NARESH-CN2
NARESH-CN2
Follow
Apr 9
How to Bypass the Pandas "Object Tax": Building an 8x Faster CSV Engine in C
#
python
#
performance
#
dataengineering
#
datascience
Comments
Add Comment
2 min read
PostgreSQL Foreign Data Wrappers: Cross-Database Queries Explained
Philip McClarence
Philip McClarence
Philip McClarence
Follow
Apr 9
PostgreSQL Foreign Data Wrappers: Cross-Database Queries Explained
#
database
#
dataengineering
#
postgres
#
sql
Comments
Add Comment
4 min read
How Google Maps Predicts Traffic in Real Time: Live Data and ETA Explained
Ashish Kumar
Ashish Kumar
Ashish Kumar
Follow
Apr 9
How Google Maps Predicts Traffic in Real Time: Live Data and ETA Explained
#
googlemaps
#
traffic
#
gps
#
dataengineering
Comments
Add Comment
3 min read
ETL vs ELT: Two Paradigms, One Goal
Edmund Eryuba
Edmund Eryuba
Edmund Eryuba
Follow
Apr 13
ETL vs ELT: Two Paradigms, One Goal
#
dataengineering
#
datascience
#
database
#
cicd
Comments
Add Comment
5 min read
How I cut Python JSON memory overhead from 1.9GB to ~0MB (11x Speedup)
NARESH-CN2
NARESH-CN2
NARESH-CN2
Follow
Apr 8
How I cut Python JSON memory overhead from 1.9GB to ~0MB (11x Speedup)
#
python
#
c
#
performance
#
dataengineering
Comments
Add Comment
2 min read
Building Market Data Pipelines Without REST Boilerplate
Infoway API
Infoway API
Infoway API
Follow
Apr 13
Building Market Data Pipelines Without REST Boilerplate
#
api
#
architecture
#
backend
#
dataengineering
1
 reaction
Comments
1
 comment
4 min read
I Got a Teams Message: 'Why Do We Need a Semantic Layer?'
Data Tech Bridge
Data Tech Bridge
Data Tech Bridge
Follow
Apr 7
I Got a Teams Message: 'Why Do We Need a Semantic Layer?'
#
analytics
#
architecture
#
data
#
dataengineering
Comments
Add Comment
7 min read
ETL vs ELT: Which One Should You Use and Why?
Rachel Muriuki
Rachel Muriuki
Rachel Muriuki
Follow
Apr 12
ETL vs ELT: Which One Should You Use and Why?
#
dataengineering
#
etl
#
elt
#
bigdata
3
 reactions
Comments
Add Comment
5 min read
Python was too slow for 10M rows—So I built a C-Bridge (and found the hidden data loss)
NARESH-CN2
NARESH-CN2
NARESH-CN2
Follow
Apr 7
Python was too slow for 10M rows—So I built a C-Bridge (and found the hidden data loss)
#
python
#
cpp
#
performance
#
dataengineering
Comments
Add Comment
2 min read
đź‘‹
Sign in
for the ability to sort posts by
relevant
,
latest
, or
top
.
We're a place where coders share, stay up-to-date and grow their careers.
Log in
Create account