data-pipeline

Here are 26 public repositories matching this topic...

apache / shardingsphere

Empowering Data Intelligence with Distributed SQL for Sharding, Scalability, and Security Across All Databases.

mysql sql database bigdata postgresql shard distributed-database encrypt data-pipeline data-encryption database-cluster distributed-transaction read-write-splitting database-middleware distributed-sql-database database-gateway

Updated Apr 8, 2025
Java

debezium / debezium

Star

Change data capture for a variety of databases. Please log issues at https://issues.redhat.com/browse/DBZ.

database kafka kafka-connect apache-kafka kafka-producer cdc data-pipeline change-data-capture debezium event-streaming

Updated Apr 8, 2025
Java

apache / flink-cdc

Star

Flink CDC is a streaming data integration tool

mysql real-time kafka etl postgresql distributed batch data-integration schema-evolution elt flink cdc data-pipeline change-data-capture paimon

Updated Apr 8, 2025
Java

BitSail is a distributed high-performance data integration engine which supports batch, streaming and incremental scenarios. BitSail is widely used to synchronize hundreds of trillions of data every day.

real-time big-data high-performance data-lake data-integration flink data-synchronization data-pipeline

Updated Jan 1, 2024
Java

apache / seatunnel-web

Star

SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).

real-time offline high-performance apache data-integration sql-engine data-pipeline etl-framework seatunnel

Updated Mar 21, 2025
Java

DataSQRL / sqrl

Star

Flexible development framework for building streaming data applications in SQL with Kafka, Flink, Postgres, GraphQL, and more.

api streaming database event-driven-microservices event-driven data-pipeline

Updated Apr 9, 2025
Java

GetFeedback / kahpp-oss

Star

Kafka Streams made easy with a YAML file

yaml automation kafka pipeline tool stream-processing kafka-streams data-processing data-pipeline stream-processor stream-processing-software

Updated Aug 4, 2023
Java

cognitree / kronos

Star

cron replacement to schedule complex data workflows

scheduler task-scheduler cronjob-scheduler quartz-scheduler data-pipeline java-scheduler workflow-scheduler

Updated Nov 16, 2022
Java

yosra270 / store-data-pipeline

Star

Data pipeline using Apache Kafka, Apache Spark and HDFS

kafka big-data spark hdfs data-pipeline

Updated May 13, 2022
Java

illuin-tech / data-pipeline

Star

Toolkit for describing data transformation pipelines by compositing simple reusable components.

java etl data-pipeline

Updated Apr 8, 2025
Java

apache / seatunnel-datasource-sdk

Star

SeaTunnel is a distributed, high-performance data integration platform for the synchronization and transformation of massive data (offline & real-time).

real-time offline high-performance apache data-integration sql-engine data-pipeline etl-framework seatunnel

Updated Jun 14, 2023
Java

sushovankarmakar / kafka-spark-streaming

Star

An end to end data pipeline with Kafka Spark Streaming Integration

java kafka spark spark-streaming java-8 data-pipeline kafka-spark kafka-spark-streaming

Updated Jun 16, 2022
Java

mbrtargeting / camus

Star

LinkedIn's previous generation Kafka to HDFS pipeline.

kafka hadoop data-engineering hdfs data-pipeline

Updated Mar 12, 2019
Java

colechristini / dataset-lib

Star

Data-processing and common libraries used in main project, all available under Apache 2.0

java data big-data java-8 data-processing data-pipeline

Updated Feb 27, 2019
Java

JinsYin / datalink

Star

⚡ 数据集成 | DataLink is a lightweight data integration framework build on top of DataX, Spark and Flink

data streaming framework big-data spark integration pipeline etl bigdata batch data-integration data-collection flink cdc data-exchange data-synchronization data-pipeline datalink flink-cdc

Updated Jun 19, 2024
Java

FatihArslan-cmd / Kafka-Spark-Cassandra-Expense-Tracker

Star

A real-time data pipeline using Kafka, Spark, and Cassandra for processing and storing credit card expenses. Includes a Spring Boot application for retrieving personnel data from MySQL, storing images in S3, and displaying employee details with expense reports on a web interface.

mysql java real-time kafka big-data apache-spark spring-boot nosql aws-s3 data-processing apache-cassandra data-pipeline expense-tracking

Updated Dec 16, 2024
Java

mujahidniaz / iot_device_streaming_pipeline_cloudera-kakfa-spark-hbase

Star

Real Time Data Streaming Pipeline

kafka spark impala cloudera hbase data-pipeline streaming-data data-ingestion streaming-pipeline iots

Updated Jan 9, 2020
Java

kwangjong / coinbase-real-time-data-pipeline

Star

A real-time cryptocurrency data streaming pipeline.

java docker kubernetes scala apache-spark grafana hdfs k8s apache-kafka apache-cassandra data-pipeline

Updated Jun 25, 2024
Java

BrahianVT / Data-Pipeline

Star

Data-pipeline

mysql database restapi data-pipeline

Updated Jun 21, 2022
Java

iShiBin / CS502Capstone

Star

CS502Capstone

scala spark cassandra prediction recommender-systems data-pipeline kafak

Updated Feb 18, 2018
Java

Improve this page

Add a description, image, and links to the data-pipeline topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the data-pipeline topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data-pipeline

Here are 26 public repositories matching this topic...

apache / shardingsphere

debezium / debezium

apache / flink-cdc

bytedance / bitsail

apache / seatunnel-web

DataSQRL / sqrl

GetFeedback / kahpp-oss

cognitree / kronos

yosra270 / store-data-pipeline

illuin-tech / data-pipeline

apache / seatunnel-datasource-sdk

sushovankarmakar / kafka-spark-streaming

mbrtargeting / camus

colechristini / dataset-lib

JinsYin / datalink

FatihArslan-cmd / Kafka-Spark-Cassandra-Expense-Tracker

mujahidniaz / iot_device_streaming_pipeline_cloudera-kakfa-spark-hbase

kwangjong / coinbase-real-time-data-pipeline

BrahianVT / Data-Pipeline

iShiBin / CS502Capstone

Improve this page

Add this topic to your repo