#面试#A curated list of awesome System Design (A.K.A. Distributed Systems) resources.
HadoopOffice - Analyze Office documents using the Hadoop ecosystem (Spark/Flink/Hive)
IBIS is a workflow creation-engine that abstracts the Hadoop internals of ingesting RDBMS data.
Life-cycle: Internal working of HDFS, SQOOP, HIVE, SPARK, HBASE, KAFKA with code.
Hadoop3.2 single/cluster mode with web terminal gotty, spark, jupyter pyspark, hive, eco etc.
Instructions on setting up Hadoop, HDFS, java, sbt, kafka, scala, spark and flume on Ubuntu 18.04
Dockerfile for running Apache Knox (http://knox.apache.org/) in Docker
This project sets up a Hadoop High Availability (HA) Cluster using Docker Compose with three master nodes and two worker nodes for fault-tolerant big data processing. It includes Zookeeper & JournalNo...
#计算机科学#The goal of this project is to identify the flood-prone areas with probabilities of flood in counties in a future date, using Spark MLLib.
Analysis of YouTube Data using Hadoop Mapreduce framework in Java.
Built a Large Scale Distributed Data Processing system for Streaming Analytics using Hadoop Ecosystem (Apache Spark and HDFS), in Cloud for real-time spatial analytics.
Helm chart for Apache Knox
Getting tweets using Flume service and analyzing tweets
Spark Streaming & Kafka Quick Start Tutorial
Practise programs in hadoop ecosystem for refrence
[BigData] one year weblog analysis using PIG
Big Data is Stored and analyzed of various Customer using Hadoop and other tools like Hive, Zookeeper, Hbase and sqoop and all details of the customer is analyzed then result are given.This result is ...