Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
#计算机科学#.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
A Scala kernel for Jupyter
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]
Qubole Sparklens tool for performance tuning Apache Spark
🐍 Quick reference guide to common patterns & functions in PySpark.
The Internals of Spark SQL
Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsigh...
Use SQL to build ELT pipelines on a data lakehouse.
Apache Spark™ and Scala Workshops
Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!
#计算机科学#A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apach...
A prototype project of big data platform, the source codes of the book Big Data Platform Architecture and Prototype
Spark Structured Streaming / Kafka / Cassandra / Elastic
#计算机科学#An encrypted data analytics platform
Spark SQL 实现 ItemCF,UserCF,Swing,推荐系统,推荐算法,协同过滤