spark-sql · GitHub Topics

Make Your Company Data Driven. Connect to any data source, easily visualize, dashboard and share your data.

redash Python 可视化 analytics bi redshift BigQuery athena MySQL PostgreSQL dashboard JavaScript business-intelligence databricks Apache Spark spark-sql Hacktoberfest

Python 27.77 k

2 天前

apache / kyuubi

Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.

Apache Spark hive SQL thrift jdbc spark-sql data-lake hadoop Kubernetes Hacktoberfest

Scala 2.24 k

2 天前

dotnet / spark

#计算机科学#.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.

Apache Spark C#.NET analytics bigdata spark-streaming spark-sql 机器学习 F#dotnet-standard streaming Azure hdinsight databricks emr Microsoft

C# 2.07 k

2 个月前

almond-sh / almond

A Scala kernel for Jupyter

Jupyter Notebook Scala Repl.it jupyter-kernels Apache Spark spark-sql

Scala 1.62 k

2 天前

apache / incubator-gluten

Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.

clickhouse simd spark-sql vectorization velox arrow

Scala 1.44 k

1 天前

databricks / LearningSparkV2

This is the github repo for Learning Spark: Lightning-Fast Data Analytics [2nd Edition]

Apache Spark spark-sql spark-mllib mlflow delta-lake

Scala 1.34 k

8 个月前

oeljeklaus-you / UserActionAnalyzePlatform

电商用户行为分析大数据平台

Apache Spark Java hadoop sparkjava spark-sql

Java 1.06 k

3 年前

kevinschaich / pyspark-cheatsheet

🐍 Quick reference guide to common patterns & functions in PySpark.

pyspark cheatsheet cheat cheatsheets reference references 文档数据科学 data Apache Spark spark-sql guide guides quickstart

601

3 年前

qubole / sparklens

Qubole Sparklens tool for performance tuning Apache Spark

Apache Spark Scala Simulation scheduler scheduling performance performance-analysis performance-metrics performance-tuning performance-visualization spark-sql sparkjava spark-mllib spark-ml cluster

Scala 583

1 年前

japila-books / spark-sql-internals

The Internals of Spark SQL

Apache Spark spark-sql internals mkdocs-material book

474

5 天前

zsvoboda / ngods-stocks

New Generation Opensource Data Stack Demo

cube dagster datahub dbt iceberg metabase Python Apache Spark spark-sql trino

Jupyter Notebook 449

3 年前

microsoft / data-accelerator

Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsigh...

Apache Spark spark-streaming spark-sql sparksql streaming-data streaming servicefabric Node.js Docker hdinsight cosmosdb React Azure iothub big-data Internet of things kafka kafka-streams

C# 305

5 个月前

cuebook / cuelake

Use SQL to build ELT pipelines on a data lakehouse.

apache-iceberg delta lakehouse datalake data-lake elt etl data-engineering data-integration data-ingestion Apache Spark spark-sql data-transfer pipelines data-pipeline zeppelin-notebook SQL

JavaScript 288

3 年前

jaceklaskowski / spark-workshop

Apache Spark™ and Scala Workshops

workshop Apache Spark spark-sql spark-mllib

HTML 263

1 年前

Qbeast-io / qbeast-spark

Qbeast-spark: DataSource enabling multi-dimensional indexing and efficient data sampling. Big Data, free from the unnecessary!

Apache Spark Scala big-data sampling datasource spark-sql

Scala 233

8 个月前

Chabane / bigdata-playground

#计算机科学#A complete example of a big data application using : Kubernetes (kops/aws), Apache Spark SQL/Streaming/MLib, Apache Flink, Scala, Python, Apache Kafka, Apache Hbase, Apache Parquet, Apache Avro, Apach...