Created by gh-md-toc
This cheat sheet explains how to install and to maintain a few tools pertaining to programming with Java and Scala, in particular for Spark-powered data processing.
- Material for the Data platform - Modern Data Stack (MDS) in a box
- Material for the Data platform - Data life cycle
- Data Engineering Helpers - Knowledge Sharing - Minio
- Data Engineering Helpers - Knowledge Sharing - Trino
- Data Engineering Helpers - Knowledge Sharing - DuckDB
- Data Engineering Helpers - Knowledge Sharing - PostgreSQL
- Data Engineering Helpers - Knowledge Sharing - Hive Metastore
- If Java needs to be installed (e.g., on systems not packaging it natively),
it is advised to install and use SDKMan
- Once SDKMan has been installed, installing in parallel a specific version of Java becomes as easy as
sdk install 11.0.21-amzn(here, for the Amazon-supported Corretto OpenJDK 11) - On MacOS, Java may simply be installed with HomeBrew:
brew install openjdk
- Once SDKMan has been installed, installing in parallel a specific version of Java becomes as easy as
- The packages may be searched for on Maven Central
- Hadoop download page (as of end 2023, the latest version is 3.3.6 and dates back to June 2023): https://archive.apache.org/dist/hadoop/common/current/
- Hive Metastore standalone download page (as of end 2023, the latest version is 3.0.0 and dates back to 2018): https://downloads.apache.org/hive/hive-standalone-metastore-3.0.0/
- The PostgreSQL drivers are available only for JDK up to version 8
- PostgreSQL JDBC driver:
$ wget https://repo1.maven.org/maven2/org/postgresql/postgresql/42.6.0/postgresql-42.6.0.jar- Download page for Apache Spark: https://spark.apache.org/downloads.html
- Delta Spark:
io.delta:delta-spark_2.12:3.0.0package page- Download the JAR package:
$ wget https://repo1.maven.org/maven2/io/delta/delta-spark_2.12/3.0.0/delta-spark_2.12-3.0.0.jar- Delta standalone:
io.delta:delta-standalone_2.12:3.0.0package page- Download the JAR package:
$ wget https://repo1.maven.org/maven2/io/delta/delta-standalone_2.12/3.0.0/delta-standalone_2.12-3.0.0.jar