1.從官網上尋找自己需要的合適的版本,此處我用的是maven-3.6.1 wget http://mirrors.tuna.tsinghua.edu.cn/apache/maven/maven-3/3.6.1/binaries/apache-maven-3.6.1-bin.tar.gz 2.將其解壓 ...
1.從官網上尋找自己需要的合適的版本,此處我用的是maven-3.6.1
wget http://mirrors.tuna.tsinghua.edu.cn/apache/maven/maven-3/3.6.1/binaries/apache-maven-3.6.1-bin.tar.gz
2.將其解壓在/usr/local 目錄下
tar -zxvf apache-maven-3.6.1-bin.tar.gz -C /usr/local rm apache-maven-3.6.1-bin.tar.gz -C /usr/local
3.進入/usr/local目錄下,修改maven文件目錄的名字
cd /usr/local mv apache-maven-3.6.1 maven-3.6.1
4.接下來進行maven的環境配置
vim /etc/profile
export MAVEN_HOME=/usr/local/maven-3.6.1 export PATH=$JAVA_HOME/bin:$MAVEN_HOME/bin:$PATH
5.刷新環境變數
source /etc/profile
6.測試maven是否成功安裝
mvn -version
7.Java獨立應用編程
1)進入用戶主文件夾
cd ~
2)創建sparkApp2
mkdir sparkapp2
3)創建一個SimpleApp.java文件在./sparkapp2/src/main/java中
vim ./sparkapp2/src/main/java/SimpleApp.java內容如下:
1 /*** SimpleApp.java ***/ 2 import org.apache.spark.api.java.*; 3 import org.apache.spark.api.java.function.Function; 4 5 public class SimpleApp { 6 public static void main(String[] args) { 7 String logFile = "file:///usr/local/spark-2.4.3/README.md"; 8 JavaSparkContext sc = new JavaSparkContext("local", "Simple App", 9 "file:///usr/local/spark-2.4.3/", new String[]{"target/simple-project-1.0.jar"}); 10 JavaRDD<String> logData = sc.textFile(logFile).cache(); 11 12 long numAs = logData.filter(new Function<String, Boolean>() { 13 public Boolean call(String s) { 14 return s.contains("a"); 15 } 16 }).count(); 17 18 long numBs = logData.filter(new Function<String, Boolean>() { 19 public Boolean call(String s) { return s.contains("b"); } 20 }).count(); 21 22 System.out.println("Lines with a: " + numAs + ", lines with b: " + numBs); 23 } 24 }該程式依賴Spark java API,因此需要通過Maven進行編譯打包。
4)在./sparkapp2中新建文件pom.xml
此處的依賴如果不知道是哪個具體的版本,可以直接在maven的官網上查看依賴,官網鏈接https://mvnrepository.com/ 內容如下:<project> <groupId>edu.berkeley</groupId> <artifactId>simple-project</artifactId> <modelVersion>4.0.0</modelVersion> <name>Simple Project</name> <packaging>jar</packaging> <version>1.0</version> <repositories> <repository> <id>Akka repository</id> <url>http://repo.akka.io/releases</url> </repository> </repositories> <dependencies> <dependency> <!-- Spark dependency --> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.11</artifactId> <version>2.4.3</version> </dependency> </dependencies> </project>
5)使用maven程式打包java程式
cd ~/sparkapp2 find .
6)接著將整個應用程式打包成jar(耗時長,十幾二十分鐘這樣子吧,會比sbt時間要長),成功後的提示信息如下所示:
/usr/local/maven-3.6.1/bin.mvn package
7)通過spark-submit運行程式
將生成的jar包通過spark-submit提交到Spark中/usr/local/spark-2.4.3/bin/spark-submit --class "SimpleApp" ~/sparkapp2/target/simple-project-1.0.jar 2>&1 | grep "Lines with a"