我对云和SBT/IntelliJ还是很陌生,因此请尝试在IntelliJ & SBT构建环境下运气,将我的jar部署在数据proc集群上.

I am fairly new to cloud and SBT/IntelliJ, So trying my luck with IntelliJ & SBT build environment to deploy my jar on data proc cluster.


Here's a screen shot of my project structure:


Code is quite simple with main defined in 'mytestmain' which call another method defined in 'ReadYamlConfiguration' which needed a moultingyaml dependency, which I have included as shown in my build.sbt.

这是我的build.sbt& assembly.sbt文件:

Here's my build.sbt & assembly.sbt file:

lazy val root = (project in file(".")).
    name := "MyTestProjectNew",
    version := "0.0.1-SNAPSHOT",
    scalaVersion := "2.11.12",
    mainClass in Compile := Some("com.test.processing.jobs.mytestmain.scala")

libraryDependencies ++= Seq(
  "net.jcazevedo" %% "moultingyaml" % "0.4.2"

scalaSource in Compile := baseDirectory.value / "src"


addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.14.10")

我创建了assembly.sbt来创建Uber jar,以便包括所需的依赖关系,并从Terminal运行了'SBT assembly'.它已成功创建了一个程序集jar文件,该文件可以在Dataproc群集上成功部署和运行.

I created assembly.sbt to create Uber jar in order to include required dependencies and ran 'SBT assembly' from Terminal. It has created a assembly jar file successfully, Which I was able to deploy and run successfully on Dataproc cluster.

gcloud dataproc jobs submit spark \
--cluster my-dataproc-cluster \
--region europe-north1 --class com.test.processing.jobs.mytestmain \
--jars gs://my-test-bucket/spark-jobs/MyTestProjectNew-assembly-0.0.1-SNAPSHOT.jar


Code is working fine as expected with no issues.


Now I would like to have my own custom directory structure as shown below:


For example, I would like to have a folder name as 'spark-job' with a sub dir named as 'SparkDataProcessing' and then src/main/scala folder with packages and respective scala classes and objects etc.


my main method is defined in in package 'job' within 'com.test.processing' package.


What all changes do I need to make in build.sbt? I am looking for a detail explanation with build.sbt as a sample according to my project structure. Also please suggest what all needs to be included in gitignore file.

我正在使用IntelliJ Idea 2020 community editionSBT 1.3.3版本.我在这里到那里尝试了很少的东西,但是总是以结构,jar或build.sbt问题结束一些问题.我期待在下面的帖子中找到类似的答案.

I am using IntelliJ Idea 2020 community edition and SBT 1.3.3 version. I tried few things here and there but always ended up some issue with structure, jar or build.sbt issues.I was expecting an answer something similar which is done in below post.



As you can see in below pic, the source directory has been changed.



and when I am building this with below path, it's not working.

scalaSource in Compile := baseDirectory.value / "src"


it works when I keep the default structure. like src/main/scala



You also need to change the package name after the package keyword at the top of affected files. However, if you refactor using IntelliJ (by creating the packages and then dragging the files into the package using the UI), then IntelliJ will do this for you.


Nothing else needs to be changed (build.sbt and related files can stay the same).

最后,切记更改class参数以反映入口点位置的变化;您将通过--class com.test.processing.jobs.job.mytestmain而不是--class com.test.processing.jobs.mytestmain.

Finally, remember to change the class argument to reflect changes in entrypoint locations; you would pass --class com.test.processing.jobs.job.mytestmain instead of --class com.test.processing.jobs.mytestmain.

.gitignore:请查看 gitignore文件示例其中包括:

As for .gitignore: please take a look at an example gitignore file which includes:

  • 包含目标"的输出目录
  • IntelliJ目录,例如".idea"


Another gitignore example ignores all .class files generated by the compiler, another approach. You should include all files which are generated dynamically, where changes do not matter to other developers.

09-06 22:46