`
sillycat
  • 浏览: 2489403 次
  • 性别: Icon_minigender_1
  • 来自: 成都
社区版块
存档分类
最新评论

Spark(5)Upgrade the Spark to 1.0.2 Version

 
阅读更多

Spark(5)Upgrade the Spark to 1.0.2 Version

1. Upgrade the Version to 1.0.2
If plan to build from the source
>git clone https://github.com/apache/spark.git 

check out the tag version
>git tag -l
It will list the tags
>git checkout v1.0.2
>git pull origin v1.0.2


>sbt/sbt -Dhadoop.version=2.2.0 -Pyarn assembly
>sbt/sbt -Dhadoop.version=2.2.0 -Pyarn publish-local

Or build the normal version
>sbt/sbt update
>sbt/sbt compile
>sbt/sbt assembly

But I download the binary version from official website and go on with my example.

Error Message
14/08/08 17:33:03 WARN scheduler.TaskSetManager: Loss was due to java.lang.NoClassDefFoundError java.lang.NoClassDefFoundError: Could not initialize class scala.Predef$
Bad type in putfield/putstatic
14/08/08 22:07:07 ERROR executor.ExecutorUncaughtExceptionHandler: Uncaught exception in thread Thread[Executor task launch worker-0,5,run-main-group-0] java.lang.VerifyError: Bad type on operand stack Exception Details:  Location:    scala/collection/IndexedSeq$.ReusableCBF$lzycompute()Lscala/collection/generic/GenTraversableFactory$GenericCanBuildFrom; @19: putfield  Reason:    Type 'scala/collection/IndexedSeq$$anon$1' (current frame, stack[1]) is not assignable to 'scala/collection/generic/GenTraversableFactory$GenericCanBuildFrom'  Current Frame:    bci: @19    flags: { }    locals: { 'scala/collection/IndexedSeq$', 'scala/collection/IndexedSeq$' }    stack: { 'scala/collection/IndexedSeq$', 'scala/collection/IndexedSeq$$anon$1' }
Solution:
https://spark.apache.org/docs/latest/tuning.html#data-serialization

joda time problem.
-Duser.timezone=UTC

new DateTime(DateTimeZone.forID("UTC"))

Object serializer
https://github.com/EsotericSoftware/kryo

Update the scala version to 2.10.4


2. Deployment
Standalone
Start Master
>vi conf/spark-env.sh
SPARK_MASTER_IP=localhost
SPARK_LOCAL_IP=localhost

>./sbin/start-master.sh

The main class org.apache.spark.deploy.master.Master

Web UI
http://localhost:8080/

Start Worker
>./bin/spark-class org.apache.spark.deploy.worker.Worker spark://localhost:7077


Error Message
java.lang.NoClassDefFoundError: com/google/protobuf/ProtocolMessageEnum
     at java.lang.ClassLoader.defineClass1(Native Method)
     at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
     at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
     at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
     at java.net.URLClassLoader.access$100(URLClassLoader.java:71)

Solution:
>wget http://central.maven.org/maven2/com/google/protobuf/protobuf-java/2.4.1/protobuf-java-2.4.1.jar
>wget http://central.maven.org/maven2/org/spark-project/protobuf/protobuf-java/2.4.1-shaded/protobuf-java-2.4.1-shaded.jar

Check the log file, Error Message
>tail -f spark-root-org.apache.spark.deploy.master.Master-1-carl-macbook.local.out

Error Message:
14/08/09 03:48:45 ERROR EndpointWriter: AssociationError [akka.tcp://sparkMaster@192.168.11.11:7077] -> [akka.tcp://spark@192.168.11.11:62531]: Error [Association failed with [akka.tcp://spark@192.168.11.11:62531]] [ akka.remote.EndpointAssociationException: Association failed with [akka.tcp://spark@192.168.11.11:62531] Caused by: akka.remote.transport.netty.NettyTransport$$anonfun$associate$1$$anon$2: Connection refused: /192.168.11.11:62531 ]

Solution:
After a while I switch to try to build it myself.

I directly go to System Conferences to add one user spark.
>ssh-keygen -t rsa
if needed

Find the public key
>cat /Users/carl/.ssh/id_rsa.pub

Seems not working. So I plan to do that on ubuntu VM machine.
>sudo adduser sparkWorker --force-badname

Checkout my host
>host ubuntu-master
Host ubuntu-master not found: 3(NXDOMAIN)

Check my running of spark
>netstat -at | grep 7077
tcp6       0      0 ubuntu-master:7077      [::]:*                  LISTEN

>bin/spark-submit --class com.sillycat.spark.app.ClusterComplexJob --master spark://192.168.11.12:7077 --total-executor-cores 1 /Users/carl/work/sillycat/sillycat-spark/target/scala-2.10/sillycat-spark-assembly-1.0.jar

Turn off ipv6 on MAC
networksetup -listallnetworkservices | sed 1d | xargs -I {} networksetup -setv6off {}

All of these tries are not working.

Try the latest version from github, 1.1.0-SNAPSHOT.

The standalone cluster is still not working.

References:
http://spark.apache.org/docs/latest/spark-standalone.html 
http://spark.apache.org/docs/latest/building-with-maven.html 
https://github.com/mesos/spark.git 

http://www.iteblog.com/archives/1038 
http://www.iteblog.com/archives/1016 

My Spark Blogs
http://sillycat.iteye.com/blog/1871204 
http://sillycat.iteye.com/blog/1872478 
http://sillycat.iteye.com/blog/2083193 
http://sillycat.iteye.com/blog/2083194 

ubuntu add/remove user
https://www.digitalocean.com/community/tutorials/how-to-add-and-delete-users-on-ubuntu-12-04-and-centos-6

https://spark.apache.org/docs/latest/submitting-applications.html
https://spark.apache.org/docs/latest/spark-standalone.html

disable ipv6
http://askubuntu.com/questions/309461/how-to-disable-ipv6-permanently

spark source code
http://www.cnblogs.com/hseagle/p/3673147.html

分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics